The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics [online version ed.] 0199988692, 9780199988693

Cognitive neuroscience has grown into a rich and complex discipline, some 35 years after the term was coined. Given the

2,257 127 7MB

English Pages 621 [1111] Year 2013

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

The Oxford handbook of cognitive and behavioral therapies [online version ed.] 9780199733255, 0199733252

2,507 331 4MB Read more

Handbook of Medical Neuropsychology: Applications of Cognitive Neuroscience 9783030148959, 3030148955

This ambitious and important second edition of the Handbook of Medical Neuropsychology takes an in-depth approach to the

927 143 7MB Read more

The Oxford Handbook of Philosophy and Neuroscience 9780195304787, 2008028323

1,046 300 5MB Read more

Cognitive Psychology and Cognitive Neuroscience

222 99 14MB Read more

Oxford Handbook of Neuroscience Nursing [2 ed.] 2020941258, 9780198831570

1,285 140 11MB Read more

Handbook of Cognitive Behavioral Therapy, volume 1 1433833522, 9781433833526

3,746 319 3MB Read more

The Oxford Handbook of Law and Economics: Volume 1: Methodology and Concepts [online version ed.] 019968426X, 9780199684267

Covering over one-hundred topics on issues ranging from Law and Neuroeconomics to European Union Law and Economics to Fe

1,671 137 3MB Read more

Oxford Handbook of Aristotle [online version ed.] 0195187482, 9780195187489

The Oxford Handbook of Aristotlereflects the lively international character of Aristotelian studies, drawing contributor

1,967 213 3MB Read more

The Wiley Handbook on the Cognitive Neuroscience of Learning 2016003022, 9781118650943, 9781118650844, 9781118650851, 4094114424, 1118650948

The Wiley Handbook on the Cognitive Neuroscience of Learningcharts the evolution of associative analysis and the neurosc

1,116 152 12MB Read more

Oxford Handbook of Developmental Behavioral Neuroscience [1 ed.] 0195314735, 9780195314731

The Oxford Handbook of Developmental Behavioral Neuroscience is a seminal reference work in the burgeoning field of deve

1,332 111 12MB Read more

The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics [online version ed.]
0199988692, 9780199988693

Author / Uploaded
Kevin Ochsner
Stephen M. Kosslyn

Commentary
pdf from online version

Citation preview

Oxford Library of Psychology

Oxford Library of Psychology The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology Online Publication Date: Dec 2013

(p. ii)

Oxford Library of Psychology

Editor-in-Chief Peter E. Nathan Area Editors: Clinical Psychology David H. Barlow Cognitive Neuroscience Kevin N. Ochsner and Stephen M. Kosslyn Cognitive Psychology Daniel Reisberg Counseling Psychology Elizabeth M. Altmaier and Jo-Ida C. Hansen Developmental Psychology Philip David Zelazo Health Psychology Howard S. Friedman History of Psychology David B. Baker Methods and Measurement Page 1 of 2

Oxford Library of Psychology Todd D. Little Neuropsychology Kenneth M. Adams Organizational Psychology Steve W. J. Kozlowski Personality and Social Psychology Kay Deaux and Mark Snyder

Page 2 of 2

[UNTITLED]

[UNTITLED] The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology Online Publication Date: Dec 2013

(p. iv)

Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam Oxford is a registered trademark of Oxford University Press in the UK and certain other countries. Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016 © Oxford University Press 2013 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by license, or under terms agreed with the appropriate reproduction rights organiza tion. Inquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above. You must not circulate this work in any other form and you must impose this same condition on any acquirer. Library of Congress Cataloging-in-Publication Data The Oxford handbook of cognitive neuroscience / edited by Kevin Ochsner, Stephen M. Kosslyn. Page 1 of 2

[UNTITLED] volumes cm.—(Oxford library of psychology) ISBN 978–0–19–998869–3 1. Cognitive neuroscience—Handbooks, manuals, etc. 2. Neuropsychology—Hand books, manuals, etc. I. Ochsner, Kevin N. (Kevin Nicholas) II. Kosslyn, Stephen Michael, 1948– III. Title: Handbook of cognitive neuroscience. QP360.5.O94 2013 612.8'233—dc23 2013026213 987654321 Printed in the United States of America on acid-free paper

Page 2 of 2

Oxford Library of Psychology

Oxford Library of Psychology The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology Online Publication Date: Dec 2013

(p. vi)

(p. vii)

Oxford Library of Psychology

The Oxford Library of Psychology, a landmark series of handbooks, is published by Oxford University Press, one of the world’s oldest and most highly respected publishers, with a tradition of publishing significant books in psychology. The ambitious goal of the Oxford Library of Psychology is nothing less than to span a vibrant, wide-ranging field and, in so doing, to fill a clear market need. Encompassing a comprehensive set of handbooks, organized hierarchically, the Library incorporates volumes at different levels, each designed to meet a distinct need. At one level is a set of handbooks designed broadly to survey the major subfields of psychology; at another are numerous handbooks that cover important current focal research and scholarly areas of psychology in depth and detail. Planned as a reflection of the dynamism of psychology, the Library will grow and expand as psychology itself develops, thereby highlighting significant new research that will influence the field. Adding to its accessibil ity and ease of use, the Library will be published in print and electronically. The Library surveys psychology’s principal subfields with a set of handbooks that capture the current status and future prospects of those major subdisciplines. This initial set in cludes handbooks of social and personality psychology, clinical psychology, counseling psychology, school psychology, educational psychology, industrial and organizational psy chology, cognitive psychology, cognitive neuroscience, methods and measurements, histo ry, neuropsychology, personality assessment, developmental psychology, and more. Each handbook undertakes to review one of psychology’s major subdisciplines with breadth, comprehensiveness, and exemplary scholarship. In addition to these broadly conceived volumes, the Library also includes a large number of handbooks designed to explore in depth more specialized areas of scholarship and research, such as stress, health and cop ing, anxiety and related disorders, cognitive development, and child and adolescent as sessment. In contrast to the broad coverage of the subfield handbooks, each of these lat ter volumes focuses on an especially productive, more highly focused line of scholarship and research. Whether at the broadest or most specific level, however, all of the Library handbooks offer synthetic coverage that reviews and evaluates the relevant past and Page 1 of 2

Oxford Library of Psychology present research and anticipates research in the future. Each handbook in the Library includes introductory and concluding chapters written by its editor or editors to provide a roadmap to the handbook’s table of contents and to offer informed anticipations of signifi cant future developments in that field. An undertaking of this scope calls for handbook editors and chapter authors who are es tablished scholars in the areas about which they write. Many of the (p. viii) nation’s and world’s most productive and respected psychologists have agreed to edit Library handbooks or write authoritative chapters in their areas of expertise. For whom has the Oxford Library of Psychology been written? Because of its breadth, depth, and accessibility, the Library serves a diverse audience, including graduate stu dents in psychology and their faculty mentors, scholars, researchers, and practitioners in psychology and related fields. All will find in the Library the information they seek on the subfield or focal area of psychology in which they work or are interested. Befitting its commitment to accessibility, each handbook includes a comprehensive index, as well as extensive references to help guide research. And because the Library was de signed from its inception as an online as well as a print resource, its structure and con tents will be readily and rationally searchable online. Further, once the Library is re leased online, the handbooks will be regularly and thoroughly updated. In summary, the Oxford Library of Psychology will grow organically to provide a thorough ly informed perspective on the field of psychology, one that reflects both psychology’s dy namism and its increasing interdisciplinarity. Once published electronically, the Library is also destined to become a uniquely valuable interactive tool, with extended search and browsing capabilities. As you begin to consult this handbook, we sincerely hope you will share our enthusiasm for the more than 500-year tradition of Oxford University Press for excellence, innovation, and quality, as exemplified by the Oxford Library of Psychology. Peter E. Nathan Editor-in-Chief Oxford Library of Psychology

Page 2 of 2

About the Editors

About the Editors The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology Online Publication Date: Dec 2013

(p. ix)

About the Editors

Kevin N. Ochsner Kevin N. Ochsner is Associate Professor of Psychology at Columbia University. He gradu ated summa cum laude from the University of Illinois where he received his B.A. in Psy chology. Ochsner then received a M.A. and Ph.D. in psychology from Harvard University working in the laboratory of Dr. Daniel Schacter, where he studied emotion and memory. Also at Harvard, he began his postdoctoral training in the lab or Dr. Daniel Gilbert, where he first began integrating social cognitive and neuroscience approaches to emotion-cogni tion interactions, and along with Matthew Lieberman published the first articles on the emerging field of social cognitive neuroscience. Ochsner later completed his postdoctoral training at Stanford University in the lab of Dr. John Gabrieli, where he conducted some of the first functional neuroimaging studies examining the brain systems supporting cog nitive forms of regulation. He is now director the Social Cognitive Neuroscience Labora tory at Columbia University, where current studies examine the psychological and neural bases of emotion, emotion regulation, empathy and person perception in both healthy and clinical populations. Ochsner has received various awards for his research and teaching, including the American Psychological Association’s Division 3 New Investigator Award, the Cognitive Neuroscience Society’s Young Investigator Award, and Columbia University’s Lenfest Distinguished Faculty Award. Stephen M. Kosslyn Stephen M. Kosslyn is the Founding Dean of the university at the Minerva Project, based in San Francisco. Before that, he served as Director of the Center for Advanced Study in the Behavioral Sciences and Professor of Psychology at Stanford University, and was pre viously chair of the Department of Psychology, Dean of Social Science, and the John Lind sley Professor of Psychology in Memory of William James at Harvard University. He re ceived a B.A. from UCLA and a Ph.D. from Stanford University, both in psychology. His original graduate training was in cognitive science, which focused on the intersection of cognitive psychology and artificial intelligence; faced with limitations in those approach es, he eventually turned to study the brain. Kosslyn’s research has focused primarily on Page 1 of 2

About the Editors the nature of visual cognition, visual communication, and individual differences; he has authored or coauthored 14 books and over 300 papers on these topics. Kosslyn has re ceived the American Psychological Association’s Boyd R. McCandless Young Scientist Award, the National Academy of Sciences Initiatives in Research Award, a Cattell Award, a Guggenheim Fellowship, the J-L. Signoret (p. x) Prize (France), an honorary Doctorate from the University of Caen, an honorary Doctorate from the University of Paris Descartes, an honorary Doctorate from Bern University, and election to Academia Rodi nensis pro Remediatione (Switzerland), the Society of Experimental Psychologists, and the American Academy of Arts and Sciences.

Page 2 of 2

Contributors

Contributors The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology Online Publication Date: Dec 2013

(p. xi)

Contributors

Claude Alain

Rotman Research Institute

Baycrest Centre

Toronto, Ontario, Canada

Agnès Alsius

Department of Psychology

Queen’s University

Kingston, Ontario, Canada

George A. Alvarez

Page 1 of 19

Contributors Department of Psychology

Harvard University

Cambridge, MA

Stephen R. Arnott

Rotman Research Institute

Baycrest Centre

Toronto, Ontario, Canada

Moshe Bar

Martinos Center for Biomedical Imaging

Massachusetts General Hospital

Harvard Medical School

Charlestown, MA

Bryan L. Benson

Department of Psychology

Page 2 of 19

Contributors School of Kinesiology

University of Michigan

Ann Arbor, MI

Damien Biotti

Lyon Neuroscience Research Center

Bron, France

Annabelle Blangero

Lyon Neuroscience Research Center

Bron, France

Sheila E. Blumstein

Department of Cognitive, Linguistic, and Psychological Sciences

Brown Institute for Brain Science

Brown University

Providence, RI

Page 3 of 19

Contributors Grégoire Borst

University Paris Descartes

Laboratory for the Psychology of Child Development and Education (CNRS Unit 3521)

Paris, France

Department of Psychology

Harvard University

Cambridge, MA

Nathaniel B. Boyden

Department of Psychology

University of Michigan

Ann Arbor, MI

Andreja Bubic

Martinos Center for Biomedical Imaging

Massachusetts General Hospital

Page 4 of 19

Contributors

Harvard Medical School

Charlestown, MA

Bradley R. Buchsbaum

Rotman Research Institute

Baycrest Centre

Toronto, Ontario, Canada

Roberto Cabeza

Center for Cognitive Neuroscience

Duke University

Durham, NC

Denise J. Cai

Department of Psychology

University of California, San Diego

La Jolla, CA

Page 5 of 19

Contributors

Alfonso Caramazza

Department of Psychology

Harvard University

Cambridge, MA

Center for Mind/Brain Sciences

University of Trento

Rovereto, Italy

(p. xii)

Evangelia G. Chrysikou

Department of Psychology

University of Kansas

Lawrence, KS

Jared Danker

Department of Psychology

New York University

Page 6 of 19

Contributors

New York, NY

Sander Daselaar

Donders Institute for Brain, Cognition,and Behaviour

Radboud University

Nijmegen, Netherlands

Center for Cognitive Neuroscience

Duke University

Durham, NC

Lila Davachi

Center for Neural Science

Department of Psychology

New York University

New York, NY

Mark D’Esposito

Page 7 of 19

Contributors

Helen Wills Neuroscience Institute

Department of Psychology

University of California

Berkeley, CA

Benjamin J. Dyson

Department of Psychology

Ryerson University

Toronto, Ontario, Canada

Jessica Fish

MRC Cognition and Brain Sciences Unit

Cambridge, UK

Angela D. Friederici

Department of Neuropsychology

Max Planck Institute for Human Cognitive and Brain Sciences

Page 8 of 19

Contributors

Leipzig, Germany

Melvyn A. Goodale

The Brain and Mind Institute

University of Western Ontario

London, Ontario, Canada

Kalanit Grill-Spector

Department of Psychology and Neuroscience Institute

Stanford University

Stanford, CA

Argye E. Hillis

Departments of Neurology, Physical Medicine and Rehabilitation, and Cognitive Science

Johns Hopkins University

Baltimore, MD

Page 9 of 19

Contributors Ray Jackendoff

Center for Cognitive Studies

Tufts University

Medford, MA

Petr Janata

Center for Mind and Brain

Department of Psychology

University of California Davis

Davis, CA

Roni Kahana

Department of Neurobiology

Weizmann Institute of Science

Rehovot, Israel

Stephen M. Kosslyn

Page 10 of 19

Contributors

Minerva Project

San Francisco, CA

Youngbin Kwak

Neuroscience Program

University of Michigan

Ann Arbor, MI

Bruno Laeng

Department of Psychology

University of Oslo

Oslo, Norway

Ewen MacDonald

Department of Psychology

Queen’s University

Ontario, Canada

Page 11 of 19

Contributors

Centre for Applied Hearing Research

Department of Electrical Engineering

Technical University of Denmark

Lyngby, Denmark

Bradford Z. Mahon

Departments of Neurosurgery and Brain and Cognitive Sciences

University of Rochester

Rochester, NY

Claudia Männel

Department of Neuropsychology

Max Planck Institute for Human Cognitive and Brain Sciences

Leipzig, Germany

(p. xiii)

Jason B. Mattingley

Queensland Brain Institute

Page 12 of 19

Contributors

University of Queensland

St. Lucia, Queensland, Australia

Josh H. McDermott

Department of Brain and Cognitive Sciences

Massachusetts Institute of Technology

Cambridge, MA

Kevin Munhall

Department of Psychology

Queen’s University

Kingston, Ontario, Canada

Emily B. Myers

Department of Psychology

Department of Speech, Language, and Hearing Sciences

University of Connecticut

Page 13 of 19

Contributors

Storrs, CT

Jeffrey Nicol

Department of Psychology

Nipissing University

North Bay, Ontario, Canada

Kevin N. Ochsner

Department of Psychology

Columbia University

New York, NY

Laure Pisella

Lyon Neuroscience Research Center

Bron, France

Gilles Rode

Page 14 of 19

Contributors Lyon Neuroscience Research Center

University Lyon

Hospices Civils de Lyon

Hôpital Henry Gabrielle

St. Genis Laval, France

Yves Rossetti

Lyon Neuroscience Research Center

University Lyon

Mouvement et Handicap

Plateforme IFNL-HCL

Hospices Civils de Lyon

Lyon, France

M. Rosario Rueda

Departemento de Psicolog í a Experimental

Centro de Investigación Mente, Cerebro y Comportamiento (CIMCYC)

Page 15 of 19

Contributors

Universidad de Granada

Granada, Spain

Rachael D. Seidler

Department of Psychology

School of Kinesiology

Neuroscience Program

University of Michigan

Ann Arbor, MI

Noam Sobel

Department of Neurobiology

Weizmann Institute of Science

Rehovot, Israel

Sharon L. Thompson-Schill

Department of Psychology

Page 16 of 19

Contributors

University of Pennsylvania

Philadelphia, PA

Caroline Tilikete

Lyon Neuroscience Research Center

University Lyon

Hospices Civils de Lyon

Hôpital Neurologique

Lyon, France

Kyrana Tsapkini

Departments of Neurology, and Physical Medicine and Rehabilitation

Johns Hopkins University

Baltimore, MD

Alain Vighetto

Lyon Neuroscience Research Center

Page 17 of 19

Contributors

University Lyon

Hospices Civils de Lyon

Hôpital Neurologique

Lyon, France

Barbara A. Wilson

Department of Psychology

Institute of Psychiatry

King’s College

London, UK

John T. Wixted

Department of Psychology

University of California, San Diego

La Jolla, CA

Eiling Yee

Page 18 of 19

Contributors

Basque Center on Cognition, Brain, and Language

San Sebastian, Spain

Josef Zihl

Neuropsychology Unit

Department of Psychology

University of Munich

Max Planck Institute of Psychiatry

Munich, Germany

(p. xiv)

Page 19 of 19

Introduction to The Oxford Handbook of Cognitive Neuroscience: Cognitive Neuroscience—Where Are We Now?

Introduction to The Oxford Handbook of Cognitive Neu roscience: Cognitive Neuroscience—Where Are We Now? Kevin N. Ochsner and Stephen Kosslyn The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0001

Abstract and Keywords This two-volume set reviews the current state-of-the art in cognitive neuroscience. The in troductory chapter outlines central elements of the cognitive neuroscience approach and provides a brief overview of the eight sections of the book’s two volumes. Volume 1 is di vided into four sections comprising chapters that examine core processes, ways in which they develop across the lifespan, and ways they may break down in special populations. The first section deals with perception and addresses topics such as the abilities to repre sent and recognize objects and spatial relations and the use of top-down processes in vi sual perception. The second section focuses on attention and how it relates to action and visual motor control. The third section, on memory, covers topics such as working memo ry, semantic memory, and episodic memory. Finally, the fourth section, on language, in cludes chapters on abilities such as speech perception and production, semantics, the ca pacity for written language, and the distinction between linguistic competence and per formance. Keywords: cognitive neuroscience, perception, attention, language, memory, spatial relations, visual perception, visual motor control, semantics, linguistic competence

On a night in the late 1970s, something important happened in a New York City taxicab: A new scientific field was named. En route to a dinner at the famed Algonquin Hotel, the neuroscientist Michael Gazzaniga and the cognitive psychologist George Miller coined the term “cognitive neuroscience.” This field would go on to change the way we think about the relationship between behavior, mind, and brain. This is not to say that the field was born on that day. Indeed, as Hermann Ebbinghaus (1910) noted, “Psychology has a long past, but a short history,” and cognitive neuro science clearly has a rich and complex set of ancestors. Although it is difficult to say ex actly when a new scientific discipline came into being, the groundwork for the field had begun to be laid decades before the term was coined. As has been chronicled in detail elsewhere (Gardner, 1985; Posner & DiGirolamo, 2000), as behaviorism gave way to the Page 1 of 11

Introduction to The Oxford Handbook of Cognitive Neuroscience: Cognitive Neuroscience—Where Are We Now? cognitive revolution, and as computational and neuroscientific approaches to understand ing the mind became increasingly popular, researchers in numerous allied fields came to believe that understanding the relationships between behavior and the mind required un derstanding their relationship to the brain. This two-volume set reviews the current state-of-the art in cognitive neuroscience, some 35 years after the field was named. In these intervening years, the field has grown tremendously—so much so, in fact, that cognitive neuroscience is now less a bounded dis cipline focused on specific topics and more an approach that permeates psychological and neuroscientific inquiry. As such, no collection of chapters could possibly encompass the entire breadth and depth of cognitive neuroscience. That said, this two-volume set at tempts systematically to survey eight core areas of inquiry in cognitive neuroscience, four per volume, in a total of 55 chapters. As an appetizer to this scientific feast, this introductory chapter offers a quick sketch of some central elements of the cognitive neuroscience approach and a brief overview of the eight sections of the Handbook’s two volumes. (p. 2)

The Cognitive Neuroscience Approach Among the many factors that gave rise to cognitive neuroscience, we highlight three sig nal insights. In part, we explicitly highlight these key ideas because they lay bare ele ments of the cognitive neuroscience approach that have become so commonplace today that their importance may be forgotten even as they implicitly influence the ways re search is conducted.

Multiple Levels of Analysis The first crucial influence on cognitive neuroscience were insights presented in a book by the late British vision scientist David Marr. Published in 1982, the book Vision took an old idea—levels of analysis—and made a strong case that we can only understand visual per ception if we integrate descriptions cast at three distinct, but fundamentally interrelated (Kosslyn & Maljkovic, 1990), levels. At the topmost computational level, one describes the problem at hand, such as how one can see edges, derive three-dimensional structure of shapes, and so on; this level characterizes “what” the system does. At the middle algo rithm level, one describes how a specific computational problem is solved by a system that includes specific processes that operate on specific representations; this level char acterizes “how” the system operates. And at the lowest implementation level, one de scribes how the representations and processes that constitute the algorithm are instanti ated in the brain. All three levels are crucial, and characteristics of the description at each level affect the way we must describe characteristics at the other levels. This approach proved enormously influential in vision research, and researchers in other domains quickly realized that it could be applied more broadly. This multilevel approach is now the foundation for cognitive neuroscience inquiry more generally, although we of Page 2 of 11

Introduction to The Oxford Handbook of Cognitive Neuroscience: Cognitive Neuroscience—Where Are We Now? ten use different terminology to refer to these levels of analysis. For instance, many re searchers now talk about the levels of behavior and experience, psychological processes (or information processing mechanisms), and neural systems (Mitchell, 2006; Ochsner, 2007; Ochsner & Lieberman, 2001). But the core idea is still the same as that articulated by Marr: A complete understanding of the ways in which vision, memory, emotion, or any other cognitive or emotional faculty operates necessarily involves connecting descriptions of phenomena across levels of analysis. The resulting multilevel descriptions have many advantages over the one- or two-level ac counts that are typical of traditional approaches in allied disciplines such as cognitive psychology. These advantages include the ability to use both behavioral and brain data in combination—rather than just one or the other taken alone—to draw inferences about psychological processes. In so doing, one constructs theories that are constrained by, must connect to, and must make sense in the context of more types of data than theories that are couched solely at the behavioral or at the behavioral and psychological levels. We return to some of these advantages below.

Use of Multiple Methods If we are to study human abilities and capacities at multiple levels of analysis, we must necessarily use multiple types of methods to do so. In fact, many methods exist to mea sure phenomena at each of the levels of analysis, and new measures are continually being invented (Churchland & Sejnowski 1988). Today, this observation is taken as a given by many graduate students who study cogni tive neuroscience. They take it for granted that we should use studies of patient popula tions, electrophysiological methods, functional imaging methods, transcranial magnetic stimulation (TMS, which uses magnetic fields to temporarily impair or enhance neural functioning in a specific brain area), and other new techniques as they are developed. But this view wasn’t always the norm. This fact is illustrated nicely by a debate that took place in the early 1990s about whether and how neuroscience data should inform psycho logical models of cognitive processes. On one side was the view from cognitive neuropsy chology, which centered on the idea that studies of patient populations may be sufficient to understand the structure of cognitive processing (Caramazza, 1992). The claim was that by studying the ways in which behavior changes as a result of the unhappy accidents of nature (e.g., strokes, traumatic brain injuries) that caused lesions of language areas, memory areas, and so on, we can discover the processing modules that constitute the mind. The key assumption here is that researchers can identify direct relationships be tween behavioral deficits and specific areas of the brain that were damaged. On the other side of the debate was the view from cognitive neuroscience, (p. 3) which centered on the idea that the more methods used, the better (Kosslyn & Intriligator, 1992). Because every method has its limitations, the more methods researchers could bring to bear, the more likely they are to have a correct picture of how behavior is related to neural functioning. In the case of patient populations, for example, in some cases the deficits in behavior might not simply reflect the normal functions of the damaged regions; rather, they could Page 3 of 11

Introduction to The Oxford Handbook of Cognitive Neuroscience: Cognitive Neuroscience—Where Are We Now? reflect reorganization of function after brain damage or diffuse damage to multiple re gions that affects multiple separate functions. If so, then observing patterns of dissocia tions and associations of abilities following brain damage would not necessarily allow re searchers to delineate the structure of cognitive processing. Other methods would be re quired (such as neuroimaging) to complement studies of brain-damaged patients. The field quickly adopted the second perspective, drawing on multiple methods when constructing and testing theories of cognitive processing. Researchers realized that they could use multiple methods together in complementary ways: They could use functional imaging methods to describe the network of processes active in the healthy brain when engaged in a particular behavior; they could use lesion methods or TMS to assess the causal relationships between activity in specific brain areas and particular forms of infor mation processing (which in turn give rise to particular types of behavior); they could use electrophysiological methods to study the temporal dynamics of cortical systems as they interactively relate to the behavior of interest. And so on. The cognitive neuroscience ap proach adopted the idea that no single technique provides all the answers. That said, there is no denying that some techniques have proved more powerful and gen erative than others during the past 35 years. In particular, it is difficult to overstate the impact of functional imaging of the healthy intact human brain, first ushered in by positron emission tomography studies in the late 1980s (Petersen et al., 1988) and given a tremendous boost by the advent of, and subsequent boom of, functional magnetic reso nance imaging in the early 1990s (Belliveau et al., 1992). The advent of functional imag ing is in many ways the single most important contributor to the rise of cognitive neuro science. Without the ability to study cortical and subcortical brain systems in action in healthy adults, it’s not clear whether cognitive neuroscience would have become the cen tral paradigm that it is today. We must, however, offer a cautionary note: Functional imaging is by no means the be-all and end-all of cognitive neuroscience techniques. Like any other method, it has its own strengths and weaknesses (which have been described in detail elsewhere, e.g., Poldrack, 2006, 2008, 2011; Van Horn & Poldrack, 2009; Yarkoni et al., 2010). Researchers trained in cognitive neuroscience understand many, if not all, of these limitations, but unfortu nately, many outside the field do not. This can cause two problems. The first is that new comers to the field may improperly use functional imaging in the service of overly simplis tic “brain mapping” (e.g., seeking to identify “love spots” in the brain; Fisher et al., 2002) and may commit other inferential errors (Poldrack, 2006). The second, less appreciated problem, is that when nonspecialists read about studies of such overly simplistic hypothe ses, they may assume that all cognitive neuroscientists traffic in this kind of experimenta tion and theorizing. As the chapters in these volumes make clear, most cognitive neuro scientists appreciate the strengths and limits of the various techniques they use, and un derstand that functional imaging is simply one of a number of techniques that allow neu roscience data to constrain theories of psychological processes. In the next section, we turn to exactly this point.

Page 4 of 11

Introduction to The Oxford Handbook of Cognitive Neuroscience: Cognitive Neuroscience—Where Are We Now?

Constraints and Convergence One implication of using multiple methods to study phenomena at multiple levels of analy sis is that we have numerous types of data. These data provide converging evidence for, and constrain the nature of, theories of human cognition, emotion, and behavior. That is, the data must fit together, painting different facets of the same picture (this is what we mean by convergence). And even though each type of data alone does not dictate a partic ular interpretation, each type helps to narrow the range of possible interpretations (this is what we mean by constraining the nature of theories). Researchers in cognitive neuro science acknowledge that data always can be interpreted in various ways, but they also rely on the fact that data limit the range of viable interpretations—and the more types of data, the more strongly they will narrow down the range of possible theories. In this sense, constraints and convergence are the very core of the cognitive neuroscience ap proach (Ochsner & Kosslyn, 1999). We note that the principled use of constraining and converging evidence does not privi lege evidence couched at any one level of analysis. Brain data are not more important, more real, or more (p. 4) intrinsically valuable than behavioral data, and vice versa. Rather, both kinds of data constrain the range of possible theories of psychological processes, and as such, both are valuable. In addition, both behavioral and brain data can spark changes in theories of psychological processes. This claim stands in contrast to claims made by those who have argued that brain data can never change, or in any way constrain, a psychological theory. According to this view, brain data are ambiguous without a psychological theory to interpret them (Kihlstrom, 2012). Such arguments fail to appreciate the fact that the goal of cognitive neuroscience is to construct theories couched at all three levels of analysis. Moreover, be havioral and brain data often are dependent variables collected in the same experiments. This is not arbitrary; we have ample evidence that behavior and brain function are inti mately related: When the brain is damaged in a particular location, specific behaviors are disrupted—and when a person engages in specific behaviors, specific brain areas are acti vated. Dependent measures are always what science uses to constrain theorizing, and thus it follows that both behavioral and brain data must constrain our theories of the in tervening psychological processes. This point is so important that we want to illustrate it with a two examples. The first be gins with classic studies of the amnesic patient known for decades only by his initials, H.M. (Corkin, 2002). After he died, his brain was famously donated to science and dis sected live on the Internet in 2009 (see http://thebrainobservatory.ucsd.edu/hm_live.php). We now know that his name was Henry. In the 1960s, Henry suffered from severe epilep sy that could not be treated with medication, which arose because of abnormal neural tis sue in his temporal lobes. At the time, he suffered horribly from seizures, and the last re maining course of potential treatment was a neurosurgical operation that removed the tips of Henry’s temporal lobes (and with them, the neural origins of his epileptic seizures). Page 5 of 11

Introduction to The Oxford Handbook of Cognitive Neuroscience: Cognitive Neuroscience—Where Are We Now? When Henry awoke after his operation, the epilepsy was gone, but so was his ability to form new memories of events he experienced. Henry was stuck in the eternal present, forevermore awakening each day with his sense of time frozen at the age at which he had the operation. The time horizon for his experience was about two minutes, or the amount of time information could be retained in short-term memory before it required transfer to a longer-term episodic memory store. To say that the behavioral sequelae of H.M.’s operation were surprising to the scientific community at that time is an understatement. Many psychologists and neuroscientists spent the better part of the next 20 to 30 years reconfiguring their theories of memory in order to accommodate these and subsequent findings. It wasn’t until the early 1990s that the long-reaching theoretical implications of Henry’s amnesia finally became clear (Schacter & Tulving, 1994), when a combination of behavioral, functional imaging, and patient lesion data converged to implicate a multiple-systems account of human memory. This understanding of H.M.’s deficits was hard won, and emerged only after an extended “memory systems debate” in psychology and neuroscience (Schacter & Tulving, 1994). This debate was between, on the one hand, behavioral and psychological theorists who argued that we have a single memory system (which has multiple processes) and, on the other hand, neuroscience-inspired theorists who argued that we have multiple memory systems (each of which instantiates a particular kind of process or processes). The initial observation of H.M.’s amnesia, combined with decades of subsequent careful experimen tation using multiple behavioral and neuroscience techniques, decisively came down on the side of the multiple memory systems theorists. Cognitive processing relies on multiple types of memory, and each uses a distinct set of representations and processes. This was a clear victory for the cognitive neuroscience approach over purely behavioral approach es. A second example of the utility of combining neuroscientific and behavioral evidence comes from the “imagery debate” (Kosslyn, Thompson, & Ganis, 2006). On one hand, some psychologists and philosophers argued that the pictorial characteristics of visual mental images that are evident to experience are epiphenomenal, like heat produced by a light bulb when someone is reading—something that could be experienced but played no role in accomplishing the function. On the other hand, cognitive neuroscientists argued that visual mental images are analogous to visual percepts in that they use space in a rep resentation to specify space in the world. This debate went back and forth for many years without resolution, and at one point a mathematical proof was offered that behavioral data alone could never resolve it (Ander son, 1978). The advent of neuroimaging helped bring this debate largely to a close (Koss lyn, Thompson, & Ganis, 2006). A key (p. 5) finding was that the first cortical areas that process visual input during perception each are topographically mapped, such that adja cent locations in the visual world are represented in adjacent locations in the visual cor tex. That is, these areas use space on the cortex to represent space in the world. In the early 1990s, researchers showed that visualizing objects typically activates these areas, Page 6 of 11

Introduction to The Oxford Handbook of Cognitive Neuroscience: Cognitive Neuroscience—Where Are We Now? and increasing the size of a visual mental image activates portions of this cortex that reg ister increasingly larger sizes in perception. Moreover, in the late 1990s researchers showed that temporarily impairing these areas using TMS hampers imagery and percep tion to the same degree. Hence, these brain-based findings provided clear evidence that visual mental images are, indeed, analogous to visual percepts in that both represent space in the world by using space in a representation. We have written as if both debates—about memory systems and mental imagery repre sentation—are now definitely closed. But this is a simplification; not everyone is con vinced of one or another view. Our crucial point is that the advent of neuroscientific data has shifted the terms of the debate. When only behavioral data were available, in both cases the two alternative positions seemed equally plausible—but after the relevant neu roscientific data were introduced, the burden of proof shifted dramatically to one side— and a clear consensus emerged in the field (e.g., see Reisberg, Pearson, & Kosslyn, 2003). In the years since these debates, evidence from cognitive neuroscience has constrained theories of a wide range of phenomena. Many such examples are chronicled in this Hand book.

Overview of the Handbook Cognitive neuroscience in the new millennium is a broad and diverse field, defined by a multileveled integrative approach. To provide a systematic overview of this field, we’ve di vided this Handbook into two volumes.

Volume 1 The first volume surveys classic areas of interest in cognitive neuroscience: perception, attention, memory, and language. Twenty years ago when Kevin Ochsner was a graduate student and Stephen Kosslyn was one of his professors, research on these topics formed the backbone of cognitive neuroscience research. And this is still true today, for two rea sons. First, when cognitive neuroscience took off, these were the areas of research within psy chology that had the most highly developed behavioral, psychological, and neuropsycho logical (i.e., brain-damaged patient based) models in place. And in the case of research on perception, attention, and memory, these were topics for which fairly detailed models of the underlying neural circuitry already had been developed on the basis of rodent and nonhuman primate studies. As such, these areas were poised to benefit from the use of brain-based techniques in humans. Second, research on the representations and processes used in perception, attention, memory, and language in many ways forms a foundation for studying other kinds of com plex behaviors, which are the focus of the second volume. This is true both in terms of the

Page 7 of 11

Introduction to The Oxford Handbook of Cognitive Neuroscience: Cognitive Neuroscience—Where Are We Now? findings themselves and in terms of the evidence such findings provided that the cogni tive neuroscience approach could be successful. With this in mind, each of the four sections in Volume 1 includes a selection of chapters that cover core processes and the ways in which they develop across the lifespan and may break down in special populations. The first section, on perception, includes chapters on the abilities to represent and recog nize objects and spatial relations. In addition, this section contains chapters on the use of top-down processes in visual perception and on the ways in which such processes enable us to construct and use mental images. We also include chapters on perceptual abilities that have seen tremendous research growth in the past 5 to 10 years, such as on the study of olfaction, audition, and music perception. Finally, there is a chapter on disorders of perception. The second section, on attention, includes chapters on the abilities to attend to auditory and spatial information as well as on the relationships between attention, action, and vi sual motor control. These are followed by chapters on the development of attention and its breakdown in various disorders. The third section, on memory, includes chapters on the abilities to maintain information in working memory as well as semantic memory, episodic memory, and the consolidation process that governs the transfer of information from working to semantic and episodic memory. There is also a chapter on the ability to acquire skills, which depends on differ ent systems than those used in other forms of memory, as well as chapters on changes in memory function with older age and the ways in which memorial processes break down in various disorders. Finally, the fourth section, on language, includes chapters on abilities such as speech per ception and production, the distinction between linguistic (p. 6) competence and perfor mance, semantics, the capacity for written language, and multimodal and developmental aspects of speech perception.

Volume 2 Whereas Volume 1 addresses the classics of cognitive neuroscience, Volume 2 focuses on the “new wave” of research that has developed primarily in the past 10 years. As noted earlier, in many ways the success of these relatively newer research directions builds on the successes of research in the classic domains. Indeed, our knowledge of the systems implicated in perception, attention, memory, and language literally—and in this Handbook —provided the foundation for the work described in Volume 2. The first section, on emotion, begins with processes involved in interactions between emotion, perception, and attention, as well as the generation and regulation of emotion. This is followed by chapters that provide models for understanding broadly how emotion affects cognition as well as the contribution that bodily sensation and control make to af Page 8 of 11

Introduction to The Oxford Handbook of Cognitive Neuroscience: Cognitive Neuroscience—Where Are We Now? fective and other processes. This section concludes with chapters on genetic and develop mental approaches to emotion. The second section, on self and social cognition, begins with a chapter on the processes that give rise to the fundamental ability to know and understand oneself. This is followed by chapters on increasingly complex abilities involved in perceiving others, starting with the perception of nonverbal cues and perception–action links, and from there ranging to face recognition, impression formation, drawing inferences about others’ mental states, empathy, and social interaction. This section concludes with a chapter on the develop ment of social cognitive abilities. The third section, on higher cognitive functions, surveys abilities that largely depend on processes in the frontal lobes of the brain, which interact with the kinds of core perceptu al, attentional, and memorial processes described in Volume 1. Here, we include chapters on conflict monitoring and cognitive control, the hierarchical control of action, thinking, decision making, categorization, expectancies, numerical cognition, and neuromodulatory influences on higher cognitive abilities. Finally, in the fourth section, four chapters illustrate how disruptions of the mechanisms of cognition and emotion produce abnormal functioning in clinical populations. This sec tion begins with a chapter on attention deficit-hyperactivity disorder and from there moves to chapters on anxiety, post-traumatic stress disorder, and obsessive-compulsive disorder.

Summary Before moving from the appetizer to the main course, we offer two last thoughts. First, we edited this Handbook with the goal of providing a broad-reaching compendium of research on cognitive neuroscience that will be widely accessible to a broad audience. Toward this end, the chapters included in this Handbook are available online to be down loaded individually. This is the first time that chapters of a Handbook of this sort have been made available in this way, and we hope this facilitates access to and dissemination of some of cognitive neuroscience’s greatest hits. Second, we hope that, whether you are a student, an advanced researcher, or an interest ed layperson, this Handbook whets your appetite for learning more about this exciting and growing field. Although reading survey chapters of the sort provided here is an excel lent way to become oriented in the field and to start building your knowledge of the top ics that interest you most, we encourage you to take your interests to the next level: Delve into the primary research articles cited in these chapters—and perhaps even get in volved in doing this sort of research!

Page 9 of 11

Introduction to The Oxford Handbook of Cognitive Neuroscience: Cognitive Neuroscience—Where Are We Now?

References Anderson, J. R. (1978). Arguments concerning representations for mental imagery. Psy chological Review, 85, 249–277. Belliveau, J. W., Kwong, K. K., Kennedy, D. N., Baker, J. R., Stern, C. E., et al. (1992). Mag netic resonance imaging mapping of brain function: Human visual cortex. Investigative Radiology, 27 (Suppl 2), S59–S65. Caramazza, A. (1992). Is cognitive neuropsychology possible? Journal of Cognitive Neuro science, 4, 80–95. Churchland, P. S., & Sejnowski, T. J. (1988). Perspectives on cognitive neuroscience. Science, 242, 741–745. Corkin, S. (2002). What’s new with the amnesic patient H.M.? Nature Reviews, Neuro science, 3, 153–160. Fisher, H. E., Aron, A., Mashek, D., Li, H., & Brown, L. L. (2002). Defining the brain sys tems of lust, romantic attraction, and attachment. Archives of Sexual Behavior, 31, 413– 419. Gardner, H. (1985). The mind’s new science: A history of the cognitive revolution. New York: Basic Books. Kihlstrom, J. F. (2012). Social neuroscience: The footprints of Phineas Gage. Social Cogni tion, 28, 757–782. Kosslyn, S. M., & Intriligator, J. I. (1992). Is cognitive neuropsychology plausible? The per ils of sitting on a one-legged stool. Journal of Cognitive Neuroscience, 4, 96–105. Kosslyn, S. M., & Maljkovic, V. M. (1990). Marr’s metatheory revisited. Concepts in Neu roscience, 1, 239–251. Kosslyn, S. M., Thompson, W. L., & Ganis, G. (2006). The case for mental imagery. New York: Oxford University Press. Marr, D. (1982). Vision: A computational investigation into the human representa tion and processing of visual information. San Francisco: W. H. Freeman. (p. 7)

Mitchell, J. P. (2006). Mentalizing and Marr: An information processing approach to the study of social cognition. Brain Research, 1079, 66–75. Ochsner, K. (2007). Social cognitive neuroscience: Historical development, core princi ples, and future promise. In A. Kruglanksi & E. T. Higgins (Eds.), Social psychology: A handbook of basic principles (pp. 39–66). New York: Guilford Press.

Page 10 of 11

Introduction to The Oxford Handbook of Cognitive Neuroscience: Cognitive Neuroscience—Where Are We Now? Ochsner, K. N., & Kosslyn, S. M. (1999). The cognitive neuroscience approach. In B. M. Bly & D. E. Rumelhart (Eds.), Cognitive science (pp. 319–365). San Diego, CA: Academic Press. Ochsner, K. N., & Lieberman, M. D. (2001). The emergence of social cognitive neuro science. American Psychologist, 56, 717–734. Petersen, S. E., Fox, P. T., Posner, M. I., Mintun, M., & Raichle, M. E. (1988). Positron emission tomographic studies of the cortical anatomy of single-word processing. Nature, 331, 585–589. Poldrack, R. A. (2006). Can cognitive processes be inferred from neuroimaging data? Trends in Cognitive Sciences, 10, 59–63. Poldrack, R. A. (2008). The role of fMRI in cognitive neuroscience: Where do we stand? Current Opinion in Neurobiology, 18, 223–227. Poldrack, R. A. (2011). Inferring mental states from neuroimaging data: From reverse in ference to large-scale decoding. Neuron, 72, 692–697. Posner, M. I., & DiGirolamo, G. J. (2000). Cognitive neuroscience: Origins and promise. Psychological Bulletin, 126, 873–889. Reisberg, D., Pearson, D. G., & Kosslyn, S. M. (2003). Intuitions and introspections about imagery: The role of imagery experience in shaping an investigator’s theoretical views. Applied Cognitive Psychology, 17, 147–160. Schacter, D. L., & Tulving, E. (1994). (Eds.) Memory systems 1994. Cambridge, M A: MIT Press. Van Horn, J. D., & Poldrack, R. A. (2009). Functional MRI at the crossroads. International Journal of Psychophysiology, 73, 3–9. Yarkoni, T., Poldrack, R. A., Van Essen, D. C., & Wager, T. D. (2010). Cognitive neuro science 2.0: Building a cumulative science of human brain function. Trends in Cognitive Sciences, 14, 489–496.

Kevin N. Ochsner

Kevin N. Oschner is a professor in the Department of Psychology at Columbia Univer sity in New York, NY. Stephen Kosslyn

Stephen M. Kosslyn, Center for Advanced Study in the Behavioral Sciences, Stanford University, Stanford, CA

Page 11 of 11

Representation of Objects

Representation of Objects Kalanit Grill-Spector The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0002

Abstract and Keywords Functional magnetic resonance imaging (fMRI) has enabled neuroscientists and psycholo gists to understand the neural bases of object recognition in humans. This chapter re views fMRI research that yielded important insights about the nature of object represen tations in the human brain. Combining fMRI with psychophysics may offer clues about what kind of visual processing is implemented in distinct cortical regions. This chapter explores how fMRI has influenced current understanding of object representations by fo cusing on two aspects of object representation: how the underlying representations pro vide for invariant object recognition and how category information is represented in the ventral stream. It first provides a brief introduction of the functional organization of the human ventral stream and a definition of object-selective cortex before describing cue-in variant responses in the lateral occipital complex (LOC), neural bases of invariant object recognition, object and position information in the LOC, and viewpoint sensitivity across the LOC. The chapter concludes by commenting on debates about the nature of functional organization in the human ventral stream. Keywords: functional magnetic resonance imaging, object recognition, psychophysics, brain, object representation, category information, ventral stream, object-selective cortex, lateral occipital complex, functional organization

Introduction Humans can effortlessly recognize objects in a fraction of a second despite large variabili ty in the appearance of objects (Thorpe et al., 1996). What are the underlying representa tions and computations that enable this remarkable human ability? One way to answer these questions is to investigate the neural mechanisms of object recognition in the hu man brain. With the advent of functional magnetic resonance imaging (fMRI) about 20 years ago, neuroscientists and psychologists began to examine the neural bases of object recognition in humans. fMRI is an attractive method because it is a noninvasive tech nique that allows multiple measurements of brain activation in the same awake behaving human. Among noninvasive techniques, it provides the best spatial resolution currently Page 1 of 29

Representation of Objects available, enabling us to localize cortical activations in the spatial resolution of millime ters (as fine as 1 mm) and at a reasonable time scale (on the order of seconds). Before the advent of fMRI, knowledge about the function of the ventral stream was based on single-unit electrophysiology measurements in monkeys and on lesion studies. These studies showed that neurons in the monkey inferotemporal (IT) cortex respond to shapes (Fujita et al., 1992) and complex objects such as faces (Desimone et al., 1984), and that lesions to the ventral stream can produce specific deficits in object recognition such as agnosia (inability to recognize objects) and prosopagnosia (inability to recognize faces; see Farah, 1995). However, interpreting lesion data is complicated because lesions are typically diffuse (usually more than one region is damaged), may disrupt both a cortical region and its connectivity, (p. 12) and are not replicable across patients. Therefore, the primary knowledge gained from fMRI research was which cortical sites in the normal hu man brain are involved in object recognition. The first set of fMRI studies of object and face recognition in humans identified the regions in the human brain that respond selec tivity to objects and faces (Malach et al., 1995; Kanwisher et al., 1997; Grill-Spector et al., 1998b). Next, a series of studies demonstrated that activation in object- and face-selec tive regions correlates with success at recognizing object and faces, respectively, provid ing striking evidence for the involvement of these regions in recognition (Bar et al., 2001; Grill-Spector et al., 2000, 2004). After researchers determined which regions in cortex are involved in object recognition, their focus shifted to examining the nature of repre sentations and computations that are implemented in these regions to understand how they enable efficient object recognition in humans. In this chapter, I review fMRI research that provided important knowledge about the na ture of object representations in the human brain. For example, one of the fundamental problems in recognition is how to recognize an object across variations in its appearance (invariant object recognition). Understanding how a biological system has solved this problem may give hints for how to build a robust artificial recognition system. Further, fMRI is more adequate for measuring object representations than the temporal sequence of computations en route to object recognition because the time scale of fMRI measure ments is longer than the time scale of the recognition process (the temporal resolution of fMRI is on the order of seconds, whereas object recognition takes about 100 to 250 ms). Nevertheless, combining psychophysics with fMRI may give us some clues to the kinds of visual processing implemented in distinct cortical regions. For example, finding regions whose activation is correlated with success at some tasks, but not others, may suggest the involvement of particular cortical regions in one computation, but not another. In discussing how fMRI has affected our current understanding of object representations, I focus on results pertaining to two aspects of object representation: • How do the underlying representations provide for invariant object recognition? • How is category information represented in the ventral stream?

Page 2 of 29

Representation of Objects I have chosen these topics for three reasons: (1) are central topics in the field of object recognition for which fMRI has substantially advanced our understanding, (2) some find ings related to these topics stirred considerable debate (see the later section, Debates about the Nature of Functional Organization in the Human Ventral Stream), and (3) some of the fMRI findings in humans are surprising given prior knowledge from single-unit electrophysiology in monkeys. In terms of the chapter’s organization, I begin with a brief introduction of the functional organization of the human ventral stream and a definition of object-selective cortex, and then describe research that elucidated the properties of these regions with respect to basic coding principles. I continue with findings related to invariant object recognition, and then end with research and theories regarding category representation and specialization in the human ventral stream.

Functional Organization of the Human Ventral Stream The first set of fMRI studies on object and face recognition in humans was devoted to identifying the regions in the brain that are object and face selective. Electrophysiology research in monkeys suggested that neurons in higher level regions respond to shapes and objects more than simple stimuli such as lines, edges, and patterns (Desimone et al., 1984; Fujita et al., 1992; Logothetis et al., 1995). Based on these findings, fMRI studies measured brain activation when people viewed pictures of objects, as opposed to when people viewed scrambled objects (i.e., pictures that have the same local information and statistics, but do not contain an object) or texture patterns (e.g., checkerboards, which are robust visual stimuli, but do not elicit a percept of a global form). These studies found a constellation of regions in the lateral occipital cortex termed the lateral occipital com plex (LOC), extending from the lateral occipital sulcus, posterior to the medial temporal hMT+ region ventrally to the occipito-temporal sulcus (OTS) and the fusiform gyrus (Fus), that respond more to objects than controls. The LOC is located lateral and anterior to early visual areas (Grill-Spector et al., 1998a, 1998b) and is typically divided to two subregions: LO, a region in lateral occipital cortex adjacent and posterior to the hMT+ re gion; and pFus/OTS, a ventral region overlapping the OTS and the posterior fusiform gyrus (pFus) (Figure 2.1). More recent experiments indicate that the posterior subregion (LO) overlaps a visual field map representation between V3a and hMT+ called LO2 (Sayres & Grill-Spector, 2008).

Page 3 of 29

Representation of Objects

Figure 2.1 Object-, face- and place-selective cortex. (a) Data of one representative subject shown on her partially inflated right hemisphere. Left: lateral view. Right: ventral view. Dark gray: sulci. Light gray: gyri. White lines delineate retinotopic regions. Blue: ob ject-selective regions (objects > scrambled objects), including LO and pFus/OTS ventrally as well as dor sal foci along the intraparietal sulcus (IPS). Red: face-selective regions (faces > objects, body parts, places & words), including two regions in the fusiform gyrus (FFA-1, FFA-2) a region in the inferior occipital gyrus (IOG) and two regions in the posteri or superior temporal sulcus (STS). Magenta: overlap between face- and object-selective regions. Green: place-selective regions (places > faces, body parts, objects and words.), including the PPA and a dorsal region lateral to the IPS. Yellow: Body part selective regions (bodies > other categories). Black: Visual word form area (VWFA), words > other categories. All maps thresholded at p < 0.001, voxel level. (b) LO and pFus (but not V1) responses are correlated with recognition performance (Ungerleider et al., 1983; Grill-Spector et al., 2000). To superimpose recogni tion performance and fMRI signals on the same plot, all values were normalized relative to the maximal response for the 500-ms duration stimulus. For

For

(p. 13)

Page 4 of 29

Representation of Objects The LOC responds robustly to many kinds of objects and object categories (including nov el objects) and is thought to be in the intermediate or high-level stages of the visual hier archy. Importantly, LOC activations are correlated with subjects’ object recognition per formance. High LOC responses correlate with successful object recognition (hits), and low LOC responses correlate with trials in which objects are present, but are not recog nized (misses) (see Figure 2.1b). There are also object-selective regions in the dorsal stream (Grill-Spector, 2003; Grill-Spector & Malach, 2004), but these regions do not cor relate with object recognition performance (Fang & He, 2005) and may be involved in computations related to visually guided actions toward objects (Culham et al., 2003). However, a comprehensive discussion of the dorsal stream’s role in object perception is beyond the scope of this chapter. In addition to the LOC, researchers found several ventral regions that show preferential responses to specific object categories. Searching for regions with categorical preference was motivated by reports that suggested that lesions to the ventral stream can produce very specific deficits, such as the inability to recognize faces or the inability to read words, whereas other visual (and recognition) faculties are preserved. By contrasting ac tivations to different kinds of objects, researchers found ventral regions that show higher responses to specific object categories, such as lateral fusiform regions that respond more to animals than tools and medial fusiform regions that respond to tools more than animals (Chao et al., 1999; Martin et al., 1996); a region in the left OTS that responds more strongly to letters than textures (the visual word form area [VWFA]; Cohen et al., 2000); several foci that respond more strongly to faces than other objects (Grill-Spector et al., 2004; Haxby et al., 2000; Hoffman & Haxby, 2000; Kanwisher et al., 1997; Weiner & Grill-Spector, 2012), including the fusiform face areas (FFA-1, FFA-2; Kanwisher et al., 1997; Weiner & Grill-Spector, 2010); regions that respond more strongly to houses and places than faces and objects, including a region in the parahippocampal gyrus, the parahippocampal place area (PPA; Epstein & Kanwisher, 1998); and regions that respond more strongly to body parts than faces and objects, including a region near the MT called the extrastriate body area (EBA; Downing et al., 2001); and a region in the fusiform gyrus, the fusiform body area (FBA; Schwarzlose et al., 2005, or OTS-limbs, Weiner and GrillSpector, 2011). Nevertheless, many of these object-selective and category-selective re gions respond to more than one object category and also respond strongly to object frag ments (Grill-Spector et al., 1998b; Lerner et al., 2001, 2008). This suggests that caution is needed when interpreting the nature of the selective responses. It is possible that the un derlying representation is perhaps of object parts, features, and/or fragments and not of whole objects or object categories. Findings of category-selective regions in the human brain initiated a fierce debate about the (p. 14) principles of functional organization in the ventral stream. Should one consider only the maximal responses to the preferred category, or do the non maximal responses also carry information? How abstract is the information represented in these regions? For example, is category information represented in these regions, or are low-level visual fea tures that are associated with categories represented? I address these questions in detail Page 5 of 29

Representation of Objects in the later section, Debates about the Nature of Functional Organization in the Human Ventral Stream.

Cue-Invariant Responses in the Lateral Occipi tal Complex Although findings of object-selective responses in the human brain were suggestive of the involvement of these regions in processing objects, there are many differences between objects and scrambled objects (or objects and texture patterns). Objects have a shape, surfaces, and contours; they are associated with a meaning and semantic information; and generally are more interesting than texture patterns. Each of these factors may af fect the higher fMRI response to objects than controls. Further, differences in low-level visual properties across objects and controls may be driving differences in response am plitudes.

Figure 2.2 Selective responses to objects across multiple visual cues across the lateral occipital com plex. Statistical maps of selective response to object from luminance, stereo, and motion information in a representative subject. All maps were thresholded at p < 0.005, voxel level, and are shown on the inflated right hemisphere of a representative subject. (a) Luminance objects > scrambled luminance objects. (b) Objects generated from random dot stereograms vs. structureless random dot stereograms (perceived as a cloud of dots). (c) Objects generated from dot motion vs. the same dots moving randomly. Visual meridians are represented by the red (upper), blue (horizontal), and green (lower) lines. White contour: motion-selective region, MT. (Adapted from Vinberg & Grill-Spector, 2008.)

Converging evidence from several studies revealed an important aspect of coding in the LOC: it responds to object shape, not low-level visual features. These studies showed that all LOC subregions (LO and pFus/OTS) respond more strongly when subjects view objects independently of the type of visual information that defines the object form (Gilaie-Dotan et al., 2002; Grill-Spector et al., 1998a; Kastner et al., 2000; Kourtzi & Kanwisher, 2000, 2001; Vinberg & Grill-Spector, 2008) (Figure 2.2). The LOC responds more strongly to (1) objects defined by luminance compared with luminance textures, (2) objects generated from random dot stereograms compared with structureless random dot stereograms, (3) objects generated from structure from motion relative to random (structureless) motion, and (4) objects generated from textures compared with texture patterns. LOC response to Page 6 of 29

Representation of Objects objects is also similar across object format (gray-scale, line drawings, silhouettes), and it responds to objects delineated by both real and illusory contours (Mendola et al., 1999; Stanley & Rubin, 2003). Kourtzi and Kanwisher (2001) also showed that when objects have the same shape but different contours, there is fMRI adaptation (fMRI-A, indicating a common neural substrate), but there is no fMRI-A when the shared contours were iden tical but the perceived shape was different, suggesting that the LOC responds to global shape rather than local contours (see also Kourtzi et al., 2003; Lerner et al., 2002). Over all, these studies provided fundamental knowledge showing that LOC activation is driven by shape rather than low-level visual information. More recently, we examined whether LOC response to objects is driven by their global shape or their surfaces and whether LOC subregions are sensitive to border ownership. One open question in object recognition is whether the region in the image that belongs to the object is first segmented from the rest of the image (figure–ground segmentation) and then recognized, or whether knowing the shape of an object aids its segmentation (Nakayama et al., 1995; Peterson & Gibson, 1994a, 1994b). To address these questions, we scanned subjects when they viewed stimuli that were matched for their low-level in formation (p. 15) but generated different percepts. Conditions included: (1) a flat object in front of a flat background object, (2) a flat surface with a shaped hole (same shape as the object) in front of a flat background, (3) two flat surfaces without shapes, (4) local edges (created by scrambling the object contour) in front of a background, or (5) random dot stimuli with no structure (Vinberg & Grill-Spector, 2008) (Figure 2.3a). Note that condi tions 1 and 2 both contain a shape, but only condition 1 contains an object. We repeated the experiment twice, once with random dots that were presented stereoscopically and once with random dots that moved, to determine whether the pattern of result varied across stereo and motion cues. We found that LOC responses (both LO and pFus/OTS) were higher for objects and shaped holes than for surfaces, local edges, or random stim uli (see Figure 2.3b). We observed these results for both motion and stereo cues. In con trast, LOC responses were not higher for surfaces than for random stimuli and were not higher for local edges than for random stimuli. Thus, adding either local edge information or global surface information does not increase LOC response. However, adding a global shape produces a significant increase in LOC response. These results provide clear evi dence that cue-invariant responses in the LOC are driven by object shape, rather than by global surface information or local edge information. Additional studies revealed that the LOC is also sensitive to border ownership (Appel baum et al., 2006; Vinberg & Grill-Spector, 2008). Specifically, LO and pFus/OTS respons es were higher for objects (shapes presented in the foreground) than for the same shapes when they defined holes in the foreground. Since objects and holes had the same shape, the only difference between the objects and the holes was the border ownership of the contour defining the shape. In the former case, the border belongs to the object, and in the latter case, it belongs to the flat surface in which the hole is punched in. Interestingly, this higher response to objects than holes was a unique characteristic of LOC subregions and did not occur in other visual regions (see Figure 2.3). This result suggests that LOC prefers shapes (and contours) when they define the figure region. One implication of this Page 7 of 29

Representation of Objects result is that perception the same cortical machinery determines what is the object in the visual input as well as which region in the visual input is the figure regions, correspond ing to the object.

Neural Bases of Invariant Object Recognition

Figure 2.3 Responses to shape, edges and surfaces across the ventral stream. (a) Schematic illustration of experimental conditions. Stimuli were generated from either motion or stereo information alone and had no luminance edges or surfaces (except for the screen border, which was present during the entire experiment, including blank baseline blocks). For il lustration purposes, darker regions indicate front surfaces. From left to right: Object on the front sur face in front of a flat background plane. Shaped hole on the front surface in front of a flat background. (c) Disconnected edges in front of a flat background. Edges were generated by scrambling the shape con tours. Surfaces: Two semitransparent flat surfaces at different depths. Random stimuli with no coherent structure, edges, global surfaces, or global shape. Random stimuli had the same relative disparity or depth range as other conditions. See examples of stimuli: http://www-psych.stanford.edu/~kalanit/ jnpstim/. (b) Responses to objects, holes, edges, and global surfaces across the visual ventral processing hierarchy. Responses: mean ± SEM across eight sub jects. O: object; H: hole; S: surfaces; E: edges; R: ran dom. Diamonds: significantly different than random at p < 0.05. (Adapted with permission from Vinberg & Grill-Spector, 2008.)

The literature reviewed so far provides accumulating evidence that LOC is involved in processing object form. The next question that one may ask, given the role of the LOC in object perception, (p. 16) is, How does it deal with the variability in objects’ appearance? There are many factors that can affect the appearance of objects. Changes in object ap pearance can occur as a result of the object being at different locations relative to the ob server, which will affect the retinal projection of the object in terms of its size and posi tion. Also, the two-dimensional (2D) projection of a three-dimensional (3D) object on the retina varies considerably owing to changes in its rotation and viewpoint relative to the observer. Other changes in appearance occur because of differential illumination condi Page 8 of 29

Representation of Objects tions, which affect the object’s color, contrast, and shadowing. Nevertheless, humans are able to recognize objects across large changes in their appearance, which is referred to as invariant object recognition. A central topic of research in the study of object recognition is understanding how invari ant recognition is accomplished. One view suggests that invariant object recognition is accomplished because the underlying neural representations are invariant to the appear ance of objects. Thus, there will be similar neural responses even when the appearance of an object changes considerably. One means by which this can be achieved is by extracting from the visual input features or fundamental elements (such as geons; Biederman, 1987) that are relatively insensitive to changes in objects’ appearance. According to one influen tial model (the recognition by components [RBC] model; Biederman, 1987), objects are represented by a library of geons (that are easy to detect in many viewing conditions) and their spatial relations. Other theories suggest that invariance may be generated through a sequence of computations across a hierarchically organized processing stream in which the level of sensitivity to object transformation decreases from one level of processing to the next. For example, at the lowest level, neurons code local features, and in higher lev els of the processing stream, neurons respond to more complex shapes and are less sensi tive to changes in position and size (Riesenhuber & Poggio, 1999). Neuroimaging studies of invariant object recognition found differential sensitivity across the ventral stream to object transformations such as size, position, illumination, and view point. Intermediate regions such as LO show higher sensitivity to image transformations than higher level regions such as pFus/OTS. Notably, accumulating evidence from many studies suggests that at no point in the ventral stream are neural representations entirely invariant to object transformations. These results support an account in which invariant recognition is supported by a pooled response across neural populations that are sensi tive to object transformations. One way in which this can be accomplished is by a neural code that contains independent sensitivity to object information and object transforma tion (DiCarlo & Cox, 2007). For example, neurons may be sensitive to both object catego ry and object position. As long as the categorical preference is retained across object transformations, invariant object information can be extracted.

Object and Position Information in the Lateral Occipital Complex Of the object transformations that the recognition system needs to overcome, size and po sition invariance are thought to be accomplished in part by an increase in the size of neural receptive fields along the visual hierarchy. That is, as one ascends the visual hier archy, neurons respond to stimuli across a larger part of the visual field. At the same time, a more complex visual stimulus is necessary to elicit significant responses in neu rons (e.g., shapes instead of oriented lines). Findings from electrophysiology suggest that even at the highest stages of the visual hierarchy, neurons retain some sensitivity to ob ject location and size (although electrophysiology reports vary significantly about the de Page 9 of 29

Representation of Objects gree of position sensitivity of IT neurons (DiCarlo & Maunsell, 2003; Op De Beeck & Vo gels, 2000; Rolls, 2000). A related issue is whether position sensitivity of neurons in high er visual areas manifests as an orderly, topographic representation of the visual field. Several studies documented sensitivity to both eccentricity and polar angle in distinct ventral stream regions. Both object-selective and category-selective regions in the ventral stream respond to objects presented at multiple positions and sizes. However, the ampli tude of response to an object varies across different retinal positions. The LO, pFus/OTS, and category-selective regions (e.g. FFA, PPA) respond more strongly to objects present ed in the contralateral compared with ipsilateral visual field (Grill-Spector et al., 1998b; Hemond et al., 2007; McKyton & Zohary, 2007). Some regions (LO and EBA) also respond more strongly to objects presented in the lower visual field (Sayres & Grill-Spector, 2008; Schwarzlose et al., 2008). Responses also vary with eccentricity: the FFA and the VWFA respond more strongly to centrally presented stimuli, and the PPA responds more strong ly to peripherally presented stimuli (Hasson et al., 2002, 2003; Levy et al., 2001; Sayres & Grill-Spector, 2008). Further, more recently, Aracro & (p. 17) colleagues discovered that the PPA contains two visual field map representations (Aracaro et al., 2009). Using fMRI-A, my colleagues and I have shown that the pFus/OTS, but not the LO, ex hibits some degree of insensitivity to objects’ size and position (Grill-Spector et al., 1999). fMRI-A is a method that allows characterization of the sensitivity of neural representa tions to stimulus transformations at a subvoxel resolution. fMRI-A is based on findings from single-unit electrophysiology showing that when objects repeat, there is a stimulusspecific decrease in IT cells’ response to the repeated image, but not to other object im ages (Miller et al., 1991; Sawamura et al., 2006). Similarly, fMRI signals in higher visual regions show a stimulus-specific reduction (fMRI-A) in response to repetition of identical object images (Grill-Spector et al., 1999, 2006a; Grill-Spector & Malach, 2001). We showed that fMRI-A can be used to test the sensitivity of neural responses to object trans formation by adapting cortex with a repeated presentation of an identical stimulus and examining adaptation effects when the stimulus is changed along an object transforma tion (e.g., changing its position). If the response remains adapted, it indicates that neu rons are insensitive to the change. However, if the response returns to the initial level (i.e., recovers from adaptation), it indicates sensitivity to the change (Grill-Spector & Malach, 2001). Using fMRI-A, we found that repeated presentation of the same face or object at the same position and size produces reduced fMRI activation. This is thought to reflect stimulusspecific neural adaptation. Presenting the same face or object in different positions in the visual field or at different sizes also produces fMRI-A in pFus/OTS and FFA, indicating in sensitivity to object size and position in the range we tested (Grill-Spector et al., 1999; see also Vuilleumier et al., 2002). This result is consistent with electrophysiology findings showing that IT neurons that respond similarly to stimuli at different positions in the visu al field also show adaptation when the same object is shown in different positions (Lueschow et al., 1994). In contrast, the LO recovered from fMRI-A to images of the same

Page 10 of 29

Representation of Objects face or object when presented at different sizes or positions. This indicates that the LO is sensitive to object position and size. Recently, several groups examined the sensitivity of the distributed response across the visual stream to object category and object position (Sayres & Grill-Spector, 2008; Sch warzlose et al., 2008) and also object identity and object position (Eger et al., 2008). These studies used multivoxel pattern analyses (MVPA) and classifier methods developed in machine learning to examine what information is present in the distributed responses across voxels in a cortical region. The distributed response can carry different informa tion from the mean response of a region of interest (ROI) when there is variation across voxel responses. To examine sensitivity to position information, several studies examined whether distrib uted response patterns to the same object category (or object exemplar) is the same (or different) when the same stimulus is presented in a different position in the visual field. In MVPA, researchers typically split the data into two independent sets and examine the cross-correlation between the distributed responses to the same (or different) stimulus in the same (or different) position across the two datasets. This gives a measure of the sen sitivity of distributed responses to object information and position. When responses are position invariant, there is a high correlation between the distributed responses to the same object category (or exemplar) at different positions. When responses are sensitive to position, there is a low correlation between responses to the same object category (or exemplar) at different positions. When exemplars from the same object category are shown in the same position in the vi sual field LO responses are reliable (or positively correlated). Surprisingly, showing ob jects from the same category, but at a different position, significantly reduced the correla tion between activation patterns (Figure 2.4, first vs. third bars) and this reduction was larger than changing the object category in the same position (see Figure 2.4, first vs. second bar). Importantly, position and category effects were independent because there were no significant interactions between position and category (all F values < 1.02, all p values > 0.31). Thus, changing both object category and position produced maximal decorrelation between distributed responses (see Figure 2.4, fourth bar).

Page 11 of 29

Representation of Objects

Figure 2.4 Mean cross correlations between LO dis tributed responses across two independent halves of the data for the same or different category at the same or different position in the visual field. Position effects: LO response patterns to the same category were substantially more similar if they were present ed at the same position versus different positions ( first and third bars, p < 10–7). Category effects: the mean correlation was higher for same-category re sponse patterns than for different-category response patterns when presented in the same retinotopic po sition (first two bars; p < 10–4). Error bars indicate SEM across subjects. (Adapted with permission from Sayres & Grill-Spector, 2008.)

Is the position information in the LO a consequence of an orderly retinotopic map (similar to retinotopic organization in lower visual areas)? By measuring retinotopic maps in the LO using standard traveling wave paradigms (Sayres & Grill-Spector, 2008; Wandell, 1999), we found a continuous mapping of the visual field in the LO in terms of both ec centricity and polar angle. This retinotopic map (p. 18) contained an over-representation of the contralateral and lower visual field (more voxels preferred these visual field posi tions than ipsilateral and upper visual fields). Although we did not consistently find a sin gle visual field map (a single hemifield or quarterfield representation) in LO, it over lapped the visual map named LO2 (Larsson & Heeger, 2006) and extended inferior to it. This suggests that there is retinotopic information in the LO which explains the position sensitivity found in the MVPA. A related recent study examined position sensitivity using pattern analysis more broadly across the ventral stream, providing additional evidence for a hierarchical organization across the ventral stream (Schwarzlose et al., 2008). Schwarzlose and colleagues found that distributed responses to a particular object category (faces, body parts, or scenes) were similar across positions in ventral temporal regions (e.g., pFus/OTS and FBA) but changed across positions in occipital regions (e.g., EBA and LO). Thus, accumulating evi dence from both fMRI-A and pattern analysis studies suggests a hierarchy of representa Page 12 of 29

Representation of Objects tions in the human ventral stream through which representations become less sensitive to object position as one ascends the visual hierarchy.

Implications for Theories of Object Recognition It is important to relate imaging results to the concept of position-invariant representa tions of objects and object categories. What exactly is implied by the term invariance depends on the scientific context. In some instances, this term is taken to reflect a neural representation that is abstracted so as to be independent of viewing conditions. A fully in variant representation, in this meaning of the term, is expected to be completely indepen dent of retinal position information (Biederman & Cooper, 1991). However, in the context of studies of visual cortex, the term is more often considered to be a graded phenomenon, in which neural populations are expected to retain some degree of sensitivity to visual transformations (like position changes) but in which stimulus selectivity is preserved across these transformations (DiCarlo & Cox, 2007; Kobatake & Tanaka, 1994; Rolls & Milward, 2000). In support of this view, a growing literature suggests that maintaining lo cal position information within a distributed neural representation may actually aid in variant recognition in several ways (DiCarlo & Cox, 2007; Dill & Edelman, 2001; Sayres & Grill-Spector, 2008). First, maintaining separable information about position and category may also allow maintaining information about the structural relationships between object parts (Edelman & Intrator, 2000). Second, separable position and object information may provide a robust way to generate position invariance by using a population code. Accord ing to this model, objects are represented as manifolds in a high dimensional space spanned by a population of neurons. The separability of position and object information may allow for fast decisions based on linear computations (e.g., linear discriminant func tions) to determine the object identity (or category) across positions (see DiCarlo & Cox, 2007). Finally, separable object and position information may allow concurrent localiza tion and recognition of objects, that is, recognizing what the object is and also where it is.

Evidence for Viewpoint Sensitivity Across the Lateral Occipital Complex Another source of change in object appearance that merits separate consideration is change across rotation in depth. In contrast to position or size changes, for which invari ance may be achieved by a linear transformation, the shape of objects changes with depth rotation. This is because the visual system (p. 19) receives 2D retinal projections of 3D ob jects. Some theories suggest that view-invariant recognition across object rotations or changes in the observer viewing angle are accomplished by largely view-invariant repre sentations of objects (generalized cylinders, Marr, 1980; the RBC model, Biederman, 1987). That is, the underlying neural representations respond similarly to an object across its views. However, other theories suggest that object representations are view de pendent, that is, consist of several 2D views of an object (Bulthoff et al., 1995; Bulthoff & Edelman, 1992; Edelman & Bulthoff, 1992; Poggio & Edelman, 1990; Tarr & Bulthoff, Page 13 of 29

Representation of Objects 1995; Ullman, 1989). Invariant object recognition is accomplished by interpolation across these views (Logothetis et al., 1995; Poggio & Edelman, 1990; Ullman, 1989) or by a dis tributed neural code across view-tuned neurons (Perrett et al., 1998). Single-unit electrophysiology studies in primates indicate that most neurons in monkey IT cortex are view dependent (Desimone et al., 1984; Logothetis et al., 1995; Perrett, 1996; Vogels & Biederman, 2002; Wang et al., 1996), with a small minority (5–10 percent) of neurons showing view-invariant responses across object rotations (Booth & Rolls, 1998; Logothetis et al., 1995;). In humans, results vary considerably. Short-lagged fMRI-A experiments, in which the test stimulus is presented immediately after the adapting stimulus (Grill-Spector et al., 2006a), suggest that object representations in the lateral occipital complex are view de pendent (Fang et al., 2007; Gauthier et al., 2002; Grill-Spector et al., 1999; but see Va lyear et al., 2006). However, long-lagged fMRI-A experiments, in which many intervening stimuli occur between the test and adapting stimulus (Grill-Spector et al., 2006a), have provided some evidence for view-invariant representations in the ventral LOC, especially in the left hemisphere (James et al., 2002; Vuilleumier et al., 2002) and the PPA, (Epstein et al., 2008). Also, a recent study showed that the distributed LOC responses to objects remained stable across 60-degree rotations (Eger et al., 2008). Presently, there is no con sensus across experimental findings in the degree to which ventral stream representa tions are view dependent or view invariant. These variable results may reflect differences in the neural representations depending on object category and cortical region, or methodological differences across studies (e.g., level of object rotation and fMRI-A para digm used). To address these differential findings, in a recent study we used a parametric approach to investigating sensitivity to object rotation and used a computational model to link be tween putative neural tuning and resultant fMRI measurements (Andresen et al., 2009). The parametric approach allows a richer characterization of rotation sensitivity because it measures the degree of sensitivity to rotations rather than characterizing representa tions as one of two possible alternatives: “invariant” or “not invariant.” We used fMRI-A to measure viewpoint sensitivity as a function of the rotation level for two object cate gories: animals and vehicles. Overall, we found sensitivity to object rotation in the LOC. However, there were differences across categories and regions. First, there was higher sensitivity to vehicle rotation than animal rotation. Rotations of 60 degrees produced a complete recovery from adaptation for vehicles, but rotations of 120 degrees were neces sary to produce recovery from adaptation for animals (Figure 2.5). Second, we found evi dence for over-representation of the front view of animals in the right pFus/OTS: its re sponses to animals were higher for the front view than the back view (compare black and gray circles in Figure 2.5b, right). In addition, fMRI-A effects across rotation varied ac cording to the adapting view (see Figure 2.5b, right). When adapting with the back view of animals, we found recovery from adaptation for rotations of 120 degrees or larger, but when adapting with the front view of animals, there was no significant recovery from adaptation across rotations. One interpretation is that there is less sensitivity to rotation Page 14 of 29

Representation of Objects when adapting with front views than back views of animals. However, subjects’ behav ioral performance in a discrimination task across object rotations showed that they are equally sensitive to rotations (performance decreases with rotation level) whether rota tions are relative to the front or back of an animal (Andresen et al., 2009), suggesting that this interpretation is unlikely. Alternatively, the apparent rotation cross-adaptation may be due to lower responses for back views of animals. That is, the apparent adapta tion across rotation from the front view to the back view is driven by lower responses to the back view rather than adaptation across 180-degree rotations.

Figure 2.5 LO and pFus/OTS responses during fMRIA experiments of rotation sensitivity Each line repre sents response after adapting with a front (dashed black) or back (solid gray) view of an object. The nonadapted response is indicated by diamonds (black for front view and gray for back view). The open cir cles indicate significant adaptation, lower than non adapted, p < 0.05, paired t-test across subjects. (a) Vehicle data. (b) Animal data. Responses are plotted relative to a blank fixation baseline. Error bars indicate SEM across eight subjects. (Adapted with permission from Anderson, Vinberg, & Grill-Spector, 2009.)

To better characterize the underlying representations and examine which representations may lead to our observed results, we simulated putative neural responses in a voxel and predicted the resultant (p. 20) blood oxygen level dependent (BOLD) responses. In the model, each voxel contains a mixture of neural populations, each tuned to a different ob ject view (Andresen et al., 2009) (Figure 2.6). blood oxygen level dependent (BOLD) re sponses were modeled to be proportional to the sum of responses across all neural popu lations. We simulated the BOLD responses in fMRI-A experiments. Results of the simula tions indicate that two main parameters affected the pattern of fMRI data: (1) the view

Page 15 of 29

Representation of Objects tuning width of the neural population and (2) the proportion of neurons in a voxel that prefer a specific object view. Figure 2.6a shows the response characteristics of a model of a putative voxel containing a mixture of view-dependent neural populations tuned to different object views, in which the distribution of neurons tuned to different object views is uniform. In this model, nar rower neural tuning to object view (left) results in recovery from fMRI-A for smaller rota tions than wider view tuning (right). Responses to front and back views are identical when there is no adaptation (see Figure 2.6a, diamonds), and the pattern of adaptation as a function of rotation is similar when adapting with the front or back views (see Figure 2.6a). Such a model provides an account of responses to vehicles across object-selective cortex (as measured with fMRI), and for animals in the LO. Thus, this model suggests that the difference between the representation of animals and vehicles in the LO is likely due to a smaller population view tuning for vehicles than animals (a tuning width of σ < 40° produces complete recovery from adaptation for rotations larger than 60 degrees, as ob served for vehicles). Figure 2.6b shows simulation results when there is a prevalence of neurons to the front view of objects. This simulation shows higher BOLD responses to frontal views without adaptation (gray vs. black diamonds) and a flatter profile of fMRI-A across rotations when adapting with the front view. These simulation results are consistent with our observa tions in pFus/OTS and indicate that the differential recovery from adaptation as a func tion of the adapting animal view may be a consequence of a larger neural population tuned to front views of animals.

Page 16 of 29

Representation of Objects

Implications for Theories of Object Recognition

Figure 2.6 Simulations predicting fMRI responses of putative voxels containing a mixture of view-depen dent neural populations. Left: schematic illustration of the view tuning and distribution of neural popula tions tuned to different views in a voxel. Right: result of model simulations illustrating the predicted fMRIA data. In all panels, the model includes six Gaus sians tuned to specific views around the viewing cir cle, separated 60° apart. Across columns, the view tuning width varies; across rows, the distribution of neural populations preferring specific views varies. Diamonds, responses without adaptation; black, back view; gray, front view; lines, response after adapta tion with a front view (dashed gray line) or back view (solid black line). (a) Mixture of view-dependent neural populations that are equally distributed in a voxel. Narrower tuning (left) shows recovery from fMRI-A for smaller rotations than wider view tuning (r ight). This model predicts the same pattern of recov ery from adaptation when adapting with the front or back view. (b) Mixture of view-dependent neural pop ulations in a voxel with a higher proportion of neu rons that prefer the front view. The number on the right indicates the ratio between the percentages neurons tuned to the front vs. back view. Top row: ra tio = 1.2; bottom row: ratio = 1.4. Because there are more neurons tuned to the front view in this model, it predicts higher BOLD responses to frontal views without adaptation (gray vs. black diamonds) and a flatter profile of fMRI-A across rotations when adapt ing with the front view. (Adapted with permission from Anderson, Vinberg, & Grill-Spector, 2009).

Overall, recent results provide empirical evidence for view-dependent object representa tion across human object-selective cortex that is evident both with standard fMRI and fM RI-A measurements. These data provide important empirical constraints for theories of object recognition and highlight the importance of parametric manipulations for captur ing neural selectivity to any type of stimulus transformation. They findings also generate new questions. For example, if there is no view-invariant neural representation in the hu man ventral temporal cortex, how is view invariant object recognition accomplished? One Page 17 of 29

Representation of Objects possibility is that view invariant recognition is achieved by utilizing a population code across neurons, where each neuron itself is not view invariant, but the responses of the populations to views of an object are separable from views of other objects (Perret et al, 1998; Cox & Dicarlo, 2007). Thus the (p. 21) distributed pattern of responses across neu rons separates among views of one object from views of another object. Another possibili ty is that downstream neurons in the anterior temporal lobe or prefrontal cortex read out the information from ventral temporal cortex and these downstream neurons contain view-invariant representations supporting behavior (Friedman et al., 2003, 2008; Quiroga et al., 2005, 2008).

Debates about the Nature of Functional Orga nization in the Human Ventral Stream So far, we have considered general computational principles that are required by any ob ject recognition system. Nevertheless, it is possible that some object classes or domains require specialized computations. The rest of this chapter examines functional specializa tion in the ventral stream that may be linked to these putative “domain-specific” compu tations. As illustrated in Figure 2.1, several regions in the ventral stream exhibit higher responses to particular object categories such as places, faces, and body parts compared with other object categories. Findings of category-selective regions initiated a fierce debate about the principles of functional organization in the ventral stream. Are there regions in the cortex that are specialized for (p. 22) any object category? Is there something special about computations relevant to specific categories that generate specialized cortical re gions for these computations? That is, perhaps some general processing is applied to all objects, but some computations may be specific to certain domains and may require addi tional brain resources. In explaining the pattern of functional selectivity in the ventral stream, four prominent views have emerged. The main debate centers on the question of whether regions that elicit maximal response for a category should be treated as a module for the representa tion of that category, or whether they are part of a more general object recognition sys tem.

Handful of Category-Specific Modules and a General Purpose Region for Processing All Other Objects Kanwisher and coworkers (Kanwisher, 2000; Op de Beeck et al., 2008) suggested that the ventral temporal cortex contains a limited number of modules specialized for the recogni tion of special object categories such as faces (in the FFA), places (in the PPA), and body parts (in the EBA and FBA). The remaining object-selective cortex (LOC), which shows lit tle selectivity for particular object categories, is a general-purpose mechanism for per ceiving any kind of visually presented object or shape. The underlying hypothesis is that there are few “domain-specific modules” that perform computations specific to these Page 18 of 29

Representation of Objects classes of stimuli beyond what would be required from a general object recognition sys tem. For example, faces, like other objects, need to be recognized across variations in their appearance (a domain-general process). However, given the importance of face pro cessing for social interactions, there are aspects of face processing that are unique. Spe cialized face processing may include identifying faces at the individual level (e.g., John vs. Harry), extracting gender information, gaze, expression, and so forth. These unique facerelated computations may be implemented in face-selective regions.

Process Maps Tarr and Gauthier (2000) proposed that object representations are clustered according to the type of processing that is required, rather than according to their visual attributes. It is possible that different levels of processing may require dedicated computations that are performed in localized cortical regions. For example, faces are usually recognized at the individual level (e.g. “Bob Jacobs”), but many objects are typically recognized at the category level (e.g. “a horse”). Following this reasoning, and evidence that objects of ex pertise activate the FFA more than other objects (Gauthier et al., 1999, 2000), Gauthier, Tarr, and their colleagues have suggested that the FFA is a region for subordinate identi fication of any object category that is automated by expertise (Gauthier et al., 1999, 2000; Tarr & Gauthier, 2000).

Distributed Object Form Topography Haxby et al. (2001) posited an “object form topography” in which occipito-temporal cor tex contains a topographically organized representation of shape attributes. The repre sentation of an object is reflected by a distinct pattern of response across all ventral cor tex, and this distributed activation produces the visual perception. Haxby et al. showed that the activation patterns for eight object categories were replicable, and that the re sponse to a given category could be determined by the distributed pattern of activation across the ventral temporal cortex. Further, they showed that it is possible to predict what object category subjects viewed even when regions that show maximal activation to a particular category (e.g., the FFA) were excluded (Haxby et al., 2001). Thus, this model suggests that the ventral temporal cortex represents object category information in an overlapping and distributed fashion. One of the reasons that this view is appealing is that a distributed code is a combinatorial code that allows representation of a large number of object categories. Given Biederman’s rough estimate that humans can recognize about 30,000 categories (Bieder man, 1987), this provides a neural substrate that has a capacity to represent such a large number of categories. Second, this model posited a provocative view that when consider ing information in the ventral stream, one needs to consider the weak signals as much as the strong signals because both convey useful information.

Sparsely Distributed Representations of Faces and Body Parts Recently, using high-resolution fMRI (HR- fMRI), we reported a series of alternating faceand limb-selective activations that are arranged in a consistent spatial organization rela tive to each other (p. 23) as well as retinotopic regions and hMT+ (Weiner & Grill-Spector, Page 19 of 29

Representation of Objects 2010, 2011, 2013). Specifically, our data illustrate that there is not just one distinct re gion selective for each category (i.e., a single FFA or FBA) in the ventral temporal cortex, but rather a series of face- and limb-selective clusters that minimally overlap, with a con sistent organization relative to one another on a posterior-to-anterior axis on the OTS and fusiform gyrus (FG). Our data also show an interaction between localized cortical clusters and distributed responses across voxels outside these clusters. Our results further illus trate that even in weakly selective voxels outside of these clusters, the distributed re sponses for faces and limbs are distinct from one another. Nevertheless, there is signifi cantly more face information in the distributed responses in weakly and highly selective voxels compared with nonselective voxels, indicating differential amounts of information in these different subsets of voxels where weakly and highly selective voxels are more in formative than nonselective voxels. These data suggest a fourth model—a sparsely distributed organization in the ventral temporal cortex—mediating the debate between modular and distributed theories of ob ject representation. Sparsely refers to the presence of several face- and limb-selective clusters with a distinct, minimally overlapping organization, and distributed refers to the presence of information in weakly and nonselective voxels outside of these clusters. This sparsely distributed organization is supported by recent cortical connectivity studies indi cating a hybrid modular and distributed organization (Borra et al., 2009; Zangenehpour & Chaudhuri, 2005), as well as theoretical work of a sparse-distributed network (Kanerva, 1988). Presently, there is no consensus in the field about which account best explains ventral stream functional organization. Much of the debate centers on the degree to which object processing is constrained to discrete modules or involves distributed computations across large stretches of the ventral stream (Op de Beeck et al., 2008). The debate is both about the spatial scale on which computations for object recognition occur and about the funda mental principles that underlie specialization in the ventral stream. On the one hand, do main-specific theories need to address findings of multiple foci that show selectivity. For example, there are multiple foci in the ventral stream that respond more strongly to faces versus objects. Thus, a strong modular account of a single “face module” for face recogni tion is unlikely. Second, the spatial extent of these putative modules is undetermined, and it is unclear whether each of these category-selective regions corresponds to a visual area. On the other hand, a very distributed and overlapping account of object representa tion in the ventral stream suffers from the potential problem that in order to resolve cate gory information, the brain may need to read out information present across the entire ventral stream (which is inefficient). Further, the fact that there is information in the dis tributed response does not mean that the brain uses the information in the same way that an independent classifier does. It is possible that activation in localized regions is more informative for perceptual decisions than the information available across the entire ven tral stream (Grill-Spector et al., 2004; Williams et al., 2007). For example, FFA responses predict when subjects recognize faces and birds, but do not predict when subjects recog nize houses, guitars, or flowers (Grill-Spector et al., 2004). The recent sparsely distrib uted model we proposed attempts to bridge between the extreme modular views and Page 20 of 29

Representation of Objects highly distributed and overlapping views of organization of the ventral temporal cortex. One particular appeal of this view is that it is closely tied to the measurements and allows for additional clusters to be incorporated into the model. As scanning resolutions improve for human fMRI studies, the number of clusters is likely to increase, but the alternating nature of face and limb representations is likely to remain in adjacent activations, as also suggested by monkey fMRI (Pinsk et al., 2009).

Open Questions and Future Directions In sum, neuroimaging research has advanced our understanding of object representa tions in the human brain. These studies have identified regions involved in object recogni tion, and have laid fundamental stepping stones in understanding the neural mechanisms underlying invariant object recognition. However, many questions remain. First, what is the relationship between neural sensitivi ty to object transformations and behavioral sensitivity to object transformations? Do bias es in neural representations produce biases in performance? For example, empirical evi dence shows over-representation of the lower visual field in LO. Does this lead to better recognition in the lower than upper visual field? Second, what information does the visual system use to build invariant object representations? Third, (p. 24) what computations are implemented in distinct cortical regions involved in object recognition? Does the “aha” moment in recognition involve a specific response in a particular brain region, or does it involve a distributed response across a large cortical expanse? Combining experimental methods such as fMRI and EEG will provide high spatial and temporal resolution, which is critical to addressing this question. Fourth, why do representation of few categories such as faces or body parts yield local clustered activations while many other categories (e.g., manmade objects) produce more diffuse and less intense responses across the ven tral temporal cortex? Fifth, what is the pattern of connectivity between ventral stream vi sual regions in the human brain? Although the connectivity in monkey visual cortex has been extensively explored (Moeller et al., 2008; Van Essen et al., 1990), there is little knowledge about connectivity between cortical visual areas in the human ventral stream. This knowledge is necessary for building a model of hierarchical processing in humans and any neural network model of object recognition. Future directions that combine methodologies, such as psychophysics with fMRI, EEG with fMRI, or diffusion tensor imaging with fMRI, will be instrumental in addressing these fundamental questions.

Acknowledgements I thank David Andresen, Rory Sayres, Joakim Vinberg, and Kevin Weiner for their contri butions to the research summarized in this chapter. This work was supported by NSF grant and NEI grant.

Page 21 of 29

Representation of Objects

References Andresen, D. R., Vinberg, J., & Grill-Spector, K. (2009). The representation of object view point in the human visual cortex. NeuroImage, 45, 522–536. Appelbaum, L. G., Wade, A. R., Vildavski, V. Y., Pettet, M. W., & Norcia, A. M. (2006). Cueinvariant networks for figure and background processing in human visual cortex. Journal of Neuroscience, 26, 11695–11708. Bar, M., Tootell, R. B., Schacter, D. L., Greve, D. N., Fischl, B., Mendola, J. D., Rosen, B. R., & Dale, A. M. (2001). Cortical mechanisms specific to explicit visual object recognition. Neuron, 29, 529–535. Biederman, I. (1987). Recognition-by-components: A theory of human image understand ing. Psychological Review, 94, 115–147. Biederman, I., & Cooper, E. E. (1991). Evidence for complete translational and reflection al invariance in visual object priming. Perception, 20, 585–593. Borra, E., Ichinohe, N., Sato, T., Tanifuji, M., Rockland KS. (2010). Cortical connections to area TE in monkey: hybrid modular and distributed organization. Cereb Cortex, 20 (2): 257–70. Booth, M. C., & Rolls, E. T. (1998). View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex. Cerebral Cortex, 8, 510–523. Bulthoff, H. H., & Edelman, S. (1992). Psychophysical support for a two-dimensional view interpolation theory of object recognition. Proceedings of the National Academy of Sciences U S A 89: 60–64. Bulthoff, H. H., Edelman, S. Y., & Tarr, M. J. (1995). How are three-dimensional objects represented in the brain? Cerebral Cortex 5, 247–260. Chao, L. L., Haxby, J. V., & Martin, A. (1999). Attribute-based neural substrates in tempo ral cortex for perceiving and knowing about objects. Nature Neuroscience, 2, 913–919. Cohen, L., Dehaene, S., Naccache, L., Lehericy S., Dehaene-Lambertz, G., Henaff, M. A., & Michel, F. (2000). The visual word form area: Spatial and temporal characterization of an initial stage of reading in normal subjects and posterior split-brain patients. Brain, 123 (2), 291–307. Culham, J. C., Danckert, S. L., DeSouza, J. F., Gati, J. S., Menon, R. S., & Goodale, M. A. (2003). Visually guided grasping produces fMRI activation in dorsal but not ventral stream brain areas. Experimental Brain Research, 153, 180–189. Desimone, R., Albright, T. D., Gross, C. G., & Bruce, C. (1984). Stimulus-selective proper ties of inferior temporal neurons in the macaque. Journal of Neuroscience, 4, 2051–2062.

Page 22 of 29

Representation of Objects DiCarlo, J. J., & Cox, D. D. (2007). Untangling invariant object recognition. Trends in Cog nitive Science, 11, 333–341. DiCarlo, J. J., & Maunsell, J. H. (2003). Anterior inferotemporal neurons of monkeys en gaged in object recognition can be highly sensitive to object retinal position. Journal of Neurophysiology, 89, 3264–3278. Dill, M., & Edelman, S. (2001). Imperfect invariance to object translation in the discrimi nation of complex shapes. Perception, 30: 707–724. Downing, P. E., Jiang, Y., Shuman, M., & Kanwisher, N. (2001). A cortical area selective for visual processing of the human body. Science, 293, 2470–2473. Edelman, S., & Bulthoff, H. H. (1992) Orientation dependence in the recognition of famil iar and novel views of three-dimensional objects. Vision Research, 32, 2385–2400. Edelman, S., & Intrator, N. (2000), (Coarse coding of shape fragments) + (retinotopy) ap proximately = representation of structure. Spatial Vision, 13, 255–264. Eger, E., Ashburner, J., Haynes, J. D., Dolan, R. J., & Rees, G. (2008). fMRI activity pat terns in human LOC carry information about object exemplars within category. Journal of Cognitive Neuroscience, 20, 356–370. Epstein, R., & Kanwisher, N. (1998). A cortical representation of the local visual environ ment. Nature, 392, 598–601. Epstein, R. A., Parker, W. E., & Feiler, A. M. (2008). Two kinds of fMRI repetition suppres sion? Evidence for dissociable neural mechanisms. Journal of Neurophysiology, 99 (6), 2877–2886. Fang, F., & He, S. (2005). Cortical responses to invisible objects in the human dorsal and ventral pathways. Nature Neuroscience, 8, 1380–1385. Farah, M. J. (1995). Visual agnosia. Cambridge, MA: MIT Press. Freedman, D. J., Riesenhuber, M., Poggio, T., & Miller, E. K. (2001) Categorical rep resentation of visual stimuli in the primate prefrontal cortex. Science, 291, 312–316. (p. 25)

Freedman, D. J., Riesenhuber, M., Poggio, T., & Miller, E. K. (2003). A comparison of pri mate prefrontal and inferior temporal cortices during visual categorization. Journal of Neuroscience, 23, 5235–5246. Fujita, I., Tanaka, K., Ito, M., & Cheng, K. (1992). Columns for visual features of objects in monkey inferotemporal cortex. Nature, 360, 343–346. Gauthier, I., Skudlarski, P., Gore, J. C., & Anderson, A. W. (2000). Expertise for cars and birds recruits brain areas involved in face recognition. Nature Neuroscience, 3, 191–197.

Page 23 of 29

Representation of Objects Gauthier, I., Tarr, M. J., Anderson, A. W., Skudlarski, P., & Gore, J. C. (1999). Activation of the middle fusiform “face area” increases with expertise in recognizing novel objects. Na ture Neuroscience, 2, 568–573. Gilaie-Dotan, S., Ullman, S., Kushnir, T., & Malach, R. (2002). Shape-selective stereo pro cessing in human object-related visual areas. Human Brain Mapping, 15, 67–79. Grill-Spector, K. (2003). The neural basis of object perception. Current Opinion in Neuro biology, 13, 159–166. Grill-Spector, K., Golarai, G., & Gabrieli, J. (2008). Developmental neuroimaging of the hu man ventral visual cortex. Trends in Cognitive Science, 12, 152–162. Grill-Spector, K., Henson, R., & Martin, A. (2006a). Repetition and the brain: Neural mod els of stimulus-specific effects. Trends in Cognitive Science, 10, 14–23. Grill-Spector, K., Knouf, N., & Kanwisher, N. (2004). The fusiform face area subserves face perception, not generic within-category identification. Nature Neuroscience, 7, 555– 562. Grill-Spector, K., Kushnir, T., Edelman, S., Avidan, G., Itzchak, Y., & Malach, R. (1999). Dif ferential processing of objects under various viewing conditions in the human lateral oc cipital complex. Neuron, 24, 187–203. Grill-Spector, K., Kushnir, T., Edelman, S., Itzchak, Y., & Malach, R. (1998a). Cue-invariant activation in object-related areas of the human occipital lobe. Neuron, 21, 191–202. Grill-Spector, K., Kushnir, T., Hendler, T., Edelman, S., Itzchak, Y., & Malach, R. (1998b). A sequence of object-processing stages revealed by fMRI in the human occipital lobe. Hu man Brain Mapping, 6, 316–328. Grill-Spector, K., Kushnir, T., Hendler, T., & Malach, R. (2000). The dynamics of object-se lective activation correlate with recognition performance in humans. Nature Neuro science, 3, 837–843. Grill-Spector, K., & Malach, R. (2001). fMR-adaptation: A tool for studying the functional properties of human cortical neurons. Acta Psychologica (Amst), 107, 293–321. Grill-Spector, K., & Malach, R. (2004). The human visual cortex. Annual Review of Neuro science, 27, 649–677. Hasson, U., Harel, M., Levy, I., & Malach, R. (2003). Large-scale mirror-symmetry organi zation of human occipito-temporal object areas. Neuron, 37, 1027–1041. Hasson, U., Levy, I., Behrmann, M., Hendler, T., & Malach, R. (2002). Eccentricity bias as an organizing principle for human high-order object areas. Neuron, 34, 479–490.

Page 24 of 29

Representation of Objects Haxby, J. V., Gobbini, M. I., Furey, M. L., Ishai, A., Schouten, J. L., & Pietrini, P. (2001). Distributed and overlapping representations of faces and objects in ventral temporal cor tex. Science, 293, 2425–2430. Haxby, J. V., Hoffman, E. A., & Gobbini, M. I. (2000). The distributed human neural system for face perception. Trends in Cognitive Science, 4, 223–233. Hemond, C. C., Kanwisher, N. G., & Op de Beeck, H. P. (2007). A preference for contralat eral stimuli in human object- and face-selective cortex. PLoS ONE, 2, e574. James, T. W., Humphrey, G. K., Gati, J. S., Menon, R. S., & Goodale, M. A. (2002). Differen tial effects of viewpoint on object-driven activation in dorsal and ventral streams. Neuron, 35, 793–801. Johnson, M. H. (2001). Functional brain development in humans. Nature Reviews, Neuro science, 2, 475–483. Kanwisher, N. (2000). Domain specificity in face perception. Nature Neuroscience, 3, 759–763. Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: A module in human extrastriate cortex specialized for face perception. Journal of Neuroscience, 17, 4302–4311. Kastner, S., De Weerd, P., & Ungerleider, L. G. (2000). Texture segregation in the human visual cortex: A functional MRI study. Journal of Neurophysiology, 83, 2453–2457. Kobatake, E., & Tanaka, K. (1994). Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortex. Journal of Neurophysiology, 71, 856–867. Kourtzi, Z., & Kanwisher, N. (2000). Cortical regions involved in perceiving object shape. Journal of Neuroscience, 20, 3310–3318. Kourtzi, Z., & Kanwisher, N. (2001). Representation of perceived object shape by the hu man lateral occipital complex. Science, 293, 1506–1509. Kourtzi, Z., Tolias, A. S., Altmann, C. F., Augath, M., & Logothetis, N. K. (2003). Integra tion of local features into global shapes: monkey and human fMRI studies. Neuron, 37, 333–346. Larsson, J., & Heeger, D. J. (2006). Two retinotopic visual areas in human lateral occipital cortex. Journal of Neuroscience, 26, 13128–13142. Lerner, Y., Epshtein, B., Ullman, S., & Malach, R. (2008). Class information predicts acti vation by object fragments in human object areas. Journal of Cognitive Neuroscience, 20, 1189–1206.

Page 25 of 29

Representation of Objects Lerner, Y., Hendler, T., Ben-Bashat, D., Harel, M., & Malach, R. (2001). A hierarchical axis of object processing stages in the human visual cortex. Cerebral Cortex, 11, 287–297. Lerner, Y., Hendler, T., & Malach, R. (2002). Object-completion effects in the human later al occipital complex. Cerebral Cortex, 12, 163–177. Levy, I., Hasson, U., Avidan, G., Hendler, T., & Malach, R. (2001). Center-periphery organi zation of human object areas. Nature Neuroscience, 4, 533–539. Logothetis, N. K., Pauls, J., & Poggio, T. (1995). Shape representation in the inferior tem poral cortex of monkeys. Current Biology, 5, 552–563. Malach, R., Levy, I., & Hasson, U. (2002) The topography of high-order human object ar eas. Trends in Cognitive Science, 6, 176–184. Malach, R., Reppas, J. B., Benson, R. R., Kwong, K. K., Jiang, H., Kennedy, W. A., Ledden, P. J., Brady, T. J., Rosen, B. R., & Tootell, R. B. (1995). Object-related activity revealed by functional magnetic resonance imaging in human occipital cortex. Proceedings of the Na tional Academy of Sciences U S A, 92, 8135–8139. Marr, D. (1980). Visual information processing: The structure and creation of visu al representations. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 290, 199–218. (p. 26)

Martin, A., Wiggs, C. L., Ungerleider, L. G., & Haxby, J. V. (1996). Neural correlates of cat egory-specific knowledge. Nature, 379, 649–652. McKyton, A., & Zohary, E. (2007). Beyond retinotopic mapping: The spatial representa tion of objects in the human lateral occipital complex. Cerebral Cortex, 17, 1164–1172. Mendola, J. D., Dale, A. M., Fischl, B., Liu, A. K., & Tootell, R. B. (1999). The representa tion of illusory and real contours in human cortical visual areas revealed by functional magnetic resonance imaging. Journal of Neuroscience, 19, 8560–8572. Miller, E. K., Li, L., & Desimone, R. (1991). A neural mechanism for working and recogni tion memory in inferior temporal cortex. Science, 254, 1377–1379. Moeller, S., Freiwald, W. A., & Tsao, D. Y. (2008). Patches with links: A unified system for processing faces in the macaque temporal lobe. Science, 320, 1355–1359. Nakayama, K., He, Z. J., & Shimojo, S. (1995). Visual surface representation: A critical link between low-level and high-level vision. In S. M. Kosslyn & D. N. Osherson (Eds.), An invitation to cognitive sciences: Visual cognition. Cambridge, MA: MIT Press. Op de Beeck, H. P., Haushofer, J., & Kanwisher, N. G. (2008). Interpreting fMRI data: Maps, modules and dimensions. Nature Reviews, Neuroscience, 9, 123–135. Op De Beeck, H., & Vogels, R. (2000). Spatial sensitivity of macaque inferior temporal neurons. Journal of Comparative Neurology, 426, 505–518. Page 26 of 29

Representation of Objects Perrett, D. I. (1996). View-dependent coding in the ventral stream and its consequence for recognition. In R. Camaniti, K. P. Hoffmann, & A. J. Lacquaniti (Eds.), Vision and move ment mechanisms in the cerebral cortex (pp. 142–151). Strasbourg: HFSP. Perrett, D. I., Oram, M. W., & Ashbridge, E. (1998). Evidence accumulation in cell popula tions responsive to faces: An account of generalisation of recognition without mental transformations. Cognition, 67, 111–145. Peterson, M. A., & Gibson, B. S. (1994a). Must shape recognition follow figure-ground or ganization? An assumption in peril. Psychological Science, 5, 253–259. Peterson, M. A., & Gibson, B. S. (1994b). Object recognition contributions to figureground organization: Operations on outlines and subjective contours. Perception and Psy chophysics, 56, 551–564. Pinsk, MA., Arcaro, M., Weiner, KS., Kalkus, JF., Inati, SJ., Gross, CG., Kastner, S. (2009). Neural representations of faces and body parts in macaque and human cortex: a compar ative FMRI study. J Neurophysiol. (5): 2581–600. Poggio, T., & Edelman, S. (1990). A network that learns to recognize three-dimensional objects. Nature, 343, 263–266. Quiroga, R. Q., Mukamel, R., Isham, E. A., Malach, R., & Fried, I. (2008). Human singleneuron responses at the threshold of conscious recognition. Proceedings of the National Academy of Sciences U S A, 105, 3599–3604. Quiroga, R. Q., Reddy, L., Kreiman, G., Koch, C., & Fried, I. (2005). Invariant visual repre sentation by single neurons in the human brain. Nature, 435, 1102–1107. Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2, 1019–1025. Rolls, E. T. (2000). Functions of the primate temporal lobe cortical visual areas in invari ant visual object and face recognition. Neuron, 27, 205–218. Rolls, E. T., & Milward, T. (2000). A model of invariant object recognition in the visual sys tem: Learning rules, activation functions, lateral inhibition, and information-based perfor mance measures. Neural Computation, 12, 2547–2572. Sawamura, H., Orban, G. A., & Vogels, R. (2006). Selectivity of neuronal adaptation does not match response selectivity: A single-cell study of the FMRI adaptation paradigm. Neu ron, 49, 307–318. Sayres, R., & Grill-Spector, K. (2008). Relating retinotopic and object-selective responses in human lateral occipital cortex. Journal of Neurophysiology, 100 (1), 249–267. Schwarzlose, R. F., Baker, C. I., & Kanwisher, N. K. (2005). Separate face and body selec tivity on the fusiform gyrus. Journal of Neuroscience, 25, 11055–11059. Page 27 of 29

Representation of Objects Schwarzlose, R. F., Swisher, J. D., Dang, S., & Kanwisher, N. (2008). The distribution of category and location information across object-selective regions in human visual cortex. Proceedings of the National Academy of Sciences U S A, 105, 4447–4452. Stanley, D. A., & Rubin, N. (2003). fMRI activation in response to illusory contours and salient regions in the human lateral occipital complex. Neuron, 37, 323–331. Tarr, M. J., & Bulthoff, H. H. (1995). Is human object recognition better described by geon structural descriptions or by multiple views? Comment on Biederman and Gerhardstein (1993). Journal of Experimental Psychology: Human Perception and Performance, 21, 1494–1505. Tarr, M. J., & Gauthier, I. (2000). FFA: A flexible fusiform area for subordinate-level visual processing automatized by expertise. Nature Neuroscience, 3, 764–769. Thorpe, S., Fize, D., & Marlot, C. (1996). Speed of processing in the human visual system. Nature, 381, 520–522. Ullman, S. (1989). Aligning pictorial descriptions: An approach to object recognition. Cog nition, 32, 193–254. Ungerleider, L. G., Mishkin, M., & Macko, K. A. (1983). Object vision and spatial vision: Two cortical pathways. Trends in Neuroscience, 6, 414–417. Van Essen, D. C., Felleman, D. J., DeYoe, E. A., Olavarria, J., & Knierim, J. (1990). Modular and hierarchical organization of extrastriate visual cortex in the macaque monkey. Cold Spring Harbor Symposia on Quantum Biology, 55, 679–696. Vinberg, J., & Grill-Spector, K. (2008). Representation of shapes, edges, and surfaces across multiple cues in the human visual cortex. Journal of Neurophysiology, 99, 1380– 1393. Vogels, R., & Biederman, I. (2002). Effects of illumination intensity and direction on ob ject coding in macaque inferior temporal cortex. Cerebral Cortex, 12, 756–766. Vuilleumier, P., Henson, R. N., Driver, J., & Dolan, R. J. (2002). Multiple levels of visual ob ject constancy revealed by event-related fMRI of repetition priming. Nature Neuroscience, 5, 491–499. Wandell, B. A. (1999). Computational neuroimaging of human visual cortex. Annual Re view of Neuroscience, 22, 145–173. Wang, G., Tanaka, K., & Tanifuji, M. (1996). Optical imaging of functional organization in the monkey inferotemporal cortex. Science, 272, 1665–1668. (p. 27)

Weiner, K. S., & Grill-Spector, K. (2010). Sparsely-distributed organization of face

and limb activations in human ventral temporal cortex. NeuroImage, 52, 1559–1573.

Page 28 of 29

Representation of Objects Weiner, KS., & Grill-Spector, K. (2011). Not one extrastriate body area: using anatomical landmarks, hMT+, and visual field maps to parcellate limb-selective activations in human lateral occipitotemporal cortex. NeuroImage, 56 (4): 2183–99. Weiner, KS., & Grill-Spector, K. (2013). Neural representations of faces and limbs neigh bor in human high-level visual cortex: evidence for a new organization principle. Psychol Res. 277 (1): 74–97. Williams, M. A., Dang, S., & Kanwisher, N. G. (2007). Only some spatial patterns of fMRI response are read out in task performance. Nature Neuroscience, 10, 685–686. Zangenehpour, S., Chaudhuri, A., Zangenehpour, S., Chaudhuri A. (2005). Patchy organi zation and asymmetric distribution of the neural correlates of face processing in monkey inferotemporal cortex. Curr Biol, 15 (11): 993–1005.

Kalanit Grill-Spector

Kalanit Grill-Spector is Associate Professor, Department of Psychology and Neuro science Institute, Stanford University.

Page 29 of 29

Representation of Spatial Relations

Representation of Spatial Relations Bruno Laeng The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0003

Abstract and Keywords Research in cognitive neuroscience in humans and animals has revealed a considerable degree of specialization of the brain for spatial functions, and has also revealed that the brain’s representation of space is separable from its representation of object identities. The current picture is that multiple and parallel frames of reference cooperate to make up a representation of space that allows efficient navigation and action within the sur rounding physical environment. As humans, however, we do not simply “act” in space, but we also “know” it and “talk” about it. Hence, the human brain’s spatial representations may involve specifically human and novel patterns of lateralization and of brain areas’ specializations. Pathologies of space perception and spatially-directed attention, like spa tial neglect, can be explained by the damage to one or several of these maps and frames of reference. The multiple spatial, cognitive, maps used by the human brain clearly coop erate toward flexible representations of spatial relations that are progressively abstract (or categorical) and may be apt to support the human ability to communicate spatial in formation and understand mathematical concepts. Nevertheless, a representation of space as extended and continuous is also necessary for the control of action and naviga tion. Keywords: spatial representations, frames of reference, spatial neglect, lateralization, cognitive maps

Representing the space around our human bodies seems to serve three main functions or goals: to “act,” to “know,” and to “talk.” First of all, humans need to navigate in their en vironment. Some navigational skills require a very precise adaptation of the movement of the entire body to the presence of other external bodies and obstacles (whether these are static or also mobile). Without an adequate representation of physical distances between external bodies (and between these and oneself), actions like running in a crowd or dri ving in traffic would be impossible. Humans also possess hands and need to manipulate objects; their highly developed control of fine finger movements makes possible the con struction and use of tools. Engaging in complex sequences of manual actions requires the positioning and direction of movements over precise and narrow areas of space (e.g., when typing on the keyboard, opening a lock with a key, making a knot, playing a musical Page 1 of 59

Representation of Spatial Relations instrument). All these behaviors would be impossible without a fine-grained representa tion of the spatial distances and actual sizes of the objects and of their position relative to each other and to the body and hand involved in the action. However, humans do not solely “act” in space. Humans also “cognize” about space, For example, we can think about an object being present in a place although neither the ob ject nor the place is any longer visible (i.e., Piagetian object permanence). We can think about simple spatial schemata or complex mathematical spaces and geometries (Aflalo & Graziano, 2006); the ability to represent space as a continuum may lie at the basis of our understanding of objects’ permanence in space and, therefore, (p. 29) of numerosity (De haene, 1997). We can also engage in endless construction of meaning and parse the phys ical world into classes and concepts; in fact, abstraction and recognition of equivalences between events (i.e., categorization) has obvious advantages, as the cultural evolution of humans demonstrates. Categories can be expressed in symbols, which can be exchanged. All this applies to spatial representations as well. For example, the category “to the left” identifies as equivalent a whole class of positions and can be expressed either in verbal language or with pictorial symbols (e.g., an arrow: ←). Clearly, humans not only “act” in space and “know” space, but they also “talk” about space. The role played by spatial cog nition in linguistic function cannot be underestimated (e.g., “thinking for speaking”; Slobin, 1996). Thus, a categorical, nonmetric, representation of space constitutes an addi tional spatial ability, one that could make the groundwork for spatial reference in lan guage. The present discussion focuses on vision, in part because evolution has devoted to it a large amount of the primate brain’s processing capacity (e.g., in humans, about 4 to 6 bil lion neurons and 20 percent of the entire cortical area; Maunsell & Newsome, 1987; Wan dell et al., 2009;); in part because vision is the most accurate spatial sense in primates and is central to the human representation of space (“to see is to know what is where by looking”; Marr, 1982). Nevertheless, it is important to acknowledge that a representation of space can be obtained in humans through several sensory modalities (e.g., kinesthesia) and that blind people do posses a detailed knowledge of space and can move and act in it as well as talk about it. Some animals possess a representation of navigational space that can be obtained through sensory information that is not available to humans (e.g., a “magnetic” sense; Johnsen & Lohmann, 2005; Ritz, 2009). Honeybees’ “dances” can com municate spatial information that refers to locations where food can be found (Kirchner & Braun, 1994; Menzel et al., 2000). We begin by discussing (1) the existence of topographic maps in the brain. From these maps, the brain extracts higher order representations of the external world that are dedi cated to the basic distinction between (2) “what” is out there versus “where” it is. These two types of information need to be integrated for the control of action, and this yields in turn the representation of (3) “how” an object can be an object of action and “which” spe cific object among multiple ones is located in a specific place at a given time. However, lo calization of objects can take place according to multiple and parallel (4) spatial frames of reference; the deployment of spatial attention can also occur along these multiple refer Page 2 of 59

Representation of Spatial Relations ence frames, as the pathology of spatial attention or (5) neglect, after brain damage, has clearly revealed. Regarding the brain’s specialization for spatial cognition, humans show a strong degree of (6) cerebral lateralization that may be unique in the animal world. The current evidence indicates a right hemisphere’s specialization for analog (coordinate) spatial information versus a left hemisphere’s specialization for digital (categorical) spa tial information. The representation of categorical spatial relations is also relevant for (7) object recognition because an object’s structural description is made in terms of parts and categorical spatial relations between these (i.e., the “where of what”). Finally, we dis cuss the brain’s representation of large-scale space, as it is used in navigation, or the (8) “cognitive map” of the environment.

1. The Brain’s Topographic Maps In the evolutionary history of vision, the primeval function of photoreceptors or “eyes” may have been a raw sense of location (e.g., cardinal directions as up vs. down based on sunlight; detecting motion paths or distances by use of optic flow; Ings, 2007). However, the ability to focus a detailed “image” with information about wavelengths and spatial fre quencies that allow the extraction of colored surfaces and forms requires the evolution of camera-like eyes with a single lens focusing light onto a mosaic of photoreceptors (e.g., the image-forming eye of the squid or the eyes of vertebrates; Lamb et al., 2007). The hu man retina allows the formation of an image that is very detailed spatially, and this detail seems conserved at the early cortical level in the flow of information from the eye to suc cessive brain areas. The smallest cortical receptive fields processing spatial information in human vision possess receptive field centers hardly wider than the single-cone pho toreceptors (Smallman et al., 1996). The retina provides the initial topographic map for humans, where nearby scene points are represented in the responses of nearby photoreceptors and, in turn, in a matrix of neurons to which they provide their input. Areas of the brain receiving retinal input are also topographically organized in retinotopic “visual field maps” (Tootell, Hadjikhani, et al., 1998; Wandell et al., 2005). These preserve, to some extent, the geometric structure of the retina, which in turn, by the laws of optic refraction, reflects the geometric struc ture of the external visual (p. 30) world as a planar projection onto a two-dimensional sur face (Figure 3.1).

Page 3 of 59

Representation of Spatial Relations

Figure 3.1 Topographic representations in primary visual cortex (V1). Reprinted with permission from Tootell, Hadjikhani, Vanduffel W, et al., 1998. © 1998 National Academy of Sciences, U.S.A.

There is now consensus that the topographic features of cortical and subcortical maps are not incidental, but instead are essential to brain function (Kaas, 1997). A topographi cally organized structure can depict visual information as “points” organized by their rel ative locations in space and varying in size, brightness, and color. Points near each other in the represented space are represented by points near each other in the representing substrate; this (internal) space can be used to represent (external) space (Markman, 1999). Thus, one fundamental property of the brain’s representation of space is that the brain uses space on the cortex to represent space in the world (Kosslyn, Thompson, & Ga nis, 2006). Specifically, topographic maps can evaluate how input from one set of recep tors can be different from that of adjoining sets of receptors. Local connections among neurons that are topographically organized can easily set up center-surround receptive fields and compare adjacent features. Other types of brain organization between units sensitive to adjacent points requires more complex arrays and longer connections (Cher niak, 1990), which are metabolically (and evolutionarily) costly and result in increases in neural transmission time. Cortical maps appear to be further arranged in spatial clusters at a coarser scale (Wandell et al., 2005, 2009). This organization allows neural mosaics in different maps that serve similar common computational goals to share resources (e.g., coordinating the timing of neural signals or temporarily storing memories; Wandell et al., 2005). Thus, cortical maps may organize themselves to optimize nearest neighbor rela tionships (Kohonen, 2001) so that neurons that process similar information are located near each other, minimizing wiring length. A topographical neural design is clearly revealed by the effects of localized cortical dam age, resulting in a general loss of visual function restricted to a corresponding region within the visual field (Horton & Hoyt, 1991). Several areas of the human visual cortex, but also brainstem nuclei (e.g., the superior colliculus) and thalamus (e.g., the lateral Page 4 of 59

Representation of Spatial Relations geniculate nucleus and the pulvinar), are organized (p. 31) into retinotopic maps, which preserve both left–right and top–bottom ordering. Consequently, cells that are close to gether on the sending surface (the retina) project to regions that are close together on the target surface (Thivierge & Marcus, 2007). Remarkably, the dorsal surface of the hu man brain, extending from the posterior portion of the intraparietal sulcus forward, con tains several maps (400–700 mm2) that are much smaller than the V1 map (4000 mm2; Wandell et al., 2005). The visual field is represented continuously as in V1, but the visual field is split along the vertical meridian so that input to each hemisphere originates from the contralateral visual hemifield. The two halves are thus “seamed” together by the long connections of the corpus callosum. If early vision’s topography has high resolution, later maps in the hierarchy are progres sively less organized topographically (Malach, Levy, & Hasson, 2002). As a result, the im age is represented at successive stages with decreasing spatial precision and resolution. In addition, beyond the initial V1 cortical map, retinotopic maps are complex and patchy. Consequently, adjacent points in the visual field are not represented in adjacent regions of the same area in every case (Sereno et al., 1995). However, the topographic organiza tion of external space in the visual cortex is extraordinarily veridical compared with other modalities. For example, the representation of bodily space in primary somatosensory cortex (or the so-called homunculus) clearly violates a smooth and continuous spatial rep resentation of body parts. The face representation is inverted (Servos et al., 1999; Yang et al., 1994), and the facial skin areas are located between those representing the thumb and the lower lip of the mouth (Nguyen et al., 2004). Also, similarly to the extensive rep resentation of the thumb in somatosensory cortex (in fact larger than that of the skin of the whole face), cortical visual maps magnify or compress distances in some portions of the visual field. The central 15 degrees of vision take up about 70 percent of cortical area; and the central 24 degrees cover 80 percent (Fishman, 1997; Zeki, 1969). In V2, parts of the retina that correspond to the upper half of the visual field are represented separately from parts that respond to the lower half of the visual field. Area MT repre sents only the binocular field, and V4 only the central 30 to 40 degrees, whereas the pari etal areas represent more of the periphery (Gatass et al., 2005; Sereno et al., 2001). Cal losal connections in humans allow areas of the inferior parietal cortex and the fusiform gyrus in the temporal lobe to deal with stimuli presented in the ipsilateral visual field (Tootell, Mendola, et al., 1998). These topographic organizations have been revealed by a variety of methods, including clinical studies of patients (Fishman, 1997); animal re search (Felleman & Van Essen, 1991; Tootell et al., 1982); and more recently, neuroimag ing in healthy humans (Engel et al., 1994, 1997; Sereno et al., 1995; DeYoe et al., 1996; Wandell, 1999) and brain stimulation (Kastner et al., 1998). In some of the retinotopic mapping studies with functional magnetic resonance imaging (fMRI), participants per formed working memory tasks or planning eye movements (Silver & Kastner, 2009). These studies revealed the existence of previously unknown topo-graphic maps of visual space in the human parietal (e.g., Sereno & Huang, 2006) and frontal (e.g., Kastner et al., 2007) lobes.

Page 5 of 59

Representation of Spatial Relations Another clear advantage of a topographical organization of the visual brain would be in guiding ocular movements by maintaining a faithful representation of the position of the target of a saccade (Optican, 2005). In addition, a topographical organization provides ex plicit and accessible information that represents the external world, beginning with the extraction of boundaries and limits of surfaces of objects and ground (Barlow, 1981). Ac cording to Marr (1982): “A representation is a formal system for making explicit certain entities or types of information.” If different properties or features of the physical infor mation are encoded or made explicit at any level of the flow of information, then qualita tively different types of information will be represented.

2. “What” Versus “Where” in the Visual System The brain can be defined as a largely special-purpose machine in which a “division of la bor” between brain areas is the pervasive principle of neural organization. After more than a century of systematic brain research, the idea that functions fractionate into a pro gressively modular brain structure has achieved axiomatic status (Livingstone & Hubel, 1988; Zeki, 2001). Specifically, a perceptual problem is most easily dealt with by dividing the problem into smaller subproblems, as independent of the others as possible so as not to disrupt each other (Gattass et al., 2005). One way in which the visual brain accomplish es this division of labor is by separating visual information into two streams of processing or, namely, a “what” system and a “where” system. It may appear odd that the brain or cognitive system separates visual attributes that in the physical world are con joined. Disjoining attributes of the same object exacerbates the problem of integrating (p. 32)

them (i.e., the so-called binding problem; Treisman, 1996; Revonsuo & Newman, 1999; Seth et al., 2004). However, computer simulations with artificial neural networks have demonstrated that two subsystems can be more efficient than one in computing different mappings of the same input at the same time (Otto et al., 1992; Reuckl, Cave & Kosslyn, 1989). Thus, cognitive neuroscience reveals that the brains of humans and, perhaps, of all other vertebrates process space and forms (bodies) as independent aspects of reality. Thus, the visual brain can be divided between “two cortical visual systems” (Ingle, 1967; Mishkin, Ungerleider & Macko, 1983; Schneider, 1967; Ungerleider & Mishkin, 1982). A ventral (mostly temporal cortex in humans) visual stream mediates object identification (“what is it?”), and a dorsal (mostly parietal cortex) visual stream mediates localization of objects in space (“where is it?”). This partition is most clearly documented for the visual modality in our species (and in other primates), although it appears to be equally valid for other sensory processes (e.g., for the auditory system, Alain et al., 2001; Lomber & Malhotra, 2008; Romanski et al., 1999; for touch, Reed et al., 2005). Information processed in the two streams appears to be integrated at a successive stage in the superior temporal lobe (Morel & Bullier, 1990), where integration of the two pathways could recruit ocular move ments to the location of a shape and provide some positional information to the temporal areas (Otto et al., 1992). Further integration of shape and position information occurs in

Page 6 of 59

Representation of Spatial Relations the hippocampus where long-term memories of “configurations” of the life environment (Sutherland & Rudy, 1988) or its topography (O’Keefe & Nadel, 1978) are formed. Similarly to the classic Newtonian distinction between matter and absolute space, the dis tinction between what and where processing assumes that (1) objects can be represented by the brain independently from their locations, and (2) locations can be perceived as im material points in an immaterial medium, empty of objects. Indeed, at a phenomenologi cal level, space primarily appears as an all-pervading stuff, an incorporeal or “ethereal” receptacle that can be filled by material bodies, or an openness in which matter spreads out. In everyday activities, a region of space is typically identified with reference to “what” is or could be located there (e.g., “the book is on the table”; Casati & Varzi, 1999). However, we also need to represent empty or “negative” space because trajectories and paths toward other objects or places (Collett, 1982) occur in a spatial medium that is free of material obstacles. The visual modality also estimates distance (i.e., the empty portion of space between objects), and vision can represent the future position of a moving object into some unoccupied point in space (Gregory, 2009). Humans exemplify the division of labor between the two visual systems. Bálint (1909) was probably the first to report a human case in which the perception of visual location was impaired while the visual recognition of an object was relatively spared. Bálint’s patient, who had bilateral inferior parietal lesions, was unable to reach for objects or estimate dis tances between objects. Holmes and Horax (1919) described similar patients and showed that they had difficulty in judging differences in the lengths of two lines, but could easily judge whether a unitary shape made by connecting the same two lines was a trapezoid and not a rectangle (i.e., when the lines were part of a unitary shape). In general, pa tients with damage to the parietal lobes lack the ability to judge objects’ positions in space, as shown by their difficulties in reaching, grasping, pointing to, or verbally de scribing their position and size (De Renzi, 1982). In contrast, “blindsight” patients, who are unaware of the presence of an object, can locate by pointing an object in the blind field of vision (Perenin & Jeannerod, 1978; Weiskrantz, 1986). Children with the Williams’ (developmental) syndrome show remarkable sparing of object recognition with severe breakdown of spatial abilities (Landau et al., 2006). In contrast, patients with infe rior temporal-occipital lesions, who have difficulty in recognizing the identity of objects and reading (i.e., visual agnosia), typically show unimpaired spatial perception; they can reach and manipulate objects and navigate without bumping into objects (Kinsbourne & Warrington, 1962). Patients with visual agnosia after bilateral damage to the ventral sys tem (Goodale et al., 1991) can also guide a movement in a normal and natural manner to ward a vertical opening by inserting a piece of cardboard into it (Figure 3.2), but perform at chance when asked to report either verbally or by adjusting the cardboard to match the orientation of the opening (Milner & Goodale, 2008). It would thus seem that the spa tial representations of the dorsal system can effectively guide action but cannot make even simple pattern discriminations.

Page 7 of 59

Representation of Spatial Relations

Figure 3.2 Performance of a patient with extensive damage to the ventral system in perceptually match ing a linear gap versus inserting an object into the gap. From Milner & Goodale, 2006. Reprinted with per mission from Oxford University Press.

Importantly, several neurobiological investigations in which brain regions of nonhuman primates (p. 33) were selectively damaged have shown a double dissociation between per ception of space and of objects (Mishkin et al., 1983). In one condition, monkeys had to choose one of two identical objects (e.g., two square plaques) located closer to a land mark object (a cylinder). In another condition, two different objects (e.g., a cube and a pyramid), each with a different pattern on its surface (e.g., checkerboard vs. stripes), were shown. In both tasks, the monkeys had to reach a target object, but the solution in the first task mainly depended on registering spatial information, whereas in the other, information about shape and pattern was crucial to obtain the reward. A group of mon keys lacked parts of the parietal cortex, whereas another group lacked parts of the tem poral cortex. The “parietal” monkeys were significantly more impaired in the spatial task, whereas the “temporal” monkeys were significantly more impaired in the object discrimi nation task. The results are counterintuitive because one would think that the monkeys with an intact dorsal system (parietal lobe) should be able to discriminate checkerboards and stripes (because these differ in both size and slant of their elements), despite the damage to the ventral system (temporal lobe). The inability of monkeys to do so clearly indicates that spatial representations of their dorsal system are used to guide action, not to discriminate patterns (Kosslyn, 1994; Kosslyn & Koenig, 1992). In addition, electrical recordings from individual neurons in monkeys’ parietal lobes re veal cells that encode the shape of objects (Sereno & Maunsell, 1998; Taira et al., 1990;; Sakatag et al., 1997). However, these pattern (or “what”) representations in the dorsal cortex are clearly action related; their representation of the geometrical properties of shapes (e.g., orientation, size, depth, and motion) are used exclusively when reaching and grasping objects. That is, they represent space in a “pragmatic” sense and without a con Page 8 of 59

Representation of Spatial Relations ceptual content that is reportable (Faillenot et al., 1997; Jeannerod & Jacob, 2005; Va lyear et al., 2005). Neuroimaging studies show shape-selective activations in humans’ dorsal areas (Denys et al., 2004); simply seeing a manipulable human-made object (e.g., a tool like a hammer) evokes changes in neural activity within the human dorsal system (Chao & Martin, 2000). This is consistent with primate studies showing that the parietal lobes contain neurons that encode the shape of objects. Thus, human parietal structures contain motor-relevant information about the shapes of some objects, information that would seem necessary to efficiently control specific actions. These dorsal areas’ shape information could also be used in conjunction with the ventral system and act as an organizing principle for a “cate gory-specific” representation of semantic categories (in this case, for “tools”; Mahon et al., 2007). Remarkably, shape information in the dorsal system per se does not seem to be able to support the conscious perception of object qualities; a profound object agnosia (that includes the recognition of manipulable objects) is observed after temporal lesions. Dissociations of motor-relevant shape information from conscious shape perception have also been documented with normal observers, when actions were directed toward visual illusions (Aglioti et al., 1995; Króliczak et al., 2006). Evidently, the dorsal (parietal) system’s visual processing does not lead to a conscious description (identification) of ob jects’ shapes (Fang & He, 2005; Johnson & Haggard, 2005). It is also clear that the dorsal system’s shape sensitivity does not derive from information relayed by the ventral system because monkeys with large temporal lesions and profound object recognition deficits are able to normally grasp small objects (Glickstein et al., 1998) and catch flying insects. Similarly, patient D.F. (Goodale et al., 1991; Milner et al., 1991; Milner & Goodale, 2006) showed preserved visuomotor abilities and could catch a ball in flight (Carey et al., 1996) but could not recognize a drawing of an apple. When asked to make a copy of it, she arranged straight lines into a spatially incoherent squarelike configuration (Servos et al., 1999). D.F.’s drawing deficit indicates that her spared dorsal shape representations cannot be (p. 34) accessed or expressed symbolically, de spite being usable as the targets of directed action. The fact that D.F. performed at chance, when reporting the orientation of the opening in the previously described “post ing” experiment, does not necessarily indicate that the conscious representation of space is an exclusive function of the ventral system (Milner & Goodale, 2006, 2008). Nor does it indicate that the dorsal system’s representation of space should be best described as a “zombie agent” (Koch, 2004) or as a form of “blindsight without blindness” (Weiskrantz, 1997). In fact, patient D.F. made accurate metric judgments of which elements in an array were nearest and which were farthest; when asked to reproduce an array of irregularly positioned colored dots on a page, her rendition of their relative positions (e.g., what ele ment was left of or below another element) was accurate, although their absolute posi tioning was deficient (Carey et al., 2006). It seems likely that a perfect reproduction of an array of elements requires comparing the copy to the model array as an overall “gestalt” or perceptual template, a strategy that may depend on the shape perception mechanisms of D.F.’s (damaged) ventral system. Page 9 of 59

Representation of Spatial Relations Finally, it would seem that the ventral system in the absence of normal parietal lobes can not support an entirely normal perception of shapes. Patients with extensive and bilateral parietal lesions (i.e., with Bálint’s syndrome) do not show completely normal object or shape perception; their object recognition is typically limited to one object or just a part of it (Luria, 1963). Remarkably, these patients need an extraordinarily long time to ac complish recognition of even a single object, thus revealing a severe reduction in object processing rate (Duncan et al., 2003). Kahneman, Treisman, and Gibbs (1992) made a dis tinction between object identification and object perception. They proposed that identifi cation (i.e., the conscious experience of seeing an instance of an object) depends on form ing a continuously updated, integrated representation of the shapes and their space–time coordinates. The ventral and dorsal system may each support a form of “phenomenal” consciousness (cf. Baars, 2002; Block, 1996), but they necessitate the functions of the oth er system in order to generate a conscious representation that is reportable and accessi ble to other parts of the brain (e.g., the executive areas of the frontal lobes; Lamme, 2003, 2006).

3. “Where,” “How,” or “Which” Systems? “What” versus “where” functional distinctions have also been identified in specific areas of the frontal lobe of monkeys (Goldman-Rakic, 1987; Passingham, 1985; Rao et al., 1997; Wilson et al., 1993). These areas appear to support working memory (short-term reten tion) of associations between shape and spatial information. In other words, they encode online information about “what is where?” or “which is it?” (when seeing multiple ob jects). Prompted by these findings with animals, similar functional distinctions have been described in human patients (e.g., Newcombe & Russell, 1969) as well as in localized brain activations in healthy subjects (e.g., Courtney et al., 1997; Haxby et al., 1991; Smith et al., 1995; Ungerleider & Haxby, 1994; Zeki et al., 1991). Spatial information (in particular fine-grained spatial information about distances, direc tion, and size) is clearly essential to action. In this respect, much of the spatial informa tion of the “where” system is actually in the service of movement planning and guidance or of “praxis” (i.e., how to accomplish an action, especially early-movement planning; An dersen & Buneo, 2002). In particular, the posterior parietal cortex of primates performs the function of transforming visual information into a motor plan (Snyder et al., 1997). For example, grasping an object with one hand is a common primate behavior that, ac cording to studies of both monkeys and humans, depends on the spatial functions of the parietal lobe (Castiello, 2005). Indeed, patients with damage to the superior parietal lobe show striking deficits in visually guided grasping (i.e., optic ataxia; Perenin & Vighetto, 1988). Damage to this area may result in difficulties generating visual-motor transforma tions that are necessary to mold the hand’s action to the shape and size of the object (Jeannerod et al., 1994; Khan et al., 2005), as well as to take into account the position of potential obstacles (Schindler et al., 2004).

Page 10 of 59

Representation of Spatial Relations Neuroimaging studies of healthy participants scanned during reach-to-grasp actions show activity in the posterior parietal cortex (especially when a precision grip is required; Cul ham et al., 2003; Gallivan et al., 2009). The superior parietal lobe appears to contain a topographic map that represents memory-driven saccade direction (Sereno et al., 2001) as well as the direction of a pointing movement (Medendorp et al., 2003). The parietal cortex may be tiled with spatiotopic maps, each representing space in the service of a specific action (Culham & Valyear, 2006). It is likely that the computation of motor com mands for reaching depends on the simultaneous processing of mutually connected areas of the parietal and frontal lobes, which (p. 35) together provide an integrated coding sys tem for the control of reaching (Burnod et al., 1999; Thiebaut de Schotten et al., 2005). Importantly, while lesions to the ventral system invariably impair object recognition, ob ject-directed grasping is spared in the same patients (James et al., 2003). The parietal lobes clearly support spatial representations detailed enough to provide the coordinates for precise actions like reaching, grasping, pointing, touching, looking, and avoiding a projectile. Given the spatial precision of both manual action and oculomotor behavior, one would expect that the neural networks of the parietal cortex would include units with the smallest possible spatial tuning (i.e., very small receptive fields; Gross & Mishkin, 1977). By the same reasoning, the temporal cortex may preferentially include units with large spatial tuning because the goal of such a neural network is to represent the presence of a particular object, regardless of its spatial attributes (i.e., show dimen sional and translational invariance). However, it turns out that both the parietal and tem poral cortices contain units with large receptive fields (from 25 to 100 degrees; O’Reilly et al., 1990) that exclude the fovea and can even represent large bilateral regions of the visual field (Motter & Mountcastle, 1981). Therefore, some property of these neural populations other than receptive field size must underlie the ability of the parietal lobes to code positions precisely. A hint is given by computational models (Ballard, 1986; Eurich & Schwegler, 1997; Fahle & Poggio, 1981; Hinton, McClelland, & Rumelhart, 1986; O’Reilly et al., 1990) showing that a population of neurons with large receptive fields, if these are appropriately overlapping, can be su perior to a population of neurons with smaller receptive fields in its ability to pinpoint something. Crucially, receptive fields of parietal neurons are staggered toward the pe riphery of the visual field, whereas receptive fields of temporal neurons tend to crowd to ward the central, foveal position of the visual field. Consequently, parietal neurons can ex ploit coarse coding to pinpoint locations, whereas the temporal neurons, which provide less variety in output (i.e., they all respond to stimuli in the fovea), trade off the ability to locate an object with the ability to show invariance to spatial transformations. Spatial at tention evokes increased activity in individual parietal cells (Constantinidis & Steinmetz, 2001) and within whole regions of the parietal lobes (Corbetta et al., 1993; 2000). There fore, focusing attention onto a region of space can also facilitate computations of the rela tive activation of overlapping receptive fields of cells and thus enhance the ability to pre cisely localize objects (Tsal & Bareket, 2005; Tsal, Meiran, & Lamy, 1995).

Page 11 of 59

Representation of Spatial Relations Following the common parlance among neuro-scientists, who refer to major divisions be tween neural streams with interrogative pronouns, the “where” system is to a large ex tent also the “how” system of the brain (Goodale & Milner, 1992), or the “vision-for-ac tion” system (whereas the “what” system has been labeled the “vision-for-perception’ sys tem by Milner & Goodale, 1995, 2008). However, spatial representations do much more than guide action; we also “know” space and can perceive it without having an intention al plan or having to perform any action. Although neuroimaging studies show that areas within the human posterior parietal cortex are active when the individual is preparing to act, the same areas are also active during the mere observation of others’ actions (Bucci no et al., 2004; Rizzolatti & Craighero, 2004). One possibility is that these areas may be automatically registering the “affordances” of objects that could be relevant to action, if an action were to be performed (Culham & Valyear, 2006). The superior and intraparietal regions of the parietal lobes are particularly engaged with eye movements (Corbetta et al., 1998; Luna et al., 1998), but neuroimaging studies also show that the parietal areas are active when gaze is fixed on a point on the screen and no action is required, while the observer simply attends to small objects moving randomly on the screen (Culham et al., 1998). The larger the number of moving objects that have to be attentively monitored on the screen, the greater is the activation in the parietal lobe (Cul ham et al., 2001). Moreover, covert attention to spatial positions that are empty of objects (i.e., before they appear in the expected locations) strongly engages mappings in the pari etal lobes (Corbetta et al., 2000; Kastner et al., 1999). Monkeys also show neural popula tion activity within the parietal cortex when they solve visual maze tasks and when men tally following a path, without moving their eyes or performing any action (Crowe et al., 2005). In human neuroimaging studies, a stimulus position judgment (left vs. right) in re lation to the body midline mainly activates the superior parietal lobe (Neggers et al., 2006), although pointing to or reaching is not required. In addition, our cognition of space is also qualitative or “categorical” (Hayward & Tarr, 1995). Such a type of spatial information is too abstract to be useful in fine motor guid ance. Yet, neuropsychological evidence clearly indicates that this type of (p. 36) spatial perception and knowledge is also dependent on parietal lobe function (Laeng, 1994). Thus, the superior and inferior regions of the parietal lobes may specialize, respectively, in two visual-spatial functions: vision-for-action versus vision-for-knowledge, or a “prag matic” versus a “semantic” function that encodes the behavioral relevance or meaning of stimuli (Freedman & Assad, 2006; Jeannerod & Jacob, 2005). The superior parietal lobe (i.e., the dorsal part of the dorsal system) may have a functional role close to the idea of an “agent” directly operating in space, whereas the inferior parietal lobe plays a role clos er to that of an “observer” that understands space and registers others’ actions as they evolve in space (Rizzolatti & Matelli, 2003). According to Milner and Goodale (2006), the polysensory areas of the inferior parietal lobe and superior temporal cortex may have developed in humans as new functional ar eas and be absent in monkeys. Thus, they can be considered a “third stream” of visual processing. More generally, they may function as a supramodal convergent system be Page 12 of 59

Representation of Spatial Relations tween the dorsal and ventral systems that supports forms of spatial cognition that are unique to our species (e.g., use of pictorial maps; Farrell & Robertson, 2000; Semmes et al., 1955). These high-level representational systems in the human parietal lobe may also provide the substrate for the construction and spatial manipulation of mental images (e.g., the three-dimensional representation of shapes and the ability to “mentally rotate”). Indeed, mental rotation of single shapes is vulnerable to lesions of the posterior parietal lobe (in the right hemisphere; Butters et al., 1970) or to its temporary and reversible de activation after transcranial magnetic stimulation (Harris & Miniussi, 2003) or stimula tion with cortically implanted electrodes in epileptic patients (Zacks et al., 2003). Neu roimaging confirms that imagining spatial transformations of shapes (i.e., mental rota tion) produces activity in the parietal lobes (e.g., Alivisatos & Petrides, 1997; Carpenter et al., 1999a; Harris et al., 2000; Jordan et al., 2002; Just et al., 2001; Kosslyn, DiGirolamo, et al., 1998). Specifically, it is the right superior parietal cortex that seems most involved in the mental rotation of objects (Parsons, 2003). Note that space in the visual cortex is represented as two-dimensional space (i.e., as a planar projection of space in the world), but disparity information from each retina can be used to reconstruct the objects’ threedimensional (3D) shape and the depth and global 3D layout of a scene. Neuroimaging in humans and electrical recordings in monkeys both indicate that the posterior parietal lobe is crucial to cortical 3D processing (Naganuma et al., 2005; Tsao et al., 2003). The parietal lobe of monkeys also contains neurons that are selectively sensitive to 3D infor mation from monocular information like texture gradients (Tsutsui et al., 2002). Thus, the human inferior parietal lobe or Brodmann area 39 (also known as the angular gyrus) would then be a key brain structure for our subjective experience of space or for “space awareness.” This area may contribute to forming individual representations of multiple objects by representing the spatial distribution of their contours and boundaries (Robertson et al., 1997). Such a combination of “what” with “where” information would result in selecting “which” objects will be consciously perceived. In sum, the posterior parietal cortex in humans and primates appears to be the control center for visual-spatial functions and the hub of a widely distributed brain system for the processing of spatial in formation (Graziano & Gross, 1995; Mountcastle, 1995). This distributed spatial system would include the premotor cortex, putamen, frontal eye fields, superior colliculus, and hippocampus. Parietal lesions, however, would disrupt critical input to this distributed system of spatial representations.

4. Spatial Frames of Reference In patients with Bálint’s syndrome, bilateral parietal lesions destroy the representation of spatial relations. These patients act as though there is no frame of reference on which to hang the objects of vision (Robertson, 2003). Normally, in order to reach, grasp, touch, look, point toward, or avoid something, we need to compute relative spatial locations be tween objects and our body (or the body’s gravitational axis) or of some of its parts (e.g.,

Page 13 of 59

Representation of Spatial Relations the eyes or the head’s vertical axis). Such computations are in principle possible with the use of various frames of reference. The initial visual frame of reference is retinotopic. However, this frame represents a view of the world that changes with each eye movement and therefore is of limited use for con trolling action. In fact, the primate brain uses multiple frames of reference, which are ob tained by integrating information from the other sense modalities with the retinal infor mation. These subsequent frames of reference provide a more stable representation of the visual world (Feldman, 1985). In the parietal lobe, neurons combine information to a stimulus in a particular location with information about the position (p. 37) of the eyes (Andersen & Buneo, 2002; Andersen, Essick, & Siegel, 1985), which is updated across saccades (Heide et al., 1995). Neurons with head-centered receptive fields are also found in regions of the monkey’s parietal lobe (Duhamel et al., 1997). Comparing location on the retina to one internal to the observer’s body is an effective way to compute position within a spatiotopic frame of reference, as also shown by computer simulations (Zipser & Andersen, 1988). In the monkey, parietal neurons can also code spatial relationships as referenced to an object and not necessarily to an absolute position relative to the viewer (Chafee et al., 2005, 2007). A reference frame can be defined by an origin and its axes. These can be conceived as rigidly attached or fixed onto something (an object or an element of the environment) or someone (e.g., the viewer’s head or the hand). For example, the premotor cortex of mon keys contains neurons that respond to touch, and their receptive fields form a crude map of the body surface (Avillac et al., 2005). These neurons are bimodal in that they also re spond to visual stimuli that are adjacent in space to the area of skin they represent (e.g., the face or an arm). However, these cells’ receptive fields are not retinotopic; instead, when the eyes move, their visual receptive fields remain in register with their respective tactile fields (Gross & Graziano, 1995; Kitada et al., 2006). For example, a bimodal neu ron with a facial tactile field responds as if its visual field is an inflated balloon glued to the side of the face. About 20 percent of these bimodal neurons continue their activity af ter lights are turned off, so as to also code the memory of an object’s location (Graziano, Hu, & Gross, 1997). Apparently, some of these neurons are specialized for withdrawing from an object rather than for reaching it (Graziano & Cooke, 2006). Neurons that inte grate several modalities at once have also been found within the premotor cortex of the monkey; trimodal neurons (visual, tactile, and auditory; Figure 3.3) have receptive fields that respond to a sound stimulus located in the space surrounding the head, within roughly 30 cm (Graziano, Reiss, & Gross, 1999). Neuroimaging reveals maximal activity in the human dorsal parieto-occipital sulcus when viewing objects looming near the face (i.e., in a range of 13 to 17 cm), and this neural response decreases proportionally to dis tance from the face (Quinlan & Culham, 2007). The superior parietal lobe of monkeys might be the substrate for body-centered positional codes for limb movements, where coordinates define the azimuth, elevation, and distance of the hand (Lacquaniti et al.,

Page 14 of 59

Representation of Spatial Relations 1995). In other words, pre motor and parietal areas can represent visual space near the body in “arm-centered” coordinates (Graziano et al., 1994). Visual space is con structed many times over, at tached to different parts of the body for different func tions (Graziano & Gross, 1998). Neuroimaging in hu mans confirms that at least one of the topographic maps of the parietal lobes uses a head-centered coordinate frame (Sereno & Huang, 2006). Thus, a plurality of Figure 3.3 Frames of reference of neural cells of the sensorimotor action spaces macaque centered on a region of the face and ex may be related to specific ef tending into a bounded region in near space. Such fectors that can move inde cells are multimodal and can respond to either visual pendently from the rest of or auditory stimuli localized within their head-cen tered receptive field. the body (e.g., hand, head, and eye). In these motor-ori From Graziano et al., 1999. Reprinted with permis sion from Nature. ented frames of reference, a spatial relationship between two locations can be coded in terms of the movement required to get from one to the other (Pail lard, 1991).

Finally, the underpinning of our sense of direction is gravity, which leads to the percep tion of “up” versus “down” or of a vertical direction that is clearly (p. 38) distinct from all other directions. This gravitational axis appears as irreversible, whereas front–back and left–right change continuously in our frame of reference simply by our turning around (Clément & Reschke, 2008). The multimodal cortex integrates the afferent signals from the peripheral retina with those from the vestibular organs (Battista & Peters, 2010; Brandt & Dietrich, 1999; Kahane et al., 2003; Waszak, Drewing, & Mausfeld, 2005) so as to provide a sense of the body’s position in relation to the environment.

5. Neglecting Space Localizing objects according to multiple and parallel spatial frames of reference is also relevant to the manner in which spatial attention is deployed. After brain damage, atten tional deficit, or the “neglect” of space, clearly reveals how attention can be allocated within different frames of reference. Neglect is a clinical disorder that is characterized by a failure to notice objects to one side (typically, the left). However, “left” and “right” must be defined with respect to some frame of reference (Beschin et al., 1997; Humphreys & Riddoch, 1994; Pouget & Snyder, 2000), and several aspects of the neglect syndrome are best understood in terms of different and specific frames of reference. That is, an object Page 15 of 59

Representation of Spatial Relations can be on the left side with respect to the eyes, head, or body, or with respect to some ax is placed on the object (e.g., the left side of the person facing the patient). In the latter case, one can consider the left side of an object (e.g., of a ball) as (1) based on a vector originating from the viewer or (2) based on the intrinsic geometry of the object (e.g., the left paw of the cat). These frames of reference can be dissociated by positioning different ly the parts of the body or of the object. For example, a patient’s head may turn to the right, but gaze can be positioned far to the left. Thus, another person directly in front of the patient would lie to the left with respect to the patient’s head and to the right with re spect to the patient’s eyes. Moreover, the person in front of the patient would have her right hand to the left of the patient’s body, but if she turned 180 degrees, her right hand would then lie to the right of the patient’s body. Although right-hemisphere damage typically leads to severe neglect (Heilman et al., 1985; Heilman & Van Den Abell, 1980; Vallar et al., 2003), some patients with left-sided lesions tend to neglect the ends of words (i.e., the right side, in European languages), even when the word appears rotated 180 degrees or is written backward or in mirror fashion (Caramazza & Hillis, 1990). Such errors occurring for a type of stimulus (e.g., words) in an object-centered or word-centered frame of reference imply (1) a spatial rep resentation of the object’s parts (e.g., of the letters, from left to right, for words) and (2) that damage can specifically affect how one reference frame is transformed into another. As discussed earlier, the parietal cortex contains neurons sensitive to all combinations of eye position and target location. Consequently, a variety of reference frame transforma tions are possible because any function over that input space can be created with appro priate combinations of neurons (Pouget & Sejnowski, 1997; Pouget & Snyder, 2000). That is, sensory information can be recoded into a flexible intermediate representation to facil itate the transformation into a motor command. In fact, regions of the parietal lobes where cells represent space in eye-centered coordinates may not form any single spatial coordinate system but rather carry the raw information necessary for other brain areas to construct other spatial coordinate systems (Andersen & Buneo, 2002; Chafee et al., 2007; Colby & Goldberg, 1999; Graziano & Gross, 1998; Olson, 2001, 2003; Olson & Gettner, 1995).

Page 16 of 59

Representation of Spatial Relations

Figure 3.4 Left, The eight conditions used to probe neglect in multiple reference frames: viewer cen tered, object centered, and extra personal. In condi tion A, the patient viewed the cubes on a table and “near” his body. In condition B, the patient viewed the cubes on a table and “far” from his body. In con dition C, the patient viewed the cubes held by the ex perimenter while she sat “near” the patient, facing him. In condition D, the patient viewed the cubes held by the experimenter while she sat “far” away and facing the patient. In condition E, the patient viewed the cubes held by the experimenter while she sat “far” and turned her back to the patient. In condi tion F, the patient viewed the cubes in the “far” mir ror while these were positioned on a “near” table. In condition G, the patient viewed the cubes in the “far” mirror while the experimenter facing him held the cubes in her hands. In condition H, the patient viewed the cubes in the “far” mirror while the experi menter turned her back to the patient and held the cubes in her hands. Note that in the last three condi tions, the cubes are seen only in the mirror (in ex trapersonal space) and not directly (in peripersonal space). Right, Results for conditions D and E, show ing a worsening of performance when the target was held in the left hand of the experimenter and in left hemispace compared with the other combinations of an object-centered and viewer-centered frames of reference. From Laeng, Brennen, et al., 2002. Reprinted with permission of Elsevier.

According to a notion of multiple coordinate systems, different forms of neglect will mani fest depending on the specific portions of parietal or frontal cortex that are damaged. These will reflect a complex mixture of various coordinate frames. Thus, if a lesion of the parietal lobe causes a deficit in a distributed code of locations that can be read out in a variety of reference frames (Andersen & Buneo, 2002), neglect behavior will emerge in the successive visual transformations (Driver & Pouget, 2000). It may also be manifested within an object-centered reference frame (Behrmann & Moscovitch, 1994). Indeed, ne Page 17 of 59

Representation of Spatial Relations glect patients can show various mixtures and dissociations between the reference frames; thus, some patients show both object-centered and viewer-centered neglect (Behrmann & Tipper 1999), but other patients shown neglect in just one of these frames (Hillis & Cara mazza, 1991; Tipper & Behrmann, 1996). For example, Laeng and colleagues (2002) asked a neglect patient to report the colors of two objects (cubes) that could either lie on a table positioned near or far from the patient or be held in the left and right hands of the experimenter. In the latter case, the experimenter either faced the patient or turned backward so that the cubes held in her hands could lie in either the left or right hemi space (Figure 3.4). Thus, the cubes’ position in space was also determined by the experimenter’s (p. 39) body position (i.e., they could be described according to an exter nal body’s object-centered frame). Moreover, by use of a mirror, the cubes could be seen in the mirror far away, although they were “near” the patient’s body, so that the patient actually looked at a “far” location (i.e., the surface of the mirror) to see the physically near object. The experiment confirmed the presence of all forms of neglect. Not only did the patient name the color of a cube seen in his left hemispace more slowly than in his right hemispace, but also latencies increased for a cube held by the experimenter in her left hand and in the patient’s left hemispace (both when the left hand was seen directly or as a mirror reflection). Finally, the patient’s performance was worse for “far” than “near” locations. He neglected cubes located near his body (i.e., within “grasping” space) but seen in the mirror, thus dissociating directing gaze toward extrapersonal space to see an object physically located in peripersonal space. In most accounts of spatial attention, shifting occurs within coordinate frames that can be defined by a portion (or side) of a spatial plane that is orthogonally transected by some egocentric axis (Bisiach et al., 1985). However, together with the classic form of neglect for stimuli or features to the left of the body (or an object’s) midline, neglect can also oc cur below or above the horizontal plane or in the lower (Butter et al., 1989) versus the upper visual field (Shelton et al., 1990) In addition, several neglect behaviors would seem to occur in spatial frames of reference that are best defined by vectors (p. 40) (Kins bourne, 1993) or polar coordinates (Halligan & Marshall, 1995), so that either there is no abrupt boundary for the deficit to occur or the neglected areas are best described by an nular regions of space around the patient’s body (e.g., grasping or near, peripersonal, space). Neurological studies have identified patients with more severe neglect for stimuli within near or reaching space than for stimuli confined beyond the peripersonal region in far, extrapersonal, space (Halligan & Marshall, 1991; Laeng et al., 2002) as well as pa tients with the opposite pattern of deficit (Cowey et al., 1994; Mennemeier et al., 1992). These findings appear consistent with the evidence from neurophysiology studies in mon keys (e.g., Graziano et al., 1994), where spatial position can be defined within a bounded region of space to the head or arm. Moreover, a dysfunction within patients’ inferior pari etal regions is most likely to result in neglect occurring in an “egocentric” spatial frame of reference (i.e., closely related to action control within personal space), whereas dys function within the superior temporal region is most likely to result in “allocentric” ne glect occurring in a spatial frame of reference centered on the visible objects in extraper sonal space (Committeri et al., 2004, 2007; Hillis, 2006). Page 18 of 59

Representation of Spatial Relations Patients with right parietal lesions also have difficulties exploring “virtual space” (i.e., lo cating objects within their own mental images). For example, patients with left-sided ne glect are unable to describe from their visual memory left-sided buildings in a city scene (“left” being based on their imagined position within a city’s square; Beschin et al., 2000; Bisach & Luzzatti, 1978). Such patients may also be unable to spell the beginning of words (i.e., unable to read the left side of the word from an imaginary display; Baxter & Warrington, 1983). However, patients with neglect specific to words (or “neglect dyslex ia”) after a left-hemisphere lesion can show a spelling deficit for the ends of words (Cara mazza & Hillis, 1990). Neuropsychological findings also have shown that not only lesions of the inferior parietal lobe but also those of the frontal lobe and the temporal-parietal-occipital junction lead to unilateral neglect. Remarkably, damage to the rostral part of the human superior tempo ral cortex (of the right hemisphere) results in profound spatial neglect (Karnath et al., 2001) in humans and monkeys, characterized by a profound lack of awareness for objects in the left hemispace. Because the homologous area of the left hemisphere is specialized for language in humans, this may have preempted the spatial function of the left superior temporal cortex, causing a right-sided dominance for space-related information (Wein traub & Mesulam, 1987). One possibility is that the right-sided superior temporal cortex plays an integrative role with regard to the ventral and dorsal streams (Karnath, 2001) because the superior temporal gyrus is adjacent to the inferior areas of the dorsal system and receives input from both streams and is therefore a site for multimodal sensory con vergence (Seltzer & Pandya, 1978). However, none of these individual areas should be in terpreted as the “seat” of the conscious perception of spatially situated objects. In fact, no cortical area alone may be sufficient for visual awareness (Koch, 2004; Lamme et al., 2000). Most likely, a conscious percept is the expression of a distributed neural network and not of any neural bottleneck. That is, a conscious percept is the gradual product of recurrent and interacting neural activity from several reciprocally interconnected regions and streams (Lamme, 2003, 2006). Nevertheless, the selective injury of a convergence zone, like the superior temporal lobe, could disrupt representations that are necessary (but not sufficient) to spatial awareness. Interestingly, patients with subcortical lesions and without detectable damage of either temporal or parietal cortex also show neglect symptoms. However, blood perfusion mea surements in these patients reveal that the inferior parietal lobe is hypoperfused and therefore dysfunctional (Hillis et al., 2005). Similarly, damage to the temporoparietal junction, an area neighboring both the ventral and dorsal systems, produces abnormal correlation of the resting state signal between left and right inferior parietal lobes, which are not directly damaged; this abnormality correlates with the severity of neglect (Corbet ta et al., 2008; He et al., 2007). Therefore, the “functional lesion” underlying neglect may include a more extensive area than what is revealed by structural magnetic resonance, by disrupting underlying association or recurrent circuits (e.g., parietal-frontal pathways; Thiebaut de Schotten et al., 2005).

Page 19 of 59

Representation of Spatial Relations

6. Lateralization of Spatial Representations Differential functional specializations of the two sides of the brain are already present in early vertebrates (Sovrano et al., 2005; Vallortigara & Rogers, 2005), suggesting that lat eralization may be the expression of a strategy of division of labor that evolved millions of years before the appearance of the human species. In several species, the right brain ap pears to be specialized for vigilance (p. 41) and recognition of novel or surprising stimuli. For example, birds appear more adept at gathering food or catching prey seen with the right eye (i.e., left brain) than with the left eye (i.e., right brain). Such a segregation of functions would seem at first glance not so adaptive because it may put the animal at great risk (by making its behavior predictable to both prey and predators). An evolution ary account that can explain this apparently nonadaptive brain organization is based on the hypothesis that a complementary lateralization makes the animal superior in perform ing several tasks at the same time (Vallortigara et al., 2001), counteracting the ecological disadvantages of lateral bias. Evidence indicates that birds that are strongly lateralized are more efficient at parallel processing than birds of the same species that are weakly lateralized (Rogers et al., 2004). Thus, a complementary lateral specialization would seem to make the animals apt to attend to two domains simultaneously. There is a clear analogy between this evolution-arily adaptive division of labor between the vertebrate cerebral hemispheres and the performance of artificial neural networks that segregate processing to multiple, smaller subsystems (Otto et al., 1992; Reuckl, Cave, & Kosslyn, 1989). Most relevant, this principle of division of labor has also been ap plied to the modularization of function for types of spatial representations. Specifically, Kosslyn (1987) proposed the existence of two neural subnetworks within the dorsal stream that process qualitatively different types of spatial information. One spatial repre sentation is based on a quantitative parsing of space and therefore closely related to that of spatial information in the service of action. This type of representation is called coordi nate (Kosslyn, 1987) because it is derived from representations that provide coordinates for navigating into the environment as well as for performing targeted actions such as reaching, grasping, hitting, throwing, and pointing to something. In contrast, the other hypothesized type of spatial representation, labeled categorical spatial relation, parses space in a qualitative manner. For example, two configurations can be described as “one to the left of the other.” Thus, qualitative spatial relations are based on the perception of spatial categories, where an object (but also an empty place) is assigned to a broad equiv alence class of spatial positions (e.g., if a briefcase can be on the floor, and being “on the floor” is satisfied by being placed on any of the particular tiles that make up the whole floor). Each of the two proposed separate networks would be complementarily lateralized. Thus, the brain can represent in parallel the same spatial layout in at least two separate man ners (Laeng et al., 2003): a right-hemisphere mode that assesses spatial “analog” spatial relations (e.g., the distance between two objects) and a left-hemisphere mode that assess es “digital” spatial relations (e.g., whether two objects are attached to one another or above or below the other). The underlying assumption in the above account is that com Page 20 of 59

Representation of Spatial Relations puting separately the two spatial relations (instead of, e.g., taking the quantitative repre sentation and making it coarser by grouping the finer locations) could result in a more ef ficient representation of space, where both properties can be attended simultaneously. Artificial neural network simulations of these spatial judgments provide support for more efficient processing in “split” networks than unitary networks (Jacobs & Kosslyn, 1994; Kosslyn, Chabris, et al., 1992; Kosslyn & Jacobs, 1994). These studies have shown that, when trained to make either digital or analog spatial judgments, the networks encode more effectively each relation if their input is based, respectively, on units with relatively small, nonoverlapping receptive fields, as opposed to units with relatively large, overlap ping receptive fields (Jacobs & Kosslyn, 1994). Overlap of location detectors would then promote the representation of distance, based on a “coarse coding” strategy (Ballard, 1986, Eurich & Schwegler, 1997; Fahle & Poggio, 1981; Hinton, McClelland, & Rumel hart, 1986). In contrast, the absence of overlap between location detectors benefits the representation of digital or categorical spatial relations, by effectively parsing space. Consistent with the above computational account, Laeng, Okubo, Saneyoshi, and Michi mata (2011) observed that spreading the attention window to encompass an area that in cludes two objects or narrowing it to encompass an area that includes only one of the ob jects can modulate the ability to represent each type of spatial relation. In this study, the spatial attention window was manipulated to select regions of differing areas by use of cues of differing sizes that preceded the presentation of pairs of stimuli. The main as sumption was that larger cues would encourage a more diffused attention allocation, whereas the small cues would encourage a more focused mode of attention. When the at tention window was large (by cueing an area that included both objects as well as the empty space between them), spatial transformations of distance between two objects were noticed faster than when (p. 42) the attention window was relatively smaller (i.e., when cueing an area that included no more than one of the objects in the pair). Laeng and colleagues concluded that a relatively larger attention window would facilitate the processing of an increased number of overlapping spatial detectors so as to include (and thus “measure”) the empty space in between or the spatial extent of each form (when judging, e.g., size or volume). In contrast, smaller nonoverlapping spatial detectors would facilitate parsing space into discrete bins or regions and, therefore, the processing of cat egorical spatial transformations; indeed, left–right and above–below were noticed faster in the relatively smaller cueing condition than in the larger (see also Okubo et al., 2010). The theoretical distinction between analog and digital spatial functions is relatively re cent, but early neurological investigations had already noticed that some spatial functions (e.g., distinguishing left from right) are commonly impaired after damage to the posterior parietal cortex of the left hemisphere, whereas impairment of other spatial functions, like judging an object’s orientation or exact position, is typical after damage to the same area in the opposite, right, hemisphere (Luria, 1973). The fact that different forms of spatial dysfunctions can occur independently for each hemisphere has been repeatedly con firmed by studies of patients with unilateral lesions (Laeng, 1994, 2006; Palermo et al., 2008) as well as by neuroimaging studies of normal individuals (Baciu et al., 1999; Koss lyn et al., 1998; Slotnick & Moo, 2006; Trojano et al., 2002). Complementary results have Page 21 of 59

Representation of Spatial Relations been obtained with behavioral methods that use the lateralized (and tachistoscopic) pre sentation of visual stimuli to normal participants (e.g., Banich & Federmeier, 1999; Bruy er et al., 1997; Bullens & Postma, 2008; Hellige & Michimata, 1989; Kosslyn, 1897; Koss lyn et al., 1989, 1995; Laeng et al., 1997; Laeng & Peters, 1995; Roth & Hellige, 1998; Ry bash & Hoyer, 1992). Laeng (1994, 2006) showed a double dissociation between failures to notice changes in categorical spatial relations and changes in coordinate spatial relations. A group of pa tients with unilateral damage to the right hemisphere had difficulty noticing a change in distance or angle between two figures of animals presented successively. The same pa tients had less difficulty noticing a change of relative orientation (e.g., left vs. right or above vs. below) between the same animals. In contrast, the patients with left-hemi sphere damage had considerably less difficulty noticing that the distance between the two animals had either increased or decreased. In another study (Laeng, 2006), similar groups of patients with unilateral lesions made corresponding errors in spatial construc tion tasks from memory (e.g., building patterns made of matchsticks; relocating figures of animals on a cardboard). Distortions in reproducing the angle between two elements and accuracy of relocation of the objects in the original position were more common after damage to the right hemisphere (see also Kessels et al., 2002), whereas mirror reversals of elements of a pattern were more common after damage to the left hemisphere. A study by Palermo and colleagues (2008) showed that patients with damage confined to the left hemisphere had difficulty visually imaging whether a dot shown in a specific position would fall inside or outside of a previously seen circle. These patients were relatively bet ter in visually imaging whether a dot shown in a specific position would be nearer to or farther from the circle’s circumference than another dot previously seen together with the same circle. The opposite pattern of deficit was observed in the patients with righthemisphere damage. Another study with patients by Amorapanth, Widick, and Chatterjee (2010) showed that lesions to a network of areas in the left hemisphere resulted in more severe impairment in judging categorical spatial relations (i.e., the above–below relations between pairs of objects) than lesions to homologous areas of the right hemisphere. Also in this study, the reverse pattern of impairment was observed for coordinate spatial pro cessing, where right-brain damage produced more severe deficit than left-hemisphere damage.

Page 22 of 59

Representation of Spatial Relations

Figure 3.5 Spatial memories for “coordinate” rela tions showed increased activity in the right hemisphere’s prefrontal cortex, whereas memories for “categorical” relations showed increased activity in the left hemisphere’s prefrontal cortex. From Slotnick & Moo, 2006. Reprinted with permis sion from Elsevier.

The above evidence with patients is generally consistent with that from studies with healthy participants, in particular studies using the lateralized tachistoscopic method. In these studies, the relative advantages in speed of response to stimuli presented either to the left or right of fixation indicated superiority of the right hemisphere (i.e., left visual field) for analog judgments and of the left hemisphere (i.e., right visual field) for digital judgments. However, in studies with healthy subjects, the lateral differences appear to be small (i.e., in the order of a few tens of milliseconds according to a meta-analysis; Laeng et al., 2003). Nevertheless, small effect sizes identified with such a noninvasive method are actually greater than effect sizes in percent blood oxygenation observed with fMRI. Most important, both behavioral effects can predict very dramatic outcomes after dam age to the same region or side of the brain. Another method, whereby the same cortical sites can be temporarily and reversibly deactivated (i.e., transcranial magnetic stimula tion [TMS]), (p. 43) provides converging evidence. Left-sided stimulation can effectively mimic the deficit in categorical perception after left-hemisphere damage, whereas rightsided stimulation mimics the deficit in coordinate space perception after right-hemi sphere damage (Slotnick et al., 2001; Trojano et al., 2006). A common finding from studies using methods with localizing power (e.g., neuroimaging, TMS, and selected patients) is that both parietal lobes play a key role in supporting the perception of spatial relations (e.g., Amorapanth et al., 2010; Baciu et al., 1999; Kosslyn et al., 1998; Laeng et al., 2002; Trojano et al., 2006). Moreover, areas of the left and right prefrontal cortex that receive direct input from ipsilateral parietal areas also show activi ty when categorical or coordinate spatial information, respectively, is held in memory (Kosslyn, Thompson, et al., 1998; Trojano et al., 2002). In an fMRI study (Slotnick & Moo, 2006), participants viewed in each trial a configuration consisting of a shape and a dot placed at a variable distance from the shape (either “on” or “off” the shape and, in the latter case, either “near” or “far” from the shape). In the subsequent retrieval task, the shape was presented without the dot, and participants responded to queries about the previously seen spatial layout (e.g., either about a categorical spatial relation property: Page 23 of 59

Representation of Spatial Relations “was the dot ‘on’ or ‘off’ the shape?”; or about a coordinate spatial relation property: “was the dot ‘near’ to or ‘far’ from the shape?”). Spatial memories for coordinate rela tions were accompanied by increased activity in the right hemisphere’s prefrontal cortex, whereas memories for categorical relations were accompanied by activity in the left hemisphere’s prefrontal cortex (see Figure 3.5). One should note that the above studies on the perception of categorical and coordinate relations do not typically involve any specific action in space, but instead involve only ob servational judgments (e.g., noticing or remembering the position of objects in a display). Indeed, a child’s initial cognition of space and of objects’ numerical identity may be en tirely based on a purely observational representation of space whereby the child notices that entities preserve their identities and trajectories when they disappear behind other objects and reappear within gaps of empty space (Dehaene & Changeux, 1993; Xu & Carey, 1996). The above findings from neuroscience studies clearly point to a role of the dorsal system in representing spatial information beyond the mere service of action (cf. Milner & Goodale, 2008). Thus, the evidence from categorical and coordinate spatial pro cessing, together with the literature on other spatial transformations or operations (e.g., mental rotations of shapes, visual maze solving) clearly indicates that a parietal-frontal system supports not merely support the “act” function but also two other central func tions of visual-spatial representations: to “know” and “talk.” The latter, symbolic function would seem of particular relevance to our species and the only one that we do not share with other living beings (except, perhaps, honeybees; Kirchner & Braun, 1994; Menzel et al., 2000). That is, humans can put into words or verbal propositions (as well as into gestures) any type of (p. 44) spatial relations, whether quantitative (by use of numerical systems and geometric systems specifying angles and eccentricities) or qualitative (by use of preposi tions and locutions). However, quantitative propositions may require measurement with tools, whereas establishing qualitative spatial relations between objects would seem to require merely looking at them (Ullman, 1984). If abstract spatial relations between ob jects in a visual scene can be effortlessly perceived, these representations are particular ly apt to be efficiently coded in a propositional manner (e.g., “on top of”). The latter lin guistic property would seem pervasive in all languages of the world and also pervade dai ly conversations. Some “localist” linguists have proposed that the deep semantic struc ture of language is intrinsically spatial (Cook, 1989). Some cognitive neuro-scientists have also suggested that language in our species may have originated precisely from the need to transmit information about the spatial layout of an area from one person to anoth er (O’Keefe, 2003; O’Keefe & Nadel, 1978). Locative prepositions are often used to refer to different spatial relations in a quick and frugal manner (e.g., “above,” “alongside,” “around,” “behind,” “between,” “inside,” “left,” “on top of,” “opposite,” “south,” “toward,” “underneath”); their grammatical class may exist in all languages (Jackendoff & Landau, 1992; Johnson-Laird, 2005; Kemmerer, 2006; Miller & Johnson-Laird, 1976; Pinker, 2007). Clearly, spatial prepositions embedded in sentences (e.g., the man is “in” the house) can express spatial relations only in a rather Page 24 of 59

Representation of Spatial Relations abstract manner (compared, for example, with how GPS coordinates can pinpoint space) and can guide actions and navigation only in a very coarse sense (e.g., by narrowing down an area of search). Locative prepositions resemble categorical spatial relations in that they express spatial relationships in terms of sketchy or schematic structural proper ties of the objects, often ignoring details of spatial metrics (e.g., size, orientation, dis tance; Talmy, 2000). Nevertheless, the abstract relations of locative prepositions seem ex tremely useful to our species because they can become the referents of vital communica tion. Moreover, categorical spatial representations and their verbal expression counter parts may underlie the conceptual structure of several other useful representations (Miller & Johnson-Laird, 1976), like the representations of time and of numerical entities (Hubbard et al., 2005). Indeed, categorical spatial representations could provide the ba sic mental scaffolding for semantics (Cook, 1989: Jeannerod & Jacob, 2005), metaphors (Lakoff & Johnson, 1999), and reasoning in general (Goel et al., 1998; Johnson-Laird, 2005; Pinker, 1990). O’Keefe (1996; 2003) has proposed that the primary function of locative prepositions is to identify a set of spatial vectors between places. The neural substrate supporting such function would consist of a specific class of neurons or “place cells” within the right hip pocampus and of cerebral structures interconnected with the hippocampus. Specifically, a combination of the receptive fields of several space cells would define boundaries of re gions in space that effectively constitute the referential meaning of a preposition. For ex ample, the preposition “below” would identify a “place field” with its center on the verti cal direction vector from a reference object. The width of such a place field would typical ly be larger than the width of the reference object but would taper with distance so as to form a tear-dropped region attached to the bottom surface of the reference object (see al so Carlson et al., 2003; Hayward & Tarr, 1995). Cognitive neuroscience studies have found neuroanatomical correlates of locative prepo sitions within the left inferior prefrontal and left inferior parietal regions (Friederici, 1982; Tranel & Kemmerer, 2004). Consistently, neuroimaging studies have found that naming spatial relationships with prepositions activated the same regions in healthy sub jects (Carpenter et al., 1999b; Damasio et al., 2001). Similar results have been found with speakers of sign language (Emmorey et al., 2002). Kemmerer and Tranel (2000) found a double dissociation between linguistic representations and perceptual representations; that is, some patients had difficulties using locative prepositions but not making percep tual judgments, and other patients had the opposite problem. Laeng (1994) also noticed that patients who made errors in a matching-to-sample task with pictures differing in their categorical spatial relations were nonetheless able to follow the instructions of the Token Test (where the comprehension of locative prepositions is necessary; De Renzi & Vignolo, 1962). These findings indicate that the encoding of categorical spatial relations (and their loss after left-hemisphere damage) cannot be easily reduced to the mediation of semantic or verbal codes. In fact, the evidence suggests that, although perceptual rep resentations may be crucial for establishing the meaning of locative prepositions (Hay ward & Tarr, 1995), once these are learned, they can be supported and interpreted within the semantic network and also selectively disrupted by (p. 45) brain damage. The concep Page 25 of 59

Representation of Spatial Relations tual representation of locative prepositions also appears to be separated from other lin guistic representations (e.g., action verbs, despite several of these verbs sharing with prepositions the conceptual domain of space) because brain damage can dissociate the meanings of these terms (Kemmerer & Tranel, 2003). Although the evidence for the existence of a division of labor for analog versus digital spatial relations in humans is now clearly established, an analogous lateralization of brain function in nonhuman species remains unclear (Vauclair et al., 2006). Importantly, lateral ization is under the influence of “opportunistic” processes of brain development that opti mize the interaction of different subsystems within the cerebral architecture (Jacobs, 1997, 1999). Thus, in our species, local interactions with linguistic and semantic net works may play a key role in the manner in which the spatial system is organized. That is, biasing categorical spatial representations within a left hemisphere’s substrate by “yok ing” them with linguistic processes may facilitate a joint operation between perception, language, and thought (Jacobs & Kosslyn, 1994; Kosslyn, 1987).

7. The “Where” of “What”: Spatial Information Within the Object The conceptual distinctions of categorical and coordinate spatial relations also have strong similarities to different types of geometries (e.g., “topological” versus “Euclidean” or “affine” geometries). For example, inside–outside judgments are topological judg ments. Piaget and Inhelder (1956) considered topological judgments as basic and sug gested that children learn topological spatial concepts earlier than other types of spatial concepts, such as projective and “Euclidian-like” geometry; however, even infants are sensitive to metric qualities (Liben, 2009). Research in neuroscience shows that topological judgments are accomplished by the pari etal lobe (also in rats; Goodrich-Hunsaker et al., 2008); in humans, these judgments have a robust left-hemisphere advantage (Wang et al., 2007). As originally reasoned by Franco and Sperry (1977), given that we can represent multiple geometries (e.g., Euclidian, affine, projective, topological) and the right hemisphere’s spatial abilities are superior to those of the left (the prevalent view in the 1970s), the right hemisphere should match shapes by their geometrical properties better than the left hemisphere. They tested this idea with a group of (commissurotomized) “split-brain” patients in an intermodal (vision and touch) task. Five geometrical forms of the same type were examined visually, while one hand searched behind a curtain for one shape among three with the matching geome try. As expected, the left hand’s performance of the split-brain patients was clearly superi or to that of their right hand. This geometrical discrimination task required the percep tion of fine spatial properties of shapes (e.g., differences in slant and gradient of surfaces, angular values, presence of concavities or holes). Thus, the superior performance of the left hand, which is controlled by the right hemisphere, reflects the use of the right hemisphere’s coordinate spatial relations’ system (of the right hemisphere) in solving a shape discrimination task that crucially depends on the fine metrics of the forms. Later Page 26 of 59

Representation of Spatial Relations investigations on split-brain patients showed that the left hand outperforms the right hand also when copying drawings from memory or in rearranging blocks of the WAIS-R Block Design Test (LeDoux, Wilson, & Gazzaniga, 1977). Specifically, LeDoux and Gaz zaniga (1978) proposed that the right hemisphere possesses a perceptual capacity that is specifically dedicated to the analysis of space in the service of the organization of action or movements planning that they called a manipulospatial subsystem. Again, a left-hand (right-hemisphere) superiority in these patients’ constructions is consistent with a view that rearranging multiple items that are identical in shape (and share colors) may require a coordinate representation of the matrix of the design or array (Laeng, 2006). A shape is intuitively a geometrical entity that occupies a volume of space. As such, a shape is nothing but the spatial arrangement of the points in space occupied by it. How ever, many real-world objects can be parsed in component elements or simpler shapes, and many objects differ in the locations of similar or identical constitutive elements (Bie derman, 1987). Intuitively, it would seem that a multipart object (e.g., a bicycle) is noth ing but the spatial relations among its parts. According to several accounts, an object can be represented as a structural description (Marr, 1982) or a representation of connec tions between parts (e.g., geons, in Biederman’s, 1987, recognition by components mod el). In these computational models, an object’s connections are conceived as abstract spa tial specifications of how an object’s parts are put together. The resulting representations can differentially describe whole classes of similar objects (e.g., cups versus buckets). In this case, abstract, categorical, spatial relations (Hayward & Tarr, 1995; Hummel & Bie derman, 1992) (p. 46) could provide the spatial ingredient of such structural descriptions. Indeed, Kosslyn (1987) proposed that the left dorsal system’s categorical spatial represen tations can play a role in object recognition, by representing spatial relations among the object’s parts. In this account, shape properties are stored in a visual memory system within the inferior temporal lobe (Tanaka et al., 1991) as a nontopographical “population code” that ignores locations. Whereas the dorsal system registers locations in a topographic map that ignores shape (e.g., the object can be represented here simply as a point and its location specified relatively to other points or indices). Such a map of in dices or spatial tokens could then represent the locations of objects in a scene or of parts of objects in space and form an “object map” (Kosslyn et al., 2006) or “skeletal image” (Kosslyn, 1980). This information can be used to reconstruct the image by posi tioning (back-propagating) each part representation in its correct location within the high-detail topographic maps of the occipital lobes. When reconstituting a mental image or, generally, in recollection (O’Regan & Nöe, 2001), the set of locations retrieved from the “object map” could also be specified by the relation of parts to eye position during learning (Laeng & Teodorescu, 2002). Based on the above account, one would expect that lesions of the dorsal system (in partic ular, of the left hemisphere) would result in object recognition problems. However, as al ready discussed, patients with parietal lesions do not present the dramatic object recogni tion deficits of patients with temporal lesions. Patients with unilateral lesions localized in the parietal lobe often appear to lack knowledge of the spatial orientation of objects, yet they appear to achieve normal object recognition (Turnbull et al., 1995, 1997). For exam Page 27 of 59

Representation of Spatial Relations ple, they may fail to recognize their correct orientation or, when drawing from memory, they rotate shapes of 90 or 180 degrees. Most remarkably, patients with bilateral parietal lesions (i.e., with Bálint’s syndrome and simultanagnosia), despite being catastrophically impaired in their perception of spatial relations between separate objects, can recognize an individual object (albeit very slowly; Duncan et al., 2003) on the basis of its shape. Hence, we face something of a paradox: Object identity depends on spatial representa tions among parts (i.e., within object relations), but damage to the dorsal spatial systems does not seem to affect object recognition (Farah, 1990). It may appear that representing spatial relations “within” and “without” shapes depends on different perceptual mecha nisms. However, there exists evidence that lesions in the dorsal system can cause specific types of object recognition deficits (e.g., Warrington, 1982; Warrington & Taylor, 1973). First of all, patients with Bálint’s syndrome do not have entirely normal object perception (Dun can et al., 2003; Friedman-Hill et al., 1995; Robertson et al., 1997). Specifically, patient R.M., with bilateral parieto-occipital lesions, is unable to judge both relative and absolute visual locations. Concomitantly, he makes mistakes in combining the colors and shapes of separate objects or the shape of an object with the size of another (i.e., the patient com mits several “illusory conjunctions”). Thus, an inadequate spatial representation or loss of spatial awareness of the features of forms, due to damage to parietal areas, appears to underlie both the deficit in spatial judgment and that of binding shape features. Accord ing to feature integration theory (Treisman, 1988), perceptual representations of two sep arate objects currently in view require integration of information in the dorsal and ven tral system, so that each object’s specific combination of features in their proper loca tions can be obtained. Additional evidence that spatial information plays a role in shape recognition derives from a study with lateralized stimuli (Laeng, Shah & Kosslyn, 1999). This study revealed a short-lived advantage for the left hemisphere (i.e., for stimuli presented tachistoscopical ly to the right visual field) in the recognition of pictures of contorted poses of animals. It was reasoned that nonrigid multipart objects (typically animal bodies but also some arti facts, e.g., a bicycle) can take a number of contortions that, combined with an unusual perspective, are likely to be novel or rarely experienced by the observer. In such cases, the visual system may opt to operate in a different mode from the usual matching of stored representations (i.e., bottom-up matching of global templates) and initiate a hy pothesis-testing procedure (i.e., a top-down search for connected parts and a serial matching of these to stored structural descriptions). In the latter case, the retrieval of categorical spatial information (i.e., a hypothesized dorsal and left hemisphere’s function) seems to be crucial for recognition. Abstract spatial information about the connectivity of the object’s parts would facilitate the formation of a perceptual hypothesis and verifying it by matching visible parts to the object’s memorized spatial configuration. In other words, an “object map” in the dorsal system specifies the spatial relations among parts’ representation of the complex pattern represented by the ventral system (Kosslyn, Ganis, & Thompson, 2006). Page 28 of 59

Representation of Spatial Relations

Figure 3.6 Stimuli used in an object recognition task with patients with unilateral posterior lesions. Pa tients with damage to the left hemisphere had greater difficulties with the noncanonical views of the nonrigid objects (animals) than those with dam age to the right hemisphere, whereas those with damage to the right hemisphere had relatively greater difficulties with the noncanonical views of rigid objects. Reprinted with permission from Laeng et al., 2000.

A subsequent study (Laeng et al., 2000) of patients with unilateral posterior dam age (mainly affecting the parietal lobe) confirmed that patients with left-hemisphere dam (p. 47)

age had greater difficulties in recognizing the contorted bodies of animals (i.e., the same images used in Laeng et al.’s, 1999, study) than those with right- hemisphere damage (Figure 3.6). However, left-hemisphere damage resulted in less difficulty than right-hemi sphere damage when recognizing pictures of the same animals seen in conventional pos es but from noncanonical (unusual) views as well as when recognizing rigid objects (arti facts) from noncanonical views. As originally shown in studies by Warrington (1982; War rington & Taylor, 1973), patients with right parietal lesions showed difficulties in the recognition or matching of objects when viewed at unconventional perspectives or in the presence of strong shadows. According to Marr (1982), these findings suggested that the patients’ difficulties reflect the inability to transform or align an internal spatial frame of reference centered on the object’s intrinsic coordinates (i.e., its axes of elongation) to match the perceived image. To conclude, the dorsal system plays a role in object recognition but as an optional re source (Warrington & James, 1988) by cooperating with the ventral system during chal lenging visual situations (e.g., novel contortions of flexible objects or very unconventional views or difficult shape-from-shadows discriminations; Warrington & James, 1986) or when making fine judgments about the shapes of objects that differ by subtle variations Page 29 of 59

Representation of Spatial Relations in size or orientation (Aguirre & D’Esposito, 1997; Faillenot et al., 1997). In ordinary cir cumstances, different from these “visual problem solving” (p. 48) situations (Farah, 1990), a spatial analysis provided by the dorsal system seems neither necessary nor sufficient to achieve object recognition.

8. Cognitive Maps As thinking agents, we accumulate in our lifetime a spatial understanding of our sur rounding physical world. Also, we can remember and think about spatial relations either in the immediate, visible, physical environment or in the invisible environments of a large geographic scale. We can also manipulate virtual objects in a virtual space and imagined geometry (Aflalo & Graziano, 2008). As a communicative species, we can transfer knowl edge about physical space to others through symbolic systems like language and geo graphical maps (Liben, 2009). Finally, as a social species, we tend to organize space into territories and safety zones, and to develop a sense of personal place. A great deal of our daily behavior must be based on spatial decisions and choices be tween routes, paths, and trajectories. A type of spatial representation, called the cogni tive map, appears to be concerned with the knowledge of large-scale space (Cheng, 1986; Kosslyn et al., 1974; Kuipers, 1978; Tolman, 1948; Wolbers & Hegarty, 2010). A distinc tion can be made between (1) a map-like representation, consisting of a spatial frame ex ternal to the navigating organism (this representation is made by the overall geometric shape of the environment [survey knowledge] and/or a set of spatial relationships be tween locales [landmarks and place]); and (2) an internal spatial frame that is based on egocentric cues generated by self-motion (route knowledge) and vestibular information (Shelton & McNamara, 2001). Cognitive maps may be based on categorical spatial infor mation (often referred to as topological; e.g., Poucet, 1993), which affords a coarse repre sentation of the connectivity of space and its overall arrangement, combined with coordi nate (metric) information (e.g., information about angles and distances) of the large-scale environment. Navigation (via path integration or dead reckoning or via the more flexible map-like rep resentation) and environmental knowledge can be disrupted by damage to a variety of brain regions. Parietal lesions result in difficulties when navigating in immediate space (DiMattia & Kesner, 1988; Stark et al., 1996) and can degrade the topographic knowledge of their environment (Newcombe & Russell, 1969; Takahashi et al., 1997). However, areas supporting cognitive map representations in several vertebrate species appear to involve portions of the hippocampus and surrounding areas (Wilson & McNaughton, 1993). As originally revealed by single-cell recording studies in rats (O’Keefe, 1976; O’Keefe & Nadel, 1978) and later in primates (O’Keefe et al., 1998; Ludvig et al., 2004) and also hu mans (Ekstrom et al., 2003), some hippocampal cells can provide a spatial map-like repre sentation within a reference frame fixed onto the external environment. For example, some of these cells have visual receptive fields that do not move with the position of the animal or with changes in viewpoint but instead fire whenever the animal (e.g., the mon Page 30 of 59

Representation of Spatial Relations key; Rolls et al., 1989; Fyhn et al., 2004) is in a certain place in the local environment. Thus, these cells can play an important functional role as part of a navigational system (Lenck-Santini et al., 2001). Reactivation of place cells has also been observed during sleep episodes in rats (Wilson & McNaughton, 1994), which can be interpreted as an of fline consolidation process of spatial memories. Reactivation of whole past sequences of place cell activity has been recorded in rats during maze navigation whenever they stop at a turning point (Foster & Wilson, 2006); in some cases, place cell discharges can indi cate future locations along the path (Johnson & Redish, 2007) before the animals choose between alternative trajectories. Another type of cell (Figure 3.7) has been found in the rat entorhinal cortex (adjacent to the hippocampus). These cells present tessellating fir ing fields or “grids” (Hafting et al., 2005; Solstad et al., 2008) that could provide the ele ments of a spatial map based on path integration (Kjelstrup et al., 2008; Moser et al., 2008) and thus complement the function of the place cells.

Figure 3.7 Neuralfiring of “place cells” and “grid cells” of rats while navigating in their cage environ ment. Reprinted with permission from Moser et al., 2008.

Ekstrom and colleagues (2003) recorded directly from hippocampal and parahippocampal cells of epileptic patients undergoing neurosurgery. The patients played a videogame (a taxi-driving game in which a player navigates within a virtual city) while neural activity was recorded simultaneously from multiple cells. A significant proportion of the recorded cells showed spiking properties identical to those of place cells already described for the rat’s hippocampus. Other cells were instead view responsive. They responded to the view of a specific landmark (e.g., the picture of a particular building) and were relatively more common in the patients’ parahippocampal region. Thus, these findings support an ac count of the human hippocampus as computing a flexible map-like representation of space by combining visual and spatial elements with a coarser representation of salient scenes, views, and landmarks formed in the parahippocampal region. In addition, neu roimaging studies in humans (p. 49) revealed activity in the hippocampal region during navigational memory tasks (e.g., in taxi drivers recalling routes; Grön et al., 2000; Maguire et al., 1997; Wolbers et al., 2007). Lesion studies of animals and neurological cases have demonstrated deficits after temporal lesions that include the hippocampus (Barrash et al., 2000; Kessels et al., 2001; Maguire et al., 1996). However, the hippocam pus and the entorhinal cortex may not constitute necessary spatial structures for humans and for all types of navigational abilities; patients with lesions in these areas can main Page 31 of 59

Representation of Spatial Relations tain a path in mind and point to (estimate) the distance from a starting point by keeping track of a reference location while moving (Shrager etal., 2008). In rats, the parietal cortex is also clearly involved in the processing of spatial information (Save & Poucet, 2000) and constitutes another important structure for navigation (Nitz, 2006; Rogers & Kesner, 2006). One hypothesis is that the parietal cortex is involved in combining visual-spatial information and self-motion information so that egocentrically acquired information can be relayed to the hippocampus to generate and update an allo centric representation of space. Based on the role of the human dorsal system in the com putation of both categorical and coordinate types of spatial representations (Laeng et al., 2003), one would expect a strong interaction between processing in the hippocampal for mation and in the posterior parietal cortex. The human parietal cortex could provide both coordinate (distance and angle) and categorical information (boundary conditions, con nectivity, and topological information; Poucet, 1993) to the hippocampus. In turn, the hip pocampus could combine the above spatial information with spatial scenes encoded by the parahippocampal area and ventral areas specialized for landscape object recognition (e.g., recognition of a specific building; Aguirre et al., 1998). In addition, language-based spatial information (Hermer & Spelke, 1996; Hermer-Vazquez et al., 2001) could play an active role for this navigational system. Neuroimaging studies with humans revealed activity in the parahippocampal cortex when healthy participants passively viewed an environment or large-scale scenes (Epstein & Kanwisher, 1998), including an empty room, as well as during navigational tasks (e.g., in virtual environments; Aguirre et al., 1996; Maguire et al., 1998, 1999). Patients with dam age in this area show problems in scene recognition and route learning (Aguirre & D’Esposito, 1999; Epstein et al., 2001). Subsequent research with patients and monkeys has clarified the involvement of the parahippocampal cortex in memorizing objects’ loca tions within a large-scale scene or room’s geometry (Bohbot et al., 1998; Malkova & Mishkin, 2003), more than in supporting navigation or place knowledge (Burgess & O’Keefe, 2003). One proposal is that, when remembering a place or scene, the parietal cortex, based on reciprocal connections, can also translate an allocentric (North, South, East, West) parahippocampal representation into an egocentric (left, right, ahead, be hind) representation (Burgess, 2008). By this account, neglect in scene imagery (e.g., the Milan square’s neglect experiment of Bisiach & Luzzatti, 1978) after parietal lesions would result from an intact ventral allocentric representation of space (i.e., the whole square) along with damage to the parietal egocentric representation.

Conclusion As humans, we “act” in space, “know” space, and “talk” about space; three func tions that together would seem to require the whole human brain in order to be accom plished. Indeed, research on the human brain’s representation of spatial relations in cludes rather different traditions and theoretical backgrounds, which taken together pro vide us with a complex and rich picture of our cognition of space. Neuroscience has re (p. 50)

Page 32 of 59

Representation of Spatial Relations vealed (1) the existence of topo-graphic maps in the brain or, in other words, the brain’s representation of space by the spatial organization of the brain itself. The visual world is then represented by two higher order, representational streams of the brain, anatomically located ventrally and dorsally, that make the basic distinction between (2) “what” is in the world and “where” it is. However, these two forms of information need to be integrated in other representations that specify (3) “how” an object can be acted on and “which” object is the current target. Additionally, the brain localizes objects according to multiple and parallel (4) spatial frames of reference that are also relevant to the manner in which spa tial attention is deployed. After brain damage, attentional deficits (5) or neglect clearly reveal the relevance allocating attention along different frames of reference. Although many of the reviewed functions are shared among humans and other animals, humans show a strong degree of (6) cerebral lateralization for spatial cognition, and the current evidence indicates complementary hemispheric specializations for digital (categorical) and for analog (coordinate) spatial information. The representation of categorical spatial relations is also relevant for (7) object recognition by specifying the spatial arrangement of parts within an object (i.e., the “where of what”). Humans, as other animals, can also represent space in the very large scale, a (8) cognitive map of the external environment, which is useful for navigation. The most striking finding of cognitive neuro-science is the considerable degree of func tional specialization of the brain’s areas. Interestingly, the discovery that the visual brain separates visual information into two streams of processing (“what” versus “where”) does particular justice to Kant’s classic concept of space as a separate mode of knowledge (Moser et al., 2008). In the Critique of Pure Reason, space was defined as what is left when one ignores all the attributes of a shape: “If we remove from our empirical concept of a body, one by one, every feature in it which is empirical, the color, the hardness or softness, the weight, even the impenetrability, there still remains the space which the body (now entirely vanished) occupied, and this cannot be removed” (Kant, 1787; 2008, p. 377).

Author Note I am grateful for comments and suggestions on drafts of the chapter to Charlie Butter, Michael Peters, and Peter Svenonius. Please address correspondence to Bruno Laeng, Ph.D., Department of Psychology, Univer sity of Oslo, 1094 Blindern, 0317 Oslo, Norway; e-mail: [email protected].

References Aflalo, T. N., & Graziano, M. S. A. (2006). Possible origins of the complex topographic or ganization of motor cortex: Reduction of a multidimensional space onto a two-dimension al array. Journal of Neuroscience, 26, 6288–6297.

Page 33 of 59

Representation of Spatial Relations Aflalo, T. N., & Graziano, M. S. A. (2008). Four-dimensional spatial reasoning in humans. Journal of Experimental Psychology: Human Perception and Performance, 34, 1066–1077. Aglioti, S., Goodale, M. A., & DeSouza, J. F. X (1995). Sizecontrast illusions deceive the eye but not the hand. Current Biology, 5, 679–685. Aguirre, G. K., & D’Esposito, M. (1997). Environmental knowledge is subserved by sepa rable dorsal/ventral neural areas. Journal of Neuroscience, 17, 2512–2518. Aguirre, G. K., & D’Esposito, M. (1999). Topographical disorientation: A synthesis and taxonomy. Brain, 122, 1613–1628. Aguirre, G. K., Zarahn, E., & D’Esposito, M. (1998). An area within human ventral cortex sensitive to “building” stimuli: Evidence and implications. Neuron, 17, 373–383. Alain, C., Arnott, S. R., Hevenor, S., Graham, S., & Grady, C. L. (2001). “What” and “where” in the human auditory system. Proceedings of the National Academy of Sciences U S A, 98, 12301–12306. Alivisatos, B., & Petrides, M. (1997). Functional activation of the human brain during mental rotation. Neuropsychologia, 35, 111–118. Amorapanth, P. X., Widick, P., & Chatterjee, A. (2010). The neural basis for spatial rela tions. Journal of Cognitive Neuroscience, 22, 1739–1753. Andersen, R. A., & Buneo, C. A. (2002). Intentional maps in posterior parietal cortex. An nual Review of Neuroscience, 25, 189–220. Andersen, R. A., Essick, G. K., & Siegel, R. M. (1985). The encoding of spatial location by posterior parietal neurons. Science, 230, 456–458. Avillac, M., Deneve, S., Olivier, E., Pouget, A., & Duhamel, J. R. (2005). Reference frames for representing visual and tactile locations in parietal cortex. Nature Neuroscience, 8, 941–949. Baars, B. J. (2002). The conscious access hypothesis: Origins and recent evidence. Trends in Cognitive Sciences, 6, 47–52. Baciu, M., Koenig, O., Vernier, M.-P., Bedoin, N., Rubin, C., & Segebarth, C. (1999). Cate gorical and coordinate spatial relations: fMRI evidence for hemispheric specialization. Neuroreport, 10, 1373–1378. (p. 51)

Bálint, R. (1909). Seelenlähmung des “Schauens”, optische Ataxie, räumliche

Störung der Aufmerksamkeit. European Neurology, 25 (1), 51–66, 67–81. Ballard, D. H. (1986). Cortical connections and parallel processing: Structure and func tion. Behavioral and Brain Sciences, 9, 67–120.

Page 34 of 59

Representation of Spatial Relations Banich, M. T., & Federmeier, K. D. (1999). Categorical and metric spatial processes distin guished by task demands and practice. Journal of Cognitive Neuroscience, 11 (2), 153– 166. Barlow, H. (1981). Critical limiting factors in the design of the eye and visual cortex. Pro ceedings of the Royals Society of London, Biological Sciences, 212, 1–34. Barrash, J., Damasio, H., Adolphs, R., & Tranel, D. (2000). The neuroanatomical corre lates of route learning impairment. Neuropsychologia, 38, 820–836. Battista, C., & Peters, M. (2010). Ecological aspects of mental rotation around the vertical and horizontal axis. Learning and Individual Differences, 31 (2), 110–113. Baxter, D. M., & Warrington, E. K. (1983). Neglect dysgraphia. Journal of Neurology, Neu rosurgery, and Psychiatry, 46, 1073–1078. Behrmann, M., & Moscovitch, M. (1994). Object-centered neglect in patients with unilat eral neglect: Effects of left-right coordinates of objects. Journal of Cognitive Neuroscience, 6, 1–16. Behrmann, M., & Tipper, S. P. (1999). Attention accesses multiple reference frames: Evi dence from visual neglect. Journal of Experimental Psychology: Human Perception and Performance, 25, 83–101. Beschin, N., Basso, A., & Della Sala, S. (2000). Perceiving left and imagining right: Disso ciation in neglect. Cortex, 36, 401–414. Beschin, N., Cubelli, R., Della Sala, S., & Spinazzola, L. (1997). Left of what? The role of egocentric coordinates in neglect. Journal of Neurosurgery and Psychiatry, 63, 483–489. Biederman, I. (1987). Recognition-by-components: A theory of human image understand ing. Psychological Review, 94, 115–147. Bisiach, E., Capitani, E., & Porta, E. (1985). Two basic properties of space representation in the brain: Evidence from unilateral neglect. Journal of Neurology, Neurosurgery, and Psychiatry, 48, 141–144. Bisiach, E., & Luzzatti, C. (1978). Unilateral neglect of representational space. Cortex, 14, 129–133. Block, N. (1996). How can we find the neural correlate of consciousness. Trends in Neuro sciences, 19, 456–459. Bohbot, V. D., Kalina, M., Stepankova, K., Spackova, N., Petrides, M., & Nadel, L. (1998). Spatial memory deficits in patients with lesions to the right hippocampal and the right parahippocampal cortex. Neuropsychologia, 36, 1217–1238. Brandt, T., & Dietrich, M. (1999). The vestibular cortex: Its locations, functions and disor ders. Annals of the New York Academy of Sciences, 871, 293–312. Page 35 of 59

Representation of Spatial Relations Bruyer, R., Scailquin, J. C., & Coibon, P. (1997). Dissociation between categorical and co ordinate spatial computations: Modulation by cerebral hemispheres, task properties, mode of response, and age. Brain and Cognition, 33, 245–277. Buccino, G., Lui, F., Canessa, N., Patteri, I., Lagravinese, G., Benuzzi, F., Porro, C.A., & Rizzolatti, G. (2004). Neural circuits involved in the recognition of actions performed by nonconspecifics: An fMRI study. Journal of Cognitive Neuroscience, 16, 114–126. Bullens, J., & Postma, A. (2008). The development of categorical and coordinate spatial relations. Cognitive Development, 23, 38–47. Burgess, N. (2008). Spatial cognition and the brain. Annals of the New York Academy of Sciences, 1124, 77–97. Burgess, N., & O’Keefe, J. (2003). Neural representations in human spatial memory. Trends in Cognitive Sciences, 7, 517–519. Burnod, Y., Baraduc, P., Battaglia-Mayer, A., Guigon, E., Koechlin, E., Ferraina, S., Lac quaniti, F., & Caminiti, R. (1999). Parieto-frontal coding of reaching: An integrated frame work. Experimental Brain Research, 129, 325–346. Butter, C. M., Evans, J., Kirsh, N., & Kewman, D. (1989). Altitudinal neglect following trau matic brain injury: A case report. Cortex, 25, 135–146. Butters, N., Barton, M., & Brody, B. A. (1970). Role of the right parietal lobe in the media tion of cross-modal associations and reversible operations in space. Cortex, 6, 174–190. Caramazza, A., & Hillis, A. E. (1990). Spatial representation of words in the brain implied by the studies of a unilateral neglect patient. Nature, 346, 267–269. Carey, D. P., Dijkerman, H. C., Murphy, K. J., Goodale, M. A., & Milner, A. D. (2006). Point ing to places and spaces in a patient with visual form agnosia. Neuropsychologia, 44, 1584–1594. Carey, D. P., Harvey, M., & Milner, A. D. (1996). Visuomotor sensitivity for shape and ori entation in a patient with visual form agnosia. Neuropsychologia, 3, 329–337. Carlson, L., Regier, T., & Covey, E. (2003). Defining spatial relations: Reconciling axis and vector representations. In E. van der Zee & J. Slack (Eds.), Representing direction in lan guage and space (pp. 111–131). Oxford, UK: Oxford University Press. Carpenter, P. A., Just, M. A., Keller, T. A., Eddy, W., & Thulborn, K. (1999a). Graded func tional activation in the visuospatial system with the amount of task demand. Journal of Cognitive Neuroscience, 11, 9–24. Carpenter, P. A., Just, M. A., Keller, T. A., Eddy, W., & Thulborn, K. (1999b). Time-course of fMRI activation in language and spatial networks during sentence comprehension. Neu roImage, 10, 216–224. Page 36 of 59

Representation of Spatial Relations Casati, R., & Varzi, A. (1999). Parts and places: The structures of spatial representation. Boston: MIT Press. Castiello, U. (2005). The neuroscience of grasping. Nature Reviews: Neuroscience, 6, 726–736. Cavanagh, P. (1998). Attention: Exporting vision to the mind. In S. Saida & P. Cavanagh (Eds.), Selection and integration of visual information, pp. 3–11. Tsukuba, Japan: STA & NIBH-T. Chafee, M. V., Averbeck, B. B., & Crowe, D. A. (2007). Representing spatial relationships in posterior parietal cortex: Single neurons code object-referenced position. Cerebral Cor tex, 17, 2914–2932. Chafee, M. V., Crowe, D. A., Averbeck, B. B., & Georgopoulos, A. P. (2005). Neural corre lates of spatial judgement during object construction in parietal cortex. Cerebral Cortex, 15, 1393–1413. Chao, L. L., & Martin, A. (2000). Representation of manipulable man-made objects in the dorsal stream. NeuroImage, 12, 478–484. Cheng, K. (1986) A purely geometric module in the rat’s spatial representation. Cognition, 23, 149–178. Cherniak, C. (1990). The bounded brain: Toward a quantitative neuroanatomy. Journal of Cognitive Neuroscience, 2, 58–68. (p. 52)

Clément, G., & Reschke, M. F. (2008). Neuroscience in space. New York: Springer.

Colby, C. L., & Goldberg, M. E. (1999), Space and attention in parietal cortex. Annual Re view of Neuroscience, 23, 319–349. Collett, T. (1982). Do toads plan routes? A study of the detour behaviour of Bufo Viridis. Journal of Comparative Physiology, 146, 261–271. Committeri, G., Galati, G., Paradis, A. L., Pizzamiglio, L., Berthoz, A., & LeBihan, D. (2004). Reference frames for spatial cognition: Different brain areas are involved in view er-, object-, and landmark-centered judgments about object location. Journal of Cognitive Neuroscience, 16, 1517–1535. Committeri, G., Pitzalis, S., Galati, G., Patria, F., Pelle, G., Sabatini, U., Castriota-Scander beg, A., Piccardi, L., Guariglia, C., & Pizzamiglio L. (2007). Neural bases of personal and extrapersonal neglect in humans. Brain, 130, 431–441. Constantinidis, C., & Steinmetz, M. A. (2001). Neuronal responses in Area 7a to multiplestimulus displays: I. Neurons encode the location of the salient stimulus. Cerebral Cortex, 11, 581–591. Cook, W. A. (1989). Case grammar theory. Washington, DC: Georgetown University Press. Page 37 of 59

Representation of Spatial Relations Corbetta, M., Akbudak, E., Conturo, T. E., Snyder, A. Z., Ollinger, J. M., Drury, H. A., Linenweber, M. R., Petersen, S. E., Raichle, M. E., Van Essen, D. C., & Shulman, G. L. (1998). A common network of functional areas for attention and eye movements. Neuron, 21, 761–773 Corbetta, M., Kincade, J. M., Ollinger, J. M., McAvoy, M. P., & Shulman, G. L. (2000). Vol untary orienting is dissociated from target detection in human posterior parietal cortex. Nature, 3, 292–297. Corbetta, M., Miezin, F. M., Shulman, G. L., & Petersen, S. E. (1993). A PET study of visu ospatial attention. Journal of Neuroscience, 13, 1202–1226. Corbetta, M., Patel, G., & Shulman, G. L. (2008). The reorienting system of the human brain: From environment to theory of mind. Neuron, 58, 306–324. Courtney, S. M., Ungerleider, L. G., Keil, K., & Haxby, J. V. (1997). Transient and sustained activity in a distributed neural system for human working memory. Nature, 386, 608–611. Cowey, A., Small, M., & Ellis, S. (1994). Left visuo-spatial neglect can be worse in far than in near space. Neuropsychologia, 32, 1059–1066. Crowe, D. A., Averbeck, B. B., Chafee, M. V., & Georgopoulos, A. P. (2005). Dynamics of parietal neural activity during spatial cognitive processing. Neuron, 47, 885–891. Culham, J. C., Brandt, S. A., Cavanagh, P., Kanwisher, N. G., Dale, A. M., & Tootell, R. B. H. (1998). Cortical fMRI activation produced by attentive tracking of moving targets. Journal of Neurophysiology, 80, 2657–2670. Culham, J. C., Cavanagh, P., & Kanwisher, N. G. (2001) Attention response functions: Characterizing brain areas using fmri activation during parametric variations of atten tional load. Neuron, 32, 737–745. Culham, J. C., Danckert, S. L., DeSouza, J. F., Gati, J. S., Menon, R. S., & Goodale, M. A. (2003). Visually guided grasping produces fMRI activation in dorsal but not ventral stream brain areas. Experimental Brain Research, 153 (2), 180–189. Culham, J. C., & Valyear, K. F. (2006). Human parietal cortex in action. Current Opinion in Neurobiology, 16 (2), 205–212. Damasio, H., Grabowski, T. J., Tranel, D., Ponto, L. L. B., Hichwa, R. D., & Damasio, A. R. (2001). Neural correlates of naming actions and of naming spatial relations. NeuroImage, 13, 1053–1064. Dehaene, S. (1997). The number sense. Oxford, UK: Oxford University Press. Dehaene, S., & Changeux, J.-P. (1993). Development of elementary numerical abilities: A neuronal model. Journal of Cognitive Neuroscience, 5, 390–407.

Page 38 of 59

Representation of Spatial Relations Denys, K., Vanduffel, W., Fize, D., Nelissen, K., Peuskens, H., Van Essen, D., & Orban, G. A. (2004). The processing of visual shape in the cerebral cortex of human and nonhuman primates: A functional magnetic resonance imaging study. Journal of Neuroscience, 24, 2551–2565. De Renzi, E. (1982). Disorders of space exploration and cognition. New York: John Wiley & Sons. De Renzi, E., & Vignolo, L. (1962). The Token Test: A sensitive test to detect receptive dis turbances in aphasics. Brain, 85, 665–678. DeYoe, E. A., Carman, G. J., Bandettini, P., Glickman, S., Wieser, J., Cox, R., Miller, D., & Neitz, J. (1996). Mapping striate and extrastriate visual areas in human cerebral cortex. Proceedings of the National Academy of Sciences U S A, 93, 2382–2386. DiMattia, B. V., & Kesner, R. P. (1988). Role of the posterior parietal association cortex in the processing of spatial event information. Behavioral Neuroscience, 102, 397–403. Driver, J., & Pouget, A. (2000). Object-Centered Visual Neglect, or Relative Egocentric Ne glect? Journal of Cognitive Neuroscience, 12 (3), 542–545. Duhamel, J.-R., Bremmer, F., BenHamed, S., & Graf, W. (1997) Spatial invariance of visual receptive fields in parietal cortex neurons. Nature, 389, 845–848. Duncan, J., Bundesen, C., Olson, A., Humphreys, G., Ward, R., Kyllingsbæk, S., van Raams donk, M., Rorden, R., & Chavda, S. (2003). Attentional functions in dorsal and ventral si multanagnosia. Cognitive Neuropsychology, 20, 675–701. Ekstrom, A. D., Kahana, M. J., Caplan, J. B., Fields, T. A., Isham, E. A., Newman, E. L., & Fried, I. (2003). Cellular networks underlying human spatial navigation. Nature, 425, 184–187. Emmorey, K., Damasio, H., McCullough, S., Grabowski, T., Ponto, L., Hichwa, R., & Bellu gi, U. (2002). Neural systems underlying spatial language in American Sign Language. Neuroimage, 17, 812–824. Engel, S. A., Glover, G. H., & Wandell, B. A. (1997) Retinotopic organization in human vi sual cortex and the spatial precision of functional MRI. Cerebral Cortex, 7, 181–192. Engel, S. A., Rumelhart, D. E., Wandell, B. A., Lee, A. T., Glover, G. H., Chichilnisky, E. J., & Shadlen, M. N. (1994). fMRI of human visual cortex. Nature, 369, 525. Epstein, R., DeYoe, E. A., Press, D. Z., Rosen, A. C., & Kanwisher, N. (2001). Neuropsycho logical evidence for a topographical learning mechanism in parahippocampal cortex. Cog nitive Neuropsychology, 18, 481–508. Epstein, R., & Kanwisher, N. (1998). A cortical representation of the local visual environ ment. Nature, 392 (6676), 598–601. Page 39 of 59

Representation of Spatial Relations Eurich, C. W., & Schwegler, H. (1997). Coarse coding: Calculation of the resolution achieved by a population of large receptive field neurons. Biological Cybernetics, 76, 357– 363. Fahle, M., & Poggio, T. (1981). Visual hyperacuity: Spatiotemporal interpolation in human vision. Philosophical Transactions of the Royal Society of London: Series B, 213, 451–477. (p. 53)

Faillenot, I., Toni, I., Decety, J., Grégoire, M.-C., & Jeannerod, M. (1997). Visual pathways for object-oriented action and object recognition: Functional anatomy with PET. Cerebral Cortex, 7, 77–85. Fang, F., & He, S. (2005). Cortical responses to invisible objects in the human dorsal and ventral pathways. Nature Neuroscience, 8, 1380–1385. Farah, M. J. (1990). Visual agnosia: Disorders of object recognition and what they tell us about normal vision. Cambridge, MA: MIT Press. Farrell, M. J., & Robertson, I. H. (2000). The automatic updating of egocentric spatial re lationships and its impairment due to right posterior cortical lesions. Neuropsychologia, 38, 585–595. Feldman, J. (1985). Four frames suffice: A provisional model of vision and space. Behav ioral and Brain Sciences, 8, 265–289. Felleman, D. J., & Van Essen, D. C. (1991) Distributed hierarchical processing in primate cerebral cortex. Cerebral Cortex, 1, 1–47. Fishman, R. S. (1997). Gordon Holmes, the cortical retina, and the wounds of war. Docu menta Ophthalmologica, 93, 9–28. Foster, D.J., & Wilson, M. A. (2006). Reverse replay of behavioural sequences in hip pocampal place cells during the awake state. Nature, 440, 680–683. Franco, L., & Sperry, R. W. (1977). Hemisphere lateralization for cognitive processing of geometry. Neuropsychologia, 75, 107–114. Freedman, D. J., & Assad, J. A. (2006). Experience-dependent representation of visual cat egories in parietal cortex. Nature, 443, 85–88. Friederici, A. D. (1982). Syntactic and semantic processes in aphasic deficits: The avail ability of prepositions. Brain and Language, 15, 249–258. Friedman-Hill, S. R., Robertson, L. C., & Treisman, A. (1995). Parietal contributions to vi sual feature binding: Evidence from a patient with bilateral lesions. Science, 269, 853– 855. Fyhn, M., Molden, S., Witter, M. P., Moser, E. I., & Moser, M.-B. (2004). Spatial represen tation in the entorhinal cortex. Science, 305, 1258–1264. Page 40 of 59

Representation of Spatial Relations Gallivan, J. P., Cavina-Pratesi, C., & Culham, J. C. (2009). Is that within reach? fMRI re veals that the human superior parieto-occipital cortex (SPOC) encodes objects reachable by the hand. Journal of Neuroscience, 29, 4381–4391. Gattass, R., Nascimento-Silva, S., Soares, J. G. M., Lima, B., Jansen, A. K., Diogo, A. C. M., Farias, M. F., Marcondes, M., Botelho, E. P., Mariani, O. S., Azzi, J., & Fiorani, M. (2005). Cortical visual areas in monkeys: Location, topography, connections, columns, plasticity and cortical dynamics. Philosophical Transactions of the Royal Society, B, 360, 709–731. Glickstein, M., Buchbinder, S., & May, J. L. (1998). Visual control of the arm, the wrist and the fingers: Pathways through the brain. Neuropsychologia, 36, 981–1001. Goel, V., Gold, B., Kapur, S., & Houle, S. (1998). Neuroanatomical correlates of human reasoning. Journal of Cognitive Neuroscience, 10, 293–302. Goldman-Rakic, P. S. (1987). Circuitry of primate prefrontal cortex and regulation of be havior by representational memory. Handbook of Physiology, 5, 373–417. Goodale, M.A., & Milner, A. D. (1992). Separate visual pathways for perception and ac tion. Trends in Neurosciences, 15, 20–25. Goodale, M. A., Milner, A. D., Jakobson, L. S., & Carey, D. P. (1991). A neurological dissoci ation between perceiving objects and grasping them. Nature, 349, 154–156. Goodrich-Hunsaker, N. J., Howard, B. P., Hunsaker, M. R., & Kesner, R. P. (2008). Human topological task adapted for rats: Spatial information processes of the parietal cortex. Neurobiology of Learning and Memory, 90, 389–394. Graziano, M. S. A., & Cooke, D. F. (2006). Parieto-frontal interactions, personal space, and defensive behavior. Neuropsychologia, 44, 845–859. Graziano, M. S. A., & Gross, C. G. (1995). Multiple representations of space in the brain. The Neuroscientist, 1, 43–50. Graziano, M. S. A., & Gross, C. G. (1998). Spatial maps for the control of movement. Cur rent Opinion in Neurobiology, 8, 195–201. Graziano, M. S. A., Hu, X. T., & Gross, C. G. (1997). Coding the locations of objects in the dark. Science, 277, 239–241. Graziano, M. S. A., Reiss, L. A. J., & Gross, C. G. (1999). A neuronal representation of the location of nearby sounds. Nature, 397, 428–430. Graziano, M. S. A., Yap, G. S., & Gross, C. G. (1994). Coding of visual space by pre-motor neurons. Science, 226, 1054–1057. Gregory, R. (2009). Seeing through illusions. Oxford, UK: Oxford University Press.

Page 41 of 59

Representation of Spatial Relations Grön, G., Wunderlich, A. P., Spitzer, M., Tomczak, R., & Riepe, M. W. (2000). Brain activa tion during human navigation: Gender-different neural networks as a substrate of perfor mance. Nature Neuroscience, 3, 404–408. Gross, C.G., & Graziano, M. S. (1995). Multiple representations of space in the brain. Neuroscientist, 1, 43–50. Gross, C. G., & Mishkin, M. (1977). The neural basis of stimulus equivalence across reti nal translation. In S. Harnad, R. Doty, J. Jaynes, L. Goldstein, and G. Krauthamer (Eds.), Laterulizution in the nervous system (pp. 109–122). New York: Academic Press. Hafting, T., Fyhn, M., Molden, S., Moser, M.-B., & Moser, E. I. (2005). Microstructure of a spatial map in the entorhinal cortex. Nature, 436, 801–806. Halligan, P. W., & Marshall, J. C. (1991). Left neglect for near but not far space in man. Nature, 350, 498–500. Halligan, P. W., & Marshall, J. C. (1995). Lateral and radial neglect as a function of spatial position: A case study. Neuropsychologia, 33, 1697–1702. Harris, I. M., Egan, G. F., Sonkkila, C., Tochon-Danguy, H. J., Paxinos, G., & Watson, J. D. (2000). Selective right parietal lobe activation during mental rotation: A parametric PET study. Brain, 123, 65–73. Harris, I. M., & Miniussi, C. (2003). Parietal lobe contribution to mental rotation demon strated with rTMS. Journal of Cognitive Neuroscience, 15, 315–323. Haxby, J. V., Grady, C. L., Horwitz, B., Ungerleider, L. G., Mishkin, M., Carson, R. E., et al. (1991). Dissociation of object and spatial visual processing pathways in human extrastri ate cortex. Proceedings of the National Academy of Sciences U S A, 88, 1621–1625. Hayward, W. G., & Tarr, M. J. (1995). Spatial language and spatial representation. Cogni tion, 55, 39–84. He, B. J., Snyder, A. Z., Vincent, J. L., Epstein, A., Shulman, G. L., & Corbetta, M. (2007). Breakdown of functional connectivity in frontoparietal networks underlies behav ioral deficits in spatial neglect. Neuron, 53, 905–918. (p. 54)

Heide, W., Blankenburg, M., Zimmermann, E., & Kompf, D. 1995. Cortical control of dou blestep saccades—implications for spatial orientation. Annals of Neurology, 38, 739–748. Heilman, K. M., Bowers, D., Coslett, H. B., Whelan, H., & Watson, R. T. (1985). Directional hypokinesia: prolonged reaction times for leftward movements in patients with right hemisphere lesions and neglect. Neurology, Cleveland, 35, 855–859. Heilman, K. M., & Van Den Abell, T. (1980). Right hemisphere dominance for attention: The mechanism underlying hemispheric asymmetries of inattention (neglect). Neurology, 30, 327–330. Page 42 of 59

Representation of Spatial Relations Hellige, J. B., & Michimata, C. (1989). Categorization versus distance: Hemispheric differ ences for processing spatial information. Memory & Cognition, 17, 770–776. Hermer, L., & Spelke, E. (1996). Modularity and development: The case of spatial reorien tation. Cognition, 61, 195–232. Hermer-Vazquez, L., Moffet, A., & Munkholm, P. (2001). Language, space, and the devel opment of cognitive flexibility in humans: The case of two spatial memory tasks. Cogni tion, 79, 263–299. Hillis, A. E. (2006). Neurobiology of unilateral spatial neglect. Neuroscientist, 12, 153– 163. Hillis, A. E., & Caramazza, A. (1991). Spatially-specific deficit to stimulus-centered letter shape representations in a case of “neglect dyslexia.” Neuropsychologia, 29, 1223–1240. Hillis, A. E., Newhart, M., Heidler, J., Barker, P. B., & Degaonkar, M. (2005). Anatomy of spatial attention: Insights from perfusion imaging and hemispatial neglect in acute stroke. Journal of Neuroscience, 25, 3161–3167. Hinton, G. E., McClelland, J. L., & Rumelhart, D. E. (1986). Distributed representations. In D. E. Rumelhart & D. L. McClelland (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition. Volume 1: Foundations (pp. 77–109). Cambridge, MA: MIT Press. Holmes, G., & Horax, G. (1919). Disturbances of spatial orientation and visual attention, with loss of stereoscopic vision. Archives of Neurology and Psychiatry, 1, 385–407. Horton, J. C., & Hoyt, W. F. (1991). The representation of the visual field in human striate cortex: A revision of the classic Holmes map. Archives of Ophthalmology, 109, 816–824. Hubbard, E. M., Piazza, M., Pinel, P., & Dehaene, S. (2005). Interactions between number and space in parietal cortex. Nature Reviews, Neuroscience, 6, 435–448. Hummel, J. E., & Biederman, I. (1992). Dynamic binding in a neural network for shape recognition. Psychological Review, 99, 480–517. Humphreys, G. W., & Riddoch, M. J. (1994). Attention to withinobject and between-object spatial representation: Multiple sites for visual selection. Cognitive Neuropsychology, 11, 207–241. Ingle, D. J. (1967). Two visual mechanisms underlying the behaviour of fish. Psychologis che Forschung, 31, 44–51. Ings, S. (2007). A natural history of seeing. New York: Norton & Company. Jackendoff, R., & Landau, B. (1992). Spatial language and spatial cognition. In R. Jackend off (Ed.), Languages of the mind: Essays on mental representation (pp. 99–124). Cam bridge, MA: MIT Press. Page 43 of 59

Representation of Spatial Relations Jacobs, R. A. (1997). Nature, nurture, and the development of functional specializations: A computational approach. Psychonomic Bulletin & Review, 4, 299–309. Jacobs, R. A. (1999). Computational studies of the development of functionally specialized modules. Trends in Cognitive Sciences, 3, 31–38. Jacobs, R. A., & Kosslyn, S. M. (1994). Encoding shape and spatial relations: The role of receptive field size in coordinating complementary representations. Cognitive Science, 18, 361–386. James, T. W., Culham, J. C., Humphrey, G. K., Milner, A. D., & Goodale, M. A. (2003). Ven tral occipital lesions impair object recognition but not object-directed grasping: A fMRI study. Brain, 126, 2463–2475. Jeannerod, M., Decety, J., & Michel, F. (1994). Impairment of grasping movements follow ing a bilateral posterior parietal lesion. Neuropsychologia, 32, 369–380. Jeannerod, M., & Jacob, P. (2005). Visual cognition: A new look at the two-visual systems model. Neuropsychologia, 43, 301–312. Johnsen, S., & Lohmann, K. J. (2005). The physics and neurobiology of magnetoreception. Nature Review Neuroscience, 6, 703–712. Johnson, H., & Haggard, P. (2005). Motor awareness without perceptual awareness. Neu ropsychologia, 43, 227–237. Johnson, A., & Redish, A. D. (2007). Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point. Journal of Neuroscience, 27, 12176–12189. Johnson-Laird, P. N. (2005). Mental models in thought. In K. Holyoak & R. J. Sternberg (Eds.), The Cambridge handbook of thinking and reasoning (pp. 179–212). Cambridge, UK: Cambridge University Press. Jordan, K., Wustenberg, T., Heinze, H. J., Peters, M., & Jancke, L. (2002). Women and men exhibit different cortical activation patterns during mental rotation tasks. Neuropsycholo gia, 40, 2397–2408. Just, M. A., Carpenter, P. A., Maguire, M., Diwadkar, V., & McMains, S. (2001). Mental ro tation of objects retrieved from memory: A functional MRI study of spatial processing. Journal of Experimental Psychology: General, 130, 493–504. Kaas, J. H. (1997). Topographic maps are fundamental to sensory processing. Brain Re search Bulletin, 44, 107–112. Kahane, P., Hoffman, D., Minotti, L., & Berthoz, A. (2003) Reappraisal of the human vestibular cortex by cortical electrical stimulation study. Annals of Neurology, 54, 615– 624.

Page 44 of 59

Representation of Spatial Relations Kahneman, D., Treisman, A., & Gibbs, B. (1992). The reviewing of object files: Object-spe cific integration of information. Cognitive Psychology, 24, 175–219. Kant, I. (1781). Kritik der reinen Vernunft (translation: Critique of Pure Reason, 2008. Penguin Classics. Karnath, H. O. (2001). New insights into the functions of the superior temporal cortex: Nature Reviews, Neuroscience, 2, 568–576. Karnath, H. O., Ferber, S., & Himmelbach, M. (2001). Spatial awareness is a function of the temporal not the posterior parietal lobe. Nature, 411, 950–953. Khan, A. Z., Pisella, L., Vighetto, A., Cotton, F., Luauté, J., Boisson, D., Salemme, R., Craw ford, J. D., & Rossetti, Y. (2005). Optic ataxia errors depend on remapped, not viewed, tar get location. Nature Neuroscience, 8, 418–420. Kastner, S., Demner, I., & Ziemann, U. (1998). Transient visual field defects induced by transcranial magnetic stimulation. Experimental Brain Research, 118, 19–26. Kastner, S., DeSimone, K., Konen, C. S., Szczepanski, S. M., Weiner, K. S., & Sch neider, K. A. (2007). Topographic maps in human frontal cortex revealed in memory-guid ed saccade and spatial working-memory tasks. Journal of Neurophysiology, 97, 3494– 3507. (p. 55)

Kastner, S., Pinsk, M. A., De Weerd, P., Desimone, R., & Ungerleider, L. G. (1999). In creased activity in human visual cortex during directed attention in the absence of visual stimulation. Neuron, 22, 751–761. Kemmerer, D. (2006). The semantics of space: Integrating linguistic typology and cogni tive neuroscience. Neuropsychologia, 44, 1607–1621. Kemmerer, D., & Tranel, D. (2000). A double dissociation between linguistic and percep tual representations of spatial relationships. Cognitive Neuropsychology, 17, 393–414. Kemmerer, D., & Tranel, D. (2003). A double dissociation between the meanings of action verbs and locative prepositions. Neurocase, 9, 421–435. Kessels, R. P. C., de Haan, E. H. F., Kappelle, L. J., & Postma, A. (2001). Varieties of human spatial memory: A meta-analysis on the effects of hippocampal lesions. Brain Research Reviews, 35, 295–303. Kessels, R. P. C., Kappelle, L. J., de Haan, E. H. F., & Postma, A. (2002). Lateralization of spatial-memory processes: evidence on spatial span, maze learning, and memory for ob ject locations. Neuropsychologia, 40, 1465–1473. Kirchner, W. H., & Braun, U. (1994). Dancing honey bees indicate the location of food sources using path integration rather than 48, cognitive maps. Animal Behaviour, 1437– 1441. Page 45 of 59

Representation of Spatial Relations Kinsbourne, M. (1993). Orientational bias model of unilateral neglect: Evidence from at tentional gradients within hemispace. In I. H. Robertson & J. C. Marshall (Eds.), Unilater al neglect: Clinical and experimental studies (pp. 63–86). Hillsdale, NJ: Erlbaum. Kinsbourne, M., & Warrington, E. K. (1962). A disorder of simultaneous form perception. Brain, 85, 461–486. Kitada, R., Kito, T., Saito, D. N., Kochiyama, T., Matsamura, M., Sadato, N., & Lederman, S. J. (2006). Multisensory activation of the intraparietal area when classifying grating ori entation: A functional magnetic resonance imaging study. Journal of Neuroscience, 26, 7491–7501. Kjelstrup, K. B., Solstad, T., Brun, V. H., Hafting, T., Leutgeb, S., Witter, M. P., Moser, E. I., & Moser, M.-B. (2008). Finite scales of spatial representation in the hippocampus. Science, 321, 140–143. Koch, C. (2004). The quest for consciousness: A neurobiological approach. Englewood, CO: Roberts and Company. Kohonen, T. (2001). Self-organizing maps. Berlin: Springer. Kosslyn, S. M. (1987). Seeing and imagining in the cerebral hemispheres: A computation al approach. Psychological Review, 94 (2), 148–175. Kosslyn, S. M. (1980). Image and mind. Cambridge, MA: Harvard University Press. Kosslyn, S. M. (1994). Image and brain: The resolution of the imagery debate. Cambridge, MA: MIT Press. Kosslyn, S. M., Chabris, C. F., Marsolek, C. J., & Koenig, O. (1992). Categorical versus co ordinate spatial relations: Computational analyses and computer simulations. Journal of Experimental Psychology: Human Perception and Performance, 18 (2), 562–577. Kosslyn, S. M., DiGirolamo, G. J., Thompson, W. L., & Alpert, N. M. (1998). Mental rotation of objects versus hands: neural mechanisms revealed by positron emission tomography. Psychophysiology, 35, 151–161. Kosslyn, S. M., & Jacobs, R. A. (1994). Encoding shape and spatial relations: A simple mechanism for coordinating complementary representations. In V. Honavar & L. M. Uhr (Eds.), Artificial intelligence and neural networks: Steps toward principled integration (pp. 373–385). Boston: Academic Press. Kosslyn, S. M., & Koenig, O. (1992). wet mind: the new cognitive neuroscience. New York: Free Press. Kosslyn, S. M., Koenig, O., Barrett, A., Cave, C. B., Tang, J., & Gabrieli, J. D. E. (1989). Evi dence for two types of spatial representations: Hemispheric specialization for categorical

Page 46 of 59

Representation of Spatial Relations and coordinate relations. Journal of Experimental Psychology: Human Perception and Per formance, 15 (4), 723–735. Kosslyn, S. M., Maljkovic, V., Hamilton, S. E., Horwitz, G., & Thompson, W. L. (1995). Two types of image generation: Evidence for left and right hemisphere processes. Neuropsy chologia, 33 (11), 1485–1510. Kosslyn, S. M., Pick, H. L., & Fariello, G. R. (1974). Cognitive maps in children and men. Child Development, 45, 707–716. Kosslyn, S. M., Thompson, W. T., & Ganis, G. (2006). The case for mental imagery. New York: Oxford University Press. Kosslyn, S. M., Thompson, W.T., Gitelman, D. R., & Alpert, N. M. (1998). Neural systems that encode categorical versus coordinate spatial relations: PET investigations. Psychobi ology, 26 (4), 333–347. Króliczak, G., Heard, P., Goodale, M. A., & Gregory, R. L. (2006). Dissociation of percep tion and action unmasked by the hollow-face illusion. Brain Research, 1080, 9–16. Kuipers, B. (1978). Modeling spatial knowledge. Cognitive Science, 2, 129–153. Lacquaniti, F., Guigon, E., Bianchi, L., Ferraina, S., & Caminiti, R. (1995). Representing spatial information for limb movement: The role of area 5 in the monkey. Cerebral Cortex, 5, 391–409. Laeng, B. (1994). Lateralization of categorical and coordinate spatial functions: A study of unilateral stroke patients. Journal of Cognitive Neuroscience, 6 (3), 189–203. Laeng, B. (2006). Constructional apraxia after left or right unilateral stroke. Neuropsy chologia, 44, 1519–1523. Laeng, B., Brennen, T., Johannessen, K., Holmen, K., & Elvestad, R. (2002). Multiple refer ence frames in neglect? An investigation of the object-centred frame and the dissociation between “near” and “far” from the body by use of a mirror. Cortex, 38, 511–528. Laeng, B., Carlesimo, G. A., Caltagirone, C., Capasso, R., & Miceli, G. (2000). Rigid and non-rigid objects in canonical and non-canonical views: Effects of unilateral stroke on ob ject identification. Cognitive Neuropsychology, 19, 697–720. Laeng, B., Chabris, C. F., & Kosslyn, S. M. (2003). Asymmetries in encoding spatial rela tions. In K. Hugdahl and R. Davidson (Eds.), The asymmetrical brain (pp. 303–339). Cam bridge, MA: MIT Press. Laeng, B., Okubo, M., Saneyoshi, A., & Michimata, C. (2011). Processing spatial relations with different apertures of attention. Cognitive Science, 35, 297–329. Laeng, B., & Peters, M. (1995). Cerebral lateralization for the processing of spatial coor dinates and categories in left- and right-handers. Neuropsychologia, 33, 421–439. Page 47 of 59

Representation of Spatial Relations Laeng, B., Peters, M., & McCabe, B. (1997). Memory for locations within regions. Spatial biases and visual hemifield differences. Memory and Cognition, 26, 97–107. Laeng, B., Shah, J., & Kosslyn, S. M. (1999). Identifying objects in conventional and con torted poses: Contributions of hemisphere-specific mechanisms. Cognition, 70 (1), 53–85. Laeng, B., & Teodorescu, D.-S. (2002). Eye scanpaths during visual imagery reen act those of perception of the same visual scene. Cognitive Science, 26, 207–231. (p. 56)

Lakoff, G., & Johnson, M. (1999). Philosophy in the flesh. Chicago: University of Chicago Press. Lamb, T. D., Collin, S. P., & Pugh, E. N. (2007). Evolution of the vertebrate eye: Opsins, photoreceptors, retina and eye cup. Nature Reviews, Neuroscience, 8, 960–975. Lamme, V. A. F. (2003). Why visual attention and awareness are different. Trends in Cog nitive Sciences, 7, 12–18. Lamme, V. A. F. (2006). Towards a true neural stance on consciousness. Trends in Cogni tive Sciences, 10, 494–501. Lamme, V. A. F., & Roelfsema, P. R. (2000). The distinct modes of vision offered by feed forward and recurrent processing. Trends in Neuroscience, 23, 571–579. Lamme, V. A. F., Super, H., Landman, R., Roelfsema, P. R., & Spekreijse, H. (2000). The role of primary visual cortex (V1) in visual awareness. Vision Research, 40, 1507–1521. Landau, B., Hoffman, J. E., & Kurz, N. (2006). Object recognition with severe spatial deficits in Williams syndrome: Sparing and breakdown. Cognition, 100, 483–510. LeDoux, J. E., & Gazzaniga, M. S. (1978). The integrated mind. New York: Plenum. LeDoux, J. E., Wilson, D. H., & Gazzaniga, M. S. (1977). Manipulospatial aspects of cere bral lateralization: Clues to origin of lateralization. Neuropsychologia, 15, 743–750. Lenck-Santini, P.-P., Save, E., & Poucet, B. (2001). Evidence for a relationship between place-cell spatial firing and spatial memory performance. Hippocampus, 11, 337–390. Liben, L. S. (2009). The road to understanding maps. Current Directions in Psychological Science, 18, 310–315. Livingstone, M. S., & Hubel, D. H. (1988). Segregation of form, color, movement, and depth: Anatomy, physiology, and perception. Science, 240, 740–749. Lomber, S. G., & Malhotra, S. (2008). Double dissociation of “what” and “where” process ing in auditory cortex. Nature Neuroscience, 11, 609–616.

Page 48 of 59

Representation of Spatial Relations Ludvig, N., Tang, H. M., Gohil, B. C., & Botero, J. M. (2004). Detecting location-specific neuronal firing rate increases in the hippocampus of freely moving monkeys. Brain Re search, 1014, 97–109. Luna, B., Thulborn, K. R., Strojwas, M. H., McCurtain, B. J., Berman, R. A., Genovese, C. R., et al. (1998). Dorsal cortical regions subserving visually guided in humans: an fMRI study. Cerebral Cortex, 8 (1), 40–47. Luria, A. R. (1963). Restoration of function after brain injury. Pergamon Press. Luria, A. R. (1973). The working brain: An introduction to neuropsychology. New York: Basic Books. Maguire, E. A., Burgess, N., & O’Keefe, J. (1999). Human spatial navigation: cognitive maps, sexual dimorphism. Current Opinion in Neurobiology, 9, 171–177. Maguire, E. A., Burke, T., Phillips, J., & Staunton, H. (1996). Topographical disorientation following unilateral temporal lobe lesions in humans. Neuropsychologia, 34, 993–1001. Maguire, E. A., Frackowiak, R. S., & Frith, C. D. (1997). Recalling routes around London: Activation of the right hippocampus in taxi drivers. Journal of Neuroscience, 17, 7103– 7110. Maguire, E. A., Frith, C. D., Burgess, N., Donnett, J. G., & O’Keefe, J. (1998). Knowing where things are: Parahippocampal involvement in encoding object relations in virtual large-scale space. Journal of Neuroscience, 10, 61–76. Mahon, B. Z., Milleville, S. C., Negri, G. A. L., Rumiati, R. I., Caramazza, A., & Martin, A. (2007). Action-related properties shape object representations in the ventral stream. Neu ron, 55, 507–520. Malach, R., Levy, I., & Hasson, U. (2002). The topography of high-order human object ar eas. Trends in Cognitive Science, 6, 176–184. Malkova, L., & Mishkin, M. (2003). One-trial memory for object-place associations after separate lesions of hippocampus and posterior parahippocampal region in the monkey. Journal of Neuroscience, 1; 23 (5), 1956–1965. Markman, A. B. (1999). Knowledge representation. Mahwah, NJ: Psychology Press. Marr, D. (1982). Vision. San Francisco: Freeman and Company. Maunsell, J. H. R., & Newsome, W. T. (1987). Visual processing in the monkey extrastriate cortex. Annual Review of Neuroscience, 10, 363–401. Medendorp, W. P., Goltz, H. C., Vilis, T., & Crawford, J. D. (2003). Gaze-centered updating of visual space in human parietal cortex. Journal of Neuroscience, 23, 6209–6214.

Page 49 of 59

Representation of Spatial Relations Mennemeier, M., Wertman, E., & Heilman, K. M. (1992). Neglect of near peripersonal space. Brain, 115, 37–50. Menzel, R., Brandt, R., Gumbert, A., Komischke, B., & Kunze, J. (2000). Two spatial mem ories for honeybee navigation. Proceedings of the Royals Society of London, Biological Sciences, 267, 961–968. Miller, G. A., & Johnson-Laird, P. N. (1976). Language and perception. Cambridge, MA: Harvard University Press. Milner, A. D., & Goodale, M. A. (1995). The visual brain in action. Oxford: Oxford Univer sity Press. Milner, D. A., & Goodale, M. A. (2006). The visual brain in action. New York: Oxford Uni versity Press. Milner, D. A., & Goodale, M. A. (2008). Two visual systems re-viewed. Neuropsychologia, 46, 774–785. Milner, D. A., Perrett, D. I., Johnston, R. S., Benson, P. J., Jordan, T. R., Heeley, D. W., et al. (1991). Perception and action in “visual form agnosia.” Brain, 114, 405–428. Mishkin, M., Ungerleider, L. G., & Macko, K. A. (1983). Object vision and spatial vision: Two cortical pathways. Trends in Neurosciences, 6, 414–417. Morel, A., & Bullier, J. (1990). Anatomical segregation of two cortical visual pathways in the macaque monkey. Visual Neuroscience, 4, 555–578. Moser, E. I., Kropff, E., & Moser, M.-B. (2008). Place cells, grid cells and the brain’s spa tial representation system. Annual Review of Neuroscience, 31, 69–89. Motter, B. C., & Mountcastle, V. B. (1981). The functional properties of light-sensitive neu rons of the posterior parietal cortex studied in waking monkeys: Foveal sparing and oppo nent vector organization. Journal of Neuroscience, 1, 3–26. Mountcastle, V. B. (1995). The parietal system and some higher brain functions. Cerebral Cortex, 5, 377–390. Naganuma, T., Nose, I., Inoue, K., Takemoto, A., Katsuyama, N., & Taira, M. (2005). Infor mation processing of geometrical features of a surface based on binocular disparity cues: An fMRI study. Neuroscience Research, 51, 147–155. Neggers, S. F. W., Van der Lubbe, R. H. J., Ramsey, N. F., & Postma, A. (2006). Interactions between ego- and allocentric neuronal representations of space. NeuroImage, 31, 320– 331. Newcombe, F., & Russell, W. R. (1969). Dissociated visual perceptual and spatial deficits in focal lesions of the right hemisphere. Journal of Neurology, Neurosurgery, and Psychia try, 32, 73–81. Page 50 of 59

Representation of Spatial Relations Nguyen, B. T., Trana, T. D., Hoshiyama, M., Inuia, K., & Kakigi, R. (2004). Face rep resentation in the human primary somatosensory cortex. Neuroscience Research, 50, 227–232. (p. 57)

Nitz, D. A. (2006). Tracking route progression in the posterior parietal cortex. Neuron, 49, 747–756. O’Keefe, J. (1976). Place units in the hippocampus of the freely moving rat. Experimental Neurology, 51, 78–109. O’Keefe, J. (1996). The spatial prepositions in English, vector grammar, and the cognitive map theory. In P. Bloom, M. A. Peterson, L. Nadel, & M. F. Garrett (Eds.), Language and space (pp. 277–316). Cambridge, MA: The MIT Press. O’Keefe, J. (2003). Vector grammar, places, and the functional role of spatial prepositions in English. In E. van der Zee & J. Slack (Eds.), Representing direction in language and space (pp. 69–85). Oxford, K: Oxford University Press. O’Keefe, J., Burgess, N., Donnett, J. G., Jeffery, J. K., & Maguire, E. A. (1998). Place cells, navigational accuracy, and the human hippocampus. P hilosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 353, 1333–1340. O’Keefe, J., & Nadel, L. (1978). The hippocampus as a cognitive map. Oxford, UK: Calren don Press. Okubo, M., Laeng, B., Saneyoshi, A., & Michimata, C. (2010). Exogenous attention differ entially modulates the processing of categorical and coordinate spatial relations. Acta Psychologica, 135, 1–11. Olson, C. R. (2001). Object-based vision and attention in primates. Current Opinion in Neurobiology, 11, 171–179. Olson, C. R. (2003). Brain representations of object-centered space in monkeys and hu mans. Annual Review of Neuroscience, 26, 331–354. Olson, C. R., & Gettner, S. N. (1995). Object-centered direction selectivity in the macaque supplementary eye field. Science, 269, 985–988. Optican, L. M. (2005). Sensorimotor transformation for visually guided saccases. Annals of the New York Academy of Sciences, 1039, 132–148. O’Regan, J. K., & Nöe, A. (2001). A sensorimotor account of vision and visual conscious ness. Behavioral and Brain Sciences, 24, 939–1011. O’Reilly, R. C., Kosslyn, S. M., Marsolek, C. J., & Chabris, C. F. (1990). Receptive field characteristics that allow parietal lobe neurons to encode spatial properties of visual in put: A computational analysis. Journal of Cognitive Neuroscience, 2, 141–155.

Page 51 of 59

Representation of Spatial Relations Otto, I., Grandguillaume, P., Boutkhil, L., & Guigon, E. (1992). Direct and indirect cooper ation between temporal and parietal networks for invariant visual recognition. Journal of Cognitive Neuroscience, 4, 35–57. Paillard, J. (1991). Motor and representational framing of space. In J. Paillard (Ed.), Brain and space (pp. 163–182). Oxford, UK: Oxford University Press. Palermo, L., Bureca, I., Matano, A., & Guariglia, C. (2008). Hemispheric contribution to categorical and coordinate representational processes: A study on brain-damaged pa tients. Neuropsychologia, 46, 2802–2807. Parsons, L. M. (2003). Superior parietal cortices and varieties of mental rotation. Trends in Cognitive Sciences, 7, 515–517. Perenin, M.-T., & Vighetto, A. (1988). Optic ataxia: A specific disruption in visuomotor mechanisms. Brain, 111, 643–674. Piaget, J., & Inhelder, B. (1956). The child’s conception of space. London: Routledge & Kegan Paul. Perenin, M. T., & Jeannerod, M. (1978). Visual function within the hemianopic field follow ing early cerebral hemidecortication in man. I. Spatial localization. Neuropsychologia, 16, 1–13. Pinker, S. (1990). A theory of graph comprehension. In R. Friedle (Ed.), Artificial intelli gence and the future of testing (pp. 73–126). Hillsdale, NJ: Erlbaum. Pinker, S. (2007). The stuff of thought: Language as a window into human nature. New York: Penguin Books. Poucet, B. (1993). Spatial cognitive maps in animals: New hypotheses on their structure and neural mechanisms. Psychological Review, 100, 163–182. Pouget, A., & Sejnowski, T. J. (1997). A new view of hemineglect based on the response properties of parietal neurones. Philosophical Transactions of the Royal Society, Series B, Biological Sciences, 352, 1449–1459. Pouget, A., Snyder, L. H. (2000). Computational approaches to sensorimotor transforma tions. Nature Neuroscience, 3, 1192–1198. Quinlan, D. J., & Culham, J. C. (2007). fMRI reveals a preference for near viewing in the human parietal-occipital cortex. NeuroImage, 36, 167–187. Rao, S. C., Rainer, G., & Miller, E. K. (1997). Integration of what and where in the primate prefrontal cortex. Science, 276, 821–824. Reed, C. L., Klatzky, R. L., & Halgren, E. (2005). What vs. where in touch: An fMRI study. NeuroImage, 25, 718–726. Page 52 of 59

Representation of Spatial Relations Revonsuo, A., & Newman, J. (1999). Binding and consciousness. Consciousness and Cog nition, 8, 123–127. Ritz, T. (2009). Magnetic sense in animal navigation. In L. Squire (Ed.), Encyclopedia of neuroscience (pp. 251–257). Elsevier: Network Version. Rizzolatti, G., & Craighero, L. (2004). The mirror-neuron system. Annual Review of Neu roscience, 27, 169–192. Rizzolatti, G., & Matelli, M. (2003). Two different streams form the dorsal visual system: Anatomy and functions. Experimental Brain Research, 153, 146–157. Robertson, L. C. (2003). Binding, spatial attention and perceptual awareness. Nature Re views: Neuroscience, 4, 93–102. Robertson, L. C., Treisman, A., Friedman-Hill, S., & Grabowecky, M. (1997). The interac tion of spatial and object pathways: Evidence from Bálint ‘s syndrome. Journal of Cogni tive Neuroscience, 9, 295–317. Rogers, J. L., & Kesner, R. P. (2006). Lesions of the dorsal hippocampus or parietal cortex differentially affect spatial information processing. Behavioral Neuroscience, 120, 852– 860. Rogers, L. J., Zucca, P., & Vallortigara, G. (2004). Advantage of having a lateralized brain. Proceedings of the Royal Society of London B (Suppl.): Biology Letters, 271, 420–422. Rolls, E. T., Miyashita, Y., Cahusac, P. M. B., Kesner, R. P., Niki, H., Feigenbaum, J., et al. (1989). Hippocampal neurons in the monkey with activity related to the place in which a stimulus is shown. Journal of Neuroscience, 9, 1835–1845. Romanski, L. M., Tian, B., Fritz, J., Mishkin, M., Goldman-Rakic, P. S., & Rauschecker, J. P. (1999). Dual streams of auditory afferents target multiple domains in the primate pre frontal cortex. Nature Neuroscience, 2, 1131–1136. Roth, E. C., & Hellige, J. B. (1998). Spatial processing and hemispheric asymmetry: Con tributions of the transient/magnocellular visual system. Journal of Cognitive Neuroscience, 10, 472–484. Rueckl, J. G., Cave, K. R., & Kosslyn, S. M. (1989). Why are ‘what’ and ‘where’ processed by separate visual systems? A computational investigation. Journal of Cognitive Neuro science, 1 (2), 171–186. Rybash, J. M., & Hoyer, W. J. (1992). Hemispheric specialization for categorical and coordinate spatial representations: A reappraisal. Memory & Cognition, 20 (3), 271–276. (p. 58)

Sakata, H., Taira, M., Kusunoki, M., Murata, A., & Tanaka, Y. (1997). The parietal associa tion cortex in depth perception and visual control of hand action. Trends in Neuroscience, 20, 350–357. Page 53 of 59

Representation of Spatial Relations Save, E., & Poucet, B. (2000). Hippocampal-parietal cortical interactions in spatial cogni tion. Hippocampus, 10, 491–499. Schindler, I., Rice, N. J., McIntosh, R. D., Rossetti, Y., Vighetto, A., & Milner, A. D. (2004). Automatic avoidance of obstacles is a dorsal stream function: Evidence from optic ataxia. Nature Neuroscience, 7, 779–784. Schneider, G. E. (1967). Contrasting visuomotor functions of tectum and cortex in the golden hamster. Psychologische Forschung, 30, 52–62. Seltzer, B., & Pandya, D. N. (1978). Afferent cortical connections and architectonics of the superior temporal sulcus and surrounding cortex in the rhesus monkey. Brain Research, 149, 1–24. Semmes, J., Weinstein, S., Ghent, L., & Teuber, H. L. (1955). Spatial orientation in man af ter cerebral injury: I. Analyses by locus of lesion. Journal of Psychology, 39, 227–244. Sereno, A. B., & Maunsell, J. H. R. (1998). Shape selectivities in primate lateral intrapari etal cortex. Nature, 395, 500–503. Sereno, M. I., Dale, A. M., Reppas, J. B., Kwong, K. K., Belliveau, J. W., et al. (1995). Bor ders of multiple visual areas in humans revealed by functional magnetic resonance imag ing. Science, 268, 889–893. Sereno, M. I., & Huang, R.-S. (2006). A human parietal face area contains head-centered visual and tactile maps. Nature Neuroscience, 9, 1337–1343. Sereno, M. I., Pitzalis, S., & Martinez, A. (2001). Mapping of contralateral space in retino topic coordinates by a parietal cortical area in humans. Science, 294, 1350–1354. Servos, P., Engel, S. A., Gati, J., & Menon, R. (1999). FMRI evidence for an inverted face representation in human somatosensory cortex. Neuroreport, 10, 1393–1395. Seth, A. K., McKinstry, J. L., Edelman, G. M., & Krichmar, J. L. (2004). Visual binding through reentrant connectivity and dynamic synchronization in a brain-based device. Cerebral Cortex, 14, 1185–1199. Shelton, P. A., Bowers, D., & Heilman, K. M. (1990). Peripersonal and vertical neglect. Brain, 113, 191–205. Shelton, A. L., & McNamara, T. P. (2001). Systems of spatial reference in human memory. Cognitive Psychology, 43, 274–310. Shrager, Y., Kirwan, C. B., & Squire, L. R. (2008). Neural basis of the cognitive map: Path integration does not require hippocampus or entorhinal cortex. Proceedings of the Na tional Academy of Sciences U S A, 105, 12034–12038. Silver, M., & Kastner, S. (2009). Topographic maps in human frontal and parietal cortex. Trends in Cognitive Sciences, 11, 488–495. Page 54 of 59

Representation of Spatial Relations Slobin, D. I. (1996). From “thought and language” to “thinking for speaking.” In J. J. Gumperz & S. C. Levinson (Eds.), Rethinking linguistic relativity (pp. 70–96). Cambridge, UK: Cambridge University Press. Slotnick, S. D., & Moo, L. R. (2006). Prefrontal cortex hemispheric specialization for cate gorical and coordinate visual spatial memory. Neuropsychologia, 44, 1560–1568. Slotnick, S. D., Moo, L., Tesoro, M. A., & Hart, J. (2001). Hemispheric asymmetry in cate gorical versus coordinate visuospatial processing revealed by temporary cortical deacti vation. Journal of Cognitive Neuroscience, 13, 1088–1096. Smallman, H. S., MacLeod, D. I. A., He, S., & Kentridge, R. W. (1996). Fine grain of the neural representation of human spatial vision. Journal of Neuroscience, 76 (5), 1852– 1859. Smith, E. E., Jonides, J., Koeppe, R. A., Awh, E., Schumacher, E., & Minoshima, S. (1995). Spatial vs. object working memory: PET investigations. Journal of Cognitive Neuroscience, 7, 337–358. Snyder, L. H., Batista, A. P., & Andersen, R. A. (1997). Coding of intention in the posterior parietal cortex. Nature, 386, 167–170. Solstad, T., Boccara, C. N., Kropff, E., Moser, M.-B., & Moser, E. I. (2008). Representation of geometric borders in the entorhinal cortex. Science, 322, 1865–1868. Sovrano, V. A., Dadda, M., Bisazza, A. (2005). Lateralized fish perform better than nonlat eralized fish in spatial reorientation tasks. Behavioural Brain Research, 163, 122–127. Stark, M., Coslett, H. B., & Saffran, E. M. (1996). Impairment of an egocentric map of lo cations: Implications for perception and action. Cognitive Neuropsychology, 13, 481–523. Sutherland, R. J., & Rudy, J. W. (1988) Configural association theory: The role of the hip pocampal formation in learning, memory and amnesia. Psychobiology, 16, 157–163. Tanaka, K., Saito, H., Fukada, Y., & Moriya, M. (1991) Coding visual images of objects in the inferotemporal cortex of the macaque monkey. Journal of Neurophysiology, 66, 170– 189. Taira, M., Mine, S., Georgopoulos, A. P., Murata, A., & Sakata, H. (1990). Parietal cortex neurons of the monkey related to the visual guidance of hand movement. Experimental Brain Research, 83, 29–36. Takahashi, N., Kawamura, M., Shiota, J., Kasahata, N., & Hirayama, K. (1997). Pure topo graphic disorientation due to right retrosplenial lesion. Neurology, 49, 464–469. Talmy, L. (2000). Toward a cognitive semantics. Cambridge, MA: MIT Press.

Page 55 of 59

Representation of Spatial Relations Thiebaut de Schotten, M., Urbanski, M., Duffau, H., Volle, E., Levy, R., Dubois, B., & Bar tolomeo, P. (2005). Direct evidence for a parietal-frontal pathway subserving spatial awareness in humans. Science, 309, 2226–2228. Thivierge, J.-P., & Marcus, G. (2007). The topographic brain: From neural connectivity to cognition. Trends in Neuroscience, 30, 251–259. Tipper, S.P., & Behrmann, M. (1996). Object-centered not scene based visual neglect. Journal of Experimental Psychology: Human Perception and Performance, 22, 1261–1278. Tolman, E. C. (1948) Cognitive maps in rats and men. Psychological Review, 55, 189–208. Tootell, R. B., Hadjikhani, N., Hall, E. K., Marrett, S., Vanduffel, W., Vaughan, J.T., & Dale, A. M. (1998). The retinotopy of visual spatial attention. Neuron, 21, 1409–1422. Tootell, R. B., Hadjikhani, N., Vanduffel, W., Liu, A. K., Mendola, J. D., Sereno, M. I., & Dale, A. M. (1998). Functional analysis of primary visual cortex (V1) in humans. Proceed ings of the National Academy of Sciences U S A, 95, 811–817. Tootell, R. B., Mendola, J. D., Hadjikhani, N., Liu, A. K., & Dale, A. M. (1998). The repre sentation of the ipsilateral visual field in human cerebral cortex. Proceedings of the Na tional Academy of Sciences U S A, 95, 818–824. Tootell, R. B., Silverman, M. S., Switkes, E., & DeValois, R. L. (1982). Deoxyglucose analy sis of retinotopic organization in primate striate cortex. Science, 218, 902–904. Tranel, D., & Kemmerer, D. (2004). Neuroanatomical correlates of locative prepositions. Cognitive Neuropsychology, 21, 719–749. Treisman, A. (1996). The binding problem. Current Opinion in Neurobiology, 6, 171–178. Treisman, A. (1988). The perception of features and objects. In R. D. Wright (Ed.), Visual attention (pp. 26–54). New York: Oxford University Press. (p. 59)

Tranel, D., & Kemmerer, D. (2004). Neuroanatomical correlates of locative prepositions. Cognitive Neuropsychology, 21, 719–749. Trojano, L., Conson, M., Maffei, R., & Grossi, D. (2006). Categorical and coordinate spa tial processing in the imagery domain investigated by rTMS. Neuropsychologia, 44, 1569– 1574. Trojano, L., Grossi, D., Linden, D. E. J., Formisano, E., Goebel, R., & Cirillo, S. (2002). Co ordinate and categorical judgements in spatial imagery: An fMRI study. Neuropsychologia, 40, 1666–1674. Tsal, Y., & Bareket, T. (2005). Localization judgments under various levels of attention. Psychonomic Bulletin & Review, 12 (3), 559–566.

Page 56 of 59

Representation of Spatial Relations Tsal, Y., Meiran, N., & Lamy, D. (1995). Toward a resolution theory of visual attention. Vi sual Cognition, 2, 313–330. Tsao, D. Y., Vanduffel, W., Sasaki, Y., Fize, D., Knutsen, T. A., Mandeville, J. B., Wald, L. L., Dale, A. M., Rosen, B. R., Van Essen, D. C., Livingstone, M. S., Orban, G. A., & Tootell, R. B. H. (2003). Stereopsis activates V3A and caudal intraparietal areas in macaques and humans. Neuron, 31, 555–568. Tsutsui, K.-I., Sakata, H., Naganuma, T., & Taira, M. (2002). Neural correlates for percep tion of 3D surface orientation from texture gradients. Science, 298, 409–412. Turnbull, O. H., Beschin, N., & Della Sala, S. (1997). Agnosia for object orientation: Impli cations for theories of object recognition. Neuropsychologia, 35, 153–163. Turnbull, O. H., Laws, K. R., & McCarthy, R. A. (1995). Object recognition without knowl edge of object orientation. Cortex, 31, 387–395. Ullman, S. (1984). Visual routines. Cognition, 18, 97–159. Ungerleider, L. G., & Haxby, J. V. (1994). “What” and “where” in the human brain. Current Opinion in Neurobiology, 4, 157–165. Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. J. Ingle, M. A. Goodale, & R. J. Mansfield (Eds.), Analysis of visual behavior (pp. 549–586). Cambridge, MA: MIT Press. Vallar, G., Bottini, G., & Paulesu, E. (2003). Neglect syndromes: the role of the parietal cortex. Advances in Neurology, 93, 293–319. Vallortigara, G., Cozzutti, C., Tommasi, L., & Rogers, L. J. (2001) How birds use their eyes: Opposite left-right specialisation for the lateral and frontal visual hemifield in the domes tic chick. Current Biology, 11, 29–33. Vallortigara, G., & Rogers, L. J. (2005). Survival with an asymmetrical brain: advantages and disadvantages of cerebral lateralization. Behavioral and Brain Sciences, 28, 575–589. Valyear, K. F., Culham, J. C., Sharif, N., Westwood, D., & Goodale, M. A. (2005). A double dissociation between sensitivity to changes in object identity and object orientation in the ventral and dorsal visual streams: A human fMRI study. Neuropsychologia, 44, 218–228. Vauclair, J., Yamazaki, Y., & Güntürkün, O. (2006). The study of hemispheric specialization for categorical and coordinate spatial relations in animals. Neuropsychologia, 44, 1524– 1534. Wandell, B. A. (1999). Computational neuroimaging of human visual cortex. Annual Re view of Neuroscience, 22, 145–173. Wandell, B.A., Brewer, A.A., & Dougherty, R. F. (2005). Visual field map clusters in human cortex. Philosophical Transactions of the Royal Society, B, 360, 693–707. Page 57 of 59

Representation of Spatial Relations Wandell, B. A., Dumoulin, S. O., & Brewer, A. A. (2009). Visual cortex in humans. In L. Squire (Ed.), Encyclopedia of neuroscience (pp. 251–257). Elsevier: Network Version. Wang, B., Zhou, T. G., Zhuo, Y., & Chen, L. (2007). Global topological dominance in the left hemisphere. Proceedings of the National Academy of Sciences U S A, 104, 21014– 21019. Warrington, E. K. (1982). Neuropsychological Studies of Object Recognition. Philosophi cal Transactions of the Royal Society of London. Series B, Biological, 298, 15–33. Warrington, E. K., & James, A. M. (1986). Visual object recognition in patients with righthemisphere lesions: Axes or features. Perception, 15, 355–366. Warrington, E. K., & James, A. M. (1988). Visual apperceptive agnosia: A clinico-anatomi cal study of three cases. Cortex, 24, 13–32. Warrington, E. K., & Taylor, A. M. (1973). The contribution of the right parietal lobe to ob ject recognition. Cortex, 9, 152–164. Waszak, F., Drewing, K., & Mausfeld, R. (2005). Viewerexternal frames of reference in the mental transformation of 3-D objects. Perception & Psychophysics, 67, 1269–1279. Weintraub, S., & Mesulam, M.-M. (1987). Right cerebral dominance in spatial attention: Further evidence based on ipsilateral neglect. Archives of Neurology, 44, 621–625. Weiskrantz, L. (1986). Blindsight: A Case Study and Implications. Oxford: Oxford Univer sity Press. Weiskrantz, L. (1997). Consciousness lost and found: A neuropsychological exploration. Oxford, UK: Oxford University Press. Wilson, F. A., Scalaidhe, S. P., & Goldman-Rakic, P. S. (1993). Dissociation of object and spatial processing domains in primate prefrontal cortex. Science, 260, 1955–1958. Wilson, M. A., & McNaughton, B. L. (1993). Dynamics of the hippocampal ensemble code for space. Science, 261, 1055–1058. Wilson, M. A., & McNaughton, B. L. (1994). Reactivation of the hippocampal ensemble memories during sleep. Science, 265, 676–679. Wolbers, T., & Hegarty, M. (2010). What determines our navigational abilities? Trends in Cognitive Sciences, 14 (3), 138–146. Wolbers, T., Wiener, J. M., Mallot, H. A., & Bűchel, C. (2007). Differential recruitment of the hippocampus, medial prefrontal vortex, and the human motion complex during path integration in humans. Journal of Neuroscience, 27, 9408–9416. Xu, F., & Carey, S. (1996). Infants’ metaphysics: The case of numerical identity. Cognitive Psychology, 30, 111–153. Page 58 of 59

Representation of Spatial Relations Yang, T. T., Gallen, C. C., Schwartz, B. J., & Bloom, F. E. (1994). Noninvasive somatosenso ry homunculus mapping in humans by using a large-array biomagnetometer. Proceedings of the National Academy of Sciences U S A, 90, 3098–3102. Zacks, J. M., Gilliam, F., & Ojemann, J. G. (2003). Selective disturbance of mental rotation by cortical stimulation. Neuropsychologia, 41, 1659–1667. Zeki, S. (1969). Representation of central visual fields in prestriate cortex of monkey. Brain Research, 14, 271–291. Zeki, S. (2001). Localization and globalization in conscious vision. Annual Review of Neu roscience, 24, 57–86. Zeki, S., Watson, J. D. G., Luexk, C. J., Friston, K. J., Kennard, C., & Frackowiak, R. S. J. (1991). A direct demonstration of functional specialization in human visual cortex. Journal of Neuroscience, 11, 641–649. Zipser, D., & Andersen, R. A. (1988). A back-propagation programmed network that simu lates response properties of a subset of posterior parietal neurons. Nature, 331, 679–684.

Bruno Laeng

Bruno Laeng is professor in cognitive neuropsychology at the University of Olso.

Page 59 of 59

Top-Down Effects in Visual Perception

Top-Down Effects in Visual Perception Moshe Bar and Andreja Bubic The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0004

Abstract and Keywords The traditional belief that our perception is determined by sensory input is an illusion. Empirical findings and theoretical ideas from recent years indicate that our experience, memory, expectations, goals, and desires can substantially affect the appearance of the visual information that is in front of us. Indeed, our perception is equally shaped by the incoming bottom-up information that captures the objective appearance of the world sur rounding us, as by our previous knowledge and personal characteristics that constitute different sources of top-down perceptual influences. The present chapter focuses on such feedback effects in visual perception, and describes how the interplay between prefrontal and posterior visual cortex underlies the mechanisms through which top-down predic tions guide visual processing. One major conclusion is that perception and cognition are more interactive than typically thought, and any artificial boundary between them may be misleading. Keywords: top-down, feedback, predictions, expectations, perception, visual cortex, prefrontal cortex

Introduction The importance of top-down effects in visual perception is widely acknowledged by now. There is nothing speculative in recognizing the fact that our perception is influenced by our previous experiences, expectations, emotions, or motivation. Research in cognitive neuroscience has revealed how such factors systematically influence the processing of signals that originate from our retina or other sensory organs. Nevertheless, even most textbooks still view percepts as objective reflections of the external world. According to this view, when placed in a particular environment, we all principally see the same things, but may mutually differ with respect to postperceptual, or higher level, cognitive process es. Thus, perception is seen as exclusively determined by the external sensory, so-called bottom-up, inputs. And, although the importance of the constant exchange between in coming sensory data and existing knowledge used for postulating hypotheses regarding sensations has occasionally been recognized during the early days of cognitive psycholo Page 1 of 24

Top-Down Effects in Visual Perception gy (MacKay, 1956), only rarely have the top-down effects been considered to be of signifi cant importance as bottom-up factors (Gregory, 1980; Koffka, 1935). Although increased emphasis has been placed on top-down effects in perception within modern cognitive neuroscience theories and approaches, it is important to note that these are mostly treated as external, modulatory effects. This view incorporates an implicit and rather strong assumption according to which top-down effects represent secondary phe nomena whose occurrence is not mandatory, but rather is contingent on environmental conditions such as the level of visibility, or ambiguity. As such, they are even discussed as exclusively attentional or higher-level executive effects that may influence and (p. 61) bias visual perception, without necessarily constituting one of its inherent, core elements. This chapter reviews a more comprehensive outlook on visual perception. In this account, interpreting the visual world can’t be accomplished by relying solely on bottom-up infor mation, but rather it emerges from the integration of external information with preexist ing knowledge. Several sources of such knowledge that trigger different types of topdown biases and may modulate and guide visual processing are described here. First, we discuss how our brain, when presented with a visual object, employs a proactive strategy of testing multiple hypotheses regarding its identity. Another source of early top-down fa cilitatory bias derived from the presented object concerns its potential emotional salience that is also extracted during the early phases of visual processing. At the same time, a comparable associative and proactive strategy is used for formulating and testing hy potheses based on the context in which the object is presented and on other items likely to accompany it. Finally, factors outside the currently presented input such as task de mands, previously encountered events, or the behavioral context may be used as addition al sources of top-down influences in visual recognition. In conclusion, although triggered and constrained by the incoming bottom-up signals, visual perception is strongly guided and shaped by top-down mechanisms whose origin, temporal dynamics, and neural basis are reviewed and synthesized here.

Understanding the Visual World The common view of visual perception posits that the flow of visual information starts once the visual information is acquired by the eye, and continues as it is transmitted fur ther, to the primary and then to higher order visual cortices. This hierarchical view is in complete because it ignores the influences that our preexisting knowledge and various dispositions have on the way we process and understand visual information. In our everyday life, we are rarely presented with clearly and fully perceptible visual in formation, and we rarely approach our environment without certain predictions. Even highly familiar objects are constantly encountered in different circumstances that greatly vary in terms of lighting, occlusions, and other parameters. Naturally, all of these varia tions limit our ability to structure and interpret the presented scene in a meaningful man ner without relying on previous experience. Therefore, outside the laboratory, successful recognition has to rely not only on the immediately available visual input but also on dif Page 2 of 24

Top-Down Effects in Visual Perception ferent sources of related preexisting information represented at higher processing levels (Bullier, 2001; Gilbert & Sigman, 2007; Kveraga et al., 2007b), which trigger the so-called top-down effects in visual perception. Consequently, to understand visual recognition, we need to consider differential, as well as integrative, effects of bottom-up and top-down processes, and their respective neural bases. Before reviewing specific instances in which top-down mechanisms are of prominent rele vance, it is important to mention that the phrase “top-down modulation” has typically been utilized in several different ways in the literature. Engel et al. (2001) summarize four variants of its use: anatomical (equating top-down influences with the feedback activ ity in a processing hierarchy), cognitivist (equating top-down influences with hypothesisor expectation-driven processing), perceptual or gestaltist (equating top-down influences with contextual modulation of perceptual items), and dynamicist (equating top-down in fluences with the enslavement of local processing elements by large-scale neuronal dy namics). Although separable, these variants are in some occasions partly overlapping (e.g., it may be hard to clearly separate cognitivist and perceptual definitions) or mutually complementary (e.g., cognitivist influences may be mediated by anatomical or dynamicist processing mechanisms). A more general definition might specify top-down influences as instances in which complex types of information represented at higher processing stages influence simpler processing occurring at earlier stages (Gilbert & Sigman, 2007). How ever, even this definition might be problematic in some instances because the term “com plexity of information” can prove to be hard to specify, or even to apply in some instances of top-down modulations. Nevertheless, bringing different definitions and processing lev els together, it is possible to principally suggest that bottom-up information flow is sup ported by feedforward connections that transfer information from lower to higher level regions, in contrast to feedback connections that transmit the signals originating from the higher level areas downstream within the processing hierarchy. Feedforward connec tions are typically considered to be those that originate from supragranular layers and terminate in and around the fourth cortical layer, in contrast to feedback connections that originate in infragranular and end in agranular layers (Felleman & Van Essen, 1991; Fris ton, 2005; Maunsell & Van Essen, 1983; Mumford, 1992; Shipp, 2005). (p. 62) Further more, feedforward connections are relatively focused and tend to project to a smaller number of mostly neighboring regions in the processing hierarchy. In contrast, feedback connections are more diffused because they innervate many regions and form wider con nection patterns within them (Shipp & Zeki, 1989; Ungerleider & Desimone, 1986). Con sequently, feedforward connections carry the main excitatory input to cortical neurons and are considered as driving connections, unlike the feedback ones that typically have modulatory effects (Buchel & Friston, 1997; Girard & Bullier, 1989; Salin & Bullier, 1995; Sherman & Guillery, 2004). For example, feedback connections potentiate responses of low-order areas by nonlinear mechanisms such as gain control (Bullier et al., 2001), in crease the synchronization in lower order areas (Munk et al., 1995), and contribute to at tentional processing (Gilbert et al., 2000). Mumford (1992) has argued that feedback and feedforward connections have to be considered of equal importance, emphasizing that most cognitive processes reflect a balanced exchange of information between pairs of Page 3 of 24

Top-Down Effects in Visual Perception brain regions that are often hierarchically asymmetrical. However, the equal importance does not imply equal functionality: as suggested by Lamme et al. (1998), feedforward con nections may be fast, but they are not necessarily linked with perceptual experience. In stead, attentive vision and visual awareness, which may in everyday life introspectively be linked to a conscious perceptual experience, arise from recurrent processing within the hierarchy (Lamme & Roefsema, 2000). As described in more detail later in this chapter, a similar conceptualization of feedfor ward and feedback connections has been advocated by a number of predictive approach es to cortical processing, all of which emphasize that efficient perception arises from a balanced exchange of bottom-up and top-down signals. Within such recurrent pathways, feedback connections are suggested to trigger “templates,” namely expected reconstruc tions of sensory signals that can be compared with the input being received from lowerlevel areas by the feedforward projections. According to one theory, the residual, or the difference between the template and incoming input, is calculated and transmitted to higher level areas (Mumford, 1992). Different models propose different ways to handle the discrepancy between top-down hypotheses and bottom-up input, but they mostly agree that a comparison between ascending and descending information is needed to fa cilitate convergence. In conclusion, our understanding of visual recognition strongly requires that we charac terize the sources and dynamics of top-down effects in sensory processing. First, we need to distinguish between different types of cognitive biases in vision, some of which may be come available before the presentation of a specific stimulus, whereas others are trig gered by its immediate appearance. Specifically, although in some situations we may be able to use prior knowledge to generate more or less specific predictions regarding the upcoming stimulation, in other contexts hypotheses regarding the identity or other fea tures of incoming stimuli can only be generated after the object is presented. In either case, visual perception does not simply reflect a buildup of independently processed stim ulus features (e.g., shape, color, or edge information), which are first integrated into a recognizable image and thereafter complemented with other, already existing information about that particular object. Instead, links between the presented input and preexisting representations are created before, or at the initial moment of, object presentation and continuously refined thereafter until the two sufficiently overlap and the object is suc cessfully recognized.

Relevance of Prediction in Simple Visual Recognition Even the simplest, most impoverished contexts of visual recognition rely on an integra tion of bottom-up and top-down processing mechanisms. It has been proposed that this type of recognition is predictive in nature, in the sense that predictions are initialized af ter stimulus presentation based on certain features of the presented input (e.g., global object shape), which are rapidly processed and used for facilitating the processing of oth Page 4 of 24

Top-Down Effects in Visual Perception er object features (Bar, 2003, 2007; Kveraga et al., 2007b). One proposal is that percep tion, even when single objects are presented in isolation, progresses through several phases that constitute an activate-predict-confirm perception cycle (Enns & Lleras, 2008). Explicitly or implicitly, this view is in accordance with predictive theories of cortical pro cessing (Bar, 2003; Friston, 2005; Grossberg, 1980; Mumford, 1992; Rao & Ballard, 1999; Ullman, 1995) that were developed in attempts to elucidate the mechanisms of iterative recurrent cortical processing underlying successful cognition. Their advancement was strongly motivated by the increase in knowledge regarding the structural and functional properties of feedback and feedforward connections described earlier, as well as the posited distinctions between (p. 63) forward and inverse models in computer vision (Bal lard et al., 1983; Friston, 2005; Kawato et al., 1993). Based on these developments, pre dictive theories suggested that different cortical systems share a common mechanism of constant formulation and communication of expectations and other top-down biases from higher to lower level cortical areas, which thereby become primed for the anticipated events. This allows the input that arrives to these presensitized areas to be compared and integrated with the postulated expectations, perhaps through specific synchrony patterns visible across different levels of the hierarchy (Engel et al., 2001; Ghuman et al., 2008; Kveraga et al., 2007b; von Stein et al., 2000; von Stein & Satnthein, 2000). Such compari son and integration of top-down and bottom-up information has been posited to rely on it erative error-minimization mechanisms (Friston, 2005; Grossberg, 1980; Kveraga et al., 2007b; Mumford, 1992; Ullman, 1995) that support successful cognitive functioning. With respect to visual processing, this means that an object can be recognized once a match between the prepostulated hypothesis and sensory representation is reached, such that no significant difference exists between the two. As mentioned before, this implies that feedforward pathways carry error signals, or information regarding the residual discrep ancy between predicted and actual events (Friston, 2005; Mumford, 1992; Rao & Ballard, 1999). This suggestion is highly important because it posits a privileged status for errors of prediction in cortical processing. These events require more pronounced and pro longed analysis because they typically signal inadequacy of the preexisting knowledge for efficient functioning. Consequently, in addition to providing a powerful signal for novelty detection, discrepancy signals often trigger a reevaluation of current knowledge, new learning, or a change in behavior (Corbetta et al., 2002; Escera et al., 2000; Schultz & Dickinson, 2000; Winkler & Czigler, 1998). In contrast, events that are predicted correct ly typically carry little informational value (because of successful learning, we expected these to occur all along) and are therefore processed in a more efficient manner (i.e., faster and more accurately) than the unexpected or wrongly predicted ones. Although the described conceptualization of the general dynamics of iterative recurrent processing across different levels of cortical hierarchies is shared by most predictive theories of neural processing, these theories differ with respect to their conceptualizations of more specific features of such processing (e.g., the level of abstraction of the top-down mediat ed templates or the existence of information exchange outside neighboring levels of corti cal hierarchies; cf. Bar, 2003; Kveraga et al., 2007b; Mumford, 1992).

Page 5 of 24

Top-Down Effects in Visual Perception One of the most influential models that highlight the relevance of recurrent connections and top-down feedback in visual processing is the “global-to-local” integrated model of vi sual processing of Bullier (2001). This model builds on our knowledge of the anatomy and functionality of the visual system, especially the differences between magnocellular and parvocellular visual pathways that carry the visual information from the retina to the brain. Specifically, it takes into account findings showing that the magnocellular and par vocellular pathways are separated over the first few processing stages and that, following stimulus presentation, area V1 receives activation from lateral geniculate nucleus magno cellular neurons around 20 ms earlier than from parvocellular neurons. This faster activa tion of the M-channel (characterized by high contrast but poor chromatic sensitivity, larg er receptive fields, and lower spatial sampling rate), together with the high degree of myelination, could account for the early activation of the dorsal visual processing stream after visual stimulation, which enables the generation of fast feedback connections from higher to lower areas (V1 and V2) at exactly the time when feedforward activation from the P-channel arrives (Bullier, 2001; Kaplan, 2004; Kaplan & Shapley, 1986; Merigan & Maunsell, 1993; Schiller & Malpeli, 1978; Goodale & Milner, 1992). This view is very dif ferent not only from the classic theories that emphasize the importance of feedforward connections but also from the more “traditional” account of feedback connections stating that, regardless of the overall importance of recurrent connections, the first sweep of ac tivity through the hierarchy of (both dorsal and ventral) visual areas is still primarily de termined by the pattern of feedforward connections (Thorpe & Imbert, 1989). The inte grated model of visual processing treats V1 and V2 areas as general-purpose representa tors that integrate computations from higher levels, allowing global information to influ ence the processing of more detailed ones. V1 could be a place for perceptual integration that reunites information returned from different higher level areas by feedback connec tions after being divided during the first activity sweep for a very simple reason: This cor tical area still has a high-resolution map of almost all relevant feature information. This view resonates well with recent theories of visual processing and awareness, such as Zeki’s theory of visual (p. 64) consciousness (Zeki & Bartels, 1999), Hochstein and Ahissar’s theory of perceptual processing and learning (Hochstein & Ahissar, 2002), or Lamme’s (2006) views on consciousness. Another model that suggests a similar division of labor and offers a predictive view of vi sual processing in object recognition was put forth by Bar and colleagues (Bar, 2003, 2007; Bar et al., 2006; Kveraga et al., 2007a). This model identifies the information con tent that triggers top-down hypotheses and characterizes the exact cortical regions that bias visual processing by formulating and communicating those top-down predictions to lower-level cortical areas. It also explains how top-down information modulates bottom-up processing in situations in which single objects are presented in isolation and in which prediction is driven by a subset of an object’s own properties that facilitate the recogni tion of the object itself as well as other objects it typically appears with. Specifically, this top-down facilitation model posits that visual recognition progresses from an initial stage aimed at determining what an object resembles, and a later stage geared toward specify ing its fine details. For this to occur, the top-down facilitation model critically assumes Page 6 of 24

Top-Down Effects in Visual Perception that different features of the presented stimulus are processed at different processing stages. First, the coarse, global information regarding the object shape is rapidly extract ed and used for activating in memory existing representations that most resemble the global properties of the given object to be recognized. These include all objects that share the rudimentary properties of the presented object and look highly similar if viewed in blurred or decontextualized circumstances (e.g., a mushroom, desk lamp, and umbrella). Although an object cannot be identified with full certainty based on coarse stimulus out line, such rapidly extracted information is still highly useful for basic-level recognition of resemblance, creating the so-called analogies (Bar, 2007). Generally, analogies allow the formulation of links between the presented input and relevant preexisting representa tions in memory, which may be based on different types of similarity between the two (e.g., perceptual, semantic, functional, or conceptual). In the context of visual recogni tion, analogies are based on global perceptual similarity between the input object and ob jects in memory, which allows the brain to generate multiple hypotheses or guesses re garding the object’s most likely identity. For these analogies and initial guesses to be useful, they need to be triggered early dur ing visual processing, while they can still bias the slower incoming bottom-up input. Thus, information regarding the coarse stimulus properties has to be processed first, before the finer object features. Indeed, it has been suggested that such information is rapidly ex tracted and conveyed using low spatial frequencies of the visual input through the mag nocellular (M) pathway, which is ideal for transmitting coarse information regarding the general object outlines at higher velocity compared with other pathways (Merigan & Maunsell, 1993; Schiller & Malpeli, 1978). This information is transmitted to the or bitofrontal cortex (OFC) (Bar, 2003; Bar et al., 2006), a polymodal region implicated pri marily in the processing of rewards and affect (Barbas, 2000; Carmichael & Price, 1995; Cavada et al., 2000; Kringelbach & Rolls, 2004), as well as supporting some aspects of vi sual processing (Bar et al., 2006; Freedman et al., 2003; Frey & Petrides, 2000; Meunier et al., 1997; Rolls et al., 2005; Thorpe et al., 1983). In the present framework, the OFC is suggested to generate predictions regarding the object’s identity by activating all repre sentations that share the global appearance with the presented image by relying on rapidly analyzed low spatial frequencies (Bar, 2003, 2007). Once fed back into the ventral processing stream, these predictions interact with the slower incoming bottom-up infor mation and facilitate the recognition of the presented object. Experimental findings cor roborate the hypothesis regarding the relevance of the M-pathway for transmitting coarse visual information (Bar, 2003; Bar et al., 2006; Kveraga et al., 2007b), as well as the importance of OFC activity, and the interaction between the OFC and inferior tempo ral cortex, for recognizing visual objects (Bar et al., 2001, 2006). It has also been suggest ed that low spatial frequency information is used for generating predictions regarding other objects and events that are likely to be encountered in the same context (Bar, 2004; Oliva & Torralba, 2007). As also suggested by Ganis and Kosslyn (2007), the associative nature of our long-term memory plays a crucial role in matching the presented input with preexisting representations and all associated information relevant for object identifica tion. Finally, as suggested by Barrett and Bar (2009), different portions of the OFC are al Page 7 of 24

Top-Down Effects in Visual Perception so implicated in mediating affective predictions by supporting a representation of objects’ emotional salience that constitutes an inherent part of visual recognition. The elaboration of the exact dynamics and interaction between affective predictions and (p. 65) other types of top-down biases in visual recognition remains to be elucidated.

Contextual Top-Down Effects In the previous sections, we described how the processing of single objects might be facil itated by using rapidly processed global shape information for postulating hypotheses re garding their identity. However, in the real world, objects are rarely presented in isola tion. Instead, they are embedded in particular environments and surrounded by other ob jects that are typically not random, but rather are mutually related in that they typically share the same context. Such relations that are based on frequent co-occurrence of ob jects within the same spatial or temporal context may be referred to as contextual associ ations. Importantly, objects that are typically found in the same environment can share many qualitatively different types of relations. For example, within one particular context (e.g., a kitchen), it may be possible to find objects that are semantically or categorically related in the sense that they belong to the same semantic category (e.g., kitchen appli ances such as a dishwasher and a refrigerator), as well as those that are typically used to gether, thus sharing a mutual functional relation (e.g., a frying pan and oil). However, the existence of such relations is not crucial for defining contextual associates because some may only have the environment that they typically coinhabit in common (e.g., a shower curtain and a toilet brush). Furthermore, some categorically or contextually related ob jects may be perceptually similar (e.g., an orange and a peach) or dissimilar (e.g., an ap ple and a banana). In addition, various contextual associates may share spatial relations of different flexibility, for example, such that a bathroom mirror is typically placed above the sink, whereas a microwave and a refrigerator may be encountered in different rela tional positions within the kitchen. Finally, although some object pairs may share only one relation (e.g., categorical: a cat and a goat; or contextual: a towel and sunscreen), others may be related in more than one way (e.g., a mouse and a cat are related both contextual ly and categorically, whereas a cat and a dog are related contextually, categorically, and perceptually). Although the number of different association types shared by the two objects may be of some relevance, it is mainly the frequency and consistency of their mutual co-occurrence that determines the strength of their associative connections (Bar, 2004; Biederman, 1981; Hutchison, 2003; Spence & Owens, 1990). Such associative strength is important in that it provides information that our brain can use for accurate predictions. Our nervous system is extremely efficient in capturing the statistics in natural visual scenes (Fiser & Aslin, 2001; Torralba & Oliva, 2003), as well as learning the repeated spatial contingen cies of even arbitrarily distributed abstract stimuli (Chun & Jiang, 1999). In addition to being sensitive and successful in learning contextual relations, our brain is also highly ef ficient in utilizing this knowledge for facilitating its own processing. In accordance with the general notion of a proactively predictive brain (Bar, 2009), it has repeatedly been Page 8 of 24

Top-Down Effects in Visual Perception demonstrated that the learned contextual information is constantly used for increasing the efficiency of visual search and recognition (Bar, 2004; Biederman et al., 1982; Daven port & Potter, 2004; Torralba et al., 2006). Specifically, objects presented in familiar back grounds, especially if appearing in expected spatial configuration, are processed faster and more accurately than those presented in noncongruent settings. Furthermore, the contextual and semantic redundancy provided by the presentation of several related ob jects encountered in such settings allows the observer to resolve the insecurities and am biguities of individual objects more efficiently. Such context-based predictions are ex tremely useful because they save processing resources and reduce the need for exerting mental effort while dealing with predictable aspects of our surroundings. They allow us to allocate attention toward relevant environmental aspects more efficiently, and they are very useful for guiding our actions and behavior. However, to understand the mechanisms underlying contextual top-down facilitation effects, it is important to illustrate how the overall meaning, or the gist, of a complex image can be processed fast enough to become useful for guiding the processing of individual objects presented within it (Bar, 2004; Oli va, 2005; Oliva & Torralba, 2001). Studies that have addressed this issue have recently demonstrated how extracting the gist of a scene mainly relies on low spatial frequencies present in the image. This is not surprising because we demonstrated earlier how such frequencies can be rapidly ana lyzed within the M-pathway allowing them to, in this case, aid the rapid classification of the presented context (Bar, 2004; Oliva & Torralba, 2001; Schyns & Oliva, 1994). Specifi cally, information regarding the global scene features can proactively be used for activat ing context frames, namely the representations of objects and relations that are common to that specific context (Bar, 2004; Bar & Ullman, 1996). Similar to the ideas of schemata (p. 66) (Hock et al., 1978), scripts (Shank, 1975), and frames (Minsky, 1975) from the 1970s and 1980s, such context frames are suggested to aid the processing of individual objects presented in the scene. At this point, it is important to mention that context frames should not be understood as static images that are activated in an all-or-none manner, but rather as dynamic entities that are processed gradually. In this view, a proto typical spatial template of the global structure of a familiar context is activated first, and is then filled with more instance-specific details until it develops into an episodic context frame that includes specific information regarding an individual instantiation of the given context. Overall, there is ample evidence that demonstrates our brain’s efficiency in ex tracting the gist of even highly complex visual images, allowing it to identify individual objects typically encountered in such settings. This, however, represents only one level of predictive, contextual top-down modulations. On another level, it is possible that individual objects that are typically encountered to gether in a certain context are also able to mutually facilitate each other’s processing. In other words, although it has long been known that a kitchen environment helps in recog nizing a refrigerator that it typically contains, it was less clear whether seeing an image of that refrigerator in isolation could automatically invoke the image of the contextual set ting in which it is typically embedded. If so, it would be plausible to expect that present ing objects typically associated with a certain context in isolation could facilitate subse Page 9 of 24

Top-Down Effects in Visual Perception quent recognition of their typical contextual associates, even when these are presented outside the shared setting. This hypothesis has been addressed and substantiated in a se ries of studies conducted by Bar and colleagues (Aminoff et al., 2007; Bar et al., 2008a, 2008b; Bar & Aminoff, 2003; Fenske et al., 2006) that indicated the relevance of the parahippocampal cortex (PHC), the retrosplenial complex (RSC), and the medial pre frontal cortex (MPFC) for contextual processing. In addition to identifying the regions rel evant for contextual relations, they also revealed a greater sensitivity of the PHC to asso ciations with greater physical specificity, in contrast to RSC areas that seem to represent contextual associations in a more abstract manner (Bar et al., 2008b). Similar to the neighboring OFC’s role in generating predictions related to object identity, the involve ment of the neighboring part of the MPFC was suggested to reflect the formulations of predictions based on familiar, both visual-spatial and more abstract types of contextual associations (Bar & Aminoff, 2003; Bar et al., 2008a, 2008b). These findings are of high relevance because they illustrate how the organization of the brain as a whole may auto matically and parsimoniously support the strongly associative nature of our cognitive mind (Figure 4.1). Before moving on to other types of top-down modulations, it is important to highlight that the described contextual effects in visual recognition represent only one type of contextu al phenomena in vision. As described, they mostly address the functional or even seman tic context in which the objects appear. On a somewhat more basic level than this, sim pler stimulus contextual effects also need to be mentioned as a form of modulatory influ ence in vision. For example, contextual effects in simple figure–ground segregation of ob jects are visible in situations in which, for example, the response of a neuron becomes af fected by the global characteristics of the contour defining the object that is outside the neuron’s receptive field (Li et al., 2006). Generally, it is hard to determine whether such influences should be considered as top-down because some of them might be mediated solely by local connections intrinsic to the area involved in processing a certain feature. Thus, strictly speaking, they might not adhere to the previous definitions of top-down modifications as those reflecting influences from higher level processing stages (Gilbert & Sigman, 2007). However, given the relevance of these stimulus contextual influences on many elements of visual processing, including contour integration, scene segmenta tion, color constancy, and motion processing (Gilbert & Sigman, 2007), their relevance for intact visual perception has to be acknowledged. In a sense, even gestalt rules of percep tual grouping may be seen as similar examples of modulations in vision because they summarize how contours are perceptually linked as a consequence of their geometric re lationships (Todorović, 2007). These rules most clearly indicate how, indeed, “the whole is greater than the sum of its parts,” because they show how our perception of each stimu lus strongly depends on the contextual setting in which it is embedded. A very interesting feature of gestalt rules, and information integration in general, is that the knowledge re garding the grouping process itself does not easily modify the percept. In the example of the Muller-Lyer visual illusion, knowing that the two lines are of equal length does not necessarily make the illusion go away (Bruno & Franz, 2009).

Page 10 of 24

Top-Down Effects in Visual Perception

Figure 4.1 Parallel to the systematic bottom-up pro gression of image details that are mainly mediated by high spatial frequency (HSF) information along the ventral visual pathway, rapid projections of coarse low spatial frequencies (LSFs) trigger the generation of hypotheses or “initial guesses” regard ing the exact object identity and the context within which it typically appears. Both of these types of pre dictions are validated and refined with the gradual arrival of HSFs (Bar, 2004). IT, inferior temporal cor tex; MPFC, medial prefrontal cortex; OFC, orbital frontal cortex; RSC, retrosplenial cortex; PHC, parahippocampal cortex. Modified, with permission, from Bar (2009). (p. 67)

Interfunctional Nature of Top-Down Modula tions In the previous section, a “purely perceptual” aspect of top-down modulations has been introduced because the theoretical proposals and experimental findings described mostly focused on the top-down modulations that are triggered by the presentation of a single stimulus. However, top-down influences in visual perception include a much wider array of phenomena. Specifically, in many cases, the top-down-mediated preparation begins be fore the actual presentation of the stimulus and is triggered by factors such as instruction (Carlsson et al., 2000) or specific task cue (Simmons et al., 2004). These types of influ ences may be considered contextual, in the sense that they reflect the behavioral context of visual processing that is related to the perceptual task at hand (Watanabe et al., 1998). In this case, the participant may choose to focus on a certain aspect of a stimulus that is expected to be of relevance in the future, thus triggering a modulatory effect on the feed forward analysis of the stimulus once it appears. This is similar to the way prior knowl edge regarding the likelihood of the spatial location or other features of the objects ex pected to appear in the near future influences our perception (Bar, 2004; Biederman, Page 11 of 24

Top-Down Effects in Visual Perception 1972, 1981; Biederman et al., 1973; Driver & Baylis, 1998; Scholl, 2001). Typically, know ing what to expect allows one to attend to the location or other features of the expected stimulus. Not only spatial but also object features, objects or categories, and temporal context or other perceptual groups could be considered different subtypes of top-down in fluences (Gilbert & Sigman, 2007). In addition, prior presentation of events that provide clues regarding the identity of the incoming stimulation may also influence visual processing. Within this context, one spe cific form of the potential influence of previous experience on current visual processing involves priming (Schacter et al., 2004; Tulving & Schacter, 1990; Wiggs & Martin, 1998). Events that occur before target presentation and that influence its processing may in clude those that are semantically or contextually long-term related to the target event (Kveraga et al., 2007b), (p. 68) as well as stimuli that had become associated with the tar get stimulus through short-term learning within the same (Schubotz & von Cramon, 2001; 2002) or a different (Widmann et al., 2007) modality. It has been suggested that, in some cases, and especially in the auditory domain, prediction regarding the forthcoming stimuli can be realized within the sensory system itself (Näätänen et al., 2007). In other cases, however, these predictive sensory phenomena may be related to the computations of the motor domain. Specifically, expectations regarding the incoming stimulation may be formulated based on an “efference copy” elicited by the motor system in situations in which a person’s own actions trigger such stimulation (Sperry, 1950). Specifically, von Holst, Mittelstaedt, and Sperry in the 1950s provided the first experimental evidence demonstrating the importance of motor-to-sensory feedback in controlling behavior (Bays & Wolpert, 2008; Wolpert & Flanagan, 2001). This motivated an increased interest in the so-called internal model framework that can now be considered a prevailing, widely ac cepted view of action generation (Miall & Wolpert, 1996; Wolpert & Kawato, 1998). Ac cording to this framework, not only is there a prediction for sensory events that may be considered consequences of one’s own behavior (Blakemore et al., 1998), but the same system may be utilized for anticipating sensory events that are strongly associated with other sensory events co-occurring on a short time scale (Schubotz, 2007). Before moving away from the motor system, it is important to mention one more, somewhat different view that also emphasizes a strong link between perceptual and motor behavior. Specifi cally, according to Gross and colleagues (1999), when discussing perceptual processing, a strong focus should be placed on sensorimotor anticipation because it allows one to di rectly characterize a visual scene in categories of behavior. Thus, perception is not simply predictive but also is “predictive with purpose” and may thus be described as behavior based (Gross et al., 1999) or, as suggested by Arbib (1972), action oriented. Up to now, top-down modulatory influences have been recognized in a multitude of differ ent contexts and circumstances. However, one that is possibly the most obvious has not been addressed. Specifically, we have not discussed in any detail the relevance and neces sity of top-down modulations in situations in which we can, from our daily experience, ex plicitly appreciate the aid of prior knowledge for processing stimuli at hand. This mostly concerns the situations in which the visual input is impoverished or ambiguous, and in which we almost consciously rely on top-down information for closing the percept. For ex Page 12 of 24

Top-Down Effects in Visual Perception ample, contextual information is crucial in recovering visual scene properties lost be cause of the blurs or superimpositions in visual image (Albright & Stoner, 2002). In situa tions in which an object is ambiguous and may be interpreted in more than one fashion, the importance of a prior template that may guide recognition is even more of relevance than when viewing a clear percept (Cavanagh, 1991). Examples of such stimuli include two faces or a vase versus one face behind a vase (Costall, 1980), or as shown in Figure 4.2, a face of a young versus an old woman or a man playing a saxophone seen in silhou ette versus a woman’s face in shadow (Shepard, 1990).

Figure 4.2 Ambiguous figures demonstrate how our experience of the visual world is not determined sole ly by bottom-up input: example of the figure of a young woman versus an old woman, and a woman’s face versus a man playing the saxophone.

Generally, in the previous sections, a wide array of top-down phenomena have been men tioned, and some were described in more detail. From this description, it became clear how difficult it is to categorize these effects in relation to other cognitive functions, the most important of which is attention. Specifically, not only is it hard to clearly distinguish between attentional and other forms of top-down phenomena, but also, given that the same effect is sometimes discussed as an attentional, and sometimes as a perceptual, phenomenon, a clear separation may not be possible. This might not even be necessary in all situations because, for most practical purposes, the mechanisms underlying such phe nomena and the effects they introduce may be considered the same. When discussing topdown influences of selective attention, one typically refers to the processing guided by hypotheses or expectations and the influence of prior knowledge or other (p. 69) personal features on stimulus selection (Engel et al., 2001), which is quite concordant to the way top-down influences have been addressed in this chapter. Even aspects of visual percep tion as mentioned here might, at least in part, be categorized under anticipatory atten tion that involves a preparation for the upcoming stimulus and improves the speed and precision of stimulus processing (Posner & Dehaene, 1994; Brunia, 1999). Not surprising ly, then, instead of always categorizing a certain effect into one functional “box,” Gazzaley (2010) uses a general term “topdown modulation,” which includes changes in sensory cortical activity associated with relevant and irrelevant information that stand at the crossroads of perception, memory, and attention. Similarly, a general conceptualization of Page 13 of 24

Top-Down Effects in Visual Perception top-down influences as those that reflect an interaction of goals, action plans, working memory, and selective attention is suggested by Engel et al. (2001). In an attempt to bet ter differentiate basic types of top-down influences in perception, Ganis and Kosslyn (2007) suggested a crucial distinction between strategic top-down processes that include those influences that are under voluntary control and may be contrasted with involuntary reflexive types of top-down modulations. In a somewhat different view, Summerfield and Egner (2009) argued that biases in visual processing include attentional mechanisms that prioritize processing based on motivational relevance and expectations that bias process ing based on prior experience. While acknowledging these initial attempts to create a clear taxonomy of top-down influences in perceptual processing, there is still quite a lot of disagreement in this area that remains to be settled in the future. On a more anatomical and physiological side, when discussing top-down modulations, it was suggested that these should not be regarded as an intrinsic property of individual sensory areas, but instead as a phenomenon realized through long-range connections be tween distant brain regions (Gazzaley, 2010). In this context, a lot of work has been in vested in clearly defining the two types of areas involved in such interactive processing: sites and sources of biases. Specifically, sites relate to those areas in which the analysis of afferent signals takes place, whereas sources include those that provide the biasing infor mation that modulate processing (Frith & Dolan, 1997). Although some of the previously mentioned theories of visual perception that emphasize the role of feedback connections (Grossberg, 1980; Mumford, 1992; Ullman, 1995) often consider the immediate external stimulation to constitute the source of feedback information, others consider modulatory “bias” signals to be triggered by the system itself based on the prior knowledge (Engel et al., 2001). The sources of such signals most often include the prefrontal, but also parietal and temporal, cortices as well as the limbic system, depending on the specific type of in formation influencing information processing (Beck & Kastner, 2009; Engel et al., 2001; Hopfinger et al., 2000; Miller & Cohen, 2001). All of them aid and guide visual processing by providing relevant information necessary for the evaluation and interpretation of the incoming stimulus. An important thing to keep in mind, however, is that defining top-down modulations in terms of long-term connections may be limiting for recognizing some influences that have a modulatory role in perception and contribute to our experience of an intact percept. In this conceptualization, it is not quite clear how distant two regions have to be in order for their interaction to be considered a “top-down” phenomenon and whether some shortrange and local connections that modulate our perception may also be considered “topdown.” It is not clear what should be more relevant for defining top-down modulations, the distance between regions or the existence of recurrent processing between separate units that allows us to use prior experiences and knowledge for modulating the feedfor ward sweep of processing. This modulation may, as suggested in the attentional theory of Kastner and Ungerleider (2000), be realized through an increase of the neural response to the relevant or attended stimulus and an attenuation of the irrelevant stimulus before or after its presentation, a claim that has been experimentally corroborated (Desimone & Duncan, 1995; Pessoa et al., 2003; Reynolds & Chelazzi, 2004). Furthermore, as suggest Page 14 of 24

Top-Down Effects in Visual Perception ed by Dehaene et al. (1998), such top-down attentional amplification, or an increase in ac tivity related to the relevant stimulus, is also relevant as the mechanism that allows the stimulus to be made available to consciousness.

Concluding Remarks Visual recognition was depicted here as a process that reflects the integration of two equally relevant streams of information. One, the bottom-up stream, captures the input present in the visual image itself and conveyed through the senses. Second, a top-down stream, contributes based on prior experiences, current personal and behavioral sets, and future expectations and modifies the processing of the presented stimulus. In this concep tualization, perception may be considered the process of integrating the incoming input and our preexisting (p. 70) knowledge that exists in all contexts, even in very controlled and simple viewing circumstances. In this chapter, it has been argued that top-down effects in visual recognition may be of different complexity and may be triggered by the stimulus itself or information external to the presented stimulus such as the task instruction, copies of motor commands, or prior presentation of events informative for the current stimulation. These various types of topdown influences originate from different cortical sources, reflecting the fact that they are triggered by different types of information. Regardless of the differences in their respec tive origins, different types of top-down biases may nevertheless result in similar local ef fects, or rely on similar mechanisms for the communication between lower level visual ar eas and higher level regions to enable the described effects. In that sense, understanding the principles of one type of top-down facilitation effect may provide important clues re garding the general mechanisms of triggering and integrating top-down and bottom-up information that underlie successful visual recognition. In conclusion, the body of research synthesized here demonstrates the richness, complex ity, and importance of top-down effects in visual perception and illustrates some of the fundamental mechanisms underlying these. Critically, it is suggested that top-down ef fects should not be considered a secondary phenomenon that only occurs in special, ex treme settings. Instead, the wide, heterogeneous set of such biases suggests an everpresent and complex informational processing stream that, together with the bottom-up stream, constitutes the core of visual processing. As suggested by Gregory (1980), per ception can then be defined as a dynamic search for the best interpretation of the sensory data, a claim that highlights both the active and the constructive nature of visual percep tion. Consequently, visual recognition itself should be considered a proactive, predictive, and dynamic process of integrating different sources of information, the success of which is determined by their mutual resonance and correspondence and by our ability to learn from the past in order to predict the future.

Page 15 of 24

Top-Down Effects in Visual Perception

Author Note Work on this chapter was supported by NIH grant R01EY019477-01, NSF grant 0842947, and DARPA grant N10AP20036.

References Albright, T. D., & Stoner, G. R. (2002). Contextual influences on visual processing. Annual Review of Neuroscience, 25, 339–379. Aminoff, E., Gronau, N., & Bar, M. (2007). The parahippocampal cortex mediates spatial and nonspatial associations. Cerebral Cortex, 27, 1493–1503. Arbib, M. (1972). The metaphorical brain: An introduction to cybernetics as artificial in telligence and brain theory. New York: Wiley Interscience. Ballard, D. H., Hinton, G. E., & Sejnowski, T. J. (1983). Parallel visual computation. Nature, 306, 21–26. Bar, M. (2009). The proactive brain: Memory for predictions. Theme issue: Predictions in the brain: Using our past to generate a future (M. Bar, Ed.), Philosophical Transactions of the Royal Society, Series B, Biological Sciences, 364, 1235–1243. Bar, M. (2007). The proactive brain: Using analogies and associations to generate predic tions. Trends in Cognitive Sciences, 11 (7), 280–289. Bar, M. (2004). Visual objects in context. Nature Reviews, Neuroscience, 5 (8), 617–629. Bar, M. (2003). A cortical mechanism for triggering top-down facilitation in visual object recognition. Journal of Cognitive Neuroscience, 15, 600–609. Bar, M., & Aminoff, E. (2003). Cortical analysis of visual context. Neuron, 38 (2), 347–358. Bar, M., Aminoff, E., & Ishai, A. (2008a). Famous faces activate contextual associations in the parahippocampal cortex. Cerebral Cortex, 18 (6), 1233–1238. Bar, M., Aminoff, E., & Schacter, D. L. (2008b). Scenes unseen: The parahippocampal cor tex intrinsically subserves contextual associations, not scenes or places per se. Journal of Neuroscience, 28, 8539–8544. Bar, M., Kassam, K. S., Ghuman, A. S., Boshyan, J., Schmid, A. M., Dale, A. M., Hämäläi nen, M. S., Marinkovic, K., Schacter, D. L., Rosen, B. R., & Halgren, E. (2006). Top-down facilitation of visual recognition. Proceedings of the National Academy of Sciences U S A, 103 (2), 449–454. Bar, M., Tootell, R., Schacter, D., Greve, D., Fischl, B., Mendola, J., Rosen, B. R., & Dale, A. M. (2001). Cortical mechanisms of explicit visual object recognition. Neuron, 29 (2), 529– 535. Page 16 of 24

Top-Down Effects in Visual Perception Bar, M., & Ullman, S. (1996). Spatial context in recognition. Perception, 25 (3), 343–352. Barbas, H. (2000). Connections underlying the synthesis of cognition, memory, and emo tion in primate prefrontal cortices. Brain Research Bulletin, 52 (5), 319–330. Barrett, L. F., & Bar, M. (2009). See it with feeling: Affective predictions during object perception. Theme issue: Predictions in the brain: Using our past to generate a future (M. Bar. Ed.), Philosophical Transactions of the Royal Society, Series B, Biological Sciences, 364, 1325–1334. Bays, P. M., & Wolpert, D. M. (2008). Predictive attenuation in the perception of touch. In P. Haggard, Y. Rossetti, & M. Kawato (Eds.), Sensorimotor foundations of higher cogni tion: Attention and performance XXII (pp. 339–359). New York: Oxford University Press. Beck, D. M., & Kastner, S. (2009). Top-down and bottom-up mechanisms in biasing com petition in the human brain. Vision Research, 49 (10), 1154–1165. Biederman, I. (1981). On the semantics of a glance at a scene. In M. Kubovy, and J. R. Pomerantz (Eds.), Perceptual organization (pp. 213–263). Hillsdale, NJ: Erlbaum. Biederman, I. (1972). Perceiving real-world scenes. Science, 177, 77–80. Biederman, I., Glass, A. L., & Stacy, W. (1973). Searching for objects in real-world scenes. Journal of Experimental Psychology, 97, 22–27. Biederman, I., Mezzanotte, R. J., & Rabinowitz, J. C. (1982). Scene perception: De tecting and judging objects undergoing relational violations. Cognitive Psychology, 14 (2), 143–177. (p. 71)

Blakemore, S. J., Rees, G., & Frith, C. D. (1998). How do we predict the consequences of our actions? A functional imaging study. Neuropsychologia, 36 (6), 521–529. Brunia, C. H. M. (1999). Neural aspects of anticipatory behavior. Acta Psychologica, 101, 213–352. Bruno, N., & Franz, V. H. (2009). When is grasping affected by the Muller-Lyer illusion? A quantitative review. Neuropsychologia, 47 (6), 1421–1433. Buchel, C., & Friston, K. J. (1997). Modulation of connectivity in visual pathways by atten tion: Cortical interactions evaluated with structural equation modeling and fMRI. Cere bral Cortex, 7 (8), 768–778. Bullier, J. (2001). Integrated model of visual processing. Brain Research Reviews, 36, 96– 107. Carlsson, K., Petrovic, P., Skare, S., Petersson, K. M., & Ingvar, M. (2000). Tickling antici pations: Neural processing in anticipation of a sensory stimulus. Journal of Cognitive Neu roscience, 12, 691–703. Page 17 of 24

Top-Down Effects in Visual Perception Carmichael, S. T., & Price, J. L. (1995). Limbic connections of the orbital and medial pre frontal cortex in macaque monkeys. Journal of Comparative Neurology, 363 (4), 615–641. Cavada, C., Company, T., Tejedor, J., Cruz-Rizzolo, R. J., & Reinoso-Suarez, F. (2000). The anatomical connections of the macaque monkey orbitofrontal cortex: A review. Cerebral Cortex, 10 (3), 220–242. Cavanagh, P. (1991). What’s up in top-down processing? In A. Gorea (Ed.), Representa tions of vision: Trends and tacit assumptions in vision research (pp. 295–304). Cambridge, UK: Cambridge University Press. Chun, M. M., & Jiang, Y. (1999). Top-down attentional guidance based on implicit learning of visual covariation. Psychological Science, 10, 360–365. Corbetta, M., Kincade, J. M., & Shulman, G. L. (2002). Neural systems for visual orienting and their relationships to spatial working memory. Journal of Cognitive Neuroscience, 14, 508–523. Costall, A. (1980). The three faces of Edgar Rubin. Perception, 9, 115. Davenport, J. L., & Potter, M. C. (2004). Scene consistency in object and background per ception. Psychological Science, 15 (8), 559–564. Dehaene, S., Kerszberg, M., & Changeux, J. P. (1998). A neuronal model of a global work space in effortful cognitive tasks. Proceedings of the National Academy of Sciences U S A, 95, 14529–14534. Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. An nual Review of Neuroscience, 183, 193–222. Driver, J., & Baylis, G. C. (1998). Attention and visual object segmentation. In R. Parasura man (Ed.), The attentive brain (pp. 299–325). Cambridge, MA: MIT Press. Engel, A. K., Fries, P., & Singer, W. (2001). Dynamic predictions: Oscillations and syn chrony in top-down processing. Nature Reviews, Neuroscience, 2 (10), 704–716. Enns, J. T., & Lleras, A. (2008). What’s next? New evidence for prediction in human vi sion. Trends in Cognitive Sciences, 12, 327–333. Escera, C., Alho, K., Schröger, E., & Winkler, I. (2000). Involuntary attention and dis tractibility as evaluated with event-related brain potentials. Audiology and Neurootology, 5, 151–166. Felleman, D. J., & Van Essen, V. C. (1991). Distributed hierarchical processing in primate visual cortex. Cerebral Cortex, 1, 1–47. Fenske, M. J., Aminoff, E., Gronau, N., & Bar, M. (2006). Top-down facilitation of visual ob ject recognition: Object-based and context-based contributions. Progress in Brain Re search, 155, 3–21. Page 18 of 24

Top-Down Effects in Visual Perception Fiser, J., & Aslin, R. N. (2001). Unsupervised statistical learning of higher-order spatial structures from visual scenes. Psychological Science, 12, 499–504. Freedman, D. J., Riesenhuber, M., Poggio, T., & Miller, E. K. (2003). A comparison of pri mate prefrontal and inferior temporal cortices during visual categorization. Journal of Neuroscience, 23 (12), 5235–5246. Frey, S., & Petrides, M. (2000). Orbitofrontal cortex: A key prefrontal region for encoding information. Proceedings of the National Academy of Sciences U S A, 97, 8723–8727. Frith, C., & Dolan, R. J. (1997). Brain mechanisms associated with top-down processes in perception. Philosophical Transactions of the Royal Society, Series B, Biological Sciences, 352 (1358), 1221–1230. Friston, K. (2005). A theory of cortical responses. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 360 (1456), 815–836. Ganis, G., & Kosslyn, S. M. (2007). Multiple mechanisms of top-down processing in vision. In S. Funahashi (Ed.), Representation and brain (pp. 21–45). Tokyo: Springer-Verlag. Gazzaley, A. (2010). Top-down modulation: The crossroads of perception, attention and memory. Proceedings of SPIE-IS&T Electronic Imaging, SPIE Vol. 7527, 75270A. Ghuman, A., Bar, M., Dobbins, I. G., & Schnyer, D. (2008). The effects of priming on frontal-temporal communication. Proceedings of the National Academy of Sciences U S A, 105 (24), 8405–8409. Gross, H., Heinze, A., Seiler, T., & Stephan, V. (1999). Generative character of perception: A neural architecture for sensorimotor anticipation. Neural Networks, 12 (7–8), 1101– 1129. Gilbert, C., Ito, M., Kapadia, M., & Westheimer, G. (2000). Interactions between attention, context and learning in primary visual cortex. Vision Research, 40 (10–12), 1217–1226. Gilbert, C. D., & Sigman, M. (2007). Brain states: Top-down influences in sensory process ing. Neuron, 54, 677–696. Girard, P., & Bullier, J. (1989). Visual activity in area V2 during reversible inactivation of area 17 in the macaque monkey. Journal of Neurophysiology, 62 (6), 1287–1302. Goodale, M. A., & Milner, A. D. (1992). Separate visual pathways for perception and ac tion. Trends in Neurosciences, 15 (1), 20–25. Gregory, R. L. (1980). Perceptions as hypotheses. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 290, 181–197. Grossberg, S. (1980). How does a brain build a cognitive code? Psychological Review, 87 (1), 1–51. Page 19 of 24

Top-Down Effects in Visual Perception Hochstein, S., & Ahissar, M. (2002). Review view from the top: Hierarchies and reverse hierarchies in the visual system. Neuron, 36, 791–804. Hock, H. S., Romanski, L., Galie, A., & Williams, C. S. (1978). Real-world schemata and scene recognition in adults and children. Memory and Cognition, 6, 423–431. Hopfinger, J. B., Buonocore, M. H., & Mangun, G. R. (2000). The neural mechanisms of top-down attentional control. Nature Neuroscience, 3 (3), 284–291. Hutchison, K. A. (2003). Is semantic priming due to association strength or feature overlap? A microanalytic review. Psychonomic Bulletin & Review, 10 (4), 785–813. (p. 72)

Kaplan, E. (2004). The M, P, and K pathways of the primate visual system. In L. M. Chalu pa & J.S. Werner (Eds.), The visual neuroscience (pp. 481–494). Cambridge, MA: MIT Press. Kaplan, E., & Shapley, R. M. (1986). The primate retina contains two types of ganglion cells, with high and low contrast sensitivity. Proceedings of the National Academy of Sciences U S A, 83 (8), 2755–2757. Kastner, S., & Ungerleider, L.G. (2000). Mechanisms of visual attention in the human cor tex. Annual Review of Neuroscience, 23, 315–341. Kawato, M., Hayakawa, H., & Inui, T. (1993). A forward-inverse optics model of reciprocal connections between visual cortical areas. Network, 4, 415–422. Koffka, K. (1935). The principles of gestalt psychology. New York: Harcourt, Brace, & World. Kringelbach, M. L., & Rolls, E. T. (2004). The functional neuroanatomy of the human or bitofrontal cortex: Evidence from neuroimaging and neuropsychology. Progress in Neuro biology, 72 (5), 341–372. Kveraga, K., Boshyan, J., & Bar, M. (2007a). Magnocellular projections as the trigger of top-down facilitation in recognition. Journal of Neuroscience, 27 (48), 13232–13240. Kveraga, K., Ghuman, A. S., & Bar, M. (2007b). Top-down predictions in the cognitive brain. Brain and Cognition, 65, 145–168. Lamme, V. A. F. (2006). Towards a true neural stance on consciousness. Trends in Cogni tive Sciences, 10 (11), 494–501. Lamme, V. A. F., & Roelfsema, P. R. (2000). The distinct modes of vision offered by feed forward and recurrent processing. Trends in Neurosciences, 23, 571–579. Lamme, V. A. F., Super, H., & Spekreijse, H. (1998). Feedforward, horizontal, and feed back processing in the visual cortex. Current Opinion in Neurobiology, 8, 529–535.

Page 20 of 24

Top-Down Effects in Visual Perception Li, W., Piech, V., & Gilbert, C. D. (2006). Contour saliency in primary visual cortex. Neuron, 50, 951–962. MacKay, D. (1956). Towards an information-flow model of human behaviour. British Jour nal of Psychiatry, 43, 30–43. Maunsell, J. H. R., & Van Essen, D. C. (1983) Functional properties of neurons in the mid dle temporal visual area of the macaque monkey. II. Binocular interactions and the sensi tivity to binocular disparity. Journal of Neurophysiology, 49, 1148–1167. Merigan, W. H., & Maunsell, J. H. (1993). How parallel are the primate visual pathways? Annual Review of Neuroscience, 16, 369–402. Meunier, M., Bachevalier, J., & Mishkin, M. (1997). Effects of orbital frontal and anterior cingulate lesions on object and spatial memory in rhesus monkeys. Neuropsychologia, 35, 999–1015. Miall, R. C., & Wolpert, D. M. (1996). Forward models for physiological motor control. Neural Networks, 9 (8), 1265–1279. Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. An nual Review of Neuroscience, 24, 167–202. Minsky, M. (1975). A framework for representing knowledge, In P. H. Winston (Ed.), The psychology of computer vision (pp. 163–189). New York: McGraw-Hill. Mumford, D. (1992). On the computational architecture of the neocortex. I. The role of cortico-cortical loops. Biological Cybernetics, 66 (3), 241–251. Munk, M. H., Nowak, L. G., Nelson, J. I., & Bullier, J. (1995). Structural basis of cortical synchronization. II. Effects of cortical lesions. Journal of Neurophysiology, 74 (6), 2401– 2414. Näätänen, R., Paavilainen, P., Rinne, T., & Ahlo, K. (2007). The mismatch negativity (MMN) in basic research of central auditory processing: A review. Clinical Neurophysiolo gy, 118, 2544–2590. Oliva, A. (2005). Gist of the scene. In L. Itti, G. Rees, & J.K. Tsotsos (Eds.), Encyclopedia of neurobiology of attention (pp. 251–256). San Diego, CA: Elsevier. Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representa tion of the spatial envelope. International Journal in Computer Vision, 42, 145–175. Oliva, A., & Torralba, A. (2007). The role of context in object recognition. Trends in Cogni tive Sciences, 11 (12), 520–527. Pessoa, L., Kastner, S., & Ungerleider, L. G. (2003). Neuroimaging studies of attention: From modulation of sensory processing to top-down control. Journal of Neuroscience, 23 (10), 3990–3998. Page 21 of 24

Top-Down Effects in Visual Perception Posner, M. I., & Dehaene, S. (1994). Attentional networks. Trends in Neuroscience, 17, 75–79. Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: A functional in terpretation of some extra-classical receptive-field effects. Nature Neuroscience, 2, 79– 87. Reynolds, J. H., & Chelazzi, L. (2004). Attentional modulation of visual processing. Annual Review of Neuroscience, 27, 611–647. Rolls, E. T., Browning, A. S., Inoue, K., & Hernadi, I. (2005). Novel visual stimuli activate a population of neurons in the primate orbitofrontal cortex. Neurobiology of Learning and Memory, 84 (2), 111–123. Salin, P. A., & Bullier, J. (1995). Corticocortical connections in the visual system: Struc ture and function. Physiological Reviews, 75 (1), 107–154. Schacter, D. L., Dobbins, I. G., & Schnyer, D. M. (2004). Specificity of priming: A cognitive neuroscience perspective. Nature Reviews Neuroscience, 5 (11), 853–862. Shank, R. C. (1975). Conceptual information processing. New York: Elsevier Science Ltd. Schiller, P. H., & Malpeli, J. O. (1978). Functional specificity of lateral geniculate nucleus laminae of the rhesus monkey. Journal of Neurophysiology, 41, 788–797. Schyns, P. G., & Oliva, A. (1994). From blobs to boundary edges: Evidence for time- and spatial-dependent scene recognition. Psychological Science, 5 (4), 195–200. Scholl, B. J. (2001). Objects and attention: The state of the art. Cognition, 80 (1–2), 1–46. Schubotz, R. I. (2007). Prediction of external events with our motor system: towards a new framework. Trends in Cognitive Sciences, 11, 211–218. Schubotz, R. I., & von Cramon, D. Y. (2002). Dynamic patterns make the premotor cortex interested in objects: Influence of stimulus and task revealed by fMRI. Brain Research Cognitive Brain Research, 14, 357–369. Schubotz, R. I., & von Cramon, D. Y. (2001). Functional organization of the lateral premo tor cortex: fMRI reveals different regions activated by anticipation of object properties, location and speed. Brain Research: Cognitive Brain Research, 11 (1), 97–112. Schultz, W., & Dickinson, A. (2000). Neuronal coding of prediction errors. Annual Review of Neuroscience, 23, 473–500. (p. 73)

Shepard, R. (1990). Mind sights. New York: W. H. Freeman. Sherman, S. M., & Guillery, R. W. (2004). The visual relays in the thalamus. In L. M. Chalupa & J. S. Werner (Eds.), The visual neuroscience (pp. 565–592). Cambridge, MA: MIT Press. Page 22 of 24

Top-Down Effects in Visual Perception Shipp, S. (2005). The importance of being agranular: A comparative account of visual and motor cortex. Philosophical Transactions of the Royal Society of London, Series B, Biolog ical Sciences, 360, 797–814. Shipp, S., & Zeki, S. (1989). The organization of connections between areas V5 and V2 in macaque monkey visual cortex. European Journal of Neuroscience, 1 (4), 333–354. Simmons, A., Matthews, S. C., Stein, M. B., & Paulus, M. P. (2004). Anticipation of emo tionally aversive visual stimuli activates right insula. Neuroreport, 15, 2261–2265. Spence, D. P., & Owens, K. C. (1990). Lexical co-occurrence and association strength. Journal of Psycholinguistic Research, 19, 317–330. Sperry, R. (1950). Neural basis of the spontaneous optokinetic response produced by visu al inversion. Journal of Comparative and Physiological Psychology, 43, 482–489. Summerfield, C., & Egner, T. (2009). Expectation (and attention) in visual cognition. Trends in Cognitive Sciences, 13 (9), 403–408. Thorpe, S., & Imbert, M. (1989). Biological constraints on connectionist modeling. In R. Pfeifer, Z. Schreter, F. Fogelman-Soulié, & L. Steels (Eds), Connectionism in perspective (pp. 63–93). Amsterdam: Elsevier. Thorpe, S. J., Rolls, E. T., & Maddison, S. (1983). Neuronal activity in the orbitofrontal cortex of the behaving monkey. Experimental Brain Research, 49, 93–115. Todorović, D. (2007). W. Metzger: Laws of seeing. Gestalt Theory, 28, 176–180. Torralba, A., & Oliva, A. (2003). Statistics of natural image categories. Network, 14 (3), 391–412. Torralba, A., Oliva, A., Castelhano, M. S., & Henderson, J. M. (2006). Contextual guidance of attention in natural scenes: the role of global features on object search. Psychological Review, 113, 766–786. Tulving, E., & Schacter, D. L. (1990). Priming and human memory systems. Science, 247, 301–306. Ullman, S. (1995). Sequence seeking and counter streams: A computational model for bidirectional information flow in the visual cortex. Cerebral Cortex, 1, 1–11. Ungerleider, L. G., & Desimone, R. (1986). Cortical connections of visual area MT in the macaque. Journal of Comparative Neurology, 248 (2), 190–222. von Stein, A., Chiang, C., Konig, P., & Lorincz, A. (2000). Top-down processing mediated by interareal synchronization. Proceedings of the National Academy of Sciences U S A, 97 (26), 14748–14753.

Page 23 of 24

Top-Down Effects in Visual Perception von Stein, A., & Satnthein, J. (2000). Different frequencies for different scales of cortical integration: From local gamma to long range alpha/theta synchronization. International Journal of Psychophysiology, 38 (3), 301–313. Watanabe, T., Harner, A. M., Miyauchi, S., Sasaki, Y., Nielsen, M., Palomo, D., & Mukai, I. (1998). Task-dependent influences of attention on the activation of human primary visual cortex. Proceedings of the National Academy of Sciences U S A, 95, 11489–11492. Widmann, A., Gruber, T., Kujala, T., Tervaniemi, M., & Schroger, E. (2007). Binding sym bols and sounds: Evidence from event-related oscillatory gamma-band activity. Cerebral Cortex, 17, 2696–2702. Wiggs, C. L., & Martin, A. (1998). Properties and mechanisms of visual priming. Current Opinion in Neurobiology, 8, 227–233. Winkler, I., & Czigler, I. (1998). Mismatch negativity: Deviance detection or the mainte nance of the “standard.” Neuroreport, 9, 3809–3813. Wolpert, D. M., & Flanagan, J. R. (2001). Motor prediction. Current Biology, 11 (18), R729–R732. Wolpert, D. M., & Kawato, M. (1998). Multiple paired forward and inverse models for mo tor control. Neural Networks, 11 (7–8), 1317–1329. Zeki, S., & Bartels, A. (1999). Toward a theory of visual consciousness. Consciousness and Cognition 8, 225–259.

Moshe Bar

Moshe Bar is a neuroscientist, director of the Gonda Multidisciplinary Brain Re search Center at Bar-Ilan University, associate professor in psychiatry and radiology at Harvard Medical School, and associate professor in psychiatry and neuroscience at Massachusetts General Hospital. He directs the Cognitive Neuroscience Laborato ry at the Athinoula A. Martinos Center for Biomedical Imaging. Andreja Bubic

Andreja Bubic, Martinos Center for Biomedical Imaging, Massachusetts General Hos pital, Harvard Medical School, Charlestown, MA

Page 24 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery Grégoire Borst The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0005

Abstract and Keywords Mental imagery is one of the cognitive functions that has received a lot of attention in the past 40 years both from philosophers and cognitive psychologists. Recently, researchers started to use neuroimaging techniques in order to tackle fundamental properties of men tal images such as their depictive nature—which was fiercely debated for almost 30 years. Results from neuroimaging, brain-damaged patients, and transcranial magnetic stimulation studies converged in showing that visual, spatial and motor mental imagery relies on the same basic brain mechanisms used respectively in visual perception, spatial cognition, and motor control. Thus, neuroimaging and lesions studies have proved critical to answer the imagery debate between depictive and propositionalist theorists. Partly be cause of the controversy that surrounded the nature of mental images, the neural bases of mental imagery are probably more closely defined than those of any other higher cog nitive functions. Keywords: visual mental imagery, spatial mental imagery, motor mental imagery, visual perception, neuroimaging, brain-damaged patients, transcranial magnetic stimulation

When we think of the best way to load luggage in the trunk of a car, of the fastest route to go from point A to point B, or of the easiest way to assemble bookshelves, we generally rely on our abilities to simulate those events by visualizing them instead of actually per forming them. When we do so, we experience “seeing with the mind’s eye,” which is the hallmark of a specific type of representation processed by our brain, namely, visual men tal images. According to Kosslyn, Thompson, and Ganis (2006), mental images are repre sentations that are similar to those created on the initial phase of perception but that do not require an external stimulation to be created. In addition, those representations pre serve the perceptible properties of the stimuli they represent.

Page 1 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery

Early Studies of Mental Imagery and the Imagery Debate Visual mental imagery has a very specific place in the study of human mental activity. In fact, dating back to early theories of mental activity, Greek philosophers such as Plato proposed that memory might be analogous to a wax tablet into which our perception and thoughts stamp images of themselves, as a signet ring stamps impressions in wax. Ac cording to this view, seeing with the mind’s eye is considered a phenomenon closely relat ed to perceptual activities. Thus, the idea of an analogy between mental imagery and per ception is not new. However, because of the inherent private and introspective nature of mental imagery, garnering objective empirical evidence of the nature of these representa tions has been a great challenge for psychology researchers. The introspective nature of imagery led behaviorists (who championed the idea that psychology should focus on ob servable stimuli and the responses to these stimuli) such as Watson (1913) to deny the ex istence of mental images by asserting that thinking was solely constituted by subtle movements of the vocal apparatus. Behaviorism has had a (p. 75) long-lasting negative im pact on the legitimacy of studying mental imagery. In fact, neither the cognitive revolu tion of the 1950s—during which the human mind started to be conceived of as like com puter software—nor the first results of Paivio (1971) showing that mental imagery im proves the memorization of words were sufficient to legitimize the study of mental im agery. The revival of mental imagery was driven not only by empirical evidence that mental im agery was a key part of memory, problem solving, and creativity but also by the type of questions and methodologies researchers used. Researchers shifted from phenomenologi cal problematic and introspective methods and started to focus on refining the under standing of the nature of the representations involved in mental imagery and of the cogni tive processes that interpret those representations. The innovative idea was to use chronometric data as a “mental tape measure” of the underlying cognitive processes that interpret mental images in order to characterize the properties of the underlying repre sentations and cognitive processes. One of the most influential works that helped mental imagery to regain its respectability was proposed by Shepard and Metzler (1971). In their mental rotation paradigm, participants viewed two three-dimensional (3D) objects with several arms, each consisting of small cubes, and decided whether the two objects had the same shape, regardless of difference in the orientations of the objects. The key find ing was that response times increased linearly with increasing angular disparity between the two objects. The results demonstrated for the first time that people mentally rotated one of the objects in congruence with the orientation of the other object. Other para digms, such as the image scanning paradigm (e.g., Finke & Pinker, 1982; Kosslyn, Ball, & Reiser, 1978), allowed researchers to characterize not only the properties of the cognitive processes at play in visual mental imagery but also the nature of visual mental images. Critically, the data of these experiments suggested that visual mental images are depic tive representations. By depictive, researchers mean that (1) each part of the representa tion corresponds to a part of the represented object, such that (2) the distances between Page 2 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery representations of the parts (in a representational space) correspond to the distances be tween the parts on the object itself (see Kosslyn et al., 2006). However, not all researchers interpreted behavioral results in mental imagery studies as evidence that mental images are depictive. For example, Pylyshyn (1973, 2002, 2003a, 2003b, 2007) proposed a propositional account of mental imagery. According to this view, results obtained in mental imagery experiments can be best explained by the fact that participants rely not on visuo-spatial mental images, but instead on descriptive represen tations (the sort of representations that underlie language). Pylyshyn (1981) championed the idea that the conscious experience of visualizing an object is purely epiphenomenal, as is the power light on an electronic device—the light does not plays a functional role in the way the electronic device works. Thus, it became evident that behavioral data would not be sufficient to resolve the mental imagery debate between propositional and depic tive researchers. In fact, Anderson (1978) demonstrated that any behavioral data collect ed in a visual mental imagery experiment could be explained equally well by inferring that depictive representations were processed or that a set of propositions were processed. As cognitive neuroscience started to elicit the neural underpinning of a number of higher cognitive functions and of visual perception started, it became evident that neuroimaging data could resolve the imagery debate initiated in the 1970s. The rationale of using neu roimaging to characterize the nature of visual mental images followed directly on the heels of the functional and structural equivalence documented in behavioral studies be tween visual mental imagery and visual perception (see Kosslyn, 1980). Researchers rea soned that if visual mental imagery relies on representations and cognitive processes sim ilar to those involved during visual perception, then visual mental imagery should rely on the same brain areas that support visual perception (Kosslyn, 1994). In this chapter we report results collected in positron tomography emission (PET), func tional magnetic resonance imagery (fMRI), transcranial magnetic stimulation (TMS), and brain lesions studies, which converged in showing that visual mental imagery relies on the same brain areas as those elicited when one perceives the world or initiates an ac tion. The different methods serve different means. For example, fMRI allows researchers to monitor the whole brain at work with a good spatial resolution—by contrasting the mean blood oxygen level–dependent signal (BOLD) in a baseline condition to the BOLD signal in an experimental condition. However, fMRI provides information on the correla tions between the brain areas activated and the tasks performed but not on the causal re lations between the two. In contrast, brain-damaged patients and TMS studies can pro vide such causal (p. 76) relations. In fact, if a performance in a particular task is selective ly impaired following a virtual lesion (TMS) or an actual brain lesion, this specific brain area plays a causal role in the cognitive processes and representations engaged in that particular task. However, researchers need to rely on previous fMRI or PET studies to de cide what specific brain areas to target with TMS or which patients to include in their

Page 3 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery study. Thus, a comprehensive view of the neural underpinning of any cognitive function requires taking into account data from all of these approaches. In this chapter, we first discuss and review the results of studies that document an over lap of the neural circuitry in the early visual cortex between visual mental imagery and vi sual perception. Then, we present studies that specifically look at the neural underpin ning of shape-based mental imagery and spatial mental imagery. Finally, we report stud ies on the neural bases of motor imagery and how they overlap with those recruited when ones initiates an action.

Visual Mental Imagery and the Early Visual Ar eas The early visual cortex comprises Brodmann areas 17 and 18, which receive input from the retina. These visual areas are retinotopically organized: Two objects located close to each other in a visual scene activate neurons in areas 17 and 18 relatively close to each other (Sereno et al., 1995). Thus, visual space is represented topographically in the visual cortex using two dimensions: eccentricity and polar angle. “Eccentricity” is the distance from the fovea (i.e., high-resolution central region of the visual field) of a point projected on the retina. Crucially, the farther away a point is located from the fovea, the more ante rior the activation is observed in the early visual cortex. Bearing on the way eccentricity is represented on the cortex, Kosslyn and collaborators (1993) used PET to study whether area 17 was recruited during visual mental imagery of letters. In their task, participants visualized letters, maintained the mental images of the letters for 4 seconds, and then were asked to make a judgment about a visual property of the letters—such as whether the letters possess a straight line. Blood flow was monitored through PET. The hypothesis was that if visual mental images were depictive and recruited topographical areas of the visual cortex, then when participants were asked to visualize letters as small as possible (while remaining visible), the activation of area 17 should be more anterior than when participants visualized letters as large as possible (while being entirely visible). The re sults were consistent with their hypothesis: Large visualized letters activated posterior regions of area 17, whereas small visualized letters recruited anterior regions of area 17. Kosslyn, Thompson, Kim, and Alpert (1995) replicated the results in a PET study in which participants visualized line drawings of objects previously memorized in boxes of three different sizes. These two studies used a neuroimaging technique with limited spatial res olutions, which led some to raise doubt about these results. However, similar findings were reported when fMRI was used—a technique that provides a better spatial resolution of the brain areas activated. For example, Klein, Paradis, Po line, Kosslyn, and Lebihan (2000) in an event-related fMRI study documented an activa tion of area 17 that started 2 seconds after the auditory cue prompting participants to form a mental image, peaked around 4 to 6 seconds, and dropped off after 8 seconds or so. In a follow-up experiment, Klein et al. (2004) demonstrated that the orientation with which a bowtie shape stimulus was visualized modulated the activation of the visual cor Page 4 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery tex. The activation elicited by visualizing the bowtie vertically or horizontally matched the cortical representations of the horizontal and vertical meridians. Moreover, in a fMRI study, Slotnick, Thompson, and Kosslyn (2005) found that the retinotopic maps produced by the visual presentation of rotating and flickering checkerboard wedges were similar to the ones produced when rotating and flickering checkerboard wedges were visualized. And to some extent, those maps were more similar than the maps produced in an atten tion-based condition. Finally, Thirion and colleagues (2006) adopted an “inverse retino topy” approach to infer the content of visual images based on the brain activations ob served. Participants were asked in a perceptual condition to fixate rotating Gabor patches and in the mental imagery condition to visualize one of the six Gabor patches rotating right or left of a fixation point. Authors were able to predict accurately the stimuli partici pants had seen and to a certain degree the stimuli participants had visualized. Crucially, most of the voxels leading to a correct prediction of the stimuli visualized or presented vi sually were located in area 17 and 18.

Figure 5.1 Set of stimuli (left) and mean response times for each participant (noted 1 to 5) in the two experimental conditions (perception vs. mental im agery) as a function of the repetitive transcranial magnetic stimulation (rTMS) condition (real vs. sham).

Taken together, results from fMRI and PET studies converge in showing that visual men tal imagery activates the early visual areas and that the spatial structure of the activa tions elicited by the mental imagery task is accounted for by standard (p. 77) retinotopic mapping. Nevertheless, the questions remained as to whether activation of the early visu al areas plays any functional role in visual imagery. In order to address this question, Kosslyn et al. (1999) designed a task in which participants first memorized four patterns of black and white stripes (which varied in length, width, orientation, and spacing of the stripes; Figure 5.1) in four quadrants, and then were asked to visualize two of the pat terns and to compare them on a given visual attribute (such as the orientation of the stripes). The same participants performed the task in a perceptual condition on which their judgments were based on patterns of stripes displayed on the screen. In both condi tions, before comparing two patterns of stripes, repetitive TMS (rTMS) was delivered to the early visual cortex—which had been shown to be activated using PET. In rTMS stud Page 5 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery ies, a coil is used to deliver low-frequency magnetic pulses, which decrease cortical ex citability for several minutes in the cortical area targeted (see Siebner et al. 2000). This technique has the advantage that the disruption is reversible and lasts for a few minutes. In addition, because the disruption is transient, there are no compensatory phenomena as with real lesions. When stimulation was delivered to the posterior occipital lobe (real rTMS condition), participants required more time to compare two patterns of stripes than when stimulations were delivered away from the brain (in a sham rTMS). The effect of re al rTMS (as denoted by the difference between the sham and real stimulations; see Fig ure 5.1) was similar in visual mental imagery and visual perception, which makes sense if area 17 is critical for both. Sparing et al. (2002) used another TMS approach to determine whether visual mental im agery modulates cortex excitability. The rationale of their approach was to use the phosphene threshold (PT; i.e., the minimum TMS intensity that evokes phosphenes) to de termine the cortical excitability of the primary visual areas of the brain. A single-pulse TMS was delivered on the primary visual cortex to produce phosphenes in the right lower quadrant of the visual field. Concurrently, participants performed either a visual mental imagery task or an auditory control task. For each participant, the PT was determined by increasing TMS intensity on each trial until participants reported experiencing phosphenes. Visual mental imagery decreased the PT compared with the baseline condi tion, whereas the auditory task had no effect on the PT. The results indicate that visual mental imagery enhances cortex excitability in the visual cortex, which supports the func tional role of the primary visual cortex in visual mental imagery. Consistent with the role of area 17 in visual mental imagery, the apparent horizontal (p. 78) size of visual mental images of a patient who had the occipital lobe surgically removed in one hemisphere was half the apparent size of mental images in normal participants (Farah, Soso, & Dasheiff, 1992). However, not all studies reported a functional role of area 17 in visual mental imagery. In fact, neuropsychological studies offered compelling evidence that cortically blind patients could have spared visual mental imagery abilities (see Bartolomeo, 2002, 2008). Anton (1899) and Goldenberg, Müllbacher, and Nowak (1995) reported cortically blind patients who seemed to be able to form visual mental images. In addition, Chatterjee and South wood (1995) reported two cortically blind patients resulting from medial occipital lesions with no impairment of their capacity to imagine object forms—such as capital letters or common animals. These two patients could also draw a set of common objects from mem ory. Finally, Kosslyn and Thompson (2003), reviewed more than 50 neuroimaging studies (fM RI, PET, and single-photon emission computer tomography, or SPECT) and found that in nearly half, there was no activation of the early visual cortex. A meta-analysis of the neu roimaging literature of visual mental imagery revealed that three factors accounted for the probability of activation in area 17. Sensitivity of the technique is one of the factors, and 19 fMRI studies out of 27 reported activation in area 17, compared with only 2 SPECT studies out of 9 reporting such activation. The degree of detail of the visual men Page 6 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery tal images that needs to be generated is also important, with high-resolution mental im ages more likely to elicit activation in area 17. Finally, the type of judgment accounts for the probability of activation in area 17: If spatial judgment is required, activation in area 17 is less likely. Thus, activation in area 17 most likely reflects the computation needed to generate the visual images, at least for certain types of mental images, such as high-reso lution, shape-based mental images.

Visual Mental Imagery and Higher Visual Areas The overlap of the brain areas elicited by visual perception and mental imagery was stud ied not only in early visual areas but also in higher visual areas. The visual system is orga nized hierarchically, with early visual cortical areas (areas 17 and 18) located on the low est level (see Felleman & Van Essen, 1991). Brain lesions and neuroimaging studies docu ment that the visual system is then organized in two parallel streams with different func tions (e.g., Goodale & Milner, 1992; Haxby et al., 1991; Ungerleider & Mishkin, 1982). The ventral stream (running from the occipital lobes down to the inferior temporal lobes) is specialized in processing object properties of percepts (such as shape, color, and tex ture), whereas the dorsal stream (running from the occipital lobes up to the posterior parietal lobes) is specialized in processing spatial properties of percepts (such as orienta tion and location) and action (but see for a discussion Borst, Thompson, and Kosslyn, 2011). A critical finding is that parallel deficits occur in visual mental imagery (e.g., Levine, Warach, & Farah, 1985): Damages to the ventral stream disrupt the ability to vi sualize the shape of objects (such as the shape of a stop sign), whereas damages to the dorsal stream disrupt the ability to create a spatial mental image (such as the locations of landmarks on a map). In the next section, we review neuroimaging and brain-damaged patient studies showing that shape-based mental imagery (including mental images of faces) and visual percep tion engage most of the same higher visual areas in the ventral stream and that spatial mental imagery and spatial vision recruit most of the same areas in the dorsal stream.

Ventral Stream, Shape-Based Mental Imagery and Color Imagery Brain imaging and neuropsychological data document a spatial segregation of visual ob ject representations in the higher visual areas. For example, Kanwisher and Yovel (2006) demonstrated that the lateral fusiform gyrus responds more strongly to pictures of faces than other categories of objects, whereas the medial fusiform gyrus and the parahip pocampal gyri respond selectively to pictures of buildings (Downing, Chan, Peelen, Dodds, & Kanwisher, 2006). To demonstrate the similarity between the cognitive processes and representations in vi sion and visual mental imagery, researchers investigated whether the spatial segregation of visual objects in the ventral stream can be found during shape-based mental imagery. Bearing on this logic, O’Craven and Kanwisher (2000) asked a group of participants ei ther to recognize pictures of familiar faces and buildings or to visualize those pictures in Page 7 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery an fMRI study. In the perceptual condition, a direct comparison of activation elicited by the two types of stimuli (buildings and faces) revealed a clear segregation within the ven trotemporal cortex—with activation found in the fusiform face area (FFA) for faces and in the parahippocampal place area for buildings (PPA). In the visual mental imagery condi tion, (p. 79) the same pattern was observed but with weaker activation and smaller patch es of cortex activated. Crucially, there was no hint of activation of the FFA when partici pants visualized faces, nor of the PPA when they visualized buildings. The similarity be tween vision and mental imagery in the higher visual areas was further demonstrated by the fact that more than 84 percent of the voxels activated in the mental imagery condition were activated in the perceptual condition. These results were replicated in another fMRI study (Ishai, Ungerleider, & Haxby, 2000). In this study, participants were asked either to view passively pictures of three objects categories (i.e., faces, houses, and chairs), to view scrambles version of these pictures (perceptual control condition), to visualize the pictures while looking at a gray back ground, or to stare passively at the gray background (mental imagery control condition). When activation elicited by the three object categories were compared in the perceptual condition—after removing the activation in the respective control condition—different re gions in the ventral stream showed differential responses to faces (FFA), houses (PPA), and chairs (inferior temporal gyrus). Fifteen percent of the voxels in these three ventral stream regions showed a similar pattern of activation in the mental imagery condition. Mental images of the three categories of objects elicited additional activation in the pari etal and the frontal regions that were not found in the perceptual condition. In a follow-up study, Ishai, Haxby, and Ungerleider (2002) studied the activation elicited by famous faces either presented visually or visualized. In the mental imagery condition, participants studied pictures of half of the famous faces before the experiment. For the other half of the trials, participants had to rely on their long-term memory to generate the mental images of the faces. In the mental imagery and perceptual conditions, the FFA (lateral fusiform gyrus) was activated, and 25 percent of the voxels activated in the men tal imagery condition were within regions recruited during the perceptual condition. The authors found that activation within the FFA was stronger for faces studied before the ex periment than for faces generated on the basis of information stored in long-term memo ry. In addition, given that visual attention did not modulate the activity recorded in higher visual areas, Ishai and colleagues argued that attention and mental imagery are dissociat ed to some degree. Finally, although mental imagery and perception recruit the same category-selective ar eas in the ventral stream, these areas are activated predominantly through bottom-up in puts during perception and through top-down inputs during mental imagery. In fact, a new analysis of the data reported by Ishai et al. (2000) revealed that the functional con nectivity of ventral stream areas was stronger with the early visual areas in visual percep tion; whereas during visual mental imagery, stronger functional connections were found

Page 8 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery between the higher visual areas and the frontal and parietal areas (Mechelli, Price, Fris ton, & Ishai, 2004). A recent fMRI study further supported the similarity of the brain areas recruited in the ventral stream during visual mental imagery and visual perception (Stokes, Thompson, Cusack, & Duncan, 2009). In this study, participants were asked to visualize an “X” or an “O” based on an auditory cue, or to view passively the two capital letters displayed on a computer screen. During both conditions (i.e., visual mental imagery and visual percep tion), the visual cortex was significantly activated. Above-baseline activation was record ed in the calcarine sulcus, cuneus, and lingual gyrus, and it extended to the fusiform and middle temporal gyri. In addition, in both conditions, a multivoxel pattern analysis re stricted to the anterior and posterior regions of the lateral occipital cortex (LOC) re vealed that different populations of neurons code for the two types of stimuli (“X” and “O”). Critically, a perceptual classifier trained on patterns of activation elicited by the perceptual presentation of the stimuli was able to predict the type of visual images gener ated in the mental imagery condition. The data speak to the fact that mental imagery and visual perception activate shared content-specific representations in high-level visual ar eas, including in the LOC. Brain lesions studies generally present converging evidence that mental imagery and per ception rely on the same cortical areas in the ventral stream (see Ganis, Thompson, Mast, & Kosslyn, 2003). The logic underlying the brain lesions studies is that if visual mental imagery and perception engage the same visual areas, then the same pattern of impair ment should be observe in the two functions. Moreover, given that different visual mental images (i.e., houses vs. faces) selectively elicit activation in different areas of the ventral stream, the impairment in one domain of mental imagery (color or shape) should yield parallel deficit in this specific domain in visual perception. In fact, patients with impair ment in face recognition (i.e., prosopagnosia) are impaired in their ability to visualize faces (Shuttleworth, Syring, & Allen, 1982; (p. 80) Young, Humphreys, Riddoch, Hellawell, & De Haan, 1994). Selective deficit to identify animals in a single case study was accom panied by similar deficit to describe animals or to draw them from memory (Sartori & Job, 1988). In addition, as revealed by an early review of the literature (Farah, 1984), ob ject agnosia was generally associated with deficit in the ability to visualize objects. Even finer parallel deficits can be observed in the ventral stream. For example, Farah, Ham mond, Mehta, and Ratcliff (1989) reported the case of a prosopagnosic patient with spe cific deficit in his ability to visualize living things (such as animals or faces) but not in his ability to visualize nonliving things. In addition, some brain-damaged patients cannot dis tinguish colors perceptually and present similar deficits in color imagery (e.g., Rizzo, Smith, Pokorony, & Damasio, 1993). Critically, patients with color perception deficits have good general mental imagery abilities but are specifically impaired in color mental im agery tasks (e.g., Riddoch & Humphreys, 1987). However, not all neuropsychological findings report parallel deficits in mental imagery and perception. Cases of patients were reported who had altered perception but pre served imagery (e.g., Bartolomeo et al., 1998; Behrmann, Moscovitch, & Winocur, 1994; Page 9 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery Servos & Goodale, 1995). For example, Behrmann et al. (1994) reported the case of C.K., a brain-damaged patient with a left homonymous hemianopia and a possible thinning of the occipital cortex (as revealed by a PET and MRI scan) who was severely impaired at recognizing objects but who had no apparent deficit in shape-based mental imagery. In fact, C.K. could draw objects with considerable detail from memory and could use infor mation derived from visual images in a variety of tasks. Conversely, he could not identify objects presented visually, even those he drew from memory. A similar dissociation be tween perceptual impairments and relative spared ability in mental imagery was ob served in Madame D. (Bartolomeo et al., 1998). Following bilateral brain lesions to the ex trastriate visual areas (i.e., Brodmann areas 18, 19 bilaterally and 37 in the right hemi sphere), Madame D. developed severe alexia, agnosia, prosopagnosia, and achromatop sia. Her ability to recognize objects presented visually was severely impaired except for very simple shapes like geometric figures. In contrast, she could draw objects from mem ory, but she could not identify them. She performed well on an object mental imagery test. Her impairment was not restricted to shape processing. In fact, she could not dis criminate between colors, match colors, or point to the correct color. In sharp contrast, she presented no deficit in color imagery, being able, for example, to determine which of two objects had a darker hue when presented with a pair of objects names. In some instances, studies reported the reverse pattern of dissociation with relatively nor mal perception associated with deficits in visual mental imagery (e.g., Goldenberg, 1992; Guariglia, Padovani, Pantano, & Pizzamiglio, 1993; Jacobson, Pearson, & Robertson, 2008; Moro, Berlucchi, Lerch, Tomaiuolo, & Aglioti, 2008). For example, two patients who performed a battery of mental imagery tests in several sensory domains (visual, tactile, auditory, gustatory, olfactory, and motor) showed pure visual imagery deficit for one and visual and tactile imagery deficit for the other. Critically, the two patients had no appar ent perceptual, language, or memory deficits (Moro et al., 2008). Lesions were located in the middle and inferior temporal gyri of the left hemisphere in one patient and in the tem poro-occipital area and the left medial and superior parietal lobe in the other patient. The fact that some brain-damaged patients can present spared mental imagery with deficit in visual perception or spared visual perception with deficit in mental imagery could reveal a double dissociation between shape- and color-based imagery and visual perception. In fact, visualizing an object relies on top-down processes that are not always necessary to perceive this object, whereas perceiving an object relies on bottom-up orga nizational processes not required to visualize it (e.g., Ganis, Thompson, & Kosslyn, 2004; Kosslyn, 1994). This double dissociation is supported by the fact that not all of the same brain areas are activated during visual mental imagery and visual perception (Ganis et al., 2004; Kosslyn, Thompson, & Alpert, 1997). In an attempt to quantify the similarity be tween visual mental imagery and visual perception, Ganis et al. (2004) in an fMRI study asked participants to judge visual properties of objects (such as whether the object was taller than wide) based either on a visual mental image of that object or on a picture of that object presented visually. Across the entire brain, the amount of overlap of the brain regions activated during visual mental imagery and visual perception reached 90 percent. The amount of overlap in activation was smaller in the occipital and temporal lobes than Page 10 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery in the frontal and parietal lobes, which suggests that perception relies in part on bottomup organizational processes that are not used as extensively during mental imagery. How ever, visual imagery elicited activation in regions that were a (p. 81) subset of the regions activated during the perceptual condition.

Dorsal Stream and Spatial Mental Imagery In the same way that researchers have studied brain areas in the ventral stream involved in shape- and color-based mental imagery, researchers have identified brain areas recruit ed during spatial mental imagery in the dorsal stream. A number of neuroimaging studies used a well-understood mental imagery phenomenon to investigate the brain areas elicit ed during spatial mental imagery, namely, the image scanning paradigm. In the image scanning paradigm, participants first learn a map of an island with a num ber of landmarks, then they mentally scan the distance between each pair of landmarks after hearing the names of a pair of landmarks (e.g., Denis & Cocude, 1989; Kosslyn et al., 1978). The landmarks are positioned in such a way that distances between each pair of landmarks are different. The classic finding is a linear increase of response times with in creasing distance between landmarks (see Denis & Kosslyn, 1999). The linear relation ship between distance and scanning times suggests that spatial images incorporate the metric properties of the objects they represent—which constitutes some of the evidence that spatial images depict information. In a PET study, Mellet, Tzourio, Denis, and Mazoy er (1995) investigated the neural basis of image scanning. After learning the map of a cir cular island, participants were asked either to scan between each landmark on a map pre sented visually in clockwise or counterclockwise direction or to scan a mental image of the same map in the same way. When compared with a rest condition, both conditions elicited brain activation in the bilateral superior external occipital regions and in the left internal parietal region (precuneus). However, primary visual areas were activated only in the perceptual condition. fMRI studies provided further evidence that spatial processing of spatial images and spa tial processing of the same material presented visually share the same brain areas in the dorsal stream (e.g., Trojano et al., 2000, 2004). For example, Trojano et al. (2000) asked participants to visualize two analogue clock faces and then to decide on which of them the clock hands form the greater angle. In the perceptual task, the task of the partici pants was identical, but the two clock faces were presented visually. When compared with a control condition (i.e., participants judged which of the two times was numerically greater), the mental imagery condition elicited activation in the posterior parietal cortex and several frontal regions. In both conditions, brain activation was found in the inferior parietal sulcus (IPS). Critically, when the two conditions (imagery and perception) were directly contrasted, the activity in the IPS was no longer observed. The neuroimaging da ta suggest that the IPS supports spatial processing of mental images and of visual per cepts. In a follow-up study using the clock-face mental imagery task in an event-related fMRI study, Formisano et al. (2002) found similar activation of the posterior parietal cor

Page 11 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery tex with a peak of activation in the IPS 2 seconds after the auditory presentation of the hours to visualize. Interestingly, the frontoparietal network at play during spatial imagery is not restricted to the processing of static spatial representation. In fact, Kaas, Weigelt, Roebroeck, Kohler, and Muckli (2010) studied the brain areas recruited when participants were imagining objects in movement using fMRI. In the motion imagery task, participants were asked to visualize a blue ball moving back and forth within either the upper right corner or the lower left corner of a computer screen. Participants imagined the motion of the ball at dif ferent speeds—adjusted in the function of duration of an auditory cue. To determine whether participants visualized the ball at the correct speed, participants were required upon hearing a specific auditory cue to decide which of two visual targets was closer to the imagined blue ball. The motion imagery task elicited activation in a parietofrontal net work comprising bilaterally the superior and inferior parietal lobules (areas 7 and 40) and the superior frontal gyrus (area 6), in addition to activation in the left middle occipital gyrus and hMT/V5+. Finally, in V1, V2, and V3, a negative BOLD response was found. Kass and colleagues argue that this negative BOLD signal might reflect an inhibition of these areas to prevent visual inputs to interfere with motion imagery in higher visual ar eas such as hMT/V5+. The recruitment of the dorsal route for spatial imagery is not restricted to the modality in which information is presented. Mellet et al. (2002) found similar activation in a pari etofrontal network (i.e., intraparietal sulcus, presupplementary motor area, and superior frontal sulcus) when participants mentally scan an environment described verbally or an environment learned visually. Activation of similar brain areas in the dorsal route is also observed when participants generate spatial images of cubes assembled on the basis of verbal information (Mellet et al., 1996). In addition, neuroimaging studies on (p. 82) blind participants suggest that representations and cognitive processes in spatial imagery are not visuo-spatial. For example, during a spatial mental imagery task, the superior occipi tal (area 19), the precuneus, and the superior parietal lobes (area 7) were activated in the same way in sighted and early blind participants (Vanlierde, de Volder, Wanet-Defalque, & Veraart, 2003). The task required participants to generate a pattern in a 6 × 6 grid by fill ing in cells based on verbal instructions. Once they generated the mental image of the pattern, participants judged the symmetry of this pattern. The fact that vision is not nec essary to form and to process spatial images was further demonstrated in an rTMS study. Aleman et al. (2002) found that participants required more time to determine whether a cross presented visually “fell” on the uppercase letter they visualized in a real rTMS con dition (compared with a sham rTMS condition) only when repetitive pulses were delivered on the posterior parietal cortex (P4 positions) but not when delivered on the early visual cortex (Oz position). The functional role of the dorsal route in spatial mental imagery is supported by results collected on brain-damaged patients (e.g., Farah et al., 1988; Levine et al., 1985; Luzzatti, Vecchi, Agazzi, Cesa-Bianchi, & Vergani, 1998; Morton & Morris, 1995). For example, Morton and Morris (1995) reported a patient called M.G. with a left parieto-occipital le Page 12 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery sion who was selectively impaired in visuo-spatial processing. M.G. had no deficit in face recognition and visual memory tests nor in an image inspection task. In contrast, she was not only impaired on a mental rotation task but also on an image scanning task. She could learn the map of the island and indicate the correct positions of the landmarks, but she was not able to mentally scan the distance between the landmarks. She presented similar deficit when asked to scan the contour of block letters.

Motor Imagery In the previous sections, we presented evidence that visual mental imagery and spatial imagery rely on the same brain areas as the ones elicited during vision and spatial vision, respectively. Given that motor imagery occurs when a movement is mentally simulated, motor imagery should recruit brain areas involved in physical movement. And in fact there is a growing number of evidence that motor areas are activated during motor im agery. In the next section, we review evidence that motor imagery engages the same brain areas as the ones recruited during a physical movement, including in some in stances the primary motor cortex, and that motor imagery is one of the strategies used to transform mental images.

Motor Imagery and Physical Movement Decety and Jeannerod (1995) demonstrated that if one is asked to mentally walk from point A to point B, the time to realize this “mental travel” is similar to the time one would take to walk that distance. This mental travel effect (i.e., similarity of the time to imagine an action and the time to perform that action) constitutes strong evidence that motor im agery is crucial to simulating actual physical movements. Motor imagery is a particular type of mental imagery and differs from visual imagery (and to a certain extent from spa tial imagery). In fact, a number of studies have documented that visual mental imagery and motor imagery rely on distinct mechanisms and brain areas (Tomasino, Borroni, Isa ja, & Rumiati, 2005; Wraga, Shepard, Church, Iniati, Kosslyn, 2005; Wraga, Thompson, Alpert, & Kosslyn, 2003). A single-cell recoding of the motor strip of monkeys first demon strated that motor imagery relies partially on areas of the cortex that carry motor control: Neurons in the motor cortex fired in sequence depending of their orientation tuning while monkeys were planning to move a lever along a specific arc (Georgopoulos, Lurito, Petrides, Schwartz, & Massey, 1989). Crucially, the neurons fired when the animals were preparing to move their arms, not actually moving them. To study motor imagery in humans, researchers often used mental rotation paradigms. In the seminal mental rotation paradigm designed by Shepard and Metzler (1971), a pair of 3D objects with several arms (each consisting of small cubes) is presented visually (Fig ure 5.2). The task of the participants is to decide whether the two objects have the same shape, regardless of difference in their orientation. The key finding is that the time to make this judgment increases linearly as the angular disparity between the two objects increases (i.e., mental rotation effect). Subsequent studies showed that the mental rota Page 13 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery tion effect is found with alphanumerical stimuli (e.g., Cooper & Shepard, 1973, Koriat & Norman, 1985), two-dimensional line drawings of letter-like asymmetrical characters (e.g., Tarr & Pinker, 1989), and pictures of common objects (e.g., Jolicoeur, 1985).

Figure 5.2 Example of a pair of Shepard and Met zler–like three-dimensional objects with (a) identical and (b) different shapes with a 50-degree rotation of the object on the right.

Richter et al. (2000) in an fMRI study found that mental rotation of Shepard and Metzler stimuli elicited activation in the superior parietal lobes bilaterally, the supplementary mo tor cortex, and the left (p. 83) primary motor cortex. Results from a hand mental rotation study provided additional evidence that motor processes were involved during image transformation (Parsons et al., 1995). Pictures of hands were presented in the right or left visual field with different orientations, and participants determined whether each picture depicted a left or right hand. Parsons and colleagues reasoned that the motor cortex would be recruited if participants mentally rotated their own hand in congruence with the orientation of the stimulus presented to make their judgment. Bilateral activation was found in the supplementary motor cortex, and critically, activation in the prefrontal and the insular premotor areas occurred in the hemisphere contralateral to the stimulus handedness. Activation was not restricted to brain areas that implemented motor func tions; significant activation was also reported in the frontal and parietal lobes as well as in area 17. According to Decety (1996), image rotation occurs because we anticipate what we would see if we manipulate an object, which implies that motor areas are recruited during men tal rotation regardless of the category of objects rotated. Kosslyn, DiGirolamo, Thompson, and Alpert (1998) in a PET study directly tested this assumption by asking participants ei ther to mentally rotate inanimate 3D armed objects or pictures of hands. In both condi tions, the two objects (or the two hands) were presented with different angular dispari ties, and participants judged whether the two objects (or hands) were identical. To deter mine the brain areas specifically activated during mental rotation, each experimental con ditions was compared with a baseline condition in which the two objects (or hands) were presented in the same orientation. The researchers found activation in the primary motor cortex (area M1), premotor cortex, and posterior parietal lobe when participants rotated hands. In contrast, none of the frontal motor areas was activated when participants men tally rotated inanimate objects. The findings suggest that there are at least two ways ob jects in images can be rotated: one that relies heavily on motor processes, and one that does not. However, the type of stimuli rotated might not predict when the motor cortex is recruited. In fact, Cohen et al. (1996) in an fMRI study found that motor areas were acti Page 14 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery vated in half of the participants in a mental rotation task using 3D armed object similar to the one used in the Kosslyn et al. (1998) study.

Strategies in Mental Rotation Tasks The fact that mental rotation of inanimate objects elicits activation in frontal motor areas in some participants but not others suggests that there might be more than one strategy to rotate this type of object. Kosslyn, Thompson, Wraga, and Alpert (2001) tested whether in a mental rotation task of 3D armed objects participants could imagine the rotation of objects in two different ways: as if an external force (such as a motor) was rotating the objects (i.e., external action condition), or as if the objects were being physically manipu lated (i.e., internal action condition). Participants received different sets of instructions and practice procedures to prompt them to use one of the two strategies (external action vs. internal action). In the practice of the external action condition, a wooden model of a typical Shepard and Metzler object was rotated by an electric motor. In contrast, in the internal condition, participants rotated the wooden model physically. The object used dur ing practice was not used on the experimental trials. On each new set of trials, partici pants were instructed to mentally rotate the object in the exact same way the wooden model was rotated in the preceding practice session. The crucial finding was that area M1 was activated when participants mentally rotated the object on the internal action tri als but not on the external action trials. However, posterior parietal and secondary motor (p. 84) areas were recruited in both conditions. The results have two implications: First, mental rotation in general (independently of the type of stimuli) can be achieved by imag ining the physical manipulation of the object. Second, participants can adopt one or the other strategy voluntarily regardless of their cognitive styles or cognitive abilities. However, the previous study left open the question of whether one can spontaneously use a motor strategy to perform a mental rotation task of inanimate objects. Wraga et al. (2003) addressed this issue in a PET study. In their experiment, participants performed ei ther a mental rotation task of pictures of hands (similar to the one used by Kosslyn et al., 1998) and then a Shepard and Metzler rotation task or two Shepard and Metzler tasks. The authors reasoned that for the group that started with the mental rotation task of hands, motor processes involved in the hand rotation task would covertly transfer to the Shepard and Metzler task. In fact, when the brain activation in the two groups of partici pants were compared in the second mental rotation task (Shepard and Metzler task in both groups), activation in the motor areas (areas 6 and M1) were found only in the group that performed a hand rotation task before the Shepard and Metzler task (Figure 5.3). The results clearly demonstrate that motor processes can be used spontaneously to men tally rotate objects that are not body parts.

Page 15 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery

Functional Role of Area M1

Figure 5.3 Brain activations observed in the internal action minus the external action conditions.

The studies we reviewed suggest that area M1 plays a role in the mental transformation of objects. However, none addressed whether M1 plays a functional role in mental trans formation and more specifically in mental rotation of objects. To test this issue, Ganis, Keenan, Kosslyn, and Pascual-Leone (2000) administered single-pulse TMS to the left pri mary motor cortex of participants while they performed mental rotations of line drawings of hands or feet presented in their right visual field. Single-pulse TMS was administered at different time intervals from the stimulus onset (400 or 650 ms) to determine when pri mary motor areas are recruited during mental rotation. In addition, to test whether men tal rotation of body parts is achieved by imagining the movement of the corresponding part of the body, single-pulse TMS was delivered specifically to the hand area of M1. Par ticipants required more time and made more errors when a single-pulse TMS was deliv ered to M1, when the single-pulse TMS was delivered 650 ms rather than 400 ms after stimulus onset, and when participants mentally rotated hands rather than feet. Within the limits of the spatial resolution of the TMS methodology, the results suggest that M1 is re quired to perform mental rotation of body parts by mapping the movement on one’s own body part but only after the visual and spatial relations of the stimuli have been encoded. Tomasino et al. (2005) reported converging data supporting the functional role of M1 in mental rotation by using a mental rotation task of hands in a TMS study. However, the data are not sufficient to claim that the computations are actually taking place in M1. It is possible that M1 relays information computed elsewhere in the brain (such as in the posterior parietal cortex). And in fact, Sirigu, Duhamel, Cohen, Pillon, Dubois, and Agid (1996) demonstrated that the parietal cortex, not the motor cortex, is critical to generate mental movement representations. Patients with lesions restricted to the parietal cortex showed deficit in predicting the time necessary to perform specific fin Page 16 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery ger movements, whereas no such deficit was reported for a patient with lesions restricted to M1.

Conclusion Some remain dubious that mental imagery can be functionally meaningful and can consti tute a topic of research on its own. However, by drifting away from a purely introspective approach of mental imagery to embrace more objective approaches, and notably by using neuroimaging, researchers have collected evidence that mental images are depictive rep resentations interpreted by cognitive processes at play in other systems—like the percep tual and the (p. 85) motor systems. In fact, we hope that this review of the literature has made clear that there is little evidence to counter the concepts that most of the same neural processes underlying perception are also used in visual mental imagery and that motor imagery can recruit the motor system in a similar way that physical action does. Researchers now rely on what is known of the organization of the perceptual and motor systems and of the key features of the neural mechanisms in those systems to refine the characterization of the cognitive mechanisms at play in the mental imagery system. The encouraging note is that each new characterization of the perceptual and motor systems brings a chance to better understand neural mechanisms at play in mental imagery. Finally, with the ongoing development of more elaborate neuroimaging techniques and analyses of the BOLD signal, mental imagery researchers have an increasing set of tools at their disposal to resolve complicate questions about mental imagery. A number of ques tions remain to be answered in order to achieve a full understanding of the neural mecha nisms carrying shape, color, spatial, and motor imagery. For example, although much evi dence points toward an overlapping of perceptual and visual mental imagery processes in high-level visual cortices—temporal and parietal lobes—evidence remains mixed at this point concerning the role of lower level processes in visual mental imagery. Indeed, we need to understand the circumstances under which the early visual cortex is recruited during mental imagery. Another problem that warrants further investigation is the neural basis of the individual differences observed in mental imagery abilities. As a prerequisite, we can develop objective methods to measure individual differences in those abilities.

References Aleman, A., Schutter, D. J. L. G., Ramsey, N. F., van Honk, J., Kessels, R. P. C., Hoogduin, J. M., Postma, A., Kahn, R. S., & de Haan, E. H. F. (2002). Functional anatomy of top-down visuospatial processing in the human brain: Evidence from rTMS. Cognitive Brain Re search, 14, 300–302. Anderson, A. K. (1978). Arguments concerning representations for mental imagery. Psy chological Review, 85, 249–277.

Page 17 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery Anton, G. (1899). Über die Selbstwahrnehmungen der Herderkranungen des Gehirns durch den Kranken bei Rindenblindheit. Archiv für Psychiatrie und Nervenkrankheiten, 32, 86–127. Bartolomeo, P. (2002). The relationship between visual perception and visual mental im agery: A reappraisal of the neuropsychological evidence. Cortex, 38, 357–378. Bartolomeo, P. (2008). The neural correlates of visual mental imagery: An ongoing de bate. Cortex, 44, 107–108. Bartolomeo, P., Bachoud-Levi, A. C., De Gelder, B., Denes, G., Dalla Barba, G., Brugieres, P., et al. (1998). Multiple-domain dissociation between impaired visual perception and preserved mental imagery in a patient with bilateral extrastriate lesions. Neuropsycholo gia, 36, 239–249. Behrmann, M., Moscovitch, M., & Winocur, G. (1994). Intact visual imagery and impaired visual perception in a patient with visual agnosia. Journal of Experimental Psychology: Human Perception and Performance, 20, 1068–1087. Borst, G., Thompson, W. L., & Kosslyn, S. M. (2011). Understanding the dorsal and ventral systems of the cortex: Beyond dichotomies. American Psychologist, 66, 624–632. Chatterjee, A., & Southwood, M. H. (1995). Cortical blindness and visual imagery. Neurol ogy, 45. Cohen, M. S., Kosslyn, S. M., Breiter, H. C., DiGirolamo, G. J., Thompson, W. L., Bookheimer, S. Y., Belliveau, J. W., & Rosen, B. R. (1996). Changes in cortical activity dur ing mental rotation: A mapping study using functional MRI. Brain, 119, 89–100. Cooper, L. A., & Shepard, R. N. (1973). Chronometric studies of the rotation of mental im ages. In W. G. Chase (Eds.), Visual information processing (pp. 75–176). New York: Acade mic Press. Decety, J. (1996). Neural representation for action. Reviews in the Neurosciences, 7, 285– 297. Decety, J., & Jeannerod, M. (1995). Mentally simulated movements in virtual reality: Does Fitts’s law hold in motor imagery? Behavioral Brain Research, 72, 127–134. Denis, M., & Cocude, M. (1989). Scanning visual images generated from verbal descrip tions. European Journal of Cognitive Psychology, 1, 293–307. Denis, M., & Kosslyn, S. M. (1999). Scanning visual mental images: A window on the mind. Current Psychology of Cognition, 18, 409–465. Downing, P. E., Chan, A. W., Peelen, M. V., Dodds, C. M., & Kanwisher, N. (2006). Domain specificity in visual cortex. Cerebral Cortex, 16, 1453–1461.

Page 18 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery Farah, M. J. (1984). The neurological basis of mental imagery: A componential analysis. Cognition, 18, 245–272. Farah, M. J., Hammond, K. M., Mehta, Z., & Ratcliff, G. (1989). Category-specificity and modality-specificity in semantic memory. Neuropsychologia, 27, 193–200. Farah, M. J., Soso, M. J., & Dasheiff, R. M. (1992). Visual angle of the mind’s eye before and after unilateral occipital lobectomy. Journal of Experimental Psychology: Human Per ception and Performance, 18, 241–246. Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the pri mate cerebral cortex. Cerebral Cortex, 1, 1–47. Finke, R. A., & Pinker, S. (1982). Spontaneous imagery scanning in mental extrapolation. Journal of Experimental Psychology: Learning, Memory and Cognition, 8, 142–147. Formisano, E., Linden, D. E. J., Di Salle, F., Trojano, L., Esposito, F., Sack, A. T., Grossi, D., Zanella, F. E., & Goebel, R. (2002). Tracking the mind’s image in the brain I: Time-re solved fMRI during visuospatial mental imagery. Neuron, 35, 185–194. Ganis, G., Keenan, J. P., Kosslyn, S. M., & Pascual-Leone, A. (2000). Transcranial magnetic stimulation of primary motor cortex affects mental rotation. Cerebral Cortex, 10, 175– 180. Ganis, G., Thompson, W. L., & Kosslyn, S. M. (2004). Brain areas underlying visual mental imagery and visual perception: An fMRI study. Brain Research: Cognitive Brain Research, 20, 226–241. Ganis, G., Thompson, W. L., Mast, F. W., & Kosslyn, S. M. (2003). Visual imagery in cerebral visual dysfunction. Neurologic Clinics, 21, 631–646. (p. 86)

Georgopoulos, A. P., Lurito, J. T., Petrides, M., Schwartz, A. B., & Massey, J. T. (1989). Mental rotation of the neuronal population vector. Science, 243, 234–236. Goldenberg, G. (1992). Loss of visual imagery and loss of visual knowledge: A case study. Neuropsychologia, 30, 1081–1099. Goldenberg, G., Müllbacher, W., & Nowak, A. (1995). Imagery without perception: A case study of anosognosia for cortical blindness. Neuropsychologia, 33, 1373–1382. Goodale, M. A., & Milner, A. D. (1992). Separate visual pathways for perception and ac tion. Trends in Neurosciences, 15, 20–25. Guariglia, C., Padovani, A., Pantano, P., & Pizzamiglio, L. (1993). Unilateral neglect re stricted to visual imagery. Nature, 364, 235–237. Haxby, J. V., Grady, C. L., Horwitz, B., Ungerleider, L. G., Mishkin, M., Carson, R. E., Her scovitch, P., Schapiro, M. B., & Rapoport, S. I. (1991). Dissociation of object and spatial Page 19 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery visual processing pathways in human extrastriate cortex. Proceedings of the National Academy of Sciences U S A, 88, 1621–1625. Ishai, A., Haxby, J. V., & Ungerleider, L. G. (2002). Visual imagery of famous faces: Effects of memory and attention revealed by fMRI. NeuroImage, 17, 1729–1741. Ishai, A., Ungerleider, L. G., & Haxby, J. V. (2000). Distributed neural systems for the gen eration of visual images. Neuron, 28, 979–990. Jacobson, L. S., Pearson, P. M., & Robertson, B. (2008). Hue-specific color memory impair ment in an individual with intact color perception and color naming. Neuropsychologia, 46, 22–36. Jolicoeur, P. (1985). The time to name disoriented natural objects. Memory and Cognition, 13, 289–303. Kaas, A., Weigelt, S., Roebroeck, A., Kohler, A., & Muckli, L. (2010). Imagery of a moving object: The role of occipital cortex and human MT/V5+. NeuroImage, 49, 794–804. Kanwisher, N., & Yovel, G. (2006). The fusiform face area: A cortical region specialized for the perception of faces. Philosophical Transactions of the Royal Society of London B, 361, 2109–2128. Klein, I., Dubois, J., Mangin, J. F., Kherif, F., Flandin, G., Poline, J. B., Denis, M., Kosslyn, S. M., & Le Bihan, D. (2004). Retinotopic organization of visual mental images as revealed by functional magnetic resonance imaging. Brain Research: Cognitive Brain Research, 22, 26–31. Klein, I., Paradis, A.-L., Poline, J.-B., Kosslyn, S. M., & Le Bihan, D. (2000). Transient activ ity in human calcarine cortex during visual imagery. Journal of Cognitive Neuroscience, 12, 15–23. Koriat, A., & Norman, J., (1985). Reading rotated words. Journal of Experimental Psychol ogy: Human Perception and Performance, 11, 490–508. Kosslyn, S. M. (1980). Image and mind. Cambridge, MA: Harvard University Press. Kosslyn, S. M. (1994). Image and brain. Cambridge, MA: Harvard University Press. Kosslyn, S. M., Alpert, N. M., Thompson, W. L., Maljkovic, V., Weise, S. B., Chabris, C., Hamilton, S. E., & Buonanno F. S. (1993). Visual mental imagery activates topographically organized visual cortex: PET investigations. Journal of Cognitive Neuroscience, 5, 263– 287. Kosslyn, S. M., Ball, T. M., & Reiser, B. J. (1978). Visual images preserve metric spatial in formation: Evidence from studies of image scanning. Journal of Experimental Psychology: Human Perception and Performance, 4, 47–60.

Page 20 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery Kosslyn, S. M., DiGirolamo, G., Thompson, W. L., & Alpert, N. M. (1998). Mental rotation of objects versus hands: Neural mechanisms revealed by positron emission tomography. Psychophysiology, 35, 151–161. Kosslyn, S. M., Pascual-Leone, A., Felician, O., Camposano, S., Keenan, J. P., Thompson, W. L., Ganis, G., Sukel, K. E., & Alpert, N. M. (April 2, 1999). The role of area 17 in visual im agery: Convergent evidence from PET and rTMS. Science, 284, 167–170. Kosslyn, S. M., & Thompson, W. L. (2003). When is early visual cortex activated during vi sual mental imagery? Psychological Bulletin, 129, 723–746. Kosslyn, S. M., Thompson, W. L., & Alpert, N. M. (1997). Neural systems shared by visual imagery and visual perception: A positron emission tomography study. NeuroImage, 6, 320–334. Kosslyn, S. M., Thompson, W. L., & Ganis, G. (2006). The case for mental imagery. New York: Oxford University Press. Kosslyn, S. M., Thompson, W. L., Kim, I. J., & Alpert, N. M. (1995). Topographical repre sentations of mental images in primary visual cortex. Nature, 378, 496–498. Kosslyn, S. M., Thompson, W. L., Wraga, M., & Alpert, N. M. (2001). Imagining rotation by endogenous versus exogenous forces: Distinct neural mechanisms. NeuroReport, 12, 2519–2525. Levine, D. N., Warach, J., & Farah, M. J. (1985). Two visual systems in mental imagery: Dissociation of “what” and “where” in imagery disorders due to bilateral posterior cere bral lesions. Neurology, 35, 1010–1018. Luzzatti, C., Vecchi, T., Agazzi, D., Cesa-Bianchi, M., & Vergani, C. (1998). A neurological dissociation between preserved visual and impaired spatial processing in mental imagery. Cortex, 34, 461–469. Mechelli, A., Price, C. J., Friston, K. J., & Ishai, A. (2004). Where bottom-up meets topdown: neuronal interactions during perception and imagery. Cerebral Cortex, 14, 1256– 1265. Mellet, E., Briscogne, S., Crivello, F., Mazoyer, B., Denis, M., & Tzourio-Mazoyer, N. (2002). Neural basis of mental scanning of a topographic representation build from a text. Cerebral Cortex, 12, 1322–1330. Mellet, E., Tzourio, N., Crivello, F., Joliot, M., Denis, M., & Mazoyer, B. (1996). Functional anatomy of spatial mental imagery generated from verbal instructions. Journal of Neuro science, 16, 6504–6512. Mellet, E., Tzourio, N., Denis, M., & Mazoyer, B. (1995). A positron emission tomography study of visual and mental spatial exploration. Journal of Cognitive Neuroscience, 4, 433– 445. Page 21 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery Moro, V., Berlucchi, G., Lerch, J., Tomaiuolo, F., & Aglioti, S. M. (2008). Selective deficit of mental visual imagery with intact primary visual cortex and visual perception. Cortex, 44, 109–118. Morton, N., & Morris, R. G. (1995). Image transformations dissociated from visuo-spatial working memory. Cognitive Neuropsychology, 12, 767–791. O’Craven, K. M., & Kanwisher, N. (2000). Mental imagery of faces and places activates corresponding stimulus-specific brain regions. Journal of Cognitive Neuroscience, 12, 1013–1023. (p. 87)

Paivio, A. (1971). Imagery and verbal processes. New York: Holt, Rinehart and Win

ston. Parsons, L. M., Fox, P. T., Downs, J. H., Glass, T., Hirsch, T. B., Martin, C. C., Jerabek, P. A., Lancaster, J. L. (1995). Use of implicit motor imagery for visual shape discrimination as revealed by PET. Nature, 375, 54–58. Pylyshyn, Z. W. (1973). What the mind’s eye tells the mind’s brain: A critique of mental imagery. Psychological Bulletin, 80, 1–24. Pylyshyn, Z. W. (1981). Psychological explanations and knowledge-dependent processes. Cognition, 10, 267–274. Pylyshyn, Z. W. (2002). Mental imagery: In search of a theory. Behavioral and Brain Sciences, 25, 157–237. Pylyshyn, Z. W. (2003a). Return of the mental image: Are there really pictures in the head? Trends in Cognitive Sciences, 7, 113–118. Pylyshyn, Z. W. (2003b). Seeing and visualizing: It s not what you think. Cambridge, MA: MIT Press. Pylyshyn, Z. W. (2007). Things and places: How the mind connects with the world. Cam bridge, MA: MIT Press. Richter, W., Somorjai, R., Summers, R., Jarmasz, M., Tegeler, C., Ugurbil, K., Menon, R., Gati, J. S., Georgopoulos, A. P., & Kim, S.-G. (2000). Motor area activity during mental ro tation studied by time-resolved single-trial fMRI. Journal of Cognitive Neuroscience, 12, 310–320. Riddoch, M. J., & Humphreys, G. W. (1987). A case of integrative visual agnosia. Brain, 110, 1431–1462. Rizzo, M., Smith, V., Pokorny, J., & Damasio, A. (1993). Color perception profiles in central achromatopsia. Neurology 43, 995–1001. Sartori, G., & Job, R. (1988). The oyster with four legs: A neuropsychological study on the interaction of visual and semantic information. Cognitive Neuropsychology, 5, 105–132. Page 22 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery Sereno, M. I., Dale, A. M., Reppas, J. B., Kwong, K. K., Belliveau, J. W., Brady, T. J., Rosen, B. R., & Tootell, R. B. H. (1995). Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science, 268, 889–893. Servos, P., & Goodale, M. A. (1995). Preserved visual imagery in visual form agnosia. Neu ropsychologia, 33 (11), 1383–1394. Shepard, R. N., & Metzler, J. (1971). Mental rotation of three-dimensional objects. Science, 171, 701–703. Shuttleworth, E. C., Jr., Syring, V., & Allen, N. (1982). Further observations on the nature of prosopagnosia. Brain and Cognition, 1, 307–322. Siebner, H. R., Peller, M., Willoch, F., Minoshima, S., Boecker, H., Auer, C., Drzezga, A., Conrad, B., & Bartenstein, P. (2000). Lasting cortical activation after repetitive TMS of the motor cortex: A glucose metabolic study. Neurology, 54, 956–963. Sirigu, A., Duhamel, J.-R., Cohen, L., Pillon, B., Dubois, B., & Agid, Y. (1996). The mental representation of hand movements after parietal cortex damage. Science, 273 (5281), 1564–1568. Slotnick, S. D., Thompson, W. L., & Kosslyn, S. M. (2005). Visual mental imagery induces retinotopically organized activation of early visual areas. Cerebral Cortex, 15, 1570–1583. Sparing, R., Mottaghy, F., Ganis, G. Thompson, W. L., Toepper, R., Kosslyn, S. M., & Pas cual-Leone, A. (2002). Visual cortex excitability increases during visual mental imagery: A TMS study in healthy human subjects. Brain Research, 938, 92–97. Stokes, M., Thompson, R., Cusack, R., & Duncan, J. (2009). Top-down activation of shapespecific population codes in visual cortex during mental imagery. Journal of Neuroscience, 29, 1565–1572. Tarr, M. J., & Pinker, S. (1989). Mental rotation and orientation-dependence in shape recognition. Cognitive Psychology, 21, 233–282. Thirion, B., Duchesnay, E., Hubbard, E., Dubois, J., Poline, J.-B., Lebihan, D., & Dehaene, S. (2006). Inverse retinotopy: Inferring the visual content of images from brain activation patterns. Neuroimage, 33, 1104–1116. Tomasino, B., Borroni, P., Isaja, A., & Rumiati, R. I. (2005). The role of the primary motor cortex in mental rotation: A TMS study. Cognitive Neuropsychology, 22, 348–363. Trojano, L., Grossi, D., Linden, D. E., Formisano, E., Hacker, H., Zanella, F. E., Goebel, R., & Di Salle, F. (2000). Matching two imagined clocks: The functional anatomy of spatial analysis in the absence of visual stimulation. Cerebral Cortex, 10, 473–481.

Page 23 of 24

Neural Underpinning of Object Mental Imagery, Spatial Imagery, and Motor Imagery Trojano, L., Linden, D. E., Formisano, E., Grossi, D., Sack, A. T., & Di Salle, F. (2004). What clocks tell us about the neural correlates of spatial imagery. European Journal of Cognitive Psychology, 16, 653–672. Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. J. Ingle, M. A. Goodale, & R. J. W. Mansfield (Eds.), Analysis of visual behavior (pp. 549–586). Cam bridge, MA: MIT Press. Vanlierde, A., de Volder, A. G., Wanet-Defalque, M. C., & Veraart C. (2003). Occipito-pari etal cortex activation during visuo-spatial imagery in early blind humans. NeuroImage, 19, 698–709. Watson, J. B. (1913). Psychology as the behaviorist views it. Psychological Review, 20, 158–177. Wraga, M., Shephard, J. M., Church, J. A., Inati, S., & Kosslyn, S. M. (2005). Imagined ro tations of self versus objects: An fMRI study. Neuropsychologia, 43, 1351–1361. Wraga, M. J., Thompson, W. L., Alpert, N. M., & Kosslyn, S. M. (2003). Implicit transfer of motor strategies in mental rotation. Brain and Cognition, 52, 135–143. Young, A. W., Humphreys, G. W., Riddoch, M. J., Hellawell, D. J., & de Haan, E. H. (1994). Recognition impairments and face imagery. Neuropsychologia, 32, 693–702.

Grégoire Borst

Grégoire Borst is an assistant professor in developmental psychology and cognitive neuroscience at Paris Descartes University.

Page 24 of 24

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Roni Kahana and Noam Sobel The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0006

Abstract and Keywords Mammalian olfaction is highly stereotyped. It consists of a sensory epithelium in the nose, where odorants are transduced to form neural signals. These neural signals are projected via the olfactory nerve to the olfactory bulb, where they generate spatiotemporal patterns of neural activity subserving odorant discrimination. This information is then projected via the olfactory tract to olfactory cortex, a neural substrate optimized for olfactory ob ject perception. In contrast to popular notions, human olfaction is quite keen. Thus, sys tematic analysis of human olfactory perception has uncovered fundamental properties of mammalian olfactory processing, and mammalian olfaction explains fundamental proper ties of human behaviors such as eating, mating, and social interaction, which are all criti cal for survival. Keywords: olfactory perception, olfaction, behavior, odorant, olfactory epithelium, olfactory discrimination, piri form cortex, eating, mating, social interaction

Introduction Even in reviews on olfaction, it is often stated that human behavior and perception are dominated by vision, or that humans are primarily visual creatures. This reflects the con sensus in cognitive neuroscience (Zeki & Bartels, 1999). Indeed, if asked which distal sense we would soonest part with, most (current authors included) would select olfaction before audition or vision. Thus, whereas primarily olfactory animals such as rodents are referred to as macrosmatic, humans are considered microsmatic. That said, we trust our nose over our eyes and ears in the two most critical decisions we make: what we eat, and with whom we mate (Figure 6.1). We review various studies in this respect, yet first we turn to the reader’s clear intuition: Given a beautiful-looking slice of cake that smells of sewage and a mushy-looking shape less mixture that smells of cinnamon and banana, which do you eat? Given a gorgeousPage 1 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose looking individual who smells like yeast and a profoundly physically unattractive person who smells like sweet spice, with whom do you mate? In both of these key behaviors, hu mans, like all mammals, are primarily olfactory. With this simple truth in mind, namely, that in our most important decisions we follow our nose, should humans nevertheless still be considered microsmatic (Stoddart, 1990)?

Functional Neuroanatomy of the Mammalian Olfactory System

Figure 6.1 The primacy of human olfaction. Humans trust olfaction over vision and audition in key behav iors related to survival, such as mate selection and determination of edibility. Courtesy of Gilad Larom.

Page 2 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose

Figure 6.2 Schematic of the human olfactory system. Odorants are transduced at the olfactory epithelium (1). Receptors of different subtypes (three illustrat ed, ∼1,000 in mammals) converge via the olfactory nerve onto common glomeruli at the olfactory bulb (2). From here, information is conveyed via the later al olfactory tract to primary olfactory cortex (3). From here, information is conveyed throughout the brain, most notably to orbitofrontal cortex (5) via a direct and indirect route through the thalamus (4). (From Sela & Sobel, 2010. Reprinted with permission from Springer.)

Before considering the behavioral significance of human olfaction, we first provide a ba sic overview of olfactory system organization. The mammalian olfactory system follows a rather clear hierarchy, starting with transduction at the olfactory epithelium in the nose, then initial processing subserving odor discrimination in the olfactory bulb, and finally higher order processing related to odor object formation and odor memory in primary ol factory cortex (R. I. Wilson & Mainen, 2006) (Figure 6.2). This organization is bilateral and symmetrical, and although structural connectivity appears largely (p. 89) ipsilateral (left epithelium to left bulb to left cortex) (Powell, Cowan, & Raisman, 1965), functional measurements have implied more contralateral than ipsilateral driving of activity (Cross et al., 2006; McBride & Slotnick, 1997; J. Porter, Anand, Johnson, Khan, & Sobel, 2005; Savic & Gulyas, 2000; D. A. Wilson, 1997). The neural underpinnings of this functional contralaterality remain unclear.

Page 3 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose

More than One Nose in the Nose

Figure 6.3 More than one nose in the nose. A, The olfactory system in the mouse contains multiple sub systems: the olfactory epithelium (OE), the vomeronasal organ (VNO), the Grueneberg ganglion (GG), and the septal organ (SO). Sensory neurons po sitioned in the OE, SO, and GG project to the main ol factory bulb (MOB), whereas sensory neurons of the VNO project to the accessory olfactory bulb (AOB). (From Ferrero & Liberles, 2010, originally adapted from Buck, 2000.) B, The human nose is innervated by both olfactory and trigeminal sensory nerve end ings. (Modification of illustration by Patrick J. Lynch.)

Odorants are concurrently processed in several neural subsystems beyond the above-de scribed main olfactory system (Breer, Fleischer, & Strotmann, 2006) (Figure 6.3). For ex ample, air-borne molecules are transduced at endings of the trigeminal nerve in the eye, nose, and throat (Hummel, 2000). It is trigeminal activation that provides the cooling sen sation associated with odorants such as menthol, or the stingy sensation associated with odorants such as ammonia or onion. In rodents, at least three additional sensing mecha nisms have been identified in the nose. These include (1) the septal organ, which consists of a small patch of olfactory receptors that are anterior to the main epithelium (Ma et al., 2003); (2) the Grueneberg organ, which contains small grape-like clusters of receptors at the anterior end of the nasal passage that project to a separate subset of main olfactory bulb targets (Storan & Key, 2006); and (3) the vomeronasal system, or accessory olfactory system (Halpern, 1987; Wysocki & Meredith, 1987). The accessory olfactory system is equipped with a separate bilateral epithelial structure, the vomeronasal organ, or VNO (sometimes also referred to as Jacobson’s organ). The VNO is a (p. 90) pit-shaped struc ture at the anterior portion of the nasal passage, containing receptors that project to an accessory olfactory bulb, which in turn projects directly to critical components of the lim bic system such as the amygdala and hypothalamus (Keverne, 1999; Meredith, 1983) (see Page 4 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Figure 6.3). In rodents, the accessory olfactory system plays a key role in mediating so cial chemosignaling (Halpern, 1987; Kimchi, Xu, & Dulac, 2007; Wysocki & Meredith, 1987). Whether humans have a septal organ or Grueneberg organ has not been carefully studied, and it is largely held that humans do not have an accessory olfactory system, al though this issue remains controversial (Frasnelli, Lundstrˆm, Boyle, Katsarkas, & Jones Gotman; Meredith, 2001; Monti-Bloch, Jennings-White, Dolberg, & Berliner, 1994; Witt & Hummel, 2006). Regardless of this debate, it is clear that the sensation of smell in hu mans and other mammals is the result of common activation across several neural subsys tems (Restrepo, Arellano, Oliva, Schaefer, & Lin, 2004; Spehr et al., 2006). However, be fore air-borne stimuli are processed, they first must be acquired.

Sniffs: More than a Mechanism for Odorant Sampling

Figure 6.4 Sniffing. Careful visualization of human sniff airflow revealed that although the nostrils are structurally close together, an asymmetry in nasal airflow generates a “functional distance” between the nostrils. A, A PIV laser light sheet was oriented in a coronal plane intersecting the nostrils at their mid point. B and C, PIV images of particle-laden inspired air stream for two example sniffs. D, A contour plot of velocity magnitude of the inspired air stream into the nose of a subject sniffing at 0.2 Hz. E, Velocity profiles of the right and left naris; abscissa indicates distance from the tip of the nose to the lateral extent of the naris. From Porter et al., 2007. Reprinted with permission from Nature.

Mammalian olfaction starts with a sniff—a critical act of odor sampling. Sniffs are not merely an epiphenomenon of olfaction, but rather are an intricate component of olfactory perception (Kepecs, Uchida, & Mainen, 2006, 2007; Mainland & Sobel, 2006; Schoenfeld & Cleland, 2006). Sniffs are in part a reflexive action (Tomori, Benacka, & Donic, 1998), which is then rapidly modified in accordance with odorant content (Laing, 1983) (Figure 6.4). Humans begin tailoring their sniff according to odorant properties within about 160 ms of sniff onset, reducing sniff magnitude for both intense (B. N. Johnson, Mainland, & Sobel, 2003) and unpleasant (Bensafi et al., 2003) odorants. We have proposed that the mechanism that tailors a sniff to its content is cerebellar (Sobel, Prabhakaran, Hartley, et Page 5 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose al., 1998), and cerebellar lesions indeed negate this mechanism (Mainland, Johnson, Khan, Ivry, & Sobel, 2005). Moreover, not only are sniffs the key mechanism for odorant sampling, they also play a key role in timing and organization of neural representation in the olfactory system. This influence of sniffing on neural representation in olfaction may begin at the earliest phase of olfactory processing because olfactory receptors are also mechanosensitive (Grosmaitre, Santarelli, Tan, Luo, & Ma, 2007), potentially responding to sniffs even without odor. Sniff properties are then reflected in neural activity at both the olfactory bulb (Verhagen, Wesson, Netoff, White, & Wachowiak, 2007) and olfactory cortex (Sobel, Prabhakaran, Desmond, et al., 1998). Indeed, negating sniffs (whether their execution, or only their (p. 91) intension) may underlie in part the pronounced differ ences in olfactory system neural activity during wake and anesthesia (Rinberg, Koulakov, & Gelperin, 2006). Finally, odor sampling is not only through the nose (orthonasal) but al so through the mouth (retronasal): Food odors make their way to the olfactory system by ascending through the posterior nares of the nasopharynx (Figure 6.5). Several lines of evidence have suggested partially overlapping yet partially distinct neural substrates sub serving orthonasal and retronasal human olfaction (Bender, Hummel, Negoias, & Small, 2009; Hummel, 2008; Small, Gerber, Mak, & Hummel, 2005).

Olfactory Epithelium: The Site of Odorant Transduction

Figure 6.5 Schematic drawing of the nasal cavity with the lower, middle, and upper turbinates. Airflow in relation to orthonasal (through the nostrils) or retronasal (from the mouth/pharynx to the nasal cavi ty) is indicated by arrows, both leading to the olfacto ry epithelium located just beneath the cribriform plate. From Negoias, Visschers, Boelrijk, & Hummel, 2008. Reprinted with permission from Elsevier.

Once sniffed, an odorant makes its way up the nasal passage, where it crosses a mucous membrane before (p. 92) interacting with olfactory receptors that line the olfactory ep ithelium. This step is not inconsequential to the olfactory process. Odorants cross this mucus following the combined action of passive gradients and active transporters, which generate an odorant-specific pattern of dispersion (Moulton, 1976). These so-called sorp Page 6 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose tion properties have been hypothesized to play a key role in odorant discrimination, in that they form a sort of chromatographic separation at the nose (Mozell & Jagodowicz, 1973). The later identification of an inordinately large family of specific olfactory receptor types (L. Buck & Axel, 1991; Zhang & Firestein, 2002) shifted the focus of enquiry regard ing odor discrimination to that of receptor–ligand interactions, but the chromatographic component of this process has never been negated and likely remains a key aspect of odorant processing. Once an odorant crosses the mucosa, it interacts with olfactory receptors at the sensory end of olfactory receptor neurons. Humans have about 12 million bipolar receptor neu rons (Moran, Rowley, Jafek, & Lovell, 1982) that differ from typical neurons in that they constantly regenerate from a basal cell layer throughout the lifespan (Graziadei & Monti Graziadei, 1983). These neurons send their dendritic process to the olfactory epithelial surface, where they form a knob from which five to twenty thin cilia extend into the mu cus. These cilia contain the olfactory receptors: 7-transmembrane G-protein–coupled sec ond-messenger receptors, where a cascade of events that starts with odorant binding cul minates in the opening of cross-membrane cation channels that depolarize the cell (Firestein, 2001; Spehr & Munger, 2009; Zufall, Firestein, & Shepherd, 1994) (Figure 6.6). The mammalian genome contains more than 1,000 such receptor types (L. Buck & Axel, 1991), yet humans functionally express only about 400 of these (Gilad & Lancet, 2003). Typically, each receptor neuron expresses only one receptor type, although recent evidence from Drosophila has suggested that in some cases a single neuron may express two receptor types (Goldman, Van der Goes van Naters, Lessing, Warr, & Carlson, 2005). In rodents, receptor types are grouped into four functional expression zones along a dorsoventral epithelial axis, yet are randomly dispersed within each zone (Ressler, Sulli van, & Buck, 1993; Strotmann, Wanner, Krieger, Raming, & Breer, 1992; Vassar, Ngai, & Axel, 1993). Each receptor type is typically responsive to a small subset of odorants (Hallem & Carlson, 2006; Malnic, Hirono, Sato, & Buck, 1999; Saito, Chi, Zhuang, Mat sunami, & Mainland, 2009), although some receptors may be responsive to only very few odorants (Keller, Zhuang, Chi, Vosshall, & Matsunami, 2007), and other receptors may be responsive to a very wide range of odorants (Grosmaitre et al., 2009). Despite some alter native hypotheses (Franco, Turin, Mershin, & Skoulakis, 2011), this receptor-to-odorant specificity is widely considered the basis for olfactory coding (Su, Menuz, & Carlson, 2009).

Page 7 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose

Olfactory Bulb: A Neural Substrate for Odorant Discrimination

Figure 6.6 Receptor events in olfaction. Signal trans duction in an olfactory sensory neuron. Binding of an odorant to its cognate odorant receptor (OR) results in the activation of heterotrimeric G protein (Gαolf plus Gβγ). Activated Gαolf in turn activates type III adenylyl cyclase (AC3), leading to the production of cyclic adenosine monophosphate (cAMP) from adeno sine triphosphate (ATP). cAMP gates or opens the cyclic nucleotide–gated (CNG) ion channel, leading to the influx of Na+ and Ca2+, depolarizing the cell. This initial depolarization is amplified through the activation of a Ca2+-dependent Cl− channel. In addi tion, cAMP activates protein kinase A (PKA), which can regulate other intracellular events, including transcription of cAMP-regulated genes. Reprinted with permission from DeMaria & Ngai, 2010.

Whereas receptor types appear randomly dispersed throughout each epithelial subzone, the path (p. 93) from epithelium to bulb via the olfactory nerve entails a unique pattern of convergence that brings together all receptor neurons that express a particular receptor type. These synapse onto one of two common points at the olfactory bulb, termed glomeruli (Mombaerts et al., 1996). Thus, the number of glomeruli is expected to be about double the number of receptor types, and the receptive range of a glomerulus is ex pected to reflect the receptive range of a given receptor type (Feinstein & Mombaerts, 2004). Within the glomeruli, receptor axons contact dendrites of either mitral or tufted output neurons and periglomerular interneurons. Whereas these rules have been learned mostly from studies in rodents, the human olfactory system may be organized slightly dif ferently; rather than the expected about 750 glomeruli (about double the number of ex pressed receptor types), postmortem studies revealed many thousands of glomeruli in the human olfactory bulb (Maresh, Rodriguez Gil, Whitman, & Greer, 2008). The stereotyped connectivity from epithelium to bulb generates a spatial representation of receptor types on the olfactory bulb surface. In simple terms, each activated glomeru lus reflects the activation of a given receptor type. Thus, the spatiotemporal pattern of bulbar activation is largely considered the base for olfactory discrimination coding (Firestein, 2001). The common notion is that a given odorant is represented by the partic ular pattern of glomeruli activation in time. Indeed, various methods of recording neural activity at the olfactory bulb have converged to support this notion (Leon & Johnson, Page 8 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose 2003; Su et al., 2009; Uchida, Takahashi, Tanifuji, & Mori, 2000) (Figure 6.7). Although it is easy to grasp and convey this notion of a purely spatial substrate where different odors induce different patterns of activation, this is clearly a simplified view because the partic ular timing of neural activity also clearly plays a role in odor coding at this stage. The role of temporal neural activity patterns in odor coding was primarily uncovered in insects (Laurent, 1997, 1999; Laurent, Wehr, & Davidowitz, 1996) but has been revealed in mam mals as well (Bathellier, Buhl, Accolla, & Carleton, 2008; Lagier, Carleton, & Lledo, 2004; Laurent, 2002). Moreover, it is noteworthy that olfactory bulb lesions have a surprisingly limited impact on olfactory discrimination (Slotnick & Schoonover, 1993), and a spa tiotemporal bulbar activation code has yet to be linked to meaningful olfactory informa tion within a predictive framework (Mainen, 2006). In other words, a “map of odors” on the olfactory bulb is a helpful concept in understanding the olfactory system, but it is not the whole story.

Primary Olfactory Cortex: A Loosely Defined Structure with Loosely Defined Function The structural and functional properties of epithelium and bulb are relatively straightfor ward: The epithelium is the site of transduction, where odorants become neural signals. The bulb is the site of discrimination, where different odors form different spatiotemporal patterns of neural activity. By contrast, the structure and function of primary olfactory cortex remain unclear. In other words, there is no clear agreement as to what constitutes primary olfactory cortex, let alone what it does. By current definition, primary olfactory cortex consists of all brain regions that receive di rect input from the mitral and tufted cell axons of the olfactory bulb (Allison, 1954; Carmichael, Clugnet, & Price, 1994; de Olmos, Hardy, & Heimer, 1978; Haberly, 2001; J. L. Price, 1973, 1987; J. L. Price, 1990; Shipley, 1995). These comprise most of the paleo cortex, including (by order along the olfactory tract) the anterior olfactory cortex (also re ferred to as the anterior olfactory nucleus) (Brunjes, Illig, & Meyer, 2005), ventral tenia tecta, anterior hippocampal continuation and indusium griseum, olfactory tubercle, piri form cortex, anterior cortical nucleus of the amygdala, periamygdaloid cortex, and rostral entorhinal cortex (Carmichael et al., 1994) (Figure 6.8). As can be appreciated by both the sheer area and diversity of cortical real estate that is considered primary olfactory cortex, this definition is far from functional. One cannot as sign a single function to “primary olfactory cortex” when primary olfactory cortex is a la bel legitimately applied to a large proportion of the mammalian brain. The term primary typically connotes basic functional roles such as early feature extraction, yet as can be ex pected, a region comprising in part piriform cortex, amygdala, and entorhinal cortex is in volved in far more complex sensory processing than mere early feature extraction. With this in mind, several authors have simply shifted the definition by referring to the classic primary olfactory structures as secondary olfactory structures, noting that the definition

Page 9 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose of mammalian primary olfactory cortex may better fit the olfactory bulb than piriform cor tex (Cleland & Sullivan, 2003; Haberly, 2001).

Figure 6.7 Spatial coding at the olfactory bulb. Pat terns of rat glomeruli activation (by 2-deoxyglucose uptake) evoked by different odors. Activation is rep resented as the average z-score pattern for both bulbs of up to four separate rats exposed to each odor. Warmer colors indicate higher uptake. From Johnson, Ong, & Leon, 2010. Copyright © 2009 Wiley-Liss, Inc.

At the same time, there has been a growing tendency to use the term primary olfactory cortex for piriform cortex alone. Piriform cortex, the largest (p. 94) component of primary olfactory cortex in mammals, lies along the olfactory tract at the junction of temporal and frontal lobes and continues onto the dorsomedial aspect of the temporal lobe (see Figure 6.8A and B). Consistent with the latter approach, here we restrict our review of olfactory cortex to the piriform portion of primary olfactory cortex alone.

Page 10 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose

Piriform Cortex: A Neural Substrate for Olfactory Object Formation

Figure 6.8 Human olfactory cortex. A, Ventral view of the human brain in which the right anterior tem poral lobe has been resected in the coronal plane to expose the limbic olfactory areas. B, Afferent output from the olfactory bulb (OB) passes through the lat eral olfactory tract (LOT) and projects monosynapti cally to numerous regions, including the anterior ol factory nucleus (AON), olfactory tubercle (OTUB), anterior piriform cortex (APC), posterior piriform cortex (PPC), amygdala (AM), and entorhinal cortex (EC). Downstream relays include the hippocampus (HP) and the putative olfactory projection site in the human orbitofrontal cortex (OFColf). C, Schematic representation of the cellular organization of the piri form cortex. Pyramidal neurons are located in cell body layers II and III, and their apical dendrites project to molecular layer I. Layer I is subdivided in to a superficial layer (Ia) that contains the sensory afferents from the olfactory bulb (shown in red) and a deeper layer (Ib) that contains the associative in puts from other areas of the primary olfactory cortex and higher order areas (shown in blue). Most of the layer Ia afferents terminate in the APC, whereas most of the layer Ib associative inputs terminate in the posterior piriform cortex (PPC). Reprinted with permission from Gottfried, 2010.

Piriform cortex is three-layered paleocortex that has been described in detail (Martinez, Blanco, Bullon, & Agudo, 1987). In brief, layer I is subdivided into layer Ia, where afferent fibers from the olfactory bulb terminate, and layer lb, where (p. 95) association fibers ter minate (see Figure 6.8C). Layer II is a compact zone of neuronal cell bodies. Layer III contains neuronal cell bodies at a lower density than layer II and a large number of den dritic and axonal elements. Piriform input is widely distributed, and part of piriform out put feeds back into piriform as further distributed input. Moreover, piriform cortex is rec iprocally and extensively connected with several high-order areas of the cerebral cortex, including the prefrontal, amygdaloid, perirhinal, and entorhinal cortices (Martinez et al., 1987).

Page 11 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose The current understanding of piriform cortex function largely originated from the work of Lew Haberly and colleagues (Haberly, 2001; Haberly & Bower, 1989; Illig & Haberly, 2003). These authors hypothesized that the structural organization of piriform cortex ren ders it highly suitable to function as a content-addressable memory system, where frag mented input can be used to “neurally reenact” a stored representation. Haberly and col leagues identified or predicted several aspects of piriform organization that render it an ideal substrate for such a system. These predictions have been remarkably borne out in later studies of structure and function. Haberly and colleagues noted that first, associa tive networks depend on spatially distributed input systems. Several lines of evidence have indeed suggested that the projection from bulb to piriform is in fact spatially distrib uted. In other words, in contrast to the spatial clustering of responses at the olfactory bulb, this ordering is apparently obliterated in the projection to piriform cortex (Stettler & Axel, 2009). Second, the discriminative power of associative networks relies on positive feedback via interconnections between the processing units that receive the distributed input. Indeed, in piriform cortex, each pyramidal cell makes a small number of synaptic contacts on a large number (>1,000) of other cells in piriform cortex at disparate loca tions. Axons from individual pyramidal cells also arborize extensively within many neigh boring cortical areas, most of which send strong projections back to piriform cortex (D. M. G. Johnson, Illig, Behan, & Haberly, 2000). Third, in associative memory models, indi vidual inputs are typically weak relative to output threshold, a situation that indeed likely (p. 96) occurs in piriform (Barkai & Hasselmo, 1994). Finally, content-addressable memo ry systems typically require activity-dependent changes in excitatory synaptic strengths. Again, this pattern has since consistently been demonstrated in piriform cortex, where enhanced olfactory learning capability is accompanied by long-term enhancement of synaptic transmission in both the descending and ascending inputs (Cohen, Reuveni, Barkai, & Maroun, 2008). In addition to the above materialization of Haberly’s predictions on piriform structure, several studies have similarly borne out his predictions on function. In a series of studies, Don Wilson and colleagues have demonstrated the importance of piriform cortex associa tive memory-like properties in olfactory pattern formation, completion, and separation from background (Barnes, Hofacer, Zaman, Rennaker, & Wilson, 2008; Kadohisa & Wil son, 2006; Linster, Henry, Kadohisa, & Wilson, 2007; Linster, Menon, Singh, & Wilson, 2009; D. A. Wilson, 2009a, 2009b; D. A. Wilson & Stevenson, 2003). In a critical recent study, these authors taught rats to discriminate between various mixtures, each contain ing ten monomolecular components (Barnes et al., 2008). They found that rats easily dis criminated between a target mixture of ten components (10C) and a second mixture in which only one of the ten components was replaced with a novel component (10CR1). In turn, rats were poor at discriminating this same target mixture from a mixture where one of the components was deleted (10C-1). The authors concluded that through pattern com pletion, 10C-1 was “completed” to 10C, yet through pattern separation, 10CR1 was per ceived as something new altogether. Critically, the authors accompanied these behavioral studies with electrical recordings from both olfactory bulb and piriform cortex. They found that a shift from 10C to 10C-1 induced a significant decorrelation in the activity of Page 12 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose olfactory bulb mitral cell ensembles. In other words, olfactory bulb mitral cell ensembles readily separated these overlapping patterns. In contrast, piriform cortex ensembles showed no significant decorrelation across 10C and 10C-1 mixtures. In other words, the piriform ensemble filled in the missing component and responded as if the full 10C mix ture were present—consistent with pattern completion. In contrast, a shift from 10C to 10CR1 produced significant cortical ensemble pattern separation. In other words, the en semble results were consistent with behavior whereby introduction of a novel component into a complex mixture was relatively easy to detect, whereas removal of a single compo nent was difficult to detect. Consistent with the above, Jay Gottfried and colleagues have used functional magnetic resonance imaging (fMRI) to investigate piriform activity in humans (Gottfried & Wu, 2009). In an initial study, they uncovered a heterogenic response profile whereby odorant physicochemical properties were evident in activity patterns measured in anterior piri form cortex, and odorant perceptual properties were associated with activity patterns measured in posterior piriform (Gottfried, Winston, & Dolan, 2006). In that posterior piri form is richer than anterior piriform in the extent of associational connectivity, this find ing is consistent with the previously described findings in rodents. Moreover, using multi variate fMRI analysis techniques, they found that odorants with similar perceived quality induced similar patterns of ensemble activity in posterior piriform cortex alone (Howard, Plailly, Grueschow, Haynes, & Gottfried, 2009). Taken together, these results from both rodents and humans depict piriform cortex as a critical component allowing the olfactory system to deal with an ever-changing olfactory environment, while still allowing stable ol factory object formation and constancy. Finally, beyond primary olfactory cortex, olfactory information is distributed widely throughout the brain. Whereas other sensory modalities traverse a thalamic relay en route from periphery to primary cortex, in olfaction information reaches primary cortex directly. This is not to say, however, that there is no olfactory thalamus. A recent lesion study has implicated thalamic involvement in olfactory identification, hedonic processing, and olfactory motor control (Sela et al., 2009), and a recent imaging study has implicated a thalamic role in olfactory attention (Plailly, Howard, Gitelman, & Gottfried, 2008), a finding further supported by lesion studies (Tham, Stevenson, & Miller, 2010). From the thalamus, olfactory information radiates widely, yet most notable in its projections is the orbitofrontal cortex that is largely considered secondary olfactory cortex (J. L. Price, 1990). Both human fMRI studies and single-cell recordings in monkeys suggest that or bitofrontal cortex is critical for coding odor identity (Rolls, Critchley, & Treves, 1996; Tan abe, Iino, Ooshima, & Takagi, 1974) and may further be key for conscious perception of smell (Li et al., 2010).

Page 13 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose

Looking at the Nose Through Human Be havior (p. 97)

As reviewed above, the basic functional architecture of mammalian olfaction is well un derstood. In the olfactory epithelium, there is a thorough understanding of receptor events that culminate in transduction of odorants into neural signals. In the olfactory bulb, there is a comprehensive view of how such neural signals form spatiotemporal pat terns that allow odor discrimination. Finally, in piriform cortex, there is an emerging view of how sparse neural representation enables formation of stable olfactory objects. Howev er, despite this good understanding of olfaction at the genetic, molecular, and cellular lev els, we have only poor understanding of structure–function relations in this system (Mainen, 2006). Put simply, there is not a scientist or perfumer in the world who can look at a novel molecule and predict its odor, or smell a novel smell and predict its structure. One reason for this state of affairs is that the olfactory stimulus, namely, a chemical, has typically been viewed as it would be by chemists. For example, carbon chain length has been the most consistently studied odorant property, yet there is no clear importance for carbon chain length in mammalian olfactory behavior (Boesveldt, Olsson, & Lundstrom, 2010). Indeed, as elegantly stated by the late Larry Katz at a lecture he gave at the Asso ciation for Chemoreception Science: “The olfactory system did not evolve to decode the catalogue of Sigma-Aldrich, it evolved to decode the world around us.” In other words, perhaps if we reexamine the olfactory stimulus space from a perceptual rather than a chemical perspective, we may gain important insight into the function of the olfactory system. It is with this notion in mind that we have recently generated an olfactory percep tual metric, and tested its application to perception and neural activity in the olfactory system. In an effort led by Rehan Khan (Khan et al., 2007), we constructed a perceptual “odor space” using data from the Dravnieks’ Atlas of Odor Character Profiles, wherein about 150 experts (perfumers and olfactory scientists) ranked (from 0 to 5, reflecting “absent” to “extremely” representative) 160 odorants (144 monomolecular species and 16 mix tures) against each of the 146 verbal descriptors (Dravnieks, 1982, 1985). We applied principal components analysis (PCA), a well-established method for dimension reduction that generates a new set of dimensions (principal components, or PCs) for the profile space in which (1) each successive dimension has the maximal possible variance and (2) all dimensions are uncorrelated. We found that the effective dimensionality of the odor profile space was much smaller than 146, with the first four PCs accounting for 54 per cent of the variance (Figure 6.9A). To generate a perceptual odor space, we projected the odorants onto a subspace formed by these first four PCs (Figure 6.9B. A navigable version of this space is available at the odor space link at http://www.weizmann.ac.il/neurobiolo gy/worg). In a series of experiments, we found that this space formed a valid representa tion of odorant perception: Simple Euclidian distances in the space predicted both explic it (Figure 6.9C) and implicit (Figure 6.9D) odor similarity. In other words, odorants close in the space smell similar, and odorants far-separated in the space smell dissimilar (Khan et al, 2007). Moreover, we found that the primary dimension in the space (PC1) was tight Page 14 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose ly linked to odorant pleasantness, that is, a continuum ranging from very unpleasant at one end to very pleasant at the other (Haddad et al, 2010, Figure 6.10). Finding that pleasantness was the primary dimension of human olfactory perception was consistent with many previous efforts. Odorant pleasantness was the primary aspect of odor spontaneously used by subjects in olfactory discrimination tasks (S. S. Schiffman, 1974), and odorant pleasantness was the primary criterion spontaneously used by sub jects in order to combine odorants into groups (Berglund, Berglund, Engen, & Ekman, 1973; S. Schiffman, Robinson, & Erickson, 1977). When using large numbers of verbal de scriptors in order to describe odorants, pleasantness repeatedly emerged as the primary dimension in multidimensional analyses of the resultant descriptor space (Khan et al., 2007; Moskowitz & Barbe, 1977). Studies with newborns suggested that at least some as pects of olfactory pleasantness are innate (Soussignan, Schaal, Marlier, & Jiang, 1997; Steiner, 1979). For example, neonate’s behavioral markers of disgust (nose wrinkling, up per lip raising) discriminated between vanillin judged as being pleasant and butyric acid judged to be unpleasant by adult raters (Soussignan et al., 1997). Moreover, there is agreement in the assessments of pleasantness by adults and children for various pure odorants (Schmidt & Beauchamp, 1988) and personal odors (Mallet & Schaal, 1998).

Page 15 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose

Figure 6.9 Olfactory perceptual space. A, The pro portion of (descending line) and cumulative (ascend ing line) variance in perceptual descriptions ex plained by each of the principal components (PCs). B, The 144 odorants projected into a two-dimensional space made of the first and second PCs. Nine odor ants used in experiments depicted in C and D: [acetophenone (AC), amyl acetate (AA), diphenyl ox ide (DP), ethyl butyrate (EB), eugenol (EU), guaiacol (GU), heptanal (HP), hexanoic acid (HX), and phenyl ethanol (PEA)]. C, For the nine odorants, the correla tion between explicit perceived similarity ratings and PCA-based distance for all pairwise comparisons. Odorants closer in the perceptual space were per ceived as more similar. D, Reaction time for correct trials in a forced-choice same–different task using five of the nine odorants. Error bars reflect SE. The reaction time was longer for odorant pairs that were closer in PCA-based space, thus providing an implicit validation of the perceptual space. Reprinted with permission from Khan et al., 2007.

Figure 6.10 Identifying pleasantness as the first PC of perception. A, The five descriptors that flanked each end of PC1 of perception. B, For the nine odor ants in Figure 6.8, the correlation between the pair wise difference in pleasantness and the pairwise dis tance along the first PC. Distance along the first PC was a strong predictor of difference in pleasantness. Reprinted with permission from Khan et al., 2007.

Page 16 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose After using PCA to reduce the apparent dimensionality of olfactory perception, we set out to independently apply the same approach to odorant structure. We used structural chem istry software to obtain 1,514 physicochemical descriptors for each of 1,565 odorants. These descriptors were of many types (p. 98) (p. 99) (e.g., atom counts, functional group counts, counts of types of bonds, molecular weights, topological descriptors). We applied PCA to these data and found that much of the variance could be explained by a relatively small number of PCs. The first PC accounted for about 32 percent of the variance, and the first ten accounted for about 70 percent of the variance.

Figure 6.11 Relating physicochemical space to per ceptual space. A, The correlation between the first to fourth (descending in the figure) perceptual PCs and each of the first seven physicochemical PCs for the 144 odorants. Error bars reflect the SE from 1,000 bootstrap replicates. The best correlation was be tween the first PC of perception and the first PC of physicochemical space. This correlation was signifi cantly larger than all other correlations. B, For the 144 odorants, the correlation between their actual first perceptual PC value and the value our model predicted from their physicochemical data. Reprinted with permission from Khan et al., 2007.

Because we separately generated PC spaces for perception and structure, we could then ask whether these two spaces were related in any way. In other words, we tested for a correlation between perceptual PCs and physicochemical PCs. Strikingly, the strongest correlation was between the first perceptual PC and the first physicochemical PC (Figure 6.11A). In other words, there was a privileged relationship between PC1 of perception and PC1 of physicochemical organization. The single best axis for explaining the variance in the physicochemical data was the best predictor of the single best axis for explaining the variance in the perceptual data. Having established that the physicochemical space is related to the perceptual space, we next built a linear predictive model through a crossvalidation procedure that allowed us to predict odor perception from odorant structure (Figure 6.11B). To test the predictive power of our model, we obtained physicochemical parameters for 52 odorants commonly used in olfaction experiments, but not present in the set of 144 used in the model building. We applied our model to the fifty-two new mole cules so that for each we had predicted values for the first PC of perceptual space. We found that using these PC values, we could convincingly predict the rank-order of pleas Page 17 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose antness of these molecules (Spearman rank correlation, r = 0.72; p = 0.0004), and mod estly yet significantly predict their actual pleasantness ratings (r = 0.55; p = 0.004). Moreover, we obtained similar predictive power across three different cultures: urban Americans in California, rural Muslim Arab Israelis, and urban Jewish Israelis (Figure 6.12).

Figure 6.12 Predicting odorant pleasantness across cultures. Twenty-seven odorous molecules not com monly used in olfactory studies, and not previously tested by us, were presented to three cultural groups of naïve subjects: urban Americans (23 subjects), rur al Arab Israelis (22 subjects), and urban Jewish Is raelis (20 subjects). Reprinted with permission from Khan et al., 2007.

An aspect of these results that has been viewed as challenging by many is that they imply that pleasantness is written into the molecular structure of odorants and is therefore by definition innate. This can be viewed as inconsistent with the high levels of cross-individ ual and cross-cultural variability in odor perception (Ayabe-Kanamura et al., 1998; Wysoc ki, Pierce, & Gilbert, 1991). We indeed think that odor pleasantness is hard-wired and in nate. Consistent with this, many odors have clear hedonic value despite no previous expe rience or exposure (Soussignan et al., 1997; Steiner, 1979), and moreover, the metric that links this hedonic value with odorant structure (PC1 of structure) predicts (p. 100) re sponses across species (Mandairon, Poncelet, Bensafi, & Didier, 2009). Nevertheless, we stress that an innate hard-wired link remains highly susceptible to the influences of learn ing, experience, and context. For example, no one would argue that perceived color is in nately and hard-wire-linked to wavelength. However, a given wavelength can be per ceived to have very different colors as a function of context (see striking online demon strations at http://www.purveslab.net/seeforyourself/). Moreover, no one would argue that location in space is reflected in location on the retina in an innate and hard-wired fashion. Nevertheless, context can alter spatial perception, as clearly evident in the Muller-Lyer il Page 18 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose lusion. Similarly, we argue that odor pleasantness is hard-wire-linked to (PC1 of) odorant structure, yet this link is clearly modified by learning, experience, and context. Olfaction is often described as multidimensional. It is made of a multidimensional stimu lus, which is transduced by about 1,000 different receptor types, giving rise to a neural image of similarly high dimensionality. Yet our results suggested that very few dimen sions, in fact primarily one, captures a significant portion of the variance in olfactory per ception, and critically, this one dimension allows for modest yet accurate predictions of odor perception from odorant structure. With this in mind, in an effort led by Rafi Haddad (Haddad, Khan, et al., 2008; Haddad, Lapid, Harel, & Sobel, 2008; Haddad et al., 2010), we set out to ask whether this reduced dimensionality was reflected in any way in neural activity. We mined all available previously published data sets that reported the neural re sponse in a sizable number of receptor types or glomeruli to a sizable number of odor ants. This rendered 12 data sets using either methods of electrical or optical recording. Once again, we applied PCA to this data. The first two PCs alone explained about 58 per cent of the variance in the neural activity data. Moreover, in nine of the twelve datasets we analyzed, we found a strong correlation between PC1 of neural response space and the summed activity of the sampled population, whether spike rates or optical signal, with r values ranging between 0.73 and 0.98 (all p < 0.001). Considering the summed re sponse in the olfactory system of insects was previously identified as strongly predictive of insect approach or withdrawal (Kreher, Mathew, Kim, & Carlson, 2008) (Figure 6.13A), we set out here to ask whether PC1 of neural activity in the mammalian olfactory system was similarly related to behavior and perception. One of the datasets we studied was that of Saito et al. (2009), who reported the neural response of ten human neurons and fiftythree mouse neurons in vitro to a set of sixty-two odorants. We asked eighteen human subjects to rate the odorant pleasantness of twenty-six odorants randomly selected from those tested by Saito et al. (2009). The correlation between human receptor PC1 and odorant pleasantness was 0.49 (p < 0.009), and if we added the mouse receptor (p. 101) response, it was 0.71 (p < 0.0001) (Figure 6.13B). To reiterate this critical result, PC1 of odorant-induced neural activity measured in a dish by one group at Duke University in the United States was a significant predictor of odorant pleasantness, as estimated by hu man subjects tested by a different group at the Weizmann Institute in Israel. Finally, here we also conducted an initial investigation into the second principal compo nent of activity as well. In that PC1 of neural activity reflected approach or withdrawal in animals, we speculated that once approached, a second decision to be made regarding an odor is whether it is edible or poisonous. Consistent with this prediction, we found signifi cant correlations between PC2 of neural activity and odorant toxicity in mice and in rats (Figure 6.13C), as well as a significant correlation between toxicity/edibility and PC2 of perception in humans (Figure 6.13D). Similar findings have been obtained by others inde pendently (Zarzo, 2008). To conclude this section, we found that if one uses the human nose as a window onto ol faction, one obtains a surprisingly simplified picture that explains a significant portion of the variance in both neural activity and perception in this system. This picture relied on a Page 19 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose set of simple linear transforms. It suggested that the primary axis of perception was linked to the primary axis of odorant structure, and that both of these were in turn relat ed to the primary axis of neural activity in this system. Moreover, the second axis of per ception was linked to the second axis of neural activity. Critically, these transforms al lowed for modest but significant predictions of perception, structure, and neural activity across species.

Looking at Human Behavior Through the Nose

Figure 6.13 The principal axes of neural space re flected olfactory behavior and perception. A, Correla tion between PC1 of neural population activity and the odor preferences of Drosophila larvae. Every dot represents a single odor. B, Correlation between PC1 of neural space in humans and mice with human odor pleasantness. Every dot represents a single odor. C, Correlation between PC2 of neural population activi ty and oral toxicity for rats (LD50 values in mg/kg). Every dot represents an odor. D, Correlation be tween PC2 of human perceptual space and LD50 val ues of rats. Reprinted with permission from Haddad et al., 2010.

In the previous section, the human nose taught us about the mammalian olfactory system. This was (p. 102) possible because, in contrast to popular notions, the human nose is an astonishingly acute device. This is evident in unusually keen powers of detection and dis crimination, which in some cases compete with those of microsmatic mammals, or with those of sophisticated analytical equipment. These abilities have been detailed within re cent reviews (Sela & Sobel, 2010; Shepherd, 2004, 2005; Stevenson, 2010; Yeshurun & Sobel, 2010; Zelano & Sobel, 2005). Here, we highlight key cases in which these keen ol factory abilities clearly influence human behavior. As noted in the introduction, two aspects of human behavior that are, in our view, macros matic, are eating and mating: Notably, both are critical for survival. A third human behav Page 20 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose ior for which human olfactory influences are less clearly apparent, yet in our view are nevertheless critical, is social interaction. It is beyond the scope of this chapter to provide a comprehensive review on human chemosignaling, as recently done elsewhere (Steven son, 2010). Here, we selectively choose examples to highlight the role of olfaction in hu man behavior.

Eating We eat what tastes good (Drewnowski, 1997). Taste, or more accurately flavor, is domi nated by smell (Small, Jones-Gotman, Zatorre, Petrides, & Evans, 1997). Hence, things taste good because they smell good (Letarte, 1997). In other words, by determining the palatability and hedonic value of food, olfaction influences the balance of food intake (Rolls, 2006; Saper, Chou, & Elmquist, 2002; Yeomans, 2006). In addition to this very sim ple basic premise, there are also several direct and indirect lines of evidence that high light the significance of olfaction in eating behavior. For example, olfaction drives saliva tion even at subthreshold odor concentrations (Pangborn & Berggren, 1973; Rogers & Hill, 1989). Odors regulate appetite (Rogers & Hill, 1989) and affect the cephalic phase of insulin secretion (W. G. Johnson & Wildman, 1983; Louis-Sylvestre & Le Magnen, 1980) and gastric acid secretion (Feldman & Richardson, 1986). The interaction between olfaction and eating is bidirectional. Olfaction influences eating, and eating behavior and mechanisms influence olfaction. The nature of this influence, however, remains controversial. For example, whereas some studies suggest that hunger increases olfactory sensitivity to food odors (Guild, 1956; Hammer, 1951; Schneider & Wolf, 1955; Stafford & Welbeck, 2010), others failed to replicate these results (Janowitz & Grossman, 1949; Zilstorff-Pedersen, 1955), or even found the opposite—higher sensitivity in satiety (Albrecht et al., 2009). Hunger and satiety influence not only sensitivity but also hedonics: Odors of foods consumed to satiety become less pleasant (Albrecht et al., 2009; Rolls & Rolls, 1997). This satiety-driven shift in hedonic representation is accompanied by altered brain representation. This was uncovered in an elegant human brain–imaging study in which eating bananas to satiety changed the representation of banana odor in the orbitofrontal cortex (O’Doherty et al., 2000). Also, an odor encoded during inactiva tion of taste-cortex in rats was later remembered as the same only during similar tastecortex inactivation (Fortis-Santiago, Rodwin, Neseliler, Piette, & Katz, 2009). The mecha nism for these shifted representations may be evident at the earliest stages of olfactory processing: Perfusion of the eating-related hormones insulin and leptin onto olfactory re ceptor neurons in rats significantly increased spontaneous firing frequency in the ab sence of odors and decreased odorant-induced peak amplitude in response to food odors (Ketterer et al., 2010; Savigner et al., 2009). Therefore, by increasing spontaneous activi ty but reducing odorant-induced activity of olfactory receptor neurons, elevated levels of insulin and leptin (such as after a meal) may result in decreased global signal-to-noise ra tio in the olfactory epithelium (Ketterer et al., 2010; Savigner et al., 2009).

Page 21 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose The importance of olfaction for human eating behavior is clearly evidenced in cases of ol factory loss. Anosmic patients experience distortions in flavor perception (Bonfils, Avan, Faulcon, & Malinvaud, 2005) and changes in eating behavior (Aschenbrenner et al., 2008). Simulating anosmia in healthy subjects by intranasal lidocaine administration re sulted in reduced hunger ratings (Greenway et al., 2007). Nevertheless, the rate of abnor mal body mass index subjects among anosmic people is no larger than in the general pop ulation (Aschenbrenner et al., 2008). Several eating disorders, ranging from obesity (Hoover, 2010; Obrebowski, ObrebowskaKarsznia, & Gawlinski, 2000; Richardson, Vander Woude, Sudan, Thompson, & Leopold, 2004; Snyder, Duffy, Chapo, Cobbett, & Bartoshuk, 2003) to anorexia (Fedoroff, Stoner, Andersen, Doty, & Rolls, 1995; Roessner, Bleich, Banaschewski, & Rothenberger, 2005), have been associated with alterations in olfactory perception, and the nutritional chal lenge associated with aging has been clearly linked to the age-related loss of olfaction (Cain & Gent, 1991; Doty, 1989; (p. 103) S. S. Schiffman, 1997). Accordingly, artificially in creasing the odorous properties of foods helps overcome the nutritional challenge in ag ing (Mathey, Siebelink, de Graaf, & Van Staveren, 2001; S. S. Schiffman & Warwick, 1988; Stevens & Lawless, 1981). Consistent with the bidirectional influences of olfaction and eating behavior, edibility is clearly a key category in odor perception. It was identified as the second principal axis of perception independently by us (Haddad et al., 2010) and others (Zarzo, 2008). Consis tent with edibility as an olfactory category, olfactory responses are stronger (Small et al., 2005) and faster (Boesveldt, Frasnelli, Gordon, & Lundstrom), and identification is more accurate (Fusari & Ballesteros, 2008), for food over nonfood odors. Moreover, whereas humans are poor at spontaneous odor naming, they are very good at spontaneous rating of odor edibility, even in childhood (de Wijk & Cain, 1994a, 1994b). Indeed, olfactory pref erences of neonates are influenced by their mother’s food preferences during pregnancy (Schaal, Marlier, & Soussignan, 2000), suggesting that the powerful link between olfacto ry preferences and eating behavior is formed at the earliest stages of development.

Mating When reasoning the choice of a sexual partner, some may list physical and personality qualities, whereas others may just explain the choice by a “simple click” or “chemistry.” Is this “click” indeed chemical? Although, as noted, humans tend to underestimate their own olfactory abilities, humans can nevertheless use olfaction to discriminate the genetic makeup of potential mating partners. The human genome includes a region called human leukocyte antigen (HLA), which consists of many genes related to the immune system, in addition to olfactory receptor genes and pseudogenes. Several studies have found that women can use smell to discriminate between men as a function of similarity between their own and the men’s HLA alleles (Eggert, Muller-Ruchholtz, & Ferstl, 1998; Jacob, McClintock, Zelano, & Ober, 2002; Ober et al., 1997; Wedekind & Furi, 1997; Wedekind, Seebeck, Bettens, & Paepke, 1995). The “ideal” smell of genetic makeup remains contro versial, yet most evidence suggests that women prefer an odor of a man with HLA alleles Page 22 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose not identical to their own, but at the same time not too different (Jacob et al., 2002; T. Roberts & Roiser, 2010). In turn, this preference may be for major histocompatibility complex (MHC) heterozygosity rather than dissimilarity (Thornhill et al., 2003). Olfactory mate preference, however, is plastic. For example, single women preferred odors of MHCsimilar men, whereas women in relationships preferred odors of MHC-dissimilar men (S. C. Roberts & Little, 2008). Moreover, olfactory mate preferences are influenced by the menstrual cycle (Gangestad & Cousins, 2001; Havlicek, Roberts, & Flegr, 2005; Little, Jones, & Burriss, 2007; Singh & Bronstad, 2001) (Figure 6.14A) and by hormone-based contraceptives (S. C. Roberts, Gosling, Carter, & Petrie, 2008; Wedekind et al., 1995; Wedekind & Furi, 1997). Finally, although not directly related to mate selection, the clearest case of chemical com munication in humans also has clear potential implications for mating behavior. This is the phenomenon of menstrual synchrony, whereby women who live in close proximity, such as roommates in dorms, synchronize their menstrual cycle over time (McClintock, 1971). This effect is mediated by an odor in sweat. This was verified in a series of studies in which experimenters obtained underarm sweat extracts from donor women during ei ther the ovulatory or follicular menstrual phase. These extracts were then deposited on the upper lips of recipient women, where follicular sweat accelerated ovulation, and ovu latory sweat delayed it (Russell, Switz, & Thompson, 1980; Stern & McClintock, 1998) (Figure 6.14B). Moreover, variation in menstrual timing can be increased by the odor of other lactating women (Jacob et al., 2004) or regulated by the odor of male hormones (Cutler et al., 1986; Wysocki & Preti, 2004). Olfactory influences on mate preferences are not restricted to women. Men can detect an HLA odor different from their own when taken from either men or women odor donors, and can rate the similar odor as more pleasant for both of the sexes (Thornhill et al., 2003; Wedekind & Furi, 1997). In addition, men preferred the scent of common over rare MHC alleles (Thornhill et al., 2003). Moreover, unrelated to HLA similarity, male raters can detect the menstrual phase of female body odor donors. The follicular phase is rated as more pleasant and sexy than the luteal phase (Singh & Bronstad, 2001), an effect that is diminished when the women use hormonal contraceptives (Kuukasjarvi et al., 2004; Thornhill et al., 2003).

Page 23 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose

Figure 6.14 Human chemosignaling. A, Women’s preference for symmetrical men’s odor as a function of probability of conception, based on actuarial val ues. Normally ovulating (non-pill-using) women only. Positive regression values reflect increased relative attraction to scent of symmetrical males; r = 0.54, p < 0.005. (From Gangestad & Thornhill, 1998.) B, Change in length of the recipient’s cycle. Cycles were shorter than baseline during exposure to follic ular compounds (t = 1.78; p ≤ 0.05, 37 cycles) but longer during exposure to ovulatory compounds (t = 2.7; p ≤ 0.01, 38 cycles). Cycles during exposure to the carrier were not different from baseline (t = 0.05; p ≤ 0.96, 27 cycles). (From Stern & McClin tock, 1998.) C, Post-smell testosterone levels (con trolling for pre-smell testosterone levels) among men exposed to the odor of a woman close to ovulation, the odor of a woman far from ovulation, or a control odor. Error bars represent standard errors. Reprinted with permission from Miller & Maner, 2010.

These behavioral results are echoed in hormone expression. Men exposed to the scent of an ovulating woman subsequently displayed higher levels of testosterone than did men exposed to the scent of a (p. 104) nonovulating woman or a control scent (Miller & Maner, 2010) (Figure 6.14C). Moreover, a recent study on chemosignals in human tears revealed a host of influences on sexual arousal (Gelstein et al., 2011). Sniffing negative-emotion-re lated odorless tears obtained from women donors induced reductions in sexual appeal at tributed by men to pictures of women’s faces. Sniffing tears also reduced self-rated sexu al arousal, reduced physiological measures of arousal, and reduced levels of testosterone. Finally, fMRI revealed that sniffing women’s tears selectively reduced activity in brain substrates of sexual arousal in men (Gelstein et al., 2011) (Figure 6.15).

Page 24 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose

Social Interaction

Figure 6.15 A chemosignal in human tears. Sniffing odorless emotional tears obtained from women donors, altered brain activity in the substrates of arousal in men, and significantly lowered levels of salivary testosterone.

Whereas olfactory influences on human eating and mating are intuitively obvious, olfacto ry cues may play into aspects of human social interaction that have been less commonly associated with smell. Many such types of social chemosignaling have been examined (Meredith, 2001), but here we will detail only one particular case that has received more attention than others, and that is the ability of humans to smell fear. Fear or distress chemosignals are prevalent throughout animal species (Hauser et al., 2008; Pageat & Gaultier, 2003). In an initial study in humans, Chen and Haviland-Jones (2000) collected underarm odors on gauze pads from young women and men after they watched funny or frightening movies. They later asked other women and men to determine by smell which was the odor of people when they were “happy” or “afraid.” Women correctly identified happiness in men and women, and fear in men. Men correctly identified happiness in women and fear in men. A (p. 105) similar result was later obtained in a study that exam ined women only (Ackerl, Atzmueller, & Grammer, 2002). Moreover, women had improved performance in a cognitive verbal task after smelling fear sweat versus neutral sweat (Chen, Katdare, & Lucas, 2006), and the smell of fearful sweat biased women toward in terpreting ambiguous expressions as more fearful, but had no effect when the facial emo tion was more discernible (Zhou & Chen, 2009). Moreover, subjects had an increased startle reflex when exposed to anxiety-related sweat versus sports-related sweat (Prehn, Ohrt, Sojka, Ferstl, & Pause, 2006). Finally, imaging studies have revealed dissociable brain representations after smelling anxiety sweat versus sports-related sweat (PrehnKristensen et al., 2009). These differences are particularly pronounced in the amygdala, a brain substrate common to olfaction, fear responses, and emotional regulation of behav ior (Mujica-Parodi et al., 2009). Taken together, this body of research strongly suggests Page 25 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose that humans can discriminate the scent of fear from other body odors, and it is not unlike ly that this influences behavior. We think that smelling fear or distress is by no means one of the key roles of human chemical communication, yet we have chosen to detail this par ticular example of human social chemosignaling because it has received increased experi mental attention. We think that chemosignaling in fact plays into many aspects of human social interaction, and uncovering these instances of chemosignaling is a major goal for research in our field.

Final Word We have described the functional neuroanatomy of the mammalian sense of smell. This system is highly conserved (Ache & Young, 2005), and therefore the human sense of smell is not very different from that of other mammals. With this in mind, just as a deep under standing of human visual psychophysics provided the basis for probing vision neurobiolo gy, we propose that a solid understanding of human olfactory psychophysics is a perquisite to understanding the neurobiological mechanisms of the sense of smell. More over, olfaction significantly influences critical human behaviors directly related to sur vival, such as eating, mating, and social interaction. Better understanding of these olfac tory influences is key, in our view, to a comprehensive picture of human behavior.

References Ache, B. W., & Young, J. M. (2005). Olfaction: Diverse species, conserved principles. Neu ron, 48 (3), 417–430. Ackerl, K., Atzmueller, M., & Grammer, K. (2002). The scent of fear. Neuroendocrinology Letters, 23 (2), 79–84. Albrecht, J., Schreder, T., Kleemann, A. M., Schopf, V., Kopietz, R., Anzinger, A., et al. (2009). Olfactory detection thresholds and pleasantness of a food-related and a non-food odour in hunger and satiety. Rhinology, 47 (2), 160–165. Allison, A. (1954). The secondary olfactory areas in the human brain. Journal of Anatomy, 88, 481–488. Aschenbrenner, K., Hummel, C., Teszmer, K., Krone, F., Ishimaru, T., Seo, H. S., et al. (2008). The influence of olfactory loss on dietary behaviors. Laryngoscope, 118 (1), 135– 144. Ayabe-Kanamura, S., Schicker, I., Laska, M., Hudson, R., Distel, H., Kobayakawa, T., et al. (1998). Differences in perception of everyday odors: A Japanese-German cross-cultural study. Chemical Senses, 23 (1), 31–38. Barkai, E., & Hasselmo, M. E. (1994). Modulation of the input/output function of rat piri form cortex pyramidal cells. Journal of Neurophysiology, 72 (2), 644.

Page 26 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Barnes, D. C., Hofacer, R. D., Zaman, A. R., Rennaker, R. L., & Wilson, D. A. (2008). Olfac tory perceptual stability and discrimination. Nature Neuroscience, 11 (12), 1378–1380. Bathellier, B., Buhl, D. L., Accolla, R., & Carleton, A. (2008). Dynamic ensemble odor cod ing in the mammalian olfactory bulb: Sensory information at different timescales. Neuron, 57 (4), 586–598. Bender, G., Hummel, T., Negoias, S., & Small, D. M. (2009). Separate signals for or thonasal vs. retronasal perception of food but not nonfood odors. Behavioral Neuro science, 123 (3), 481–489. Bensafi, M., Porter, J., Pouliot, S., Mainland, J., Johnson, B., Zelano, C., et al. (2003). Olfac tomotor activity during imagery mimics that during perception. Nature Neuroscience, 6 (11), 1142–1144. Berglund, B., Berglund, U., Engen, T., & Ekman, G. (1973). Multidimensional analysis of 21 odors. Scandinavian Journal of Psychology, 14 (2), 131–137. Boesveldt, S., Frasnelli, J., Gordon, A. R., & Lundstrom, J. N. (2010). The fish is bad: Nega tive food odors elicit faster and more accurate reactions than other odors. Biological Psy chology, 84 (2), 313–317. Boesveldt, S., Olsson, M. J., & Lundstrom, J. N. (2010). Carbon chain length and the stimu lus problem in olfaction. Behavioral Brain Research, 215 (1), 110–113. Bonfils, P., Avan, P., Faulcon, P., & Malinvaud, D. (2005). Distorted odorant perception: Analysis of a series of 56 patients with parosmia. Archives of Otolaryngology—Head and Neck Surgery, 131 (2), 107–112. Breer, H., Fleischer, J., & Strotmann, J. (2006). The sense of smell: Multiple olfactory sub systems. Cellular and Molecular Life Sciences, 63 (13), 1465–1475. Brunjes, P. C., Illig, K. R., & Meyer, E. A. (2005). A field guide to the anterior olfactory nu cleus (cortex). Brain Res Brain Res Rev, 50 (2), 305–335. Buck, L. B. (2000). The molecular architecture of odor and pheromone sensing in mam mals. Cell, 100 (6), 611–618. Buck, L., & Axel, R. (1991). A novel multigene family may encode odorant receptors: a molecular basis for odor recognition. Cell, 65 (1), 175–187. Cain, W. S., & Gent, J. F. (1991). Olfactory sensitivity: Reliability, generality, and associa tion with aging. Journal of Experimental Psychology: Human Perception and Performance, 17 (2), 382–391. Carmichael, S. T., Clugnet, M. C., & Price, J. L. (1994). Central olfactory connections in the macaque monkey. Journal of Comparative Neurology, 346 (3), 403–434.

Page 27 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Chen, D., & Haviland-Jones, J. (2000). Human olfactory communication of emo tion. Perceptual and Motor Skills, 91 (3 Pt 1), 771. (p. 106)

Chen, D., Katdare, A., & Lucas, N. (2006). Chemosignals of fear enhance cognitive perfor mance in humans. Chemical Senses, 31 (5), 415. Cleland, T. A., & Sullivan, R. M. (2003). Central olfactory structures. In R. L. Doty (Ed.), Handbook of olfaction and gustation (2nd ed., pp. 165–180). New York: Marcel Dekker. Cohen, Y., Reuveni, I., Barkai, E., & Maroun, M. (2008). Olfactory learning-induced longlasting enhancement of descending and ascending synaptic transmission to the piriform cortex. Journal of Neuroscience, 28 (26), 6664. Cross, D. J., Flexman, J. A., Anzai, Y., Morrow, T. J., Maravilla, K. R., & Minoshima, S. (2006). In vivo imaging of functional disruption, recovery and alteration in rat olfactory circuitry after lesion. NeuroImage, 32 (3), 1265–1272. Cutler, W. B., Preti, G., Krieger, A., Huggins, G. R., Garcia, C. R., & Lawley, H. J. (1986). Human axillary secretions influence women’s menstrual cycles: The role of donor extract from men. Hormones and Behavior, 20 (4), 463–473. de Olmos, J., Hardy, H., & Heimer, L. (1978). The afferent connections of the main and the accessory olfactory bulb formations in the rat: an experimental HRP-study. Journal of Comparative Neurology, 15 (181), 213–244. de Wijk, R. A., & Cain, W. S. (1994a). Odor identification by name and by edibility: Lifespan development and safety. Human Factors, 36 (1), 182–187. de Wijk, R. A., & Cain, W. S. (1994b). Odor quality: Discrimination versus free and cued identification. Perception and Psychophysics, 56 (1), 12–18. DeMaria, S., & Ngai, J. (2010). The cell biology of smell. Journal of Cell Biology, 191 (3), 443–452. Doty, R. L. (1989). Influence of age and age-related diseases on olfactory function. Annals of the New York Academy of Sciences, 561, 76–86. Dravnieks, A. (1982). Odor quality: Semantically generated multi-dimensional profiles are stable. Science, 218, 799–801. Dravnieks, A. (1985). Atlas of odor character profiles. Philadelphia: ASTM Press. Drewnowski, A. (1997). Taste preferences and food intake. Annual Review of Nutrition, 17 (1), 237–253. Eggert, F., Muller-Ruchholtz, W., & Ferstl, R. (1998). Olfactory cues associated with the major histocompatibility complex. Genetica, 104 (3), 191–197.

Page 28 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Fedoroff, I. C., Stoner, S. A., Andersen, A. E., Doty, R. L., & Rolls, B. J. (1995). Olfactory dysfunction in anorexia and bulimia nervosa. International Journal of Eating Disorders, 18 (1), 71–77. Feinstein, P., & Mombaerts, P. (2004). A contextual model for axonal sorting into glomeruli in the mouse olfactory system. Cell, 117 (6), 817–831. Feldman, M., & Richardson, C. T. (1986). Role of thought, sight, smell, and taste of food in the cephalic phase of gastric acid secretion in humans. Gastroenterology, 90 (2), 428–433. Ferrero, D. M., & Liberles, S. D. (2010). The secret codes of mammalian scents. Wiley In terdisciplinary Reviews: Systems Biology and Medicine, 2 (1), 23–33. Firestein, S. (2001). How the olfactory system makes sense of scents. Nature, 413 (6852), 211–218. Frasnelli, J., Lundstrom, J. N., Boyle, J. A., Katsarkas, A., & Jones-Gotman, M. (2011). The Vomeronasal Organ is not Involved in the Perception of Endogenous Odors. Human Brain Mapping, 32 (3), 450–460. Fortis-Santiago, Y., Rodwin, B. A., Neseliler, S., Piette, C. E., & Katz, D. B. (2009). State dependence of olfactory perception as a function of taste cortical inactivation. Nature Neuroscience, 13 (2), 158–159. Franco, M. I., Turin, L., Mershin, A., & Skoulakis, E. M. (2011). Molecular vibration-sens ing component in Drosophila melanogaster olfaction. Proceedings of the National Acade my of Sciences U S A, 108 (9), 3797–3802. Fusari, A., & Ballesteros, S. (2008). Identification of odors of edible and nonedible stimuli as affected by age and gender. Behavior Research Methods, 40 (3), 752. Gangestad, S. W., & Cousins, A. J. (2001). Adaptive design, female mate preferences, and shifts across the menstrual cycle. Annual Review of Sex Research, 12, 145–185. Gangestad, S. W., & Thornhill, R. (1998). Menstrual cycle variation in women’s prefer ences for the scent of symmetrical men. Proceedings of the Royal Society of London. B. Biological Sciences, 265 (1399), 927–933. Gelstein, S., Yeshurun, Y., Rozenkrantz, L., Shushan, S., Frumin, I., Roth, Y., et al. (2011). Human tears contain a chemosignal. Science, 331 (6014), 226–230. Gilad, Y., & Lancet, D. (2003). Population differences in the human functional olfactory repertoire. Molecular Biology and Evolution, 20 (3), 307–314. Goldman, A. L., Van der Goes van Naters, W., Lessing, D., Warr, C. G., & Carlson, J. R. (2005). Coexpression of two functional odor receptors in one neuron. Neuron, 45 (5), 661– 666.

Page 29 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Gottfried, J. A. (2010). Central mechanisms of odour object perception. Nature Reviews, Neuroscience, 11 (9), 628–641. Gottfried, J. A., Winston, J. S., & Dolan, R. J. (2006). Dissociable codes of odor quality and odorant structure in human piriform cortex. Neuron, 49 (3), 467–479. Gottfried, J. A., & Wu, K. N. (2009). Perceptual and neural pliability of odor objects. An nals of the New York Academy of Sciences, 1170, 324–332. Graziadei, P. P., & Monti Graziadei, A. G. (1983). Regeneration in the olfactory system of vertebrates. American Journal of Otolaryngology, 4 (4), 228–233. Greenway, F. L., Martin, C. K., Gupta, A. K., Cruickshank, S., Whitehouse, J., DeYoung, L., et al. (2007). Using intranasal lidocaine to reduce food intake. International Journal of Obesity (London), 31 (5), 858–863. Grosmaitre, X., Fuss, S. H., Lee, A. C., Adipietro, K. A., Matsunami, H., Mombaerts, P., et al. (2009). SR1, a mouse odorant receptor with an unusually broad response profile. Jour nal of Neuroscience, 29 (46), 14545–14552. Grosmaitre, X., Santarelli, L. C., Tan, J., Luo, M., & Ma, M. (2007). Dual functions of mam malian olfactory sensory neurons as odor detectors and mechanical sensors. Nature Neu roscience, 10 (3), 348–354. Guild, A. A. (1956). Olfactory acuity in normal and obese human subjects: Diurnal varia tions and the effect of d-amphetamine sulphate. Journal of Laryngology and Otology, 70 (7), 408–414. Haberly, L. B. (2001). Parallel-distributed processing in olfactory cortex: New insights from morphological and physiological analysis of neuronal circuitry. Chemical Senses, 26 (5), 551–576. Haberly, L. B., & Bower, J. M. (1989). Olfactory cortex: Model circuit for study of associa tive memory? Trends in Neurosciences, 12 (7), 258–264. Haddad, R., Khan, R., Takahashi, Y. K., Mori, K., Harel, D., & Sobel, N. (2008). A metric for odorant comparison. Nature Methods, 5 (5), 425–429. (p. 107)

Haddad, R., Lapid, H., Harel, D., & Sobel, N. (2008). Measuring smells. Current Opinion in Neurobiology, 18 (4), 438–444. Haddad, R., Weiss, T., Khan, R., Nadler, B., Mandairon, N., Bensafi, M., et al. (2010). Glob al features of neural activity in the olfactory system form a parallel code that predicts ol factory behavior and perception. Journal of Neuroscience, 30 (27), 9017–9026. Hallem, E. A., & Carlson, J. R. (2006). Coding of odors by a receptor repertoire. Cell, 125 (1), 143–160.

Page 30 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Halpern, M. (1987). The organization and function of the vomeronasal system. Annual Re view of Neuroscience, 10 (1), 325–362. Hammer, F. J. (1951). The relation of odor, taste and flicker-fusion thresholds to food in take. Journal of Comparative Physiology and Psychology, 44 (5), 403–411. Hauser, R., Marczak, M., Karaszewski, B., Wiergowski, M., Kaliszan, M., Penkowski, M., et al. (2008). A preliminary study for identifying olfactory markers of fear in the rat. Labo ratory Animals (New York), 37 (2), 76–80. Havlicek, J., Roberts, S. C., & Flegr, J. (2005). Women’s preference for dominant male odour: Effects of menstrual cycle and relationship status. Biology Letters, 1 (3), 256–259. Hoover, K. C. (2010). Smell with inspiration: The evolutionary significance of olfaction. American Journal of Physical Anthropology, 143 (Suppl 51), 63–74. Howard, J. D., Plailly, J., Grueschow, M., Haynes, J. D., & Gottfried, J. A. (2009). Odor qual ity coding and categorization in human posterior piriform cortex. Nature Neuroscience, 12 (7), 932–938. Hummel, T. (2000). Assessment of intranasal trigeminal function. International Journal of Psychophysiology, 36 (2), 147–155. Hummel, T. (2008). Retronasal perception of odors. Chemical Biodiversity, 5 (6), 853–861. Illig, K. R., & Haberly, L. B. (2003). Odor-evoked activity is spatially distributed in piri form cortex. Journal of Comparative Neurology, 457 (4), 361–373. Jacob, S., McClintock, M. K., Zelano, B., & Ober, C. (2002). Paternally inherited HLA alle les are associated with women’s choice of male odor. Nature Genetics, 30 (2), 175–179. Jacob, S., Spencer, N. A., Bullivant, S. B., Sellergren, S. A., Mennella, J. A., & McClintock, M. K. (2004). Effects of breastfeeding chemosignals on the human menstrual cycle. Hu man Reproduction, 19 (2), 422–429. Janowitz, H. D., & Grossman, M. I. (1949). Gustoolfactory thresholds in relation to ap petite and hunger sensations. Journal of Applied Physiology, 2 (4), 217–222. Johnson, B. A., Ong, J., & Leon, M. (2010). Glomerular activity patterns evoked by natural odor objects in the rat olfactory bulb are related to patterns evoked by major odorant components. Journal of Comparative Neurology, 518 (9), 1542–1555. Johnson, B. N., Mainland, J. D., & Sobel, N. (2003). Rapid olfactory processing implicates subcortical control of an olfactomotor system. Journal of Neurophysiology, 90 (2), 1084– 1094. Johnson, D. M. G., Illig, K. R., Behan, M., & Haberly, L. B. (2000). New features of connec tivity in piriform cortex visualized by intracellular injection of pyramidal cells suggest Page 31 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose that” primary” olfactory cortex functions like” association” cortex in other sensory sys tems. Journal of Neuroscience, 20 (18), 6974. Johnson, W. G., & Wildman, H. E. (1983). Influence of external and covert food stimuli on insulin secretion in obese and normal persons. Behavioral Neuroscience, 97 (6), 1025– 1028. Kadohisa, M., & Wilson, D. A. (2006). Separate encoding of identity and similarity of com plex familiar odors in piriform cortex. Proceedings of the National Academy of Sciences U S A, 103 (41), 15206–15211. Keller, A., Zhuang, H., Chi, Q., Vosshall, L. B., & Matsunami, H. (2007). Genetic variation in a human odorant receptor alters odour perception. Nature, 449 (7161), 468–472. Kepecs, A., Uchida, N., & Mainen, Z. F. (2006). The sniff as a unit of olfactory processing. Chemical Senses, 31 (2), 167. Kepecs, A., Uchida, N., & Mainen, Z. F. (2007). Rapid and precise control of sniffing dur ing olfactory discrimination in rats. Journal of Neurophysiology, 98 (1), 205. Ketterer, C., Heni, M., Thamer, C., Herzberg-Schafer, S. A., Haring, H. U., & Fritsche, A. (2010). Acute, short-term hyperinsulinemia increases olfactory threshold in healthy sub jects. International Journal of Obesity (London), 35 (8), 1135–1138. Keverne, E. B. (1999). The vomeronasal organ. Science, 286 (5440), 716. Khan, R., Luk, C., Flinker, A., Aggarwal, A., Lapid, H., Haddad, R., et al. (2007). Predict ing odor pleasantness from odorant structure: Pleasantness as a reflection of the physical world. Journal of Neuroscience, 27 (37), 10015–10023. Kimchi, T., Xu, J., & Dulac, C. (2007). A functional circuit underlying male sexual behav iour in the female mouse brain. Nature, 448 (7157), 1009–1014. Kreher, S. A., Mathew, D., Kim, J., & Carlson, J. R. (2008). Translation of sensory input in to behavioral output via an olfactory system. Neuron, 59 (1), 110–124. Kuukasjarvi, S., Eriksson, C. J. P., Koskela, E., Mappes, T., Nissinen, K., & Rantala, M. J. (2004). Attractiveness of women’s body odors over the menstrual cycle: the role of oral contraceptives and receiver sex. Behavioral Ecology, 15 (4), 579–584. Lagier, S., Carleton, A., & Lledo, P. M. (2004). Interplay between local GABAergic in terneurons and relay neurons generates {gamma} oscillations in the rat olfactory bulb. Journal of Neuroscience, 24 (18), 4382. Laing, D. G. (1983). Natural sniffing gives optimum odour perception for humans. Percep tion, 12 (2), 99–117. Laurent, G. (1997). Olfactory processing: Maps, time and codes. Current Opinion in Neu robiology, 7 (4), 547–553. Page 32 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Laurent, G. (1999). A systems perspective on early olfactory coding. Science, 286 (5440), 723–728. Laurent, G. (2002). Olfactory network dynamics and the coding of multidimensional sig nals. Nature Reviews, Neuroscience, 3 (11), 884–895. Laurent, G., Wehr, M., & Davidowitz, H. (1996). Temporal representations of odors in an olfactory network. Journal of Neuroscience, 16 (12), 3837–3847. Leon, M., & Johnson, B. A. (2003). Olfactory coding in the mammalian olfactory bulb. Brain Research, Brain Research Reviews, 42 (1), 23–32. Letarte, A. (1997). Similarities and differences in affective and cognitive origins of food likings and dislikes* 1. Appetite, 28 (2), 115–129. Li, W., Lopez, L., Osher, J., Howard, J. D., Parrish, T. B., & Gottfried, J. A. (2010). Right or bitofrontal cortex mediates conscious olfactory perception. Psychological Science, 21 (10), 1454–1463. Linster, C., Henry, L., Kadohisa, M., & Wilson, D. A. (2007). Synaptic adaptation and odor-background segmentation. Neurobiology of Learning and Memory, 87 (3), 352– 360. (p. 108)

Linster, C., Menon, A. V., Singh, C. Y., & Wilson, D. A. (2009). Odor-specific habituation arises from interaction of afferent synaptic adaptation and intrinsic synaptic potentiation in olfactory cortex. Learning and Memory, 16 (7), 452–459. Little, A. C., Jones, B. C., & Burriss, R. P. (2007). Preferences for masculinity in male bod ies change across the menstrual cycle. Hormones and Behavior, 51 (5), 633–639. Louis-Sylvestre, J., & Le Magnen, J. (1980). Palatability and preabsorptive insulin release. Neuroscience and Biobehavioral Reviews, 4 (Suppl 1), 43–46. Ma, M., Grosmaitre, X., Iwema, C. L., Baker, H., Greer, C. A., & Shepherd, G. M. (2003). Olfactory signal transduction in the mouse septal organ. Journal of Neuroscience, 23 (1), 317. Mainen, Z. F. (2006). Behavioral analysis of olfactory coding and computation in rodents. Current Opinion on Neurobiology, 16 (4), 429–434. Mainland, J., Johnson, B. N., Khan, R., Ivry, R. B., & Sobel, N. (2005). Olfactory impair ments in patients with unilateral cerebellar lesions are selective to inputs from the con tralesion nostril. Journal of Neuroscience, 25 (27), 6362–6371. Mainland, J., & Sobel, N. (2006). The sniff is part of the olfactory percept. Chemical Sens es, 31 (2), 181–196. Mallet, P., & Schaal, B. (1998). Rating and recognition of peers’ personal odors by 9-yearold children: an exploratory study. Journal of General Psychology, 125 (1), 47–64. Page 33 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Malnic, B., Hirono, J., Sato, T., & Buck, L. B. (1999). Combinatorial receptor codes for odors. Cell, 96 (5), 713–723. Mandairon, N., Poncelet, J., Bensafi, M., & Didier, A. (2009). Humans and mice express similar olfactory preferences. PLoS One, 4 (1), e4209. Maresh, A., Rodriguez Gil, D., Whitman, M. C., & Greer, C. A. (2008). Principles of glomerular organization in the human olfactory bulb—implications for odor processing. PLoS One, 3 (7), e2640. Martinez, M. C., Blanco, J., Bullon, M. M., & Agudo, F. J. (1987). Structure of the piriform cortex of the adult rat: A Golgi study. J Hirnforsch, 28 (3), 341–834. Mathey, M. F., Siebelink, E., de Graaf, C., & Van Staveren, W. A. (2001). Flavor enhance ment of food improves dietary intake and nutritional status of elderly nursing home resi dents. Journals of Gerontology. A. Biological Sciences and Medical Sciences, 56 (4), M200–M205. McBride, S. A., & Slotnick, B. (1997). The olfactory thalamocortical system and odor re versal learning examined using an asymmetrical lesion paradigm in rats. Behavioral Neu roscience, 111 (6), 1273. McClintock, M. K. (1971). Menstrual synchrony and suppression. Nature, 229 (5282), 244–245. Meredith, M. (1983). Sensory physiology of pheromone communication. In J. G. Vanden bergh (Ed.), Pheromones and reproduction in mammals (pp. 200–252). New York: Acade mic Press. Meredith, M. (2001). Human vomeronasal organ function: a critical review of best and worst cases. Chemical Senses, 26 (4), 433. Miller, S. L., & Maner, J. K. (2010). Scent of a woman: men’s testosterone responses to ol factory ovulation cues. Psychological Sciences, 21 (2), 276–283. Mombaerts, P., Wang, F., Dulac, C., Chao, S. K., Nemes, A., Mendelsohn, M., et al. (1996). Visualizing an olfactory sensorymap. Cell, 87 (4), 675–686. Monti-Bloch, L., Jennings-White, C., Dolberg, D. S., & Berliner, D. L. (1994). The human vomeronasal system. Psychoneuroendocrinology, 19 (5–7), 673–686. Moran, D. T., Rowley, J. C., Jafek, B. W., & Lovell, M. A. (1982). The fine-structure of the olfactory mucosa in man. Journal of Neurocytology, 11 (5), 721–746. Moskowitz, H. R., & Barbe, C. D. (1977). Profiling of odor components and their mixtures. Sensory Processes, 1 (3), 212–226. Moulton, D. G. (1976). Spatial patterning of response to odors in peripheral olfactory sys tem. Physiological Reviews, 56 (3), 578–593. Page 34 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Mozell, M. M., & Jagodowicz, M. (1973). Chromatographic separation of odorants by the nose: Retention times measured across in vivo olfactory mucosa. Science, 181 (106), 1247–1249. Mujica-Parodi, L. R., Strey, H. H., Frederick, B., Savoy, R., Cox, D., Botanov, Y., et al. (2009). Chemosensory cues to conspecific emotional stress activate amygdala in humans. PLoS One, 4 (7), 113–123. Negoias, S., Visschers, R., Boelrijk, A., & Hummel, T. (2008). New ways to understand aroma perception. Food Chemistry, 108 (4), 1247–1254. Ober, C., Weitkamp, L. R., Cox, N., Dytch, H., Kostyu, D., & Elias, S. (1997). HLA and mate choice in humans. Am J Hum Genet, 61 (3), 497–504. Obrebowski, A., Obrebowska-Karsznia, Z., & Gawlinski, M. (2000). Smell and taste in chil dren with simple obesity. International Journal of Pediatric Otorhinolaryngology, 55 (3), 191–196. O’Doherty, J., Rolls, E. T., Francis, S., Bowtell, R., McGlone, F., Kobal, G., et al. (2000). Sensory-specific satiety-related olfactory activation of the human orbitofrontal cortex. Neuroreport, 11 (4), 893–897. Pageat, P., & Gaultier, E. (2003). Current research in canine and feline pheromones. Vet erinary Clinics of North America, Small Animal Practice, 33 (2), 187–211. Pangborn, R. M., & Berggren, B. (1973). Human parotid secretion in response to pleasant and unpleasant odorants. Psychophysiology, 10 (3), 231–237. Plailly, J., Howard, J. D., Gitelman, D. R., & Gottfried, J. A. (2008). Attention to odor modu lates thalamocortical connectivity in the human brain. Journal of Neuroscience, 28 (20), 5257–5267. Porter, J., Anand, T., Johnson, B., Khan, R. M., & Sobel, N. (2005). Brain mechanisms for extracting spatial information from smell. Neuron, 47 (4), 581–592. Porter, J., Craven, B., Khan, R. M., Chang, S. J., Kang, I., Judkewitz, B., et al. (2007). Mechanisms of scent-tracking in humans. Nature Neuroscience, 10 (1), 27–29. Powell, T. P., Cowan, W. M., & Raisman, G. (1965). The central olfactory connexions. Jour nal of Anatomy, 99 (Pt 4), 791. Prehn, A., Ohrt, A., Sojka, B., Ferstl, R., & Pause, B. M. (2006). Chemosensory anxiety sig nals augment the startle reflex in humans. Neuroscience Letters, 394 (2), 127–130. Prehn-Kristensen, A., Wiesner, C., Bergmann, T. O., Wolff, S., Jansen, O., Mehdorn, H. M., et al. (2009). Induction of empathy by the smell of anxiety. PLoS One, 4 (6), e5987.

Page 35 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Price, J. L. (1973). An autoradiographic study of complementary laminar patterns of termination of afferent fibers to the olfactory cortex. Journal of Comparative Neurology, 150, 87–108. (p. 109)

Price, J. L. (1987). The central olfactory and accessory olfactory systems. In T. E. Finger & W. L. Silver (Eds.), Neurobiology of taste and smell (179–203). New York: Wiley. Price, J. L. (1990). Olfactory system. In G. Paxinos (Ed.), Human nervous system (pp. 979– 1001). San Diego: Academic Press. Ressler, K. J., Sullivan, S. L., & Buck, L. B. (1993). A zonal organization of odorant recep tor gene expression in the olfactory epithelium. Cell, 73 (3), 597–609. Restrepo, D., Arellano, J., Oliva, A. M., Schaefer, M. L., & Lin, W. (2004). Emerging views on the distinct but related roles of the main and accessory olfactory systems in respon siveness to chemosensory signals in mice. Hormones and Behavior, 46 (3), 247–256. Richardson, B. E., Vander Woude, E. A., Sudan, R., Thompson, J. S., & Leopold, D. A. (2004). Altered olfactory acuity in the morbidly obese. Obesity Surgery, 14 (7), 967–969. Rinberg, D., Koulakov, A., & Gelperin, A. (2006). Sparse odor coding in awake behaving mice. Journal of Neuroscience, 26 (34), 8857. Roberts, S. C., Gosling, L. M., Carter, V., & Petrie, M. (2008). MHC-correlated odour pref erences in humans and the use of oral contraceptives. Proceedings of the Royal Society of London. B. Biological Sciences, 275 (1652), 2715–2722. Roberts, S. C., & Little, A. C. (2008). Good genes, complementary genes and human mate preferences. Genetica, 132 (3), 309–321. Roberts, T., & Roiser, J. P. (2010). In the nose of the beholder: are olfactory influences on human mate choice driven by variation in immune system genes or sex hormone levels? Experimental Biology and Medicine (Maywood), 235 (11), 1277–1281. Roessner, V., Bleich, S., Banaschewski, T., & Rothenberger, A. (2005). Olfactory deficits in anorexia nervosa. European Archives of Psychiatry and Clinical Neuroscience, 255 (1), 6– 9. Rogers, P. J., & Hill, A. J. (1989). Breakdown of dietary restraint following mere exposure to food stimuli: interrelationships between restraint, hunger, salivation, and food intake. Addictive Behavior, 14 (4), 387–397. Rolls, E. T. (2006). Brain mechanisms underlying flavour and appetite. Philosophical Transactions of the Royal Society of London. B. Biological Sciences, 361 (1471), 1123– 1136. Rolls, E. T., Critchley, H. D., & Treves, A. (1996). Representation of olfactory information in the primate orbitofrontal cortex. Journal of Neurophysiology, 75 (5), 1982–1996. Page 36 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Rolls, E. T., & Rolls, J. H. (1997). Olfactory sensory-specific satiety in humans. Physiology & Behavior, 61 (3), 461–473. Russell, M. J., Switz, G. M., & Thompson, K. (1980). Olfactory influences on the human menstrual cycle. Pharmacology, Biochemisty, and Behavior, 13 (5), 737–738. Saito, H., Chi, Q., Zhuang, H., Matsunami, H., & Mainland, J. D. (2009). Odor coding by a Mammalian receptor repertoire. Science Signal, 2 (60), ra9. Saper, C. B., Chou, T. C., & Elmquist, J. K. (2002). The need to feed: Homeostatic and he donic control of eating. Neuron, 36 (2), 199–211. Savic, I., & Gulyas, B. (2000). PET shows that odors are processed both ipsilaterally and contralaterally to the stimulated nostril. Neuroreport, 11 (13), 2861–2866. Savigner, A., Duchamp-Viret, P., Grosmaitre, X., Chaput, M., Garcia, S., Ma, M., et al. (2009). Modulation of spontaneous and odorant-evoked activity of rat olfactory sensory neurons by two anorectic peptides, insulin and leptin. Journal of Neurophysiology, 101 (6), 2898–2906. Schaal, B., Marlier, L., & Soussignan, R. (2000). Human foetuses learn odours from their pregnant mother’s diet. Chemical Senses, 25 (6), 729–737. Schiffman, S. S. (1974). Physicochemical correlates of olfactory quality. Science, 185 (146), 112–117. Schiffman, S. S. (1997). Taste and smell losses in normal aging and disease. Journal of the American Medical Association, 278 (16), 1357–1362. Schiffman, S., Robinson, D. E., & Erickson, R. P. (1977). Multidimensional-scaling of odor ants—examination of psychological and physiochemical dimensions. Chemical Senses & Flavour, 2 (3), 375–390. Schiffman, S. S., & Warwick, Z. S. (1988). Flavor enhancement of foods for the elderly can reverse anorexia. Neurobiology of Aging, 9 (1), 24–26. Schmidt, H. J., & Beauchamp, G. K. (1988). Adult-like odor preferences and aversions in three-year-old children. Child Development, 1136–1143. Schneider, R. A., & Wolf, S. (1955). Olfactory perception thresholds for citral utilizing a new type olfactorium. Journal of Applied Physiology, 8 (3), 337–342. Schoenfeld, T. A., & Cleland, T. A. (2006). Anatomical contributions to odorant sampling and representation in rodents: zoning in on sniffing behavior. Chemical Senses, 31 (2), 131. Sela, L., Sacher, Y., Serfaty, C., Yeshurun, Y., Soroker, N., & Sobel, N. (2009). Spared and impaired olfactory abilities after thalamic lesions. Journal of Neuroscience, 29 (39), 12059. Page 37 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Sela, L., & Sobel, N. (2010). Human olfaction: A constant state of change-blindness. Ex perimental Brain Research, 1–17. Shepherd, G. M. (2004). The human sense of smell: Are we better than we think? PLoS Bi ology, 2 (5), E146. Shepherd, G. M. (2005). Outline of a theory of olfactory processing and its relevance to humans. Chemical Senses, 30, I3–I5. Shipley, M. T. (1995). Olfactory system. In G. Paxinos (Ed.), Rat nervous system (2nd ed., pp. 899–928). San Diego: Academic Press. Singh, D., & Bronstad, P. M. (2001). Female body odour is a potential cue to ovulation. Proceedings of the Royal Society of London. B. Biological Sciences, 268 (1469), 797–801. Slotnick, B. M., & Schoonover, F. W. (1993). Olfactory sensitivity of rats with transection of the lateral olfactory tract. Brain Research, 616 (1–2), 132–137. Small, D. M., Gerber, J. C., Mak, Y. E., & Hummel, T. (2005). Differential neural responses evoked by orthonasal versus retronasal odorant perception in humans. Neuron, 47 (4), 593–605. Small, D. M., Jones-Gotman, M., Zatorre, R. J., Petrides, M., & Evans, A. C. (1997). Flavor processing: more than the sum of its parts. Neuroreport, 8 (18), 3913–3917. Snyder, D., Duffy, V., Chapo, A., Cobbett, L., & Bartoshuk, L. (2003). Childhood taste dam age modulates obesity risk: Effects on fat perception and preference. Obesity Research, 11, A147–A147. Sobel, N., Prabhakaran, V., Desmond, J. E., Glover, G. H., Goode, R. L., Sullivan, E. V., et al. (1998). Sniffing and smelling: Separate subsystems in the human olfactory cortex. Na ture, 392 (6673), 282–286. Sobel, N., Prabhakaran, V., Hartley, C. A., Desmond, J. E., Zhao, Z., Glover, G. H., et al. (1998). Odorant-induced and sniff-induced activation in the cerebellum of the hu man. Journal of Neuroscience, 18 (21), 8990–9001. (p. 110)

Soussignan, R., Schaal, B., Marlier, L., & Jiang, T. (1997). Facial and autonomic responses to biological and artificial olfactory stimuli in human neonates: Re-examining early hedo nic discrimination of odors. Physiology & Behavior, 62 (4), 745–758. Spehr, M., & Munger, S. D. (2009). Olfactory receptors: G protein-coupled receptors and beyond. Journal of Neurochemistry, 109 (6), 1570–1583. Spehr, M., Spehr, J., Ukhanov, K., Kelliher, K., Leinders-Zufall, T., & Zufall, F. (2006). Sig naling in the chemosensory systems: Parallel processing of social signals by the mam malian main and accessory olfactory systems. Cellular and Molecular Life Sciences, 63 (13), 1476–1484. Page 38 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Stafford, L. D., & Welbeck, K. (2010). High hunger state increases olfactory sensitivity to neutral but not food odors. Chemical Senses, 36 (2), 189–198. Steiner, J. E. (1979). Human facial expressions in response to taste and smell stimulation. Advances in Child Development and Behavior, 13, 257–295. Stern, K., & McClintock, M. K. (1998). Regulation of ovulation by human pheromones. Na ture, 392 (6672), 177–179. Stettler, D. D., & Axel, R. (2009). Representations of odor in the piriform cortex. Neuron, 63 (6), 854–864. Stevens, D. A., & Lawless, H. T. (1981). Age-related changes in flavor perception. Appetite, 2 (2), 127–136. Stevenson, R. J. (2010). An initial evaluation of the functions of human olfaction. Chemical Senses, 35 (1), 3. Stoddart, D. M. (1990). The scented ape: The biology and culture of human odour: Cam bridge, UK: Cambridge University Press. Storan, M. J., & Key, B. (2006). Septal organ of Gr¸neberg is part of the olfactory system. Journal of Comparative Neurology, 494 (5), 834–844. Strotmann, J., Wanner, I., Krieger, J., Raming, K., & Breer, H. (1992). Expression of odor ant receptors in spatially restricted subsets of chemosensory neurons. Neuroreport, 3 (12), 1053–1056. Su, C. Y., Menuz, K., & Carlson, J. R. (2009). Olfactory perception: receptors, cells, and circuits. Cell, 139 (1), 45–59. Tanabe, T., Iino, M., Ooshima, Y., & Takagi, S. F. (1974). Olfactory area in prefrontal lobe. Brain Research, 80 (1), 127–130. Tham, W. W. P., Stevenson, R. J., & Miller, L. A. (2010). The role of the mediodorsal thala mic nucleus in human olfaction. Neurocase, 99999 (1), 1–12. Thornhill, R., Gangestad, S. W., Miller, R., Scheyd, G., McCollough, J. K., & Franklin, M. (2003). Major histocompatibility complex genes, symmetry, and body scent attractiveness in men and women. Behavioral Ecology, 14 (5), 668–678. Tomori, Z., Benacka, R., & Donic, V. (1998). Mechanisms and clinicophysiological implica tions of the sniff- and gasp-like aspiration reflex. Respiration Physiology, 114 (1), 83–98. Uchida, N., Takahashi, Y. K., Tanifuji, M., & Mori, K. (2000). Odor maps in the mammalian olfactory bulb: Domain organization and odorant structural features. Nature Neuro science, 3 (10), 1035–1043.

Page 39 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Vassar, R., Ngai, J., & Axel, R. (1993). Spatial segregation of odorant receptor expression in the mammalian olfactory epithelium. Cell, 74 (2), 309–318. Verhagen, J. V., Wesson, D. W., Netoff, T. I., White, J. A., & Wachowiak, M. (2007). Sniffing controls an adaptive filter of sensory input to the olfactory bulb. Nature Neuroscience, 10 (5), 631–639. Wedekind, C., & Furi, S. (1997). Body odour preferences in men and women: Do they aim for specific MHC combinations or simply heterozygosity? Proceedings of the Royal Soci ety of London. B. Biological Sciences, 264 (1387), 1471–1479. Wedekind, C., Seebeck, T., Bettens, F., & Paepke, A. J. (1995). MHC-dependent mate pref erences in humans. Proceedings of the Royal Society of London. B. Biological Sciences, 260 (1359), 245–249. Wilson, D. A. (1997). Binaral interactions in the rat piriform cortex. Journal of Neurophys iology, 78 (1), 160–169. Wilson, D. A. (2009a). Olfaction as a model system for the neurobiology of mammalian short-term habituation. Neurobiology of Learning and Memory, 92 (2), 199–205. Wilson, D. A. (2009b). Pattern separation and completion in olfaction. Annals of the New York Academy of Sciences, 1170, 306–312. Wilson, D. A., & Stevenson, R. J. (2003). The fundamental role of memory in olfactory per ception. Trends in Neurosciences, 26 (5), 243–247. Wilson, R. I., & Mainen, Z. F. (2006). Early events in olfactory processing. Neuroscience, 29 (1), 163. Witt, M., & Hummel, T. (2006). Vomeronasal versus olfactory epithelium: is there a cellu lar basis for human vomeronasal perception? International review of cytology, 248, 209– 259. Wysocki, C. J., & Meredith, M. (1987). The vomeronasal system. In T. E. Finger & W. L. Silver (Eds.), Neurobiology of taste and smell (pp. 125–150). New York: John Wiley & Sons. Wysocki, C. J., Pierce, J. D., & Gilbert, A. N. (1991). Geographic, cross-cultural, and indi vidual variation in human olfaction. In T. V. Getchell (Ed.), Smell and taste in health and disease (pp. 287–314). New York: Raven Press. Wysocki, C. J., & Preti, G. (2004). Facts, fallacies, fears, and frustrations with human pheromones. Anatomical Record. A. Discoveries in Molecular, Cellular, and Evolutionary Biology, 281 (1), 1201–1211. Yeomans, M. R. (2006). Olfactory influences on appetite and satiety in humans. Physiolo gy of Behavior, 89 (1), 10–14. Page 40 of 41

Looking at the Nose Through Human Behavior, and at Human Behavior Through the Nose Yeshurun, Y., & Sobel, N. (2010). An odor is not worth a thousand words: from multidi mensional odors to unidimensional odor objects. Annual Review of Psychology, 61, 219– 241. Zarzo, M. (2008). Psychologic dimensions in the perception of everyday odors: Pleasant ness and edibility. Journal of Sensory Studies, 23 (3), 354–376. Zeki, S., & Bartels, A. (1999). Toward a theory of visual consciousness* 1. Consciousness and Cognition, 8 (2), 225–259. Zelano, C., & Sobel, N. (2005). Humans as an animal model for systems-level organization of olfaction. Neuron, 48 (3), 431–454. Zhang, X., & Firestein, S. (2002). The olfactory receptor gene superfamily of the mouse. Nature Neuroscience, 5 (2), 124–133. Zhou, W., & Chen, D. (2009). Fear-related chemosignals modulate recognition of fear in ambiguous facial expressions. Psychological Science, 20 (2), 177. Zilstorff-Pedersen, K. (1955). Olfactory threshold determinations in relation to food in take. Acta Otolaryngologica, 45 (1), 86–90. Zufall, F., Firestein, S., & Shepherd, G. M. (1994). Cyclic nucleotide-gated ion channels and sensory transduction in olfactory receptor neurons. Annual Review of Biophysics and Biomolecular Structure, 23 (1), 577–607.

Roni Kahana

Roni Kahana, Department of Neurobiology, Weizmann Institute of Science, Rehovot, Israel Noam Sobel

Noam Sobel, Department of Neurobiology, Weizmann Institute of Science, Rehovot, Israel

Page 41 of 41

Cognitive Neuroscience of Music

Cognitive Neuroscience of Music Petr Janata The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0007

Abstract and Keywords Humans engage with music in many ways, and music is associated with many aspects of our personal and social lives. Music represents an organization of our auditory environ ments, and many neural processes must be recruited and coordinated both to perceive and to create musical patterns. Accordingly, our musical experiences depend on the inter play of diverse brain systems underlying perception, cognition, action, and emotion. Com pared with the study of other human faculties, the neuroscientific study of music is rela tively recent. Paradigms for examining musical functions have been adopted from other domains of neuroscience and also developed de novo. The relationship of music to other cognitive domains, in particular language, has garnered considerable attention. This chapter provides a survey of the experimental approaches taken, and emphasizes consis tencies across the various studies that help us understand musical functions within the broader field of cognitive neuroscience. Keywords: music, auditory environments, cognition, neuroscience

Introduction Discoveries of ancient bone flutes illustrate that music has been a part of human society for millennia (Adler, 2009). Although the reason for why the human brain enables musical capacities is hotly debated—did it evolve like language, or is it an evolutionary epiphe nomenon?—the fact that human individuals and societies devote considerable time and resources to incorporate music into their lives is indisputable. Scientific study of the psychology and neuroscience of music is relatively recent. In terms of human behaviors, music is usually viewed and judged in relation to language. Accord ingly, the organization of music in the brain has been viewed by many in terms of modular organization, whereby music-specific processing modules are associated with specialized neural substrates in much the same way that discrete brain areas (e.g. Broca’s and Wernicke’s areas) have been associated traditionally with language functions (Peretz & Coltheart, 2003). Indeed, neuropsychological case studies in which the loss of language Page 1 of 42

Cognitive Neuroscience of Music abilities can be dissociated from the loss of musical abilities, and vice versa, clearly sup port a modular view. To the extent that language has been treated as a domain of cogni tive neuroscience that is largely separate from other functional domains—such as percep tion, memory, attention, action, and emotion—music, too, has been regarded as a cogni tive novelty with a unique embodiment in the human brain. In other words, music has been subjected to the same localizationist pressures that have historically pervaded cog nitive neuroscience. Neuroscientific inquiry into musical functions has therefore focused most often on the auditory cortices situated along the superior surface of the temporal lobes. The logic is simple: if music is an auditory phenomenon and the auditory cortex (p. 112) processes auditory information, then music must reside in the auditory cortex. Although it is true that we mainly listen to music, meaningful engagement with music is much more than a perceptual phenomenon. The most obvious example is that of musical performance, whereby the action systems of the brain must be engaged. However, as de scribed below in detail, underneath overt perception and action lie more subtle aspects of musical engagement: attention, various forms of memory, and covert action such as men tal imagery. Finally, music plays on our emotions in various ways. Thus, the most parsimo nious view of how music coexists with the other complex behaviors that the human brain supports is not one in which all of the cognitive functions on which music depends are lo calized to a specific area of the brain, but rather one in which musical experiences are in stantiated through the coupling of networks that serve domain general functions. Follow ing a brief description of key principles underlying the organization of acoustic informa tion into music, this chapter treats the neuroscience of each of those functions in turn.

Building Blocks of Music In order to discuss the cognitive neuroscience of music, it is necessary to describe briefly some of the basic dimensions on which music is organized. These broad dimensions en compass time, tonality, and timbre. Music from different cultures utilizes these dimen sions to varying degrees, and in general it is the variability in the ways that these dimen sions are utilized that gives rise to concepts of musical styles and genres that vary within and across cultures. Here I restrict my discussion to Western tonal music.

Time The patterning of acoustic events in time is a crucial element of music that gives rise to rhythm and meter, properties that are essential in determining how we move along with music. We often identify “the beat” in music by tapping our feet or bobbing our heads with a regular periodicity that seems to fit best with the music, and this capacity for en trainment has strong social implications in terms of coordinated action and shared experi ence (Pressing, 2002; Wiltermuth & Heath, 2009). Importantly, meter and rhythm provide a temporal scaffold that guides our expectations for when events will occur (Jones, 1976; London, 2004). This scaffold might be thought of in terms of coupled oscillators that fo cus attention at particular moments in time (Large & Jones, 1999; Large & Palmer, 2002) Page 2 of 42

Cognitive Neuroscience of Music and thereby influence our perception of other musical attributes such as pitch (Barnes & Jones, 2000; Jones et al., 2002). Our perception of metric structure, that aspect of music that distinguishes a waltz from a march, is associated with hierarchies of expectations for when events will occur (Jones & Boltz, 1989; Palmer & Krumhansl, 1990). Some temporal locations within a metric struc ture are more likely to be associated with events (“strong beats”), whereas others are less likely (“weak beats”). The manipulation of when events actually occur relative to weak and strong beat locations is associated with syncopation, a salient feature of many rhythms, which in turn characterize musical styles and shape our tendency to move along with music (Pressing, 2002).

Tonality Tonality refers to notes (pitches) and the relationships among notes. Even though there are many notes that differ in their fundamental frequency (e.g., the frequency associated with each key on a piano keyboard), tonality in Western tonal music is based on twelve pitch classes, with each pitch class associated with a label such as C or C-sharp (Figure 7.1). The organization into twelve pitch classes arises because (1) two sounds that are separated by an octave (a doubling in frequency) are perceptually similar and are given the same note name (e.g., F), and (2) an octave is divided into twelve equal (logarithmic) frequency steps called semitones. When we talk about melody we refer to sequences of notes, and when we talk about harmony, we refer to two or more notes that sound at the same time to produce an interval or chord. Melodies are defined in large part by their contours—the pattern of ups and downs of the notes—such that changes in contour are easier to detect than contour-preserving shifts of isolated notes when the entire melody is transposed to a different key (Dowling, 1978).

Figure 7.1 Tonal relationships. A, The relationship between pitch height and pitch chroma (pitch class) Page 3 of 42

Cognitive Neuroscience of Music is obtained by placing successively higher pitches on a spiral, such that one full turn of the spiral corre sponds to a frequency doubling. This arrangement captures the phenomenon of octave equivalence. When the spiral is viewed from above, the chroma circle emerges. The chroma circle comprises the twelve pitch classes (C, C#, D, etc.) that are used in Western tonal music. B, The seven pitch classes (notes) belonging to the C-major scale are shown in musical notation. The notes on the staff are in ap proximate alignment with the notes along the pitch spiral. The unfilled symbols above the notes labeled C, D, and E illustrate the additional notes that would be played in conjunction with each root note to form a triad—a type of chord. Roman numerals are used to designate the scale position of each note. The case of the numeral indicates whether the triad formed with that note as a root has a major (uppercase) or minor (lowercase) quality. C, Although seven of the twelve possible pitch classes belong to a key, they are not equally representative of the key. This fact is embod ied in the concept of tonal hierarchies (key profiles), which can be derived via psychological methods such as goodness-of-fit ratings of probe tones correspond ing to each of the twelve possible pitch classes. Shown in blue is the canonical Krumhansl & Kessler key profile (Krumhansl, 1990). The generality of this profile is illustrated by the red bars, which were ob tained from about 150 undergraduate students of varying musical aptitude in the author’s psychology of music class. Each student made a single judgment about each probe tone. Very similar key profiles are obtained when the distributions of notes in pieces of music are generated, suggesting that tonal knowl edge embodies the statistics of the music we hear. The seven notes belonging to the key (black note names) are judged to fit better than the five notes that don’t belong to the key (blue note names). The number symbol (#) following a letter indicates a pitch that is raised (sharp) by one semitone, whereas a “b” following a letter indicates a note that is low ered (flat) by one semitone. D, The fact that each ma jor and minor key is associated with a different key profile gives rise to the concept of distance between the different keys. In music theory, the distances are often represented by the circle of fifths for the major (red) and minor (cyan) keys. The circle is so named because working in a clockwise direction, the fifth scale degree, which is the second most stable note in a key (e.g., the note G in C-major), becomes the most stable note (the tonic) of the next key (e.g., G-major). The distances between major and minor keys (pitch probability distributions) are represented most parsi moniously on the surface of a torus. The toroidal rep resentation is arrived at either by multidimensional scaling of subjective distances between the keys or by self-organizing neural networks that are served music that moves through all twenty-four major and minor keys as input. Each location on the torus rep resents a particular pitch probability distribution. Ac cordingly, a tally of the notes in a musical segment can be projected onto the toroidal surface and ren Page 4 of 42

Cognitive Neuroscience of Music dered in color to indicate the likelihood that the piece of music is in a particular key at a particular moment in time. Because the notes that make up a piece’s melodies and harmonies change in time, thereby creating variation in the momentary key pro files, the activation on the torus changes dynamically in time.

Central to the way that tonality works are the concepts of pitch probability distributions and the related notion of key (Krumhansl, 1990; Temperley, 2001, 2007). When we say that a piece of music is in the key of G-major or g-minor, it means that certain notes, such as G or D, will be perceived as fitting well into that key, whereas others, like G-sharp or Csharp, will sound out of place. Tallying up the notes of many different pieces written in the key of G-major, we would find that G and D are the notes that occur most often, whereas G-sharp and C-sharp would not be very frequent. That is, (p. 113) the key of Gmajor is associated with a particular probability distribution across the twelve possible pitch classes. The key of D-major is associated with a slightly different probability distrib ution, albeit one that is closely related to that of G-major in that the note (pitch class) D figures prominently in both. However, the probability distribution for the key of C-sharp major differs markedly from that of G-major. These probability distributions are often re ferred to as key profiles or tonal hierarchies, and they simultaneously reflect the statistics of music as well as perceptual distances between individual notes and the keys (or tonal contexts) that have been established in the minds of listeners (Krumhansl, 1990; Temper ley, 2007). Knowledge of the statistics underlying tonality is presumably acquired through (p. 114) implicit statistical learning mechanisms (Tillmann et al., 2000), as evidenced by the brain’s rapid adaptation to novel tonal systems (Loui et al., 2009b). Conveniently, the perceived, statistical, and music-theoretical distance relationships be tween keys can be represented geometrically on the surface of a ring (torus) such that keys that have many of their notes in common are positioned close to each other on the toroidal surface (Krumhansl & Kessler, 1982; Krumhansl, 1990). Each location on the toroidal surface is associated with a probability distribution across the twelve pitch class es. If a melody or sequence of chords is played that contains notes in proportions that correspond to the probability distribution for a particular location on the torus, then that region of tonal space is considered activated or primed. If a chord is played whose con stituent notes belong to a very different probability distribution (i.e., a distantly situated location on the torus corresponding to a distant key), the chord will sound jarring and out of place. However, if several other chords are now played that are related to the chord that was jarring, the perceived tonal center will shift to that other region of the torus. Therefore, movement of music in tonal space is dynamic and dependent on the melodies and chord progressions that are being played (Janata, 2007; Janata et al., 2002b; Toivi ainen, 2007; Toiviainen & Krumhansl, 2003).

Page 5 of 42

Cognitive Neuroscience of Music

Timbre The third broad dimension of music is timbre. Timbre is what distinguishes instruments from each other. Timbre is the spectrotemporal signature of an instrument: a description of how the frequency content of the sound changes in time. If one considers the range of musical sounds that are generated not just by physical instruments but also by electronic synthesizers, human voices, and environmental sounds that are used for musical purpos es, the perceptual space underlying timbre is vast. Based on multidimensional scaling analyses of similarity judgments between pairs of sounds, it has been possible to identify candidate dimensions of timbre (Caclin et al., 2005; Grey, 1977; Lakatos, 2000; McAdams et al., 1995). Most consistently identified across studies are dimensions corresponding to attack and spectral centroid. Attack refers to the onset characteristics of the amplitude envelope of the sound (e.g., percussive sounds have a rapid attack, whereas bowed sounds have a slower attack). Centroid refers to the location along the frequency axis where the peak of energy lies, and is commonly described as the “brightness” of a sound. The use of acoustic features beyond attack and centroid for the purpose of judging the similarities between pairs of sounds is much less consistent and appears to be context de pendent (Caclin et al., 2005). For (p. 115) example, variation in spectral fine structure, such as the relative weighting of odd and even frequency components, or time-varying spectral features (spectrotemporal flux) influence similarity judgments also. The fact that timbre cannot be decomposed into a compact set of consistent dimensions that satisfacto rily explain the abundance of perceptual variation somewhat complicates the search for the functional representation of timbre in the brain.

Perception and Cognition Tonality Pitch and Melody Some of the earliest attempts to understand the neural substrates for music processing focused on the ability of patients, in whom varying amounts of the temporal lobes, and thereby auditory cortical areas, had been removed, to detect alterations in short melodies. A consistent result has been that right temporal lobe (RTL) damage impairs the ability of individuals to detect a variety of changes in melodies, including the starkest type of change in which the altered note violates both the contour of the melody and the key, whereas left temporal lobe damage leads to no or little impairment (Dennis & Hopy an, 2001; Liegeois-Chauvel et al., 1998; Samson & Zatorre, 1988; Warrier & Zatorre, 2004; Zatorre, 1985). However, more difficult discriminations, in which the altered notes preserve the contour and the key of the initial melody, suffer when either hemisphere is damaged. Patients with RTL damage have difficulty judging whether one pitch is higher or lower than the next, a critical ability for determining both the contour and the specific intervals (distances between notes) that define a melody, even though their basic ability to discriminate whether the notes are the same or different remains intact (Johnsrude et Page 6 of 42

Cognitive Neuroscience of Music al., 2000). Whereas having the context of a familiar melody generally facilitates the abili ty to detect when a final note is mistuned (Warrier & Zatorre, 2002), RTL damage re duces the amount of that facilitation (Warrier & Zatorre, 2004), indicating that the pro cessing of pitch relationships in the temporal lobes also affects more basic perceptual processes such as intonation judgments. Interestingly, the melody processing functions of the RTL appear to depend in large part on areas that are anterior to the primary auditory cortex, which is situated on Heschl’s gyrus (HG; Johnsrude et al., 2000; Samson & Za torre, 1988). The results from the patient studies are supported by a number of functional magnetic resonance imaging (fMRI) studies of pitch and melody processing. A hierarchy of pitch processing is observed in the auditory cortex following a medial to lateral progression. Broadband noise or sounds that have no clear pitch activate medial HG, sounds with dis tinct pitch produce more activation in the lateral half of HG, and sequences in which the pitch varies, in either tonal or random melodies, generate activity that extends rostrally from HG along the superior temporal gyrus (STG) toward the planum polare, biased to ward the right hemisphere (Patterson et al., 2002). One of the critical features of pitch in music is the distinction between pitch height and pitch chroma (pitch class). The chroma of a pitch is referred to by its note name (e.g. D, D#). Critically, chroma represent perceptual constancy that allows notes played in differ ent octaves to be identified as the same note. These separable aspects of pitch appear to have partially distinct neural substrates, with preferential processing of pitch height pos terior to HG in the planum temporale, and processing of chroma anterolateral to HG in the planum polare (Warren et al., 2003), consistent with the proposed role of anterior STG regions in melody processing (Griffiths et al., 1998; Patterson et al., 2002; Schmithorst & Holland, 2003). The neural representations of individual pitches in melod ic sequences are also influenced by the statistical structure of the sequences (Patel & Bal aban, 2000, 2004). In these experiments, the neural representations were quantified by examining the quality of the coupling between the amplitude modulations of the tones used to create the sequence and the amplitude modulations in the response recorded above the scalp using magnetoencephalography (MEG). Random sequences elicited little coupling, whereas highly predictable musical scales elicited the strongest coupling. These effects were strongest above temporal and lateral prefrontal sensor sites, consistent with a hypothesis that a frontotemporal circuit supports the processing of melodic structure.

Detecting Wrong Notes and Wrong Chords The representation and processing of tonal information has been the aspect of music that has been most extensively studied using cognitive neuroscience methods. Mirroring the approach taken in much of the rest of cognitive neuroscience, “expectancy violation” par adigms have been the primary approach to establishing the existence of a cognitive schema through which we assimilate pitch information. In other words, how does the brain respond (p. 116) when a target event, usually the terminal note of a melody or chord of a harmonic progression, is unexpected given the preceding musical context?

Page 7 of 42

Cognitive Neuroscience of Music When scales or melodies end in notes that do not belong to the established key, large pos itive deflections are evident in event-related potentials (ERPs) recorded at posterior scalp sites, indicating the activation of congruency monitoring and context-updating processes indexed by the P300 or late-positive complex (LPC) components of ERP waveforms (Besson & Faïta, 1995; Besson & Macar, 1987; Paller et al., 1992). These effects are ac centuated in subjects with musical training and when the melodies are familiar (Besson & Faïta, 1995; Miranda & Ullman, 2007). Similarly, short sequences of chords that termi nate with a chord that is unexpected given the tonal context established by the preceding chords elicit P300 and LPC components (Beisteiner et al., 1999; Carrion & Bly, 2008; Janata, 1995; Koelsch et al., 2000; Patel et al., 1998), even in natural musical contexts (Koelsch & Mulder, 2002). As would be expected given the sensitivity of the P300 to glob al event probability (Tueting et al., 1970), the magnitude of the posterior positive re sponses increases as the starkness of the harmonic violation increases (Janata, 1995; Pa tel et al., 1998). The appearance of the late posterior positivities depends on overt processing of the tar get chords by making either a detection or categorization judgment. When explicit judg ments about target chords are eliminated, the most prominent deviance-related response is distributed frontally, and typically manifests as an attenuated positivity approximately 200 ms after the onset of a deviant chord. This relative negativity in response to contextu ally irregular chords was termed an early right anterior negativity (ERAN; Koelsch et al., 2000), although in many subsequent studies, it was found to be distributed bilaterally. The ERAN has been studied extensively and is interesting for two principle reasons. First, the ERAN and the right anterior negativity (RATN; Patel et al., 1998) have been interpret ed as markers of syntactic processing in music, paralleling the left anterior negativities associated with the processing of syntactic deviants in language (Koelsch et al., 2000; Pa tel et al., 1998). Localization of the ERAN to Broca’s area using MEG supports such an in terpretation (Maess et al., 2001). (The parallels between syntactic processing in music and language are discussed in a later subsection.) Second, the ERAN is regarded as an in dex of automatic harmonic syntax processing in that it is elicited even when the irregular chords themselves are not task relevant (Koelsch et al., 2000; 2002b). Whether the ERAN is attenuated when attention is oriented away from musical material is a matter of some debate (Koelsch et al., 2002b; Loui et al., 2005). The ERAN is a robust index of harmonic expectancy violation processing, and it is sensi tive to musical training. It is found in children (Koelsch et al., 2003b), and it increases in amplitude with musical training in both adults (Koelsch et al., 2002a) and children (Jentschke & Koelsch, 2009). The amplitude of the ERAN is also sensitive to the probability with which a particular chord occurs at a particular location in a sequence. For example, the ERAN to the same irregular chord function, such as a Neapolitan sixth, is weaker when that chord occurs at a sequence location that is more plausible from a harmonic syntax perspective (Leino et al., 2007). Similarly, using Bach chorales, chords that are part of the original score, but Page 8 of 42

Cognitive Neuroscience of Music not the most expected from a music-theoretical point of view, elicit an ERAN, in compari son to more expected chords that have been substituted in, but a much weaker ERAN than highly unexpected Neapolitan sixth chords inserted into the same location (Steinbeis et al., 2006). The automaticity of the ERAN naturally leads to comparisons with the mismatch negativi ty (MMN), the classic marker of preattentive detection of deviant items in auditory streams (Näätänen, 1992; Näätänen & Winkler, 1999). Given that irregular chords might be regarded as violations of an abstract context established by a sequence of chords, the ERAN could just be a form of “abstract MMN.” The ERAN and MMN occur within a simi lar latency range, and their frontocentral distributions often make them difficult to distin guish from one another based on their scalp topographies (e.g. Koelsch et al., 2001; Leino et al., 2007). Nonetheless, the ERAN and MMN are dissociable (Koelsch, 2009). For ex ample, an MMN elicited by an acoustically aberrant stimulus, such as a mistuned chord or a change in the frequency of a note (frequency MMN), does not show sensitivity to lo cation within a harmonic context (Koelsch et al., 2001; Leino et al., 2007). Moreover, if the sensory properties of the chord sequences are carefully controlled in terms of repeti tion priming for specific notes or the relative roughness of target chords and those in the preceding context, an ERAN is elicited by harmonically incongruent chords even if the harmonically incongruent chords are more similar (p. 117) in their sensory characteristics to the penultimate chords than are the harmonically congruent chords (Koelsch et al., 2007). A number of fMRI studies have contributed to the view that musical syntax is evaluated in the ventrolateral prefrontal cortex (VLPFC), in a region comprising the ventral aspect of the inferior frontal gyrus (IFG), frontal operculum, and anterior insula. The evaluation of target chords in a harmonic priming task results in bilateral activation of this region, and is greater for harmonically unrelated targets than harmonically related targets (Koelsch et al., 2005b; Tillmann et al., 2003). Similarly, chord sequences that contain modulations —strong shifts in the tonal center toward another key—activate this region in the right hemisphere (Koelsch et al., 2002c). Further evidence that the VLPFC is sensitive to con textual coherence comes from a paradigm in which subjects listened to a variety of 23second excerpts of familiar and unfamiliar classical music. Each excerpt was rendered in coherent by being chopped up into 250- to 350-ms segments and then reconstituted with random arrangement of the segments. Bilaterally, the inferior frontal cortex and adjoining insula responded more strongly to the normal music, compared with reordered music (Levitin & Menon, 2003).

Tonal Dynamics Despite their considerable appeal from an experimental standpoint, trial-based expectan cy violation paradigms are limited in their utility in investigating the brain dynamics that accompany listening to extended passages of music in which the manipulation of ex pectancies is typically more nuanced and ongoing. When examined more closely, chord sequences such as those used in the experiments described above do more than establish a particular static tonal context. They actually create patterns of movement within tonal Page 9 of 42

Cognitive Neuroscience of Music space—the system of major and minor keys that can be represented on a torus. The de tails of the trajectories depend on the specific chords and the sequence in which they oc cur (Janata, 2007). Different pieces of music will create different patterns of movement through tonal space, depending on the notes in the melodies and harmonic accompani ments. The time-varying pattern on the toroidal surface can be quantified for any piece of music, and this quantification can then be used to probe the time-varying structure of the fMRI activity recorded while a person listens to the music. This procedure identifies brain areas that are sensitive to the movements of the music through tonal space (Janata, 2005, 2009; Janata et al., 2002b). The “tonality-tracking” approach has suggested a role of the medial prefrontal cortex (MPFC) in the maintenance of tonal contexts and the integration of tonal contexts with music-evoked autobiographical memories (Janata, 2009). When individuals underwent fM RI scans while listening attentively to an arpeggiated melody that systematically moved (modulated) through all twenty-four major and minor keys over the course of 8 minutes (Janata et al., 2003), the MPFC was the one brain region that was consistently active across three to four repeated sessions within listeners and across listeners, even though consistent tonality tracking responses were observed at the level of individuals in several brain areas (Janata, 2005; Janata et al., 2002b). ERP studies provide converging evidence for a context maintenance interpretation in that a midline negativity with a very frontal focus characterizes both the N5 component, a late negative peak that has been interpret ed to reflect contextual integration of harmonically incongruous material (Koelsch et al., 2000; Loui et al., 2009b), and a sustained negative shift in response to modulating se quences (Koelsch et al., 2003a). As discussed below, tonality tracking in the MPFC is ger mane to understanding how music interacts with functions underlying a person’s sense of self because the MPFC is known to support such functions (Gilbert et al., 2006; Northoff & Bermpohl, 2004; Northoff et al., 2006).

Rhythm and Meter As in the case of melody perception, the question arises to what extent auditory cortical areas in the temporal lobe are engaged in the processing of musical rhythmic patterns. Studies of patients in whom varying amounts of either the left or right anterior temporal lobes have been removed indicate that the greatest impairment in reproducing rhythmic patterns is found in patients with excisions that encroach on secondary auditory areas in HG in the right hemisphere (Penhune et al., 1999). The deficits are observed when exact durational patterns are to be reproduced, but not when the patterns can be encoded cate gorically as sequences of long and short intervals. Given the integral relationship between timing and movement, and the propensity of hu mans to move along with the beat in the music, neuroimaging experiments of rhythm and meter perception have examined the degree to which motor systems (p. 118) of the brain are engaged alongside auditory areas during passive listening to rhythms or attentive lis tening while performing a secondary discrimination task (Grahn & Rowe, 2009), listening with the intent to subsequently synchronize with or reproduce the rhythm (Chen et al., Page 10 of 42

Cognitive Neuroscience of Music 2008a), or listening with the intent to make a same/different comparison with a target rhythm (Grahn & Brett, 2007). Discrimination of metrically simple, complex, and non metric rhythms recruits, bilaterally, the auditory cortex, cerebellum, IFG, and a set of pre motor areas including the basal ganglia (putamen), pre–supplementary motor area (pS MA) or supplementary motor area (SMA), and dorsal premotor cortex (PMC) (Grahn & Brett, 2007). The putamen has been found to respond more strongly to simple rhythms than complex rhythms, suggesting that its activation is central to the experienced salience of a beat. A subsequent study found stronger activation throughout the basal ganglia in response to beat versus nonbeat rhythms, along with greater coupling of the putamen with the auditory cortex and medial and lateral premotor areas in the beat con ditions (Grahn & Rowe, 2009). Interestingly, the response of the putamen increased as ex ternal accenting cues weakened, suggesting that activity within the putamen is also shaped by the degree to which listeners generate a subjective beat. Activity in premotor areas and the cerebellum is differentiated by the degree of engage ment with a rhythm (Chen et al., 2008a). The SMA and mid-PMC are active during pas sive listening to rhythms of varying complexity. Listening to a rhythm with the require ment to subsequently synchronize with that rhythm recruits these regions along with ven tral premotor and inferior frontal areas. These regions are then also active when subjects subsequently synchronize their taps with the rhythm. Similar results are obtained in re sponse to short 3-second piano melodies: lateral premotor areas are activated both dur ing listening and during execution of arbitrary key press sequences without auditory feedback (Bangert et al., 2006). Converging evidence for the recruitment of premotor ar eas during listening to rhythmic structure in music has been obtained through analyses of MEG data in which a measure related to the amplitude envelope of the auditory stimulus is correlated with the same measure applied to the MEG data (Popescu et al., 2004). One study of attentive listening to polyphonic music also found increased activation of premo tor areas (pSMA, mid-PMC), although the study did not seek to associate these activa tions directly with the rhythmic structure in the music (Janata et al., 2002a).

Timbre Given the strong dependence of music on variation in timbre (instrument sounds), it is surprising that relatively few studies have addressed the representation of timbre in the brain. Similarity judgments of pairs of heard or imagined orchestral instrument sounds drive activity in auditory cortical areas along the posterior half of the STG, around HG and within the planum temporale (Halpern et al., 2004). When activation in response to more complex timbres (sounds consisting of more spectral components—harmonics—and greater temporal variation in those harmonics) is compared with simpler timbres or pure tones, regions of the STG surrounding primary auditory areas stand out as more active (Menon et al., 2002; Meyer et al., 2006). Processing of attack and spectral centroid cues is more impaired in individuals with right temporal lobe resections than in those with left temporal lobe resections or in normal controls (Samson & Zatorre, 1994).

Page 11 of 42

Cognitive Neuroscience of Music A number of studies have made explicit use of the multidimensional scaling approach to examine the organization of timbral dimensions in the brain. For example, when based on similarity judgments of isolated tones varying in attack or spectral centroid, the perceptu al space of patients with resections of the right temporal lobe is considerably more dis torted than is that of normal controls or individuals with left temporal lobe resections (Samson et al., 2002). These impairments are ameliorated to a great extent, but not en tirely, when the timbral similarity of eight-note melodies is judged (Samson et al., 2002), although the extent to which melodic perception as opposed to simple timbral reinforce ment drives this effect is unclear. Evidence that timbral variation is assessed at relatively early stages of auditory cortical processing comes from observations that the MMN is similar to changes in the emotional connotation of a tone played by a violin, a change in timbre from violin to flute, and changes in pitch (Goydke et al., 2004). Moreover, MMN responses are largely additive when infrequent ignored deviant sounds deviate from standard ignored sounds on multi ple timbre dimensions simultaneously, suggesting that timbral dimensions are processed within separate sensory memory channels (Caclin et al., 2006). Also, within the represen tation of the spectral centroid dimension, the (p. 119) magnitude of the MMN varies lin early with perceptual and featural similarity (Toiviainen et al., 1998). Although timbre can be considered in terms of underlying feature dimensions, musical sounds nonetheless have a holistic object-like quality to them. Indeed, the perceptual pro cessing of timbre dimensions is not entirely independent (Caclin et al., 2007). The inter activity of timbral dimensions becomes evident when timbral categorization judgments are required and manifest themselves mainly in the amplitude of later decision-related components such as the P300 (Caclin et al., 2008). An understanding of how timbral ob jects are represented and behave within broader brain networks in a task-dependent manner, such as when musical pieces are recognized based on very brief timbral cues (Schellenberg et al., 1999), or emotional distinctions are made (Goydke et al., 2004; Bi gand et al., 2005), remains to be elaborated.

Attention Most of the research described to this point was aimed at understanding the representa tion of basic musical features and dimensions, and the processing of change along those dimensions, without much regard for the broader and perhaps more domain-general psy chological processes that are engaged by music. Following from the fact that expectancy violation paradigms have been a staple of cogni tive neuroscience research on music, considerable information has been collected about attentional capture by deviant musical events. Working from a literature based primarily on visual attention, Corbetta and Shulman (2002) proposed a distinction between a dorsal and a ventral attentional system, whereby the ventral attentional system is engaged by novel or unexpected sensory input while the dorsal attentional system is active during en dogenously guided expectations, such as the orientation of attention to a particular spa Page 12 of 42

Cognitive Neuroscience of Music tial location. Overall, the orienting and engagement of attention in musical tasks recruits these attention systems. Monitoring for target musical events and the targets themselves cause activity increases in the ventral system—in the VLPFC in the region of the frontal operculum where the IFG meets the anterior insula (Janata et al., 2002a; Koelsch et al., 2002c; Tillmann et al., 2003). The strongest activation arises when the targets violate har monic expectations (Maess et al., 2001; Tillmann et al., 2003). Structural markers in music, such as the boundaries of phrases or the breaks between movements, also cause the ventral and dorsal attentional systems to become engaged (Nan et al., 2008; Sridharan et al., 2007), with the ventral system leading the dorsal sys tem (Sridharan et al., 2007). Attentive listening to excerpts of polyphonic music engages both systems even in the absence of specific targets or boundary markers (Janata et al., 2002a ; Satoh et al., 2001). Interestingly, the ventral attentional system is engaged, bilat erally, when (1) an attentive listening task requires target detection in either selective or divided attention conditions, or (2) selective listening is required without target detec tion, but not during divided/global listening without target detection. When task demands are shifted from target detection to focusing attention on an instrument as though one were trying to memorize the part the instrument is playing, working memory areas in the dorsolateral prefrontal cortex (DLPFC) are recruited bilaterally (Janata et al., 2002a). Given the integral relationship between attention and timing (Jones, 1976; Large & Jones, 1999), elements of the brain’s attention and action systems interact when attention is fo cused explicitly on timing judgments, and this interaction is modulated by individual dif ferences in listening style. For example, while frontoparietal attention areas, together with auditory and premotor areas, are engaged overall, greater activation is observed within the ventral attentional system in those individuals who tend to orient their atten tion toward a longer, rather than a subdivided, beat period (Grahn & McAuley, 2009).

Memory Music depends on many forms of memory that aid in its perception and production. For example, we form an implicit understanding of tonal structure that allows us to form ex pectations and detect violations of those expectations. We also store knowledge about musical styles in terms of the harmonic progressions, timbres, and orchestration that characterize them. Beyond the memory for structural aspects of music are memories for specific pieces of music or autobiographical memories that may be inextricably linked to those pieces of music. We also depend on working memory to play music in our minds, ei ther when imagining a familiar song that we have retrieved from long-term memory or when repeating a jingle from a commercial that we just heard. Because linguistic materi al is often an integral part of music (i.e., the lyrics in songs), the set of memory processes that needs to be considered in association with music necessarily extends to include those associated with language. Two questions that arise (p. 120) are, How do the different memory systems interact, and how might they be characterized in terms of their overlap with memory systems identified using different tasks and sensory modalities?

Page 13 of 42

Cognitive Neuroscience of Music Working Memory An early neuroimaging study of musical processes used short eight-note melodies and found that parietal and lateral prefrontal areas were recruited when subjects had to com pare the pitch of the first and last notes (Zatorre et al., 1994). This result suggested that there is an overlap of musical working memory with more general working memory sys tems. Direct evidence that verbal working memory and tonal working memory share the same neural substrates—auditory, lateral prefrontal, and parietal cortices—was obtained in two studies in which verbal and tonal material was presented and rehearsed in sepa rate trials (Hickok et al., 2003) or in which the stimuli were identical but the task instruc tions emphasized encoding and rehearsal of either the verbal or tonal material (Koelsch et al., 2009). Further studies relevant to musical working memory are discussed below in the section on musical imagery.

Episodic Memory If we have a melody running through our minds, that is, if we are maintaining a melody in working memory, it is likely the consequence of having heard and memorized the melody at some point in the past. The melody need not even be one that we heard repeatedly dur ing our childhood (remote episodic memory), but could be one that we heard for the first time earlier in the experimental session (recent episodic memory). Neuroimaging experi ments have examined both types of episodic memory. During an incidental encoding phase of an episodic memory experiment, familiarity judg ments about 5-second excerpts of melodies (half of which were familiar) with no associat ed lyrics resulted in activation of medial prefrontal and anterior temporal lobe regions (Platel et al., 2003). The same pattern of activations was observed when a familiarity judgment task about nursery tunes was contrasted against a change detection judgment task using those same melodies (Satoh et al., 2006). Familiarity judgments based on CD recordings (as opposed to synthesized melodies) of instrumental music without lyrics or voice were found to increase activation within the MPFC, but not the anterior temporal lobes (Plailly et al., 2007). Similarly, both increased familiarity and autobiographical salience of popular music excerpts that did contain lyrics resulted in stronger activation of the dorsal medial prefrontal cortex, but not the anterior temporal lobes. In addition to showing a stronger response to familiar and memory-evoking music, the MPFC tracked the trajectories of musical excerpts in tonal space, supporting a hypothesis that tonal con texts are integrated with self-relevant information within this region (Janata, 2009). One possible reason for the discrepant findings in the anterior temporal lobes is the use of neuroimaging technique in that the positron emission tomography studies (Platel et al., 2003; Satoh et al., 2006) were not susceptible to signal loss in those regions as were the fMRI experiments (Janata, 2009; Plailly et al., 2007). Another is the use of complex recorded musical material compared with monophonic melodies. Familiarity judgments about monophonic melodies must be based solely on the pitch and temporal cues of a sin gle melodic line, whereas recordings of orchestral or popular music contain a multitude of timbral, melodic, harmonic, and rhythmic cues that can facilitate familiarity judgments. Page 14 of 42

Cognitive Neuroscience of Music Recruitment of the anterior temporal lobes is consistent with the neuropsychological evi dence of melody processing and recognition deficits following damage to those areas (Ay otte et al., 2000; Peretz, 1996). Indeed, when engaged in a recognition memory test in which patients were first presented with twenty-four unfamiliar folk tune fragments, and then made old/new judgments, those with right temporal lobe excisions were mainly im paired on tune recognition, whereas left temporal lobe excisions resulted in impaired recognition of the words (Samson & Zatorre, 1991). When tunes and words were com bined, new words paired with old tunes led to impaired tune recognition in both patient groups, suggesting some sort of text–tune integration process involving the temporal lobes of both hemispheres. In contrast to making judgments about or experiencing the long-term familiarity of musi cal materials, making judgments about whether a melody (either familiar or unfamiliar) was recently heard is more comparable with typical laboratory episodic memory tasks in which lists of items are memorized. The results from the small number of studies that have examined brain activations during old/new judgments for musical material consis tently indicate that different brain areas than those described above are recruited. More over, the reported activation patterns are quite heterogeneous, including the right hip pocampus (Watanabe et al., 2008), and a large number of prefrontal, parietal, and lateral temporal loci distributed bilaterally (Klostermann et al., 2009; (p. 121) Platel et al., 2003; Watanabe et al., 2008). Most consistent among those are the recruitment of the lateral anterior prefrontal cortex along the middle frontal gyrus (Brodmann area 10) and assort ed locations in the precuneus.

Absolute Pitch The rare ability to accurately generate the note name for a tone played in isolation with out an external referent is popularly revered as a highly refined musical ability. Neu roimaging experiments indicate that regions of the left DLPFC that are associated with working memory functions become more active when absolute pitch possessors passively listen to or make melodic interval categorization judgments about pairs of tones relative to musically trained individuals without absolute pitch (Zatorre et al., 1998). The hypothe sis that these regions become more active because of the process of associating a note with a label is supported by the observation that nonmusician subjects who are trained to associate chords with arbitrary numbers show activity within this region during that task following training (Bermudez & Zatorre, 2005). Although the classic absolute pitch ability as defined above is rare, the ability to distinguish above chance whether a recording of popular music has been transposed by one or two semitones is common (Schellenberg & Trehub, 2003), although not understood at a neuroscientific level.

Page 15 of 42

Cognitive Neuroscience of Music

Parallels Between Music and Language Syntax As noted in the section on the processing of harmonic/tonal structure in music, harmonic incongruities appear to engage brain regions and processes that are similar to those in volved in the processing of syntactic relations in language. These observations suggest that music and language may share neural resources that are more generally devoted to syntactic processing (Fedorenko et al., 2009; Patel, 2003). If there is a shared resource, then processing costs or altered ERP signatures should be observed when a person is at tending to and making judgments about one domain and syntactic violations occur in the unattended domain. Indeed, when chords are presented synchronously with words, and subjects have to make syntactic or semantic congruity judgments regarding the final word of each sentence, the amplitude of the left anterior negativity (LAN), a marker of linguistic syntax processing, is reduced in response to a syntactically incongruous word when it is accompanied by a harmonically irregular chord compared with when it is ac companied by a harmonically regular chord (Koelsch et al., 2005a). Conversely, the ERAN is reduced in amplitude when an incongruous chord is accompanied by a syntactically in congruous word (Steinbeis & Koelsch, 2008). Interestingly, there is no effect of harmonic (in)congruity on the N400, a marker of semantic incongruity, when words are being judged (Koelsch et al., 2005a). Nor is there an effect of physical auditory incongruities that give rise to an MMN response when sequences of tones, rather than chords, accom pany the words. The latter result further supports the separability of processes underly ing the ERAN and MMN.

Semantics The question of whether music conveys meaning is another interesting point of compari son between music and language. Music lacks sufficient specificity to unambiguously con vey relationships between objects and concepts, but it can be evocative of scenes and emotions (e.g., Beethoven’s Pastoral Symphony). Even without direct reference to lan guage, music specifies relationships among successive tonal elements (e.g., notes and chords), by virtue of the probability structures that govern music’s movement in tonal space. Less expected transitions create a sense of tension, whereas expected transitions release tension (Lerdahl & Krumhansl, 2007). Similarly, manipulations of timing (e.g., tempo, rhythm, and phrasing) parallel concepts associated with movement. Evidence of music’s ability to interact with the brain’s semantic systems comes from two elegant studies that make use of simultaneously presenting musical and linguistic materi al. The first (Koelsch et al., 2004) used short passages of real music to prime semantic contexts and then examined the ERP response to probe words that were either semanti cally congruous or incongruous with the concept ostensibly primed by the music. The in congruous words elicited an N400, indicating that the musical passage had indeed primed a meaningful concept as intended. The second (Steinbeis & Koelsch, 2008) used simultaneously presented words and chords, along with a moderately demanding dual Page 16 of 42

Cognitive Neuroscience of Music task in which attention had to be oriented to both the musical and linguistic information, and found that semantically incongruous words affected the processing of irregular (Neapolitan) chords. The semantic incongruities did not affect the ERAN, but rather af fected the N500, a late frontal negativity that follows the ERAN in response to incongru ous chords and is interpreted as a stage of contextual (p. 122) integration of the anom alous chord (Koelsch et al., 2000).

Action Although technological advances over the past few decades have made it possible to se quence sounds and produce music without the need for a human to actively generate each sound, music has been and continues to be intimately dependent on the motor sys tems of the human brain. Musical performance is the obvious context in which the motor systems are engaged, but the spectrum of actions associated with music extends beyond overt playing of an instrument. Still within the realm of overt action are movements that are coordinated with the music, be they complex dance moves or the simple tapping, nod ding, or bobbing along with perceived beat. Beyond that is the realm of covert action, in which no overt movements are detectable, but movements or sensory inputs are imag ined. Even expectancy, the formation of mental images of anticipated sensory input, can be viewed as a form of action (Fuster, 2001; Schubotz, 2007). As described above, listening to rhythms in the absence of overt action drives activity within premotor areas, the basal ganglia, and the cerebellum. Here, we examine musical engagement of the brain’s action system when some form of action, either overt or covert, is required. One of the beautiful things about music is the ability to engage the ac tion system across varying degrees of complexity and still have it be a compelling musical experience, from simple isochronous synchronization with the beat to virtuosic polyrhyth mic performance on an instrument. Within other domains of cognitive neuroscience, there is an extensive literature on timing and sequencing behaviors that shows differential en gagement of the action systems as a function of complexity (Janata & Grafton, 2003), which is largely echoed in the emerging literature pertaining explicitly to music.

Sensorimotor Coordination Tapping Perhaps the simplest form of musical engagement is tapping isochronously with a metronome whereby a sense of meter is imparted through the periodic accentuation of a subset of the pacing events (e.g., accenting every other beat to impart the sense of a march or every third beat to impart the sense of a waltz). In these types of situations, a very simple coupling between the posterior auditory cortex and dorsal premotor areas is observed, in which the strength of the response in these two areas is positively correlated with the strength of the accent that drives the metric salience and the corresponding be havioral manifestation of longer taps to more salient events (Chen et al., 2006). When the synchronization demands increase as simple and complex metric and then nonmetric Page 17 of 42

Cognitive Neuroscience of Music rhythms are introduced, behavioral variability increases, particularly among nonmusi cians. Positively correlated with the increased performance variability is the activity with in a more extensive network comprising the pSMA, SMA, ventral PMC, DLPFC, inferior parietal lobule, thalamus, and cerebellum, with a few additional differences between mu sicians and non-musicians (Chen et al., 2008b). Thus, premotor areas are coupled with at tention and working memory areas as the sensorimotor coupling demands increase. Basic synchronization with a beat also provides a basis for interpersonal synchronization and musical communication. Simultaneous electroencephalogram (EEG) recordings from guitarists given the task of mentally synchronizing with a metronome and then commenc ing to play a melody together reveal that EEG activity recorded from electrodes situated above premotor areas becomes synchronized both within and between the performers (Lindenberger et al., 2009). Although the degree of interpersonal synchronization that arises by virtue of shared sensory input is difficult to estimate, such simultaneous record ing approaches are bound to shape our understanding of socioemotional aspects of senso rimotor coordination.

Singing The adjustment of one’s own actions based on sensory feedback is an important part of singing. In its simplest form, the repeated singing/chanting of a single note, in compari son to passive listening to complex tones, recruits auditory cortex, motor cortex, the SMA, and the cerebellum, with possible involvement of the anterior cingulate and basal ganglia (Brown et al., 2004b; Perry et al., 1999; Zarate & Zatorre, 2008). However, as pitch regulation demands are increased through electronic shifting of the produced pitch, additional premotor and attentional control regions such as the pSMA, ventral PMC, basal ganglia, and intraparietal sulcus are recruited across tasks that require the singer to either ignore the shift or try to compensate for it. The exact network of recruited areas depends on the amount of singing experience (Zarate & Zatorre, 2008). In regard to more melodic material, (p. 123) repetition of, or harmonization with, a melody also engages the anterior STG relative to monotonic vocalization (Brown et al., 2004b), although this acti vation is not seen during the singing of phrases from an aria that is familiar to the subject (Kleber et al., 2007).

Performance Tasks involving the performance of complex musical sequences, beyond those that are re produced via the imitation of a prescribed auditory pattern, afford an opportunity to ob serve the interaction of multiple brain systems. Performance can be externally guided by a musical score, internally guided as in the case of improvisation, or some combination of the two.

Score-Based One of the first neuroimaging studies of musical functions examined performance of a Bach partita from a score and the simpler task of playing scales contrasted with listening to the corresponding music (Sergent et al., 1992). Aside from auditory, visual, and motor Page 18 of 42

Cognitive Neuroscience of Music areas recruited by the basic processes of hearing, score reading, and motor execution, parietal regions were engaged, presumably by the visuomotor transformations associated with linking the symbols in the score with a semantic understanding of those symbols as well as associated actions (Bevan et al., 2003; McDonald, 2006; Schön et al., 2001; 2002). In addition, left premotor and IFG areas were engaged, presumably reflecting some of the sequencing complexity associated with the partita. A similar study (Parsons et al., 2005) in which bimanual performance of memorized Bach pieces was compared with bimanual playing of scales, found extensive activation of medial and lateral premotor areas, anteri or auditory cortex, and subcortical activations in the thalamus and basal ganglia, presum ably driven by the added demands of retrieving and executing complex sequences from memory. Other studies complicate the interpretation that premotor cortices are driven by greater complexity in the music played. For example, separate manipulation of melodic and rhyth mic complexity found some areas that were biased toward processing melodic informa tion (mainly in the STG and calcarine sulcus), whereas others were biased toward pro cessing rhythmic information (left inferior frontal cortex and inferior temporal gyrus), but there was no apparent activation of premotor areas (Bengtsson & Ullen, 2006).

Improvised Music, like language, is often improvised with the intent of producing a syntactically (and semantically) coherent stream of auditory events. Given a task of continuing an unfamil iar melody or linguistic phrase with an improvised sung melodic or spoken linguistic phrase, a distributed set of brain areas is engaged in common for music and language, in cluding the SMA, motor cortex, putamen, globus pallidus, cerebellum, posterior auditory cortex, and lateral inferior frontal cortex (Brodmann area 44/45), although the extent of activation in area 44/45 is greater for language (Brown et al., 2006). Lateral premotor ar eas are consistently found to be active during improvisation tasks that involve various de grees of piano performance realism. For instance, when unimanual production of melodies is constrained by a five-key keyboard and instructions that independently vary the amount of melodic or rhythmic freedom that can be exhibited by the subject, activity in mid-dorsal premotor cortex is modulated by complexity along both dimensions (Berkowitz & Ansari, 2008). A similar region is recruited during unimanual production while improvising around a visually presented score, both when the improvised perfor mance must be memorized and when it is improvised freely without memorization (Bengtsson et al., 2007). A more dorsal premotor area is engaged during this type of im provisation also, mirroring effects found in a study of unimanual improvisation in which free improvisation without a score was contrasted with playing a jazz melody from memo ry (Limb & Braun, 2008). The latter study observed activation within an extensive net work encompassing the ventrolateral prefrontal (Brodmann area 44), middle temporal, parietal, and cerebellar areas. Emotion areas in the ventromedial prefrontal cortex were active during improvisation also, providing the first neuroimaging evidence of how motor control areas are coupled with affective areas during a highly naturalistic task. Interest ingly, both of the studies in which improvisation was least constrained also found substan Page 19 of 42

Cognitive Neuroscience of Music tial activation in extrastriate visual cortices that could not be attributed to visual input or score reading, suggesting perhaps that visual mental imagery processes accompany im provisation. One must note that in all of these studies, subjects were musicians, often with high levels of training.

Imagery Music affords an excellent opportunity for examining mental imagery. It is common to sing to oneself or have a song stuck in one’s head, so it would (p. 124) seem that the brain’s sensorimotor system is covertly engaged by this mental pastime. Studies of musi cal imagery have tended to emphasize either the auditory or the motor components, with an interest in determining the degree to which the primary auditory and motor cortices are engaged.

Auditory Imagery Activation of auditory association cortices is found using fMRI or PET when subjects sing through a short melody in order to compare the pitch of two notes corresponding to spe cific words in the lyric (Zatorre et al., 1996), or continue imaging the notes following the opening fragment of a television show theme song (Halpern & Zatorre, 1999). The activa tion of auditory areas is corroborated by EEG/MEG studies in which responses to an imagined note (Janata, 2001) or expected chord (Janata, 1995; Otsuka et al., 2008) closely resemble auditory evoked potentials with known sources in the auditory cortex (e.g., the N100). One study that used actual CD recordings of instrumental and vocal music found extensive activation of auditory association areas during silent gaps that were inserted in to the recordings, with some activation of the primary auditory cortex when the gaps oc curred in instrumental music (Kraemer et al., 2005). However, another study that used actual CD recordings to examine anticipatory imagery—the phenomenon of imagining the next track on a familiar album as soon as the current one ends—found no activation of the auditory cortices during the imagery period but extensive activation of a frontal and pre motor network (Leaver et al., 2009). Premotor areas, in particular the SMA, as well as frontal regions associated with memory retrieval, have been activated in most neuroimaging studies of musical imagery that have emphasized the auditory components (Halpern & Zatorre, 1999; Leaver et al., 2009; Za torre et al., 1996), even under relatively simple conditions that could be regarded as maintenance of items in working memory during same/different comparison judgments of melodies or harmonized melodies lasting 4 to 6 seconds (Brown & Martinez, 2007). It has been argued, however, on the basis of comparing activations in an instrumental timbre imagery task with a visual object imagery task, that the frontal contribution may arise from general imagery task demands (Halpern et al., 2004). Nonetheless, effortful musical imagery tasks, such as those requiring the imagining of newly learned pairings of novel melodies (Leaver et al., 2009), imagery of expressive short phrases from an aria (Kleber et al., 2007), or imagining the sound or actions associated with a Mozart piano sonata when only the other modality is presented (Baumann et al., 2007), appear to be associat ed with activity in a widespread network of cortical and subcortical areas. This network Page 20 of 42

Cognitive Neuroscience of Music matches quite well elements of both the ventral and dorsal attentional networks (Corbet ta & Shulman, 2002) and the network observed when attentive listening to polyphonic music is contrasted with rest (Janata et al., 2002a).

Motor Imagery Several studies have focused on motor imagery. In a group of pianists, violinists, and cel lists, imagined performance of rehearsed pieces from the classical repertoire recruited frontal and parietal areas bearing resemblance to the dorsal attention network, together with the SMA and subcortical areas and cerebellum (Langheim et al., 2002). Imagining performing the right-hand part of one of Bartok’s Mikrokosmos while reading the score similarly activates the dorsal attentional network along with visual areas and the cerebel lum (Meister et al., 2004). Interestingly, the SMA is not activated significantly when the source of the information to be imagined is external rather than internal (i.e., playing from memory), indicating that premotor and parietal elements of the dorsal attentional system coordinate with other brain regions based on the specific demands of the particu lar imagery task.

Emotion The relationship between music and emotion is a complex one, and multiple mechanisms have been postulated through which music and the emotion systems of the brain can in teract (Juslin & Vastfjall, 2008). Compared with the rather restricted set of paradigms that have been developed for probing the structural representations of music (e.g., tonali ty), the experimental designs for examining neural correlates of emotion in music are di verse. The precise emotional states that are captured in any given experiment, and their relevance to real music listening experiences, are often difficult to discern when the actu al comparisons between experimental conditions are considered carefully. Manipulations have tended to fall into one of two categories: (1) normal music contrasted with the same music rendered dissonant or incoherent, or (2) selection or composition of musical stimuli to fall into discrete affective categories (e.g., happy, sad, fearful). In general, studies have found modulation of activity within limbic system areas of the brain. (p. 125)

When the relative dissonance of a mechanical performance of a piano melody is

varied by the dissonance of the accompanying chords, activity in the right parahippocam pal gyrus correlates positively with the increases in dissonance and perceived unpleasant ness, whereas activity in the right orbitofrontal cortex and subcallosal cingulate increases as the consonance and perceived pleasantness increase (Blood et al., 1999). Similarly, when listening to pleasant dance music spanning a range of mostly classical genres is contrasted with listening to the same excerpts rendered dissonant and displeasing by mixing the original with two copies that have been pitch-shifted by a minor second and a tritone, medial temporal areas—the left parahippocampal gyrus, hippocampus, and bilat eral amygdala—respond more strongly to the dissonant music, whereas areas more typi cally associated with listening to music—the auditory cortex, the left IFG, anterior insula and frontal operculum, and ventral premotor cortex—respond more strongly to the origi Page 21 of 42

Cognitive Neuroscience of Music nal pleasing versions (Koelsch et al., 2006). The same stimulus materials result in stronger theta activity along anterior midline sites in response to the pleasing music (Sammler et al., 2007). Listening to coherent excerpts of music rather than their tempo rally scrambled counterparts (Levitin & Menon, 2003) increases activity in parts of the dopaminergic pathway—the ventral tegmental area and nucleus accumbens (Menon & Levitin, 2005). These regions interact and are functionally connected to the left IFG, insu la, hypothalamus, and orbitofrontal cortex, thus delineating a set of emotion-processing areas of the brain that are activated by music listening experiences that are relatively pleasing. The results of the above-mentioned studies are somewhat heterogeneous and challenging to interpret because they depend on comparisons of relatively normal (and pleasing) mu sic-listening experiences with highly abnormal (and displeasing) listening experience, rather than comparing normal pleasing listening experiences with normal displeasing ex periences. Nonetheless, modulation of the brain’s emotion circuitry is also observed when the statistical contrasts do not involve distorted materials. Listening to unfamiliar and pleasing popular music compared with silent rest activates the hippocampus, nucleus ac cumbens, ventromedial prefrontal cortex, right temporal pole, and anterior insula (Brown et al., 2004a). When listening to excerpts of unfamiliar and familiar popular music, activi ty in the VMPFC increases as the degree of experienced positive affect increases (Janata, 2009). Somewhat paradoxically, listening to music that elicits chills (goosebumps or shiv ers down the spine)—something that is considered by many to be highly pleasing—re duces activity in the VMPFC (where activity increases tend to be associated with positive emotional responses), whereas activity in the right amygdala and in the left hippocampus/ amygdala also decreases (Blood & Zatorre, 2001). Activity in other brain areas associated with positive emotional responses, such as the ventral striatum and orbitofrontal cortex increases, along with activity in the insula and premotor areas (SMA and cerebellum). The amygdala has been of considerable interest, given its general role in the processing of fearful stimuli. Patients with either unilateral or bilateral damage to the amygdala show impaired recognition of scary music and difficulty differentiating peaceful music from sad music (Gosselin et al., 2005, 2007). Right amygdala damage, in particular, leaves patients unable to distinguish intended fear in music from either positive or nega tive affective intentions (Gosselin et al., 2005). Chord sequences that contain irregular chord functions and elicit activity in the VLPFC are also regarded as less pleasing and elicit activity bilaterally in the amygdala (Koelsch et al., 2008). Violations of syntactic ex pectations also increase the perceived tension in a piece of music and are associated with changes in electrodermal activity—a measure of emotional arousal (Steinbeis et al., 2006). Perhaps the most common association between music and emotion is the relationship be tween affective valence and the mode of the music: The minor mode is consistently asso ciated with sadness, whereas the major mode is associated with happiness. Brain activa tions associated with mode manipulations are not as consistent across studies, however. In one study (Khalfa et al., 2005), the intended emotions of classical music pieces played Page 22 of 42

Cognitive Neuroscience of Music on the piano were assessed on a five-point bivalent scale (sad to happy). Relative to major pieces, minor pieces elicited activity in the posterior cingulate and in the medial pre frontal cortex, whereas pieces in the major mode were not associated with any activity in creases relative to minor pieces. A similar absence of response for major mode melodies relative to minor mode was observed in a different study in which unfamiliar monophonic melodies were used (Green et al., 2008). However, minor mode melodies elicited activity in the left parahippocampal gyrus and rostral anterior cingulate, indicating engagement of the limbic system, albeit in a unique constellation. A study in which responses to (p. 126) short four-chord sequences that established either major or minor tonalities were compared with repeated chords found bilateral activation of the IFG, irrespective of the mode (Mizuno & Sugishita, 2007). This result was consistent with the role of this region in the evaluation of musical syntax, but inconsistent with the other studies comparing ma jor and minor musical material. Finally, a study using recordings of classical music that could be separated into distinct happy, sad, and neutral categories (Mitterschiffthaler et al., 2007) found that happy and sad excerpts strongly activated the auditory cortex bilat erally relative to neutral music. Responses to happy and sad excerpts (relative to neutral) were differentiated in that happy music elicited activity within the ventral striatum, sev eral sections of the cingulate cortex, and the parahippocampal gyrus, whereas sad music was associated with activity in a region spanning the right hippocampus and amygdala, along with cingulate regions.

Anatomy, Plasticity, and Development Music provides an excellent arena in which to study the effects of training and expertise on the brain, both in terms of structure and function (Munte et al., 2002), and also to ex amine structural differences in unique populations, such as those individuals who possess the ability to name pitches in isolation (absolute pitch) or those who have difficulty per ceiving melodies (amusics). Anatomical correlates of musical expertise have been ob served both in perceptual and motor areas of the brain. Unsurprisingly, the auditory cortex has been the specific target of several investigations. An early investigation observed a larger planum temporale in the left hemisphere among musicians, although the effect was primarily driven by musicians with absolute pitch (Schlaug et al., 1995). In studies utilizing larger numbers of musically trained and un trained subjects, the volume of HG, where the primary and secondary auditory areas are situated, was found to increase with increasing musical aptitude (Schneider et al., 2002, 2005). The volumetric measures were positively correlated with the strength of the early (19–30 ms) stages of the evoked responses to amplitude-modulated pure tones (Schneider et al., 2002). Within the lateral extent of HG, the volume was positively correlated with the magnitude of a slightly later peak (50 ms post-stimulus) in the waveform elicited by sounds consisting of several harmonics. Remarkably, the hemispheric asymmetry in the volume of this region was indicative of the mode of perceptual processing of these sounds, with larger left-hemisphere volumes reflecting a bias toward processing the im plied fundamental frequency of the sounds and larger right-hemisphere volumes indicat Page 23 of 42

Cognitive Neuroscience of Music ing a bias toward spectral processing of the sounds (Schneider et al., 2005). In general, the auditory cortex appears to respond more strongly to musical sounds in musicians (Pantev et al., 1998) and as a function of the instrument with which they have had the most experience (Margulis et al., 2009). Whole brain analyses using techniques such a cortical thickness mapping or voxel-based morphometry (VBM) have also revealed differences between individuals trained on musi cal instruments and those with no training, although there is considerable variability in the regions identified in the different studies, possibly a consequence of differences in the composition of the samples and the mapping technique used (Bermudez et al., 2009). Cer tain findings, such as a greater volume in trained pianists of primary motor and so matosensory areas and cerebellar regions responsible for hand and finger movements (Gaser & Schlaug, 2003), are relatively easy to interpret, and they parallel findings of stronger evoked responses in the hand regions of the right hemisphere that control the left (fingering) hand of violinists (Pascual-Leone et al., 1994). Larger volumes in musi cians are also observed in lateral prefrontal cortex, both ventrally (Bermudez et al., 2009; Gaser & Schlaug, 2003; Sluming et al., 2002) and dorsally along the middle frontal gyrus (Bermudez et al., 2009). However, one particular type of musical aptitude, absolute pitch, is associated with decreased cortical thickness in dorsolateral frontal cortex in similar ar eas that are associated with increases in activation in listeners with absolute pitch rela tive to other musically trained subjects (Zatorre et al., 1998). Another paradox presented by VBM are findings of greater gray-matter volumes in amusic subjects in some of the same ventrolateral regions that show greater cortical thickness in musicians (Bermudez et al., 2009; Hyde et al., 2007). The anatomical differences that are observed as a function of musical training are per haps better placed into a functional context when one observes the effects of short-term training on neural responses. Nonmusicians who were trained over the course of two weeks to play a cadence consisting of broken chords on a piano keyboard exhibited a stronger MMN to deviant notes in similar note patterns compared either with their re sponses before receiving training or with a group of subjects who received training by lis tening and making judgments about (p. 127) the sequences performed by the trained group (Lappe et al., 2008). Similarly, nonmusicians who, over the course of 5 days, learned to perform five-note melodies showed greater activation bilaterally in the dorsal IFG (Broca’s area) and lateral premotor areas when listening to the trained melodies compared with listening to comparison melodies on which they had not trained (Lahav et al., 2007). Thus, perceptual responses are stronger following sensorimotor training with in the networks that are utilized during the training. When the training involves reading music from a score, medial superior parietal areas also show effects of training (Stewart et al., 2003). Both mental and physical practice of five-finger piano exercises is capable of strengthening finger representations in the motor cortex across a series of days, as mea sured by reduced transcranial magnetic stimulation (TMS) thresholds for eliciting move ments (Pascual-Leone et al., 1995).

Page 24 of 42

Cognitive Neuroscience of Music

Disorders Musical behaviors, like any other behaviors, are disrupted when the functioning of the neural substrates that support those behaviors is impaired. Neuropsychological investiga tions provide intriguing insights into component processes in musical behaviors and their likely locations in the brain. Aside from the few studies mentioned in the sections above of groups of patients who underwent brain surgery, there are many case studies docu menting the effects of brain insults, typically caused by stroke, on musical functions (Brust, 2003). A synthesis of the findings from these many studies (Stewart et al., 2006) is beyond the scope of this chapter, as is a discussion of the burgeoning topic of using music for neurorehabilitation (Belin et al., 1996; Sarkamo et al., 2008; Thaut, 2005). Here, the discussion of brain disorders in relation to music is restricted to amusia, a music-specific disorder.

Amusia Amusia, commonly referred to as “tone deafness,” refers to a profound impairment in ac curately perceiving melodies. The impairment arises not so much from an inability to dis criminate one note in a melody from the next (i.e., to recognize that a different note is be ing played), but rather from the inability to determine the direction of the pitch change (Ayotte et al., 2002; Foxton et al., 2004). The ability to perceive the direction of pitch change from one note to the next is critical to discerning the contour of the melody, that is, its defining feature. The impairment may be restricted to processing of melodic rather than rhythmic structure (Hyde & Peretz, 2004), although processing of rhythms is im paired when the pitch of individual notes is also changing (Foxton et al., 2006). Impaired identification of pitch direction, but not basic pitch discrimination, has been ob served in patients with right temporal lobe excisions that encroach on auditory cortex in HG (Johnsrude et al., 2000), which is likely to underlie the bias toward the right hemi sphere for the processing of melodic information (Warrier & Zatorre, 2004). A diffusion tensor imaging study of amusics and normal controls found that the volume of the superi or arcuate fasciculus in the right hemisphere was consistently smaller in the group of amusics than in normal controls (Loui et al., 2009a). The arcuate fasciculus connects the temporal and frontal lobes, specifically the posterior superior and middle temporal gyri and the IFG. VBM results additionally indicate a structural anomaly in a small region of the IFG in amusics (Hyde et al., 2006, 2007). Taken together, the neuroimaging studies that have implicated the IFG in the processing of musical syntax and temporal structure (Koelsch et al., 2002c, 2005b; Levitin & Menon, 2003; Maess et al., 2001; Patel, 2003; Till mann et al., 2003), the behavioral and structural imaging data from amusics, and the studies of deficits in melody processing in right temporal lobe lesion patients support a view that the ability to perceive, appreciate, and remember melodies depends in large part on intact functioning of a perception/action circuit in the right hemisphere (Fuster, 2000; Loui et al., 2009a).

Page 25 of 42

Cognitive Neuroscience of Music

Music and the Brain’s Ensemble of Functions During the past 15 to 20 years, there has been a tremendous increase in the amount of knowledge pertaining to the ways in which the human brain interacts with music. Al though it is expeditious for those outside the field to regard music as a tidy circumscribed object or unitary process that is bound to have a concrete representation in the brain, or perhaps conversely a complex cultural phenomenon for which there is no hope of under standing its neural basis, even a modest amount of deeper contemplation reveals music to be a multifaceted phenomenon that is integral to human life. My objective in this chapter was to provide an overview of the variety of musical processes that contribute to musical behaviors and experiences, and of the way that these processes interact with various (p. 128) domain-general brain systems. Clearly, music is a diverse phenomenon, and musi cal stimuli and musical tasks are capable of reaching most every part of the brain. Given this complexity, is there any hope for generating process models of musical functions and behaviors that can lead to a grand unified theory of music and the brain?

Figure 7.2 A highly schematized and simplified sum mary of brain regions involved in different facets of music psychological processes. On the left is a lateral view of the right cerebral hemisphere. A medial view of the right hemisphere is shown on the right. There is no intent to imply lateralization of function in this portrayal. White lettering designates the different lobes. The colored circles correspond to the colored labels in the box below. AG, angular gyrus; DLPFC, dorsolateral prefrontal cortex; HG, Heschl’s gyrus; IFG, inferior frontal gyrus; IPS, intraparietal sulcus; MPFC, medial prefrontal cortex; PMC, premotor cor tex; pSMA, pre–supplementary motor area; PT, planum temporale; SMA, supplementary motor area; STG, superior temporal gyrus; VLPFC, ventrolateral prefrontal cortex.

Obligatory components of process models are boxes with arrows between them, in which each box refers to a discrete function, and perhaps an associated brain area. Such models have been proposed with respect to music (Koelsch & Siebel, 2005; Peretz & Coltheart, 2003). Within such models, some component processes are considered music specific, whereas others represent shared processes with other brain functions (e.g., language, emotion). The issue of music specificity, or modularity of musical functions, is of consider able interest, in large part because of its evolutionary implications (Peretz, 2006). The strongest evidence for modularity of musical functions derives from studies of individuals Page 26 of 42

Cognitive Neuroscience of Music with brain damage in whom specific musical functions are selectively impaired (Peretz, 2006; Stewart et al., 2006). Such specificity in loss of function is remarkable given the usual extent of brain damage. Indeed, inferences about modularity must be tempered when not all possible parallel functions in other domains have been considered. The pro cessing functions of the right lateral temporal lobes provide a nice example. As reviewed in this chapter and elsewhere (Zatorre et al., 2002), the neuropsychological and function al and structural neuroanatomical evidence suggests that the auditory cortex in the right hemisphere is more specialized than the left for the processing of pitch, pitch relation ships, and thereby melody. Voice-selective regions of the auditory cortex are highly selec tive in the right hemisphere (Belin et al., 2000), and in general a large extent of the right lateral temporal lobes appears to be important for the processing of emotional prosody (Ethofer et al., 2009; Ross & Monnot, 2008; Schirmer & Kotz, 2006). Given evidence of parallels between melody and prosody, such as the connotation of sadness by an interval of a minor third in both music and speech (Curtis & Bharucha, 2010), or deficits among amusic individuals in processing speech intonation contours (Patel et al., 2005), it is likely that contour-related music and language functions are highly intertwined in the right temporal lobe. Perhaps more germane to the question of how the brain processes music is the question of how one engages with the music. Few would doubt that the brain of someone dancing a tango at a club would show more extensive engagement with the music than that of some one incidentally hearing music while choosing what cereal to buy at a supermarket. It would therefore seem that any given result from the cognitive neuroscience of music lit erature must be interpreted with regard to the experiential situation of the participants, both in terms of the affective, motivational, and task/goal states they might find them selves in, and in terms of the relationship of the (often abstracted) musical stimuli that (p. 129) they are being asked to interact with to the music they would normally interact with. In this regard, approaches that explicitly relate the dynamic time-varying properties of real musical stimuli to the behaviors and brain responses they engender will become increasingly important. Figure 7.2 is a highly schematized (and incomplete) summary of sets of brain areas that may be recruited by different musical elements and more general processes that mediate musical experiences. It is not intended to be a process model or to represent the outcome of a quantitative meta-analysis. Rather, it serves mainly to suggest that one goal of re search in the cognitive neuroscience of music might be to serve as a model system for un derstanding the coordination of the many processes that underlie creative goal-directed behavior in humans.

References Adler, D. S. (2009). Archaeology: The earliest musical tradition. Nature, 460, 695–696. Ayotte, J., Peretz, I., & Hyde, K. (2002). Congenital amusia: A group study of adults afflict ed with a music-specific disorder. Brain, 125, 238–251. Page 27 of 42

Cognitive Neuroscience of Music Ayotte, J., Peretz, I., Rousseau, I., Bard, C., & Bojanowski, M. (2000). Patterns of music agnosia associated with middle cerebral artery infarcts. Brain, 123 (Pt 9), 1926–1938. Bangert, M., Peschel, T., Schlaug, G., Rotte, M., Drescher, D., Hinrichs, H., Heinze, H.-J., & Altenmüller, E. (2006). Shared networks for auditory and motor processing in profes sional pianists: Evidence from fMRI conjunction. NeuroImage, 30, 917–926. Barnes, R., & Jones, M. R. (2000). Expectancy, attention, and time. Cognitive Psychology, 41, 254–311. Baumann, S., Koeneke, S., Schmidt, C. F., Meyer, M., Lutz, K., & Jancke, L. (2007). A net work for audio-motor coordination in skilled pianists and non-musicians. Brain Res, 1161, 65–78. Beisteiner, R., Erdler, M., Mayer, D., Gartus, A., Edward, V., Kaindl, T., Golaszewski, S., Lindinger, G., & Deecke, L. (1999). A marker for differentiation of capabilities for process ing of musical harmonies as detected by magnetoencephalography in musicians. Neuro science Letters, 277, 37–40. Belin, P., VanEeckhout, P., Zilbovicius, M., Remy, P., Francois, C., Guillaume, S., Chain, F., Rancurel, G., & Samson, Y. (1996). Recovery from nonfluent aphasia after melodic intona tion therapy: A PET study. Neurology, 47, 1504–1511. Belin, P., Zatorre, R. J., Lafaille, P., Ahad, P., & Pike, B. (2000). Voice-selective areas in hu man auditory cortex. Nature, 403, 309–312. Bengtsson, S. L., Csikszentmihalyi, M., & Ullen, F. (2007). Cortical regions involved in the generation of musical structures during improvisation in pianists. Journal of Cognitive Neuroscience, 19, 830–842. Bengtsson, S. L., & Ullen, F. (2006). Dissociation between melodic and rhythmic process ing during piano performance from musical scores. NeuroImage, 30, 272–284. Berkowitz, A. L., & Ansari, D. (2008). Generation of novel motor sequences: The neural correlates of musical improvisation. NeuroImage, 41, 535–543. Bermudez, P., Lerch, J. P., Evans, A. C., & Zatorre, R. J. (2009). Neuroanatomical corre lates of musicianship as revealed by cortical thickness and voxel-based morphometry. Cerebral Cortex, 19, 1583–1596. Bermudez, P., & Zatorre, R. J. (2005). Conditional associative memory for musical stimuli in nonmusicians: Implications for absolute pitch. Journal of Neuroscience, 25, 7718–7723. Besson, M., & Faïta, F. (1995). An event-related potential (ERP) study of musical ex pectancy: Comparison of musicians with nonmusicians. Journal of Experimental Psycholo gy: Human Perception and Performance, 21, 1278–1296.

Page 28 of 42

Cognitive Neuroscience of Music Besson, M., & Macar, F. (1987). An event-related potential analysis of incongruity in mu sic and other non-linguistic contexts. Psychophysiology, 24, 14–25. Bevan, A., Robinson, G., Butterworth, B., & Cipolotti, L. (2003). To play “B” but not to say “B”: Selective loss of letter names. Neurocase, 9, 118–128. Bigand, E., Vieillard, S., Madurell, F., Marozeau, J., & Dacquet, A. (2005). Multidimension al scaling of emotional responses to music: The effect of musical expertise and of the du ration of the excerpts. Cognition & Emotion, 19, 1113–1139. Blood, A. J., & Zatorre, R. J. (2001). Intensely pleasurable responses to music correlate with activity in brain regions implicated in reward and emotion. Proceedings of the Na tional Academy of Sciences U S A, 98, 11818–11823. Blood, A. J., Zatorre, R. J., Bermudez, P., & Evans, A. C. (1999). Emotional responses to pleasant and unpleasant music correlate with activity in paralimbic brain regions. Nature Neuroscience, 2, 382–387. Brown, S., & Martinez, M. J. (2007). Activation of premotor vocal areas during musical discrimination. Brain and Cognition, 63, 59–69. Brown, S., Martinez, M. J., & Parsons, L. M. (2004a). Passive music listening spontaneous ly engages limbic and paralimbic systems. Neuroreport, 15, 2033–2037. Brown, S., Martinez, M. J., & Parsons, L. M. (2006). Music and language side by side in the brain: A PET study of the generation of melodies and sentences. European Journal of Neuroscience, 23, 2791–2803. Brown, S., Martinez, M. J., Hodges, D. A., Fox, P. T., & Parsons, L. M. (2004b). The song system of the human brain. Cognitive Brain Research, 20, 363–375. Brust, J. C. M. (2003). Music and the neurologist: A historical perspective. In I. Peretz & R. J. Zatorre (Eds.), Cognitive neuroscience of music (pp. 181–191). Oxford, UK: Oxford University Press. Caclin, A., Brattico, E., Tervaniemi, M., Naatanen, R., Morlet, D., Giard, M. H., & McAdams, S. (2006). Separate neural processing of timbre dimensions in auditory senso ry memory. Journal of Cognitive Neuroscience, 18, 1959–1972. Caclin, A., Giard M.-H., Smith, B. K., & McAdams, S. (2007). Interactive processing of timbre dimensions: A Garner interference study. Brain Research, 1138, 159–170. Caclin, A., McAdams, S., Smith, B. K., & Giard, M. H. (2008). Interactive processing of timbre dimensions: An exploration with event-related potentials. Journal of Cognitive Neuroscience, 20, 49–64.

Page 29 of 42

Cognitive Neuroscience of Music Caclin, A., McAdams, S., Smith, B. K., & Winsberg, S. (2005). Acoustic correlates of timbre space dimensions: A confirmatory study using synthetic tones. Journal of the Acoustical Society of America, 118, 471–482. (p. 130)

Carrion, R. E., & Bly, B. M. (2008). The effects of learning on event-related potential cor relates of musical expectancy. Psychophysiology, 45, 759–775. Chen, J. L., Penhune, V. B., & Zatorre, R. J. (2008a). Listening to musical rhythms recruits motor regions of the brain. Cerebral Cortex, 18, 2844–2854. Chen, J. L., Penhune, V. B., & Zatorre, R. J. (2008b). Moving on time: Brain network for au ditory-motor synchronization is modulated by rhythm complexity and musical training. Journal of Cognitive Neuroscience, 20, 226–239. Chen, J. L., Zatorre, R. J., & Penhune, V. B. (2006). Interactions between auditory and dor sal premotor cortex during synchronization to musical rhythms. NeuroImage, 32, 1771– 1781. Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven atten tion in the brain. Nature Reviews, Neuroscience, 3, 201–215. Curtis, M. E., & Bharucha, J. J. (2010). The minor third communicates sadness in speech, mirroring its use in music. Emotion, 10, 335–348. Dennis, M., & Hopyan, T. (2001). Rhythm and melody in children and adolescents after left or right temporal lobectomy. Brain and Cognition, 47, 461–469. Dowling, W. J. (1978). Scale and contour: Two components of a theory of memory for melodies. Psychological Review, 85, 341–354. Ethofer, T., De Ville, D. V., Scherer, K., & Vuilleumier, P. (2009). Decoding of emotional in formation in voice-sensitive cortices. Current Biology, 19, 1028–1033. Fedorenko, E., Patel, A., Casasanto, D., Winawer, J., & Gibson, E. (2009). Structural inte gration in language and music: Evidence for a shared system. Memory and Cognition, 37, 1–9. Foxton, J. M., Dean, J. L., Gee, R., Peretz, I., & Griffiths, T. D. (2004). Characterization of deficits in pitch perception underlying “tone deafness.” Brain, 127, 801–810. Foxton, J. M., Nandy, R. K., & Griffiths, T. D. (2006). Rhythm deficits in “tone deafness.” Brain and Cognition, 62, 24–29. Fuster, J. M. (2000). Executive frontal functions. Experimental Brain Research, 133, 66– 70. Fuster, J. M. (2001) The prefrontal cortex—an update: Time is of the essence. Neuron, 30, 319–333. Page 30 of 42

Cognitive Neuroscience of Music Gaser, C., & Schlaug, G. (2003). Brain structures differ between musicians and non-musi cians. Journal of Neuroscience, 23, 9240–9245. Gilbert, S. J., Spengler, S., Simons, J. S., Steele, J. D., Lawrie, S. M., Frith, C. D., & Burgess, P. W. (2006). Functional specialization within rostral prefrontal cortex (Area 10): A meta-analysis. Journal of Cognitive Neuroscience, 18, 932–948. Gosselin, N., Peretz, I., Johnsen, E., & Adolphs, R. (2007). Amygdala damage impairs emo tion recognition from music. Neuropsychologia, 45, 236–244. Gosselin, N., Peretz, I., Noulhiane, M., Hasboun, D., Beckett, C., Baulac, M., & Samson, S. (2005). Impaired recognition of scary music following unilateral temporal lobe excision. Brain, 128, 628–640. Goydke, K. N., Altenmuller, E., Moller, J., & Munte, T. F. (2004). Changes in emotional tone and instrumental timbre are reflected by the mismatch negativity. Cognitive Brain Re search, 21, 351–359. Grahn, J. A., & Brett, M. (2007). Rhythm and beat perception in motor areas of the brain. Journal of Cognitive Neuroscience, 19, 893–906. Grahn, J. A., & McAuley, J. D. (2009). Neural bases of individual differences in beat per ception. NeuroImage, 47, 1894–1903. Grahn, J. A., & Rowe, J. B. (2009). Feeling the beat: Premotor and striatal interactions in musicians and nonmusicians during beat perception. Journal of Neuroscience, 29, 7540– 7548. Green, A. C., Baerentsen, K. B., Stodkilde-Jorgensen, H., Wallentin, M., Roepstorff, A., & Vuust, P. (2008). Music in minor activates limbic structures: A relationship with disso nance? Neuroreport, 19, 711–715. Grey, J. M. (1977). Multidimensional perceptual scaling of musical timbres. Journal of the Acoustical Society of America, 61, 1270–1277. Griffiths, T. D., Buchel, C., Frackowiak, R. S. J., & Patterson, R. D. (1998). Analysis of tem poral structure in sound by the human brain. Nature Neuroscience, 1, 422–427. Halpern, A. R., & Zatorre, R. J. (1999). When that tune runs through your head: A PET in vestigation of auditory imagery for familiar melodies. Cerebral Cortex, 9, 697–704. Halpern, A. R., Zatorre, R. J., Bouffard, M., & Johnson, J. A. (2004). Behavioral and neural correlates of perceived and imagined musical timbre. Neuropsychologia, 42, 1281–1292. Hickok, G., Buchsbaum, B., Humphries, C., & Muftuler, T. (2003). Auditory-motor interac tion revealed by fMRI: Speech, music, and working memory in area Spt. Journal of Cogni tive Neuroscience, 15, 673–682.

Page 31 of 42

Cognitive Neuroscience of Music Hyde, K. L., Lerch, J. P., Zatorre, R. J., Griffiths, T. D., Evans, A. C., & Peretz, I. (2007). Cortical thickness in congenital amusia: When less is better than more. Journal of Neuro science, 27, 13028–13032. Hyde, K. L., & Peretz, I. (2004). Brains that are out of tune but in time. Psychological Science, 15, 356–360. Hyde, K. L., Zatorre, R. J., Griffiths, T. D., Lerch, J. P., & Peretz, I. (2006). Morphometry of the amusic brain: A two-site study. Brain, 129, 2562–2570. Janata, P. (1995). ERP measures assay the degree of expectancy violation of harmonic contexts in music. Journal of Cognitive Neuroscience, 7, 153–164. Janata, P. (2001). Brain electrical activity evoked by mental formation of auditory expecta tions and images. Brain Topography, 13, 169–193. Janata, P. (2005). Brain networks that track musical structure. Annals of the New York Academy of Sciences, 1060, 111–124. Janata, P. (2007). Navigating tonal space. In W. B. Hewlett, E. Selfridge-Field, & E. Cor reia (Eds.), Tonal theory for the digital age (pp. 39–50). Stanford, CA: Center for Comput er Assisted Research in the Humanities. Janata, P. (2009). The neural architecture of music-evoked autobiographical memories. Cerebral Cortex, 19, 2579–2594. Janata, P., Birk, J. L., Tillmann, B., & Bharucha, J. J. (2003). Online detection of tonal popout in modulating contexts. Music Perception, 20, 283–305. Janata, P., Birk, J. L., Van Horn, J. D., Leman, M., Tillmann, B., & Bharucha, J. J. (2002b). The cortical topography of tonal structures underlying Western music. Science, 298, 2167–2170. Janata, P., & Grafton, S. T. (2003). Swinging in the brain: Shared neural substrates for be haviors related to sequencing and music. Nature Neuroscience, 6, 682–687. Janata, P., Tillmann, B., & Bharucha, J. J. (2002a). Listening to polyphonic music recruits domain-general attention and working memory circuits. Cognitive, Affective and Behav ioral Neuroscience, 2, 121–140. Jentschke, S., & Koelsch, S. (2009). Musical training modulates the development of syntax processing in children. NeuroImage, 47, 735–744. (p. 131)

Johnsrude, I. S., Penhune, V. B., & Zatorre, R. J. (2000). Functional specificity in the right human auditory cortex for perceiving pitch direction. Brain, 123, 155–163. Jones, M. R. (1976). Time, our lost dimension—toward a new theory of perception, atten tion, and memory. Psychological Review, 83, 323–355. Page 32 of 42

Cognitive Neuroscience of Music Jones, M. R., & Boltz, M. (1989). Dynamic attending and responses to time. Psychological Review, 96, 459–491. Jones, M. R., Moynihan, H., MacKenzie, N., & Puente, J. (2002). Temporal aspects of stim ulus-driven attending in dynamic arrays. Psychological Science, 13, 313–319. Juslin, P. N., & Vastfjall, D. (2008). Emotional responses to music: The need to consider underlying mechanisms. Behavioral and Brain Science, 31, 559–621. Khalfa, S., Schon, D., Anton, J. L., & Liegeois-Chauvel, C. (2005). Brain regions involved in the recognition of happiness and sadness in music. Neuroreport, 16, 1981–1984. Kleber, B., Birbaumer, N., Veit, R., Trevorrow, T., & Lotze, M. (2007). Overt and imagined singing of an Italian aria. NeuroImage, 36, 889–900. Klostermann, E. C., Loui, P., Shimamura, A. P. (2009). Activation of right parietal cortex during memory retrieval of nonlinguistic auditory stimuli. Cognitive Affective & Behav ioral Neuroscience, 9, 242–248. Koelsch, S. (2009). Music-syntactic processing and auditory memory: Similarities and dif ferences between ERAN and MMN. Psychophysiology, 46, 179–190. Koelsch, S., & Mulder, J. (2002). Electric brain responses to inappropriate harmonies dur ing listening to expressive music. Clinical Neurophysiology, 113, 862–869. Koelsch, S., & Siebel, W. A. (2005). Towards a neural basis of music perception. Trends in Cognitive Sciences, 9, 578–584. Koelsch, S., Schmidt, B. H., & Kansok, J. (2002a). Effects of musical expertise on the early right anterior negativity: An event-related brain potential study. Psychophysiology, 39, 657–663. Koelsch, S., Schroger, E., & Gunter, T. C. (2002b). Music matters: Preattentive musicality of the human brain. Psychophysiology, 39, 38–48. Koelsch, S., Fritz, T., & Schlaug, G. (2008). Amygdala activity can be modulated by unex pected chord functions during music listening. Neuroreport, 19, 1815–1819. Koelsch, S., Gunter, T., Friederici, A. D., & Schroger, E. (2000). brain indices of music pro cessing: “Nonmusicians” are musical. Journal of Cognitive Neuroscience, 12, 520–541. Koelsch, S., Gunter, T., Schroger, E., & Friederici, A. D. (2003a). Processing tonal modula tions: An ERP study. Journal of Cognitive Neuroscience, 15, 1149–1159. Koelsch, S., Gunter, T. C., Wittfoth, M., & Sammler, D. (2005a). Interaction between syn tax processing in language and in music: An ERP study. Journal of Cognitive Neuroscience, 17, 1565–1577.

Page 33 of 42

Cognitive Neuroscience of Music Koelsch, S., Jentschke, S., Sammler, D., & Mietchen, D. (2007). Untangling syntactic and sensory processing: An ERP study of music perception. Psychophysiology, 44, 476–490. Koelsch, S., Fritz, T., Schulze, K., Alsop, D., & Schlaug, G. (2005b). Adults and children processing music: An fMRI study. NeuroImage, 25, 1068–1076. Koelsch, S., Fritz, T., v Cramon, D. Y., Muller, K., & Friederici, A. D. (2006). Investigating emotion with music: An fMRI study. Human Brain Mapping, 27, 239–250. Koelsch, S., Gunter, T. C., Schroger, E., Tervaniemi, M., Sammler, D., & Friederici, A. D. (2001). Differentiating ERAN and MMN: An ERP study. Neuroreport, 12, 1385–1389. Koelsch, S., Gunter, T. C., v Cramon, D. Y., Zysset, S., Lohmann, G., & Friederici, A. D. (2002c). Bach speaks: A cortical “language-network” serves the processing of music. Neu roImage, 17, 956–966. Koelsch, S., Grossmann, T., Gunter, T. C., Hahne, A., Schroger, E., & Friederici, A. D. (2003b). Children processing music: Electric brain responses reveal musical competence and gender differences. Journal of Cognitive Neuroscience, 15, 683–693. Koelsch, S., Kasper, E., Gunter, T. C., Sammler, D., Schulze, K., & Friederici, A. D. (2004). Music, language, and meaning: Brain signatures of semantic processing. Nature Neuro science, 7, 302–307. Koelsch, S., Schulze, K., Sammler, D., Fritz, T., Muller, K., & Gruber, O. (2009). Functional architecture of verbal and tonal working memory: An fMRI study. Human Brain Mapping, 30, 859–873. Kraemer, D. J. M., Macrae, C. N., Green, A. E., & Kelley, W. M. (2005). Musical imagery: Sound of silence activates auditory cortex. Nature, 434, 158–158. Krumhansl, C. L. (1990). Cognitive foundations of musical pitch. New York: Oxford Uni versity Press. Krumhansl, C. L., & Kessler, E. J. (1982). Tracing the dynamic changes in perceived tonal organization in a spatial representation of musical keys. Psychological Review, 89, 334– 368. Lahav, A., Saltzman, E., & Schlaug, G. (2007). Action representation of sound: Audiomo tor recognition network while listening to newly acquired actions. Journal of Neuro science, 27, 308–314. Lakatos, S. (2000). A common perceptual space for harmonic and percussive timbres. Per ception and Psychophysics, 62, 1426–1439. Langheim, F. J. P., Callicott, J. H., Mattay, V. S., Duyn, J. H., & Weinberger, D. R. (2002). Cortical systems associated with covert music rehearsal. NeuroImage, 16, 901–908.

Page 34 of 42

Cognitive Neuroscience of Music Lappe, C., Herholz, S. C., Trainor, L. J., & Pantev, C. (2008). Cortical plasticity induced by short-term unimodal and multimodal musical training. Journal of Neuroscience, 28, 9632– 9639. Large, E. W., & Jones, M. R. (1999). The dynamics of attending: How people track timevarying events. Psychological Review 106, 119–159. Large, E. W., & Palmer, C. (2002). Perceiving temporal regularity in music. Cognitive Science, 26, 1–37. Leaver, A. M., Van Lare, J., Zielinski, B., Halpern, A. R., & Rauschecker, J. P. (2009). Brain activation during anticipation of sound sequences. Journal of Neuroscience, 29, 2477– 2485. Leino, S., Brattico, E., Tervaniemi, M., & Vuust, P. (2007). Representation of harmony rules in the human brain: Further evidence from event-related potentials. Brain Research, 1142, 169–177. Lerdahl, F., & Krumhansl, C. L. (2007). Modeling tonal tension. Music Perception, 24, 329–366. Levitin, D. J., & Menon, V. (2003). Musical structure is processed in “language” areas of the brain: A possible role for Brodmann Area 47 in temporal coherence. NeuroImage, 20, 2142–2152. Liegeois-Chauvel, C., Peretz, I., Babai, M., Laguitton, V., & Chauvel, P. (1998). Contribution of different cortical areas in the temporal lobes to music processing. Brain, 121, 1853–1867. (p. 132)

Limb, C. J., & Braun, A. R. (2008). Neural substrates of spontaneous musical perfor mance: An fMRI study of jazz improvisation. PLoS One, 3, e1679. Lindenberger, U., Li, S. C., Gruber, W., & Muller, V. (2009). Brains swinging in concert: Cortical phase synchronization while playing guitar. BMC Neuroscience, 10 (22), 1–12. London, J. (2004). Hearing in time: Psychological aspects of musical meter. New York: Ox ford University Press. Loui, P., Alsop, D., & Schlaug, G. (2009a). Tone deafness: A new disconnection syndrome? Journal of Neuroscience, 29, 10215–10220. Loui, P., Grent-’t-Jong, T., Torpey, D., & Woldorff, M. (2005). Effects of attention on the neural processing of harmonic syntax in Western music. Cognitive Brain Research, 25, 678–687. Loui, P., Wu, E. H., Wessel, D. L., & Knight, R. T. (2009b). A generalized mechanism for perception of pitch patterns. Journal of Neuroscience, 29, 454–459.

Page 35 of 42

Cognitive Neuroscience of Music Maess, B., Koelsch, S., Gunter, T. C., & Friederici, A. D. (2001). Musical syntax is processed in Broca’s area: An MEG study. Nature Neuroscience, 4, 540–545. Margulis, E. H., Mlsna, L. M., Uppunda, A. K., Parrish, T. B., & Wong, P. C. M. (2009). Selective neurophysiologic responses to music in instrumentalists with different listening biographies. Human Brain Mapping, 30, 267–275. McAdams, S., Winsberg, S., Donnadieu, S., Desoete, G., & Krimphoff, J. (1995). Perceptual scaling of synthesized musical timbres: Common dimensions, specificities, and latent sub ject classes. Psychological Research, 58, 177–192. McDonald, I. (2006). Musical alexia with recovery: A personal account. Brain, 129, 2554– 2561. Meister, I. G., Krings, T., Foltys, H., Boroojerdi, B., Muller, M., Topper, R., & Thron, A. (2004). Playing piano in the mind: An fMRI study on music imagery and performance in pianists. Cognitive Brain Research, 19, 219–228. Menon, V., & Levitin, D. J. (2005). The rewards of music listening: Response and physio logical connectivity of the mesolimbic system. NeuroImage, 28, 175–184. Menon, V., Levitin, D. J., Smith, B. K., Lembke, A., Krasnow, B. D., Glazer, D., Glover, G. H., & McAdams, S. (2002). Neural correlates of timbre change in harmonic sounds. NeuroI mage, 17, 1742–1754. Meyer, M., Baumann, S., & Jancke, L. (2006). Electrical brain imaging reveals spatio-tem poral dynamics of timbre perception in humans. NeuroImage, 32, 1510–1523. Miranda, R. A., & Ullman, M. T. (2007). Double dissociation between rules and memory in music: An event-related potential study. NeuroImage, 38, 331–345. Mitterschiffthaler, M. T., Fu, C. H. Y., Dalton, J. A., Andrew, C. M., & Williams, S. C. R. (2007). A functional MRI study of happy and sad affective states induced by classical mu sic. Human Brain Mapping, 28, 1150–1162. Mizuno, T., & Sugishita, M. (2007). Neural correlates underlying perception of tonality-re lated emotional contents. Neuroreport, 18, 1651–1655. Munte, T. F., Altenmuller, E., & Jancke, L. (2002). The musician’s brain as a model of neu roplasticity. Nature Reviews, Neuroscience, 3, 473–478. Näätänen, R. (1992). Attention and brain function. Hillsdale, NJ: Erlbaum. Näätänen, R., & Winkler, I. (1999). The concept of auditory stimulus representation in cognitive neuroscience. Psychological Bulletin, 125, 826–859. Nan, Y., Knosche, T. R., Zysset, S., & Friedericil, A. D. (2008) Cross-cultural music phrase processing: An fMRI study. Human Brain Mapping, 29, 312–328. Page 36 of 42

Cognitive Neuroscience of Music Northoff, G., & Bermpohl, F. (2004). Cortical midline structures and the self. Trends in Cognitive Sciences, 8, 102–107. Northoff, G., Heinzel, A., Greck, M., Bennpohl, F., Dobrowolny, H., & Panksepp, J. (2006). Self-referential processing in our brain: A meta-analysis of imaging studies on the self. NeuroImage, 31, 440–457. Otsuka, A., Tamaki, Y., & Kuriki, S. (2008). Neuromagnetic responses in silence after mu sical chord sequences. Neuroreport, 19, 1637–1641. Paller, K. A., McCarthy, G., & Wood, C. C. (1992). Event-related potentials elicited by de viant endings to melodies. Psychophysiology, 29, 202–206. Palmer, C., & Krumhansl, C. L. (1990). Mental representations for musical meter. Journal of Experimental Psychology. Human Perception and Performance, 16, 728–741. Pantev, C., Oostenveld, R., Engelien, A., Ross, B., Roberts, L. E., & Hoke, M. (1998). In creased auditory cortical representation in musicians. Nature, 392, 811–814. Parsons, L. M., Sergent, J., Hodges, D. A., & Fox, P. T. (2005). The brain basis of piano per formance. Neuropsychologia, 43, 199–215. Pascual-Leone, A., Dang, N., Cohen, L. G., Brasilneto, J. P., Cammarota, A., & Hallett, M. (1995). Modulation of muscle responses evoked by transcranial magnetic stimulation dur ing the acquisition of new fine motor-skills. Journal of Neurophysiology, 74, 1037–1045. Pascual-Leone, A., Grafman, J., Hallett, M. (1994), Modulation of cortical motor output maps during development of implicit and explicit knowledge. Science, 263, 1287–1289. Patel, A. D. (2003). Language, music, syntax and the brain. Nature Neuroscience, 6, 674– 681. Patel, A. D., & Balaban, E. (2000). Temporal patterns of human cortical activity reflect tone sequence structure. Nature, 404, 80–84. Patel, A. D., & Balaban, E. (2004). Human auditory cortical dynamics during perception of long acoustic sequences: Phase tracking of carrier frequency by the auditory steady-state response. Cerebral Cortex, 14, 35–46. Patel, A. D., Foxton, J. M., & Griffiths, T. D. (2005). Musically tone-deaf individuals have difficulty discriminating intonation contours extracted from speech. Brain and Cognition, 59, 310–313. Patel, A. D., Gibson, E., Ratner, J., Besson, M., & Holcomb, P. J. (1998). Processing syntac tic relations in language and music: An event-related potential study. Journal of Cognitive Neuroscience, 10, 717–733. Patterson, R. D., Uppenkamp, S., Johnsrude, I. S., & Griffiths, T. D. (2002). The processing of temporal pitch and melody information in auditory cortex. Neuron, 36, 767–776. Page 37 of 42

Cognitive Neuroscience of Music Penhune, V. B., Zatorre, R. J., & Feindel, W. H. (1999). The role of auditory cortex in reten tion of rhythmic patterns as studied in patients with temporal lobe removals including Heschl’s gyrus. Neuropsychologia, 37, 315–331. Peretz, I. (1996). Can we lose memory for music? A case of music agnosia in a nonmusi cian. Journal of Cognitive Neuroscience, 8, 481–496. Peretz, I. (2006). The nature of music from a biological perspective. Cognition, 100, 1–32. (p. 133)

Peretz, I., & Coltheart, M. (2003). Modularity of music processing. Nature Neuroscience, 6, 688–691. Perry, D. W., Zatorre, R. J., Petrides, M., Alivisatos, B., Meyer, E., & Evans, A. C. (1999). Localization of cerebral activity during simple singing. Neuroreport, 10, 3979–3984. Plailly, J., Tillmann, B., & Royet, J.-P. (2007). The feeling of familiarity of music and odors: The same neural signature? Cerebral Cortex, 17, 2650–2658. Platel, H., Baron, J. C., Desgranges, B., Bernard, F., & Eustache, F. (2003). Semantic and episodic memory of music are subserved by distinct neural networks. NeuroImage, 20, 244–256. Popescu, M., Otsuka, A., & Ioannides, A. A. (2004). Dynamics of brain activity in motor and frontal cortical areas during music listening: A magnetoencephalographic study. Neu roImage, 21, 1622–1638. Pressing, J. (2002). Black Atlantic rhythm: Its computational and transcultural founda tions. Music Perception, 19, 285–310. Ross, E. D., & Monnot, M. (2008). Neurology of affective prosody and its functionalanatomic organization in right hemisphere. Brain and Language, 104, 51–74. Sammler, D., Grigutsch, M., Fritz, T., & Koelsch, S. (2007). Music and emotion: Electro physiological correlates of the processing of pleasant and unpleasant music. Psychophysi ology, 44, 293–304. Samson, S., & Zatorre, R. J. (1988). Melodic and harmonic discrimination following unilat eral cerebral excision. Brain and Cognition, 7, 348–360. Samson, S., & Zatorre, R. J. (1991). Recognition memory for text and melody of songs af ter unilateral temporal lobe lesion: Evidence for dual encoding. Journal of Experimental Psychology. Learning, Memory, and Cognition, 17, 793–804. Samson, S., & Zatorre, R. J. (1994). Contribution of the right temporal-lobe to musical timbre discrimination. Neuropsychologia, 32, 231–240.

Page 38 of 42

Cognitive Neuroscience of Music Samson, S., Zatorre, R. J., & Ramsay, J. O. (2002). Deficits of musical timbre perception af ter unilateral temporal-lobe lesion revealed with multidimensional scaling. Brain, 125, 511–523. Sarkamo, T., Tervaniemi, M., Laitinen, S., Forsblom, A., Soinila, S., Mikkonen, M., Autti, T., Silvennoinen, H. M., Erkkilae, J., Laine, M., Peretz, I., & Hietanen, M. (2008). Music listening enhances cognitive recovery and mood after middle cerebral artery stroke. Brain, 131, 866–876. Satoh, M., Takeda, K., Nagata, K., Hatazawa, J., & Kuzuhara, S. (2001). Activated brain regions in musicians during an ensemble: A PET study. Cognitive Brain Research, 12, 101–108. Satoh, M., Takeda, K., Nagata, K., Shimosegawa, E., & Kuzuhara, S. (2006). Positronemission tomography of brain regions activated by recognition of familiar music. Ameri can Journal of Neuroradiology, 27, 1101–1106. Schellenberg, E. G., Iverson, P., & McKinnon, M. C. (1999). Name that tune: Identifying popular recordings from brief excerpts. Psychonomic Bulletin and Review, 6, 641–646. Schellenberg, E. G., & Trehub, S. E. (2003). Good pitch memory is widespread. Psycholog ical Science, 14, 262–266. Schirmer, A., & Kotz, S. A. (2006). Beyond the right hemisphere: brain mechanisms medi ating vocal emotional processing. Trends in Cognitive Sciences, 10, 24–30. Schlaug, G., Jancke, L., Huang, Y. X., & Steinmetz, H. (1995). In-vivo evidence of structur al brain asymmetry in musicians. Science, 267, 699–701. Schmithorst, V. J., & Holland, S. K. (2003). The effect of musical training on music pro cessing: A functional magnetic resonance imaging study in humans. Neuroscience Letters, 348, 65–68. Schneider, P., Scherg, M., Dosch, H. G., Specht, H. J., Gutschalk, A., & Rupp, A. (2002). Morphology of Heschl’s gyrus reflects enhanced activation in the auditory cortex of musi cians. Nature Neuroscience, 5, 688–694. Schneider, P., Sluming, V., Roberts, N., Scherg, M., Goebel, R., Specht, H. J., Dosch, H. G., Bleeck, S., Stippich, C., & Rupp, A. (2005). Structural and functional asymmetry of lateral Heschl’s gyrus reflects pitch perception preference. Nature Neuroscience, 8, 1241–1247. Schön, D., Anton, J. L., Roth, M., & Besson, M. (2002). An fMRI study of music sight-read ing. Neuroreport, 13, 2285–2289. Schön, D., Semenza, C., & Denes, G. (2001). Naming of musical notes: A selective deficit in one musical clef. Cortex, 37, 407–421.

Page 39 of 42

Cognitive Neuroscience of Music Schubotz, R. I. (2007). Prediction of external events with our motor system: Towards a new framework. Trends in Cognitive Sciences, 11, 211–218. Sergent, J., Zuck, E., Terriah, S., & Macdonald, B. (1992). Distributed neural network un derlying musical sight-reading and keyboard performance. Science, 257, 106–109. Sluming, V., Barrick, T., Howard, M., Cezayirli, E., Mayes, A., & Roberts, N. (2002). Voxelbased morphometry reveals increased gray matter density in Broca’s area in male sym phony orchestra musicians. NeuroImage, 17, 1613–1622. Sridharan, D., Levitin, D. J., Chafe, C. H., Berger, J., & Menon, V. (2007). Neural dynamics of event segmentation in music: Converging evidence for dissociable ventral and dorsal networks. Neuron, 55, 521–532. Steinbeis, N., & Koelsch, S. (2008). Shared neural resources between music and language indicate semantic processing of musical tension-resolution patterns. Cerebral Cortex, 18, 1169–1178. Steinbeis, N., Koelsch, S., & Sloboda, J. A. (2006). The role of harmonic expectancy viola tions in musical emotions: Evidence from subjective, physiological, and neural responses. Journal of Cognitive Neuroscience, 18, 1380–1393. Stewart, L., Henson, R., Kampe, K., Walsh, V., Turner, R., & Frith, U. (2003). Brain changes after learning to read and play music. Neuroimage, 20 (1), 71–83. doi: http:// dx.doi.org/10.1016/S1053-8119(03)00248-9 Stewart, L., von Kriegstein, K., Warren, J. D., & Griffiths, T. D. (2006). Music and the brain: Disorders of musical listening. Brain, 129, 2533–2553. Temperley, D. (2001). The cognition of basic musical structures. Cambridge, MA: MIT Press. Temperley, D. (2007). Music and probability. Cambridge, MA: MIT Press. Thaut, M. (2005). Rhythm, music, and the brain: Scientific foundations and clinical appli cations. New York: Routledge. Tillmann, B., Bharucha, J. J., & Bigand, E. (2000). Implicit learning of tonality: A self-orga nizing approach. Psychological Review, 107, 885–913. Tillmann, B., Janata, P., & Bharucha, J. J. (2003). Activation of the inferior frontal cortex in musical priming. Cognitive Brain Research, 16, 145–161. (p. 134)

Toiviainen, P. (2007). Visualization of tonal content in the symbolic and audio domains. Computing in Musicology, 15, 187–199. Toiviainen, P., & Krumhansl, C. L. (2003). Measuring and modeling real-time responses to music: The dynamics of tonality induction. Perception, 32, 741–766. Page 40 of 42

Cognitive Neuroscience of Music Toiviainen, P., Tervaniemi, M., Louhivuori, J., Saher, M., Huotilainen, M., & Naatanen, R. (1998). Timbre similarity: Convergence of neural, behavioral, and computational ap proaches. Music Perception, 16, 223–241. Tueting, P., Sutton, S., & Zubin, J. (1970). Quantitative evoked potential correlates of the probability of events. Psychophysiology, 7, 385–394. Warren, J. D., Uppenkamp, S., Patterson, R. D., & Griffiths, T. D. (2003). Separating pitch chroma and pitch height in the human brain. Proceedings of the National Academy of Sciences U S A, 100, 10038–10042. Warrier, C. M., & Zatorre, R. J. (2002). Influence of tonal context and timbral variation on perception of pitch. Perception and Psychophysics, 64, 198–207. Warrier, C. M., & Zatorre, R. J. (2004). Right temporal cortex is critical for utilization of melodic contextual cues in a pitch constancy task. Brain, 127, 1616–1625. Watanabe, T., Yagishita, S., & Kikyo, H. (2008). Memory of music: Roles of right hip pocampus and left inferior frontal gyrus. NeuroImage, 39, 483–491. Wiltermuth, S. S., & Heath, C. (2009). Synchrony and cooperation. Psychological Science, 20, 1–5. Zarate, J. M., & Zatorre, R. J. (2008). Experience-dependent neural substrates involved in vocal pitch regulation during singing. NeuroImage, 40, 1871–1887. Zatorre, R. J. (1985). Discrimination and recognition of tonal melodies after unilateral cerebral excisions. Neuropsychologia, 23, 31–41. Zatorre, R. J., Belin, P., & Penhune, V. B. (2002). Structure and function of auditory cortex: Music and speech. Trends in Cognitive Sciences, 6, 37–46. Zatorre, R. J., Evans, A. C., & Meyer, E. (1994). Neural mechanisms underlying melodic perception and memory for pitch. Journal of Neuroscience, 14, 1908–1919. Zatorre, R. J., Halpern, A. R., Perry, D. W., Meyer, E., & Evans, A. C. (1996). Hearing in the mind’s ear: A PET investigation of musical imagery and perception. Journal of Cognitive Neuroscience, 8, 29–46. Zatorre, R. J., Perry, D. W., Beckett, C. A., Westbury, C. F., & Evans, A. C. (1998). Function al anatomy of musical processing in listeners with absolute pitch and relative pitch. Pro ceedings of the National Academy of Sciences U S A, 95, 3172–3177.

Petr Janata

Petr Janata is Professor at University of California Davis in the Psychology Depart ment and Center for Mind and Brain.

Page 41 of 42

Audition

Audition Josh H. McDermott The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0008

Abstract and Keywords Audition is the process by which organisms use sound to derive information about the world. This chapter aims to provide a bird’s-eye view of contemporary audition research, spanning systems and cognitive neuroscience as well as cognitive science. The author provides brief overviews of classic areas of research as well as some central themes and advances from the past ten years. The chapter covers the sound transduction of the cochlea, subcortical and cortical anatomical and functional organization of the auditory system, amplitude modulation and its measurement, adaptive coding and plasticity, the perception of sound sources (with a focus on the classic research areas of location, loud ness, and pitch), and auditory scene analysis (including sound segregation, streaming, filling in, and reverberation perception). The chapter concludes with a discussion of where hearing research seems to be headed at present. Keywords: sound transduction, auditory system anatomy, modulation, adaptation, plasticity, pitch perception, au ditory scene analysis, sound segregation, streaming, reverberation

Introduction From the cry of a baby to the rumble of a thunderclap, many events in the world produce sound. Sound is created when matter in the world vibrates, and takes the form of pres sure waves that propagate through the air, containing clues about the environment around us. Audition is the process by which organisms utilize these clues to derive infor mation about the world. Audition is a crucial sense for most organisms. Humans, in particular, use sound to infer a vast number of important things—what someone said, their emotional state when they said it, and the whereabouts and nature of objects we cannot see, to name but a few. When hearing is impaired (via congenital conditions, noise exposure, or aging), the conse quences can be devastating, such that a large industry is devoted to the design of pros thetic hearing devices.

Page 1 of 62

Audition As listeners we are largely unaware of the computations underlying our auditory system’s success, but they represent an impressive feat of engineering. The computational chal lenges of everyday audition are reflected in the gap between biological and machine hear ing systems—machine systems for interpreting sound currently fall far short of human abilities. Understanding the basis of our success in perceiving sound will hopefully help us to replicate it in machine systems and to restore it in biological auditory systems when their function becomes impaired. The goal of this chapter is to provide a bird’s-eye view of contemporary hearing research. I provide brief overviews of classic areas of research as well as some central themes and advances from the past ten years. The first section describes the sensory transduction of the cochlea. The second section outlines subcortical and cortical functional organization. (p. 136) The third section discusses modulation and its measurement by subcortical and cortical regions of the auditory system, a key research focus of the past few decades. The fourth section describes adaptive coding and plasticity, encompassing the relationship be tween sensory coding and the environment as well as its adaptation to task demands. The fifth section discusses the perception of sound sources, focusing on location, loudness, and pitch. The sixth section presents an overview of auditory scene analysis. I conclude with a discussion of where hearing research is headed at present. Because other chapters in this handbook are devoted to auditory attention, music, and speech, I will largely avoid these topics.

The Problem Just by listening, we can routinely apprehend many aspects of the world around us: the size of a room in which we are talking, whether it is windy or raining outside, the speed of someone approaching from behind, or whether the surface someone is walking on is gravel or marble. These abilities are nontrivial because the properties of the world that are of interest to a listener are generally not explicit in the acoustic input—they cannot be easily recognized or discriminated using the sound waveform itself. The brain must process the sound entering the ear to generate representations in which the properties of interest are more evident. One of the main objectives of hearing science is to understand the nature of these transformations and their instantiation in the brain. Like other senses, audition is further complicated by a second challenge—that of scene analysis. Although listeners are generally interested in the properties of individual ob jects or events, the ears are rarely presented with the sounds from isolated sources. In stead, the sound signal that reaches the ear is typically a mixture of sounds from different sources. Such situations occur frequently in natural auditory environments, for example, in social settings, where a single speaker of interest may be talking among many others, and in music. From the mixture it receives as input, the brain must derive representa tions of the individual sound sources of interest, as are needed to understand someone’s speech, recognize a melody, or otherwise guide behavior. Known as the “cocktail party problem” (Cherry, 1953), or “auditory scene analysis” (Bregman, 1990), this problem has Page 2 of 62

Audition analogues in other sensory modalities, but the auditory version presents some uniquely challenging features.

Sound Measurement—The Peripheral Auditory System The transformation of the raw acoustic input into representations that are useful for be havior is apparently instantiated over many brain areas and stages of neural processing, spanning the cochlea, midbrain, thalamus, and cortex (Figure 8.1). The early stages of this cascade are particularly intricate in the auditory system relative to other sensory sys tems, with many processing stations occurring before the cortex. The sensory organ of the cochlea is itself a complex multicomponent system, whose investigation remains a considerable challenge—the mechanical nature of the cochlea renders it much more diffi cult to probe (e.g., with electrodes) than the retina or olfactory epithelium, for instance. Peripheral coding of sound is also unusual relative to that of other senses in its degree of clinical relevance. Unlike vision, for which the most common forms of dysfunction are op tical in nature, and can be fixed with glasses, hearing impairment typically involves al tered peripheral neural processing, and its treatment has benefited from a detailed un derstanding of the processes that are altered. Much of hearing research has accordingly been devoted to understanding the nature of the measurements made by the auditory pe riphery, and they provide a natural starting point for any discussion of how we hear.

Frequency Selectivity and the Cochlea Hearing begins with the ear, where the sound pressure waveform carried by the air is transduced into action potentials that are sent to the brain via the auditory nerve. Action potentials are a binary code, but what is conveyed to the brain is far from simply a bina rized version of the incoming waveform. The transduction process is marked by several distinctive signal transformations, the most obvious of which is produced by frequency tuning.

Page 3 of 62

Audition

Figure 8.1 The auditory system. Sound is transduced by the cochlea, processed by an interconnected set of subcortical areas, and then fed into the core re gions of auditory cortex.

The coarse details of sound transduction are well understood (Figure 8.2). Sound induces vibrations of the eardrum, which are transmitted via the bones of the middle ear to the cochlea, the sensory organ of the auditory system. The cochlea is a coiled, fluid-filled tube, containing several membranes that extend along its length and vibrate in response to sound. Transduction of this mechanical vibration into an electrical signal occurs in the organ of Corti, a mass of cells attached to the basilar membrane. The organ of Corti in particular contains what are known as hair cells, named for the stereocilia that protrude from them. The inner hair cells are (p. 137) responsible for sound transduction. When the section of membrane on which they lie vibrates, the resulting deformation of the hair cell body opens mechanically gated ion channels, inducing a voltage change within the cell. Neurotransmitter release is triggered by the change in membrane potential, generating action potentials in the auditory nerve fiber that the hair cell synapses with. This electri cal signal is carried by the auditory nerve fiber to the brain. The frequency tuning of the transduction process occurs because different parts of the basilar membrane vibrate in response to different frequencies. This is partly due to me chanical resonances—the thickness and stiffness of the membrane vary along its length, producing a different resonant frequency at each point. However, the mechanical reso nances are actively enhanced via a feedback process, believed to be mediated largely by a second set of cells, called the outer hair cells. The outer hair cells abut the inner hair cells on the organ of Corti and serve to alter the basilar membrane vibration rather than transduce it. They expand and contract in response to sound through mechanisms that are only partially understood (Ashmore, 2008; Dallos, 2008; Hudspeth, 2008). Their mo tion alters the passive mechanics of the basilar membrane, amplifying the response to low-intensity sounds and tightening the frequency tuning of the resonance. The upshot is that high frequencies produce vibrations at the basal end of the cochlea (close to the eardrum), whereas low frequencies produce vibrations at the apical end (far from the Page 4 of 62

Audition eardrum), with frequencies in between stimulating intermediate regions. The auditory nerve fibers that synapse onto individual inner hair cells are thus frequency tuned—they fire action potentials in response to a local range of frequencies, collectively providing the rest of the auditory system with a frequency decomposition of the incoming wave form. As a result of this behavior, the cochlea is often described functionally as a set of bandpass filters—filters that each pass frequencies within a particular range, and elimi nate those outside of it.

Figure 8.2 Structure of the peripheral auditory sys tem. Top right, Diagram of ear. The eardrum trans mits sound to the cochlea via the middle ear bones (ossicles). Top middle, Inner ear. The semicircular canals abut the cochlea. Sound enters the cochlea via the oval window and causes vibrations along the basilar membrane, which runs through the middle of the cochlea. Top left, Cross section of cochlea. The organ of Corti, containing the hair cells that trans duce sound into electrical potentials, sits on top of the basilar membrane. Bottom, Schematic of section of organ of Corti. The shearing that occurs between the basilar and tectorial membranes when they vi brate (in response to sound) causes the hair cell stereocilia to deform. The deformation causes a change in the membrane potential of the inner hair cells, transmitted to the brain via afferent auditory nerve fibers. The outer hair cells, which are three times more numerous than the inner hair cells, serve as a feedback system to alter the basilar membrane motion, tightening its tuning and amplifying the re sponse to low amplitude sounds.

Page 5 of 62

Audition

Figure 8.3 Frequency selectivity. A, Threshold tun ing curves of auditory nerve fibers from a cat ear, plotting the level that was necessary to evoke a crite rion increase in firing rate for a given frequency (Mill er, Schilling, et al., 1997). B, The tonotopy of the cochlea. The position along the basilar membrane at which auditory nerve fibers synapse with a hair cell (determined by dye injections) is plotted vs. their best frequency (Liberman, 1982). Both parts of this figure are courtesy of Eric Young, 2010, who replotted data from the original sources.

The frequency decomposition of the cochlea is conceptually similar to the Fourier trans form, but differs in the way that the frequency spectrum is decomposed. Whereas the Fourier transform uses linearly spaced frequency bins, each separated by the same num ber of hertz, the tuning bandwidth of auditory nerve fibers increases with their preferred frequency. This characteristic can be observed in Figure 8.3A, in which the frequency re sponse of a set of auditory nerve fibers is (p. 138) (p. 139) plotted on a logarithmic frequen cy scale. Although the lowest frequency fibers are broader on a log scale than the highfrequency fibers, in absolute terms their bandwidths are much lower—several hundred hertz instead of several thousand. The distribution of best frequency along the cochlea follows a roughly logarithmic function, apparent in Figure 8.3B, which plots the best fre quency of a large set of nerve fibers against the distance along the cochlea of the hair cell that they synapse with. These features of frequency selectivity are present in most biolog ical auditory systems. It is partly for this reason that a log scale is commonly used for fre quency. Cochlear frequency selectivity has a host of perceptual consequences—our ability to de tect a particular frequency is limited largely by the signal-to-noise ratio of the cochlear fil ter centered on the frequency, for instance. There are many treatments of frequency se lectivity and perception (Moore, 2003); it is perhaps the most studied aspect of hearing.

Page 6 of 62

Audition Although the frequency tuning of the cochlea is uncontroversial, the teleological question of why the cochlear transduction process is frequency-tuned remains less settled. How does frequency tuning aid the brain’s task of recovering useful information about the world from its acoustic input? Over the past two decades, a growing number of re searchers have endeavored to explain properties of sensory systems as optimal for the task of encoding natural sensory stimuli, initially focusing on coding questions in vision, and using notions of efficiency as the optimality criterion (Field, 1987; Olshausen & Field, 1996). Lewicki and colleagues have applied similar concepts to hearing, using algorithms that derive efficient and sparse representations of sounds (Lewicki, 2002; Smith & Lewic ki, 2006), properties believed to be desirable of early sensory representations. They re port that for speech, or for (p. 140) combinations of environmental sounds and animal vo calizations, efficient representations for sound look much like the representation pro duced by auditory nerve fiber responses—sounds are represented with filters whose tun ing is localized in frequency. Interestingly, the resulting representations share the depen dence of bandwidth on frequency found in biological hearing—bandwidths increase with frequency as they do in the ear. Moreover, representations derived in the same way for “unnatural” sets of sounds, such as samples of white noise, do not exhibit frequency tun ing, indicating that the result is at least somewhat specific to the sorts of sounds com monly encountered in the world. These results suggest that frequency tuning provides an efficient means to encode the sounds that were likely of importance when the auditory system evolved, possibly explaining its ubiquitous presence in auditory systems. It re mains to be seen whether this framework can explain potential variation in frequency tun ing bandwidths across species (humans have recently been claimed to possess narrower tuning than other species (Joris, Bergevin, et al., 2011; Shera, Guinan, et al., 2002), or the broadening of frequency tuning with increasing sound intensity (Rhode, 1978), but it pro vides one means by which to understand the origins of peripheral auditory processing.

Amplitude Compression A second salient transformation that occurs in the cochlea is that of amplitude compres sion, whereby the mechanical response of the cochlea to a soft sound (and thus the neur al response as well) is larger than would be expected given the response to a loud sound. The response elicited by a sound is thus not proportional to the sound’s amplitude (as it would be if the response were linear), but rather to a compressive nonlinear function of amplitude. The dynamic range of the response to sound is thus “compressed” relative to the dynamic range of the acoustic input. Whereas the range of audible sounds covers five orders of magnitude, or 100 dB, the range of cochlear response covers only one or two or ders of magnitude (Ruggero, Rich, et al., 1997). Compression appears to serve to map the range of amplitudes that the listener needs to hear (i.e., those commonly encountered in the environment), onto the physical operating range of the cochlea. Without compression, it would have to be the case that either sounds low in level would be inaudible, or sounds high in level would be indiscriminable (for they would fall outside the range that could elicit a response change). Compression Page 7 of 62

Audition permits very soft sounds to produce a physical response that is (just barely) detectable, while maintaining some discriminability of higher levels. The compressive nonlinearity is often approximated as a power function with an exponent of 0.3 or so. It is not obvious why the compressive nonlinearity should take the particular form that it does. Many different functions could in principle serve to compress the out put response range. It remains to be seen whether compression can be explained in terms of optimizing the encoding of the input, as has been proposed for frequency tuning (but see Escabi, Miller, et al., 2003). Most machine hearing applications also utilize amplitude compression before analyzing sound, however, and it is widely agreed to be useful to am plify low amplitudes relative to large when processing sound. Amplitude compression was first noticed in measurements of the physical vibrations of the basilar membrane (Rhode, 1971; Ruggero, 1992) but is also apparent in auditory nerve fiber responses (Yates, 1990) and is believed to account for a number of perceptual phenomena (Moore & Oxenham, 1998). The effects of compression are related to “cochlear amplification,” in that compression results from response enhancement that is limited to low-intensity sounds. Compression is achieved in part via the outer hair cells, whose motility modifies the motion of the basilar membrane in response to sound (Rug gero & Rich, 1991). Outer hair cell function is frequently altered in hearing impairment, one consequence of which is a loss of compression, something that hearing aids attempt to mimic.

Neural Coding in the Auditory Nerve

Figure 8.4 Phase locking. A, A 200-Hz pure tone stimulus waveform aligned in time with several over laid traces of an auditory nerve fiber’s response to the tone. Note that the spikes are not uniformly dis tributed in time, but rather occur at particular phas es of the sinusoidal input. B, A measure of phase locking for each of a set of nerve fibers in response to different frequencies. Phase locking decreases at high frequencies. Both parts of this figure are reprinted with permis sion from the original source: Javel & Mott, 1988.

Although frequency tuning and amplitude compression are at this point uncontroversial and relatively well understood, several other empirical questions about peripheral audito ry coding remain unresolved. One important issue involves the means by which the audi Page 8 of 62

Audition tory nerve encodes frequency information. As a result of the frequency tuning of the audi tory nerve, the spike rate of a nerve fiber contains information about frequency (a large firing rate indicates that the sound input contains frequencies near the center of the range of the fiber’s tuning). Collectively, the firing rates of all nerve fibers could thus be used to estimate the instantaneous spectrum of a sound. However, spike timings also car ry frequency information. At least for low frequencies, the spikes that are fired in re sponse to sound do not occur randomly, (p. 141) but rather tend to occur at the peak dis placements of the basilar membrane vibration. Because the motion of a particular section of the membrane mirrors the bandpass-filtered sound waveform, the spikes occur at the waveform peaks (Rose, Brugge, et al., 1967). If the input is a single frequency, spikes thus occur at a fixed phase of the frequency cycle (Figure 8.4A). This behavior is known as phase locking and produces spikes at regular intervals corresponding to the period of the frequency. The spike timings thus carry information that could potentially augment or su percede that conveyed by the rate of firing. Phase locking degrades in accuracy as frequency is increased (Figure 8.4B) due to limita tions in the temporal fidelity of the hair cell membrane potential (Palmer & Russell, 1986) and is believed to be largely absent for frequencies above 4 kHz in most mammals, al though there is some variability across species (Johnson, 1980; Palmer & Russell, 1986). The appeal of phase locking as a code for sound frequency is partly due to features of rate-based frequency selectivity that are unappealing from an engineering standpoint. Al though frequency tuning in the auditory system (as measured by auditory nerve spike rates or psychophysical masking experiments) is narrow at low stimulus levels, it broad ens considerably as the level is raised (Glasberg & Moore, 1990; Rhode, 1978). Phase locking, by comparison, is robust to sound level—even though a nerve fiber responds to a broad range of frequencies when the level is high, the time intervals between spikes con tinue to convey frequency-specific information, as the peaks in the bandpass-filtered waveform tend to occur at integer multiples of the periods of the component frequencies. Our ability to discriminate frequency is impressive, with thresholds on the order of 1 per cent (Moore, 1973), and there has been long-standing interest in whether this ability in part depends on fine-grained spike timing information (Heinz, Colburn, et al., 2001). Al though phase locking remains uncharacterized in humans because of the unavailability of human auditory nerve recordings, it is presumed to occur in much the same way as in nonhuman auditory systems. Moreover, several psychophysical phenomena are consistent with a role for phase locking in human hearing. For instance, frequency discrimination becomes much poorer for frequencies above 4 kHz (Moore, 1973), roughly the point at which phase locking declines in nonhuman animals. The fundamental frequency of the highest note on a piano is also approximately 4 kHz; this is also the point above which melodic intervals between pure tones (tones containing a single frequency) are much less evident (Attneave & Olson, 1971; Demany & Semal, 1990). These findings provide some circumstantial evidence that phase locking is important for deriving precise estimates of frequency, but definitive evidence remains elusive. It remains possible that the perceptual degradations at high frequencies reflect a lack of experience with such frequencies, or Page 9 of 62

Audition their relative unimportance for typical behavioral judgments, rather than a physiological limitation. The upper limit of phase locking is also known to decrease markedly at each successive stage of the auditory system (Wallace, Anderson, et al., 2007). (p. 142) By primary audito ry cortex, the upper cutoff is in the neighborhood of a few hundred hertz. It would thus seem that the phase locking that occurs robustly in the auditory nerve would need to be rapidly transformed into a spike rate code if it were to benefit processing throughout the auditory system. Adding to the puzzle is the fact that frequency tuning is not thought to be dramatically narrower at higher stages in the auditory system. Such tightening might be expected if the frequency information provided by phase-locked spikes was trans formed to yield improved rate-based frequency tuning at subsequent stages (but see Bit terman, Mukamel, et al., 2008).

II. Organization of the Auditory System Subcortical Pathways The auditory nerve feeds into a cascade of interconnected subcortical regions that lead up to the auditory cortex, as shown in Figure 8.1. The subcortical auditory pathways have complex anatomy, only some of which is depicted in Figure 8.1. In contrast to the subcor tical pathways of the visual system, which are often argued to largely preserve the repre sentation generated in the retina, the subcortical auditory areas exhibit a panoply of in teresting response properties not found in the auditory nerve, many of which remain ac tive topics of investigation. Several subcortical regions will be referred to in the sections that follow in the context of other types of acoustic measurements or perceptual func tions.

Feedback to the Cochlea Like other sensory systems, the auditory system can be thought of as a processing cas cade, extending from the sensory receptors to cortical areas believed to mediate auditorybased decisions. This “feedforward” view of processing underlies much auditory re search. As in other systems, however, feedback from later stages to earlier ones is ubiqui tous and substantial, and in the auditory system is perhaps even more pronounced than elsewhere in the brain. Unlike the visual system, for instance, the auditory pathways con tain feedback extending all the way back to the sensory receptors. The function of much of this feedback remains poorly understood, but one particular set of projections—the cochlear efferent system—has been the subject of much discussion. Efferent connections to the cochlea originate primarily from the superior olivary nucleus, an area of the midbrain a few synapses removed from the cochlea (see Figure 8.1, al though the efferent pathways are not shown). The superior olive is divided into two sub regions, medial and lateral, and to first order, these give rise to two efferent projections: Page 10 of 62

Audition one from the medial superior olive to the outer hair cells, called the medial olivocochlear (MOC) efferents, and one from the lateral superior olive to the inner hair cells, called the lateral olivocochlear (LOC) efferents (Elgoyhen & Fuchs, 2010). The MOC efferents have been relatively well studied. Their activation (e.g., by electrical stimulation) is known to reduce the basilar membrane response to low-intensity sounds, and causes the frequency tuning of the response to broaden. This is probably because the MOC efferents inhibit the outer hair cells, which are crucial to amplifying the response to low-intensity sounds and to sharpening frequency tuning. The MOC efferents may serve a protective function by reducing the response to loud sounds (Rajan, 2000), but their most commonly proposed function is to enhance the re sponse to transient sounds in noise (Guinan, 2006). When the MOC fibers are severed, for instance, performance on tasks involving discrimination of tones in noise is reduced (May & McQuone, 1995). Noise-related MOC effects are proposed to derive from its influence on adaptation, which when induced by background noise, reduces the detectability of transient foreground sounds by decreasing the dynamic range of the auditory nerve’s re sponse. Because MOC activation reduces the response to ongoing sound, adaptation in duced by continuous background noise is reduced, thus enhancing the response to tran sient tones that are too brief to trigger the MOC feedback themselves (Kawase, Delgutte, et al., 1993; Winslow & Sachs, 1987). Another interesting but controversial proposal is that the MOC efferents play a role in auditory attention. One study, for instance, found that patients whose vestibular nerve (containing the MOC fibers) had been severed were better at detecting unexpected tones after the surgery, suggesting that selective attention had been altered so as to prevent the focusing of resources on expected frequencies (Scharf, Magnan, et al., 1997). See Guinan, 2006, for a recent review of these and other ideas about MOC efferent function. Less is known about the LOC efferents. One recent study found that destroying the LOC efferents to one ear in mice caused binaural responses to become “unbalanced” (Darrow, Maison, et al., 2006)—when sounds were presented binaurally at equal levels, responses from the two ears that were equal under normal conditions were generally not equal fol lowing the surgical procedure. The suggestion was that the LOC efferents serve to regu late binaural responses so that interaural intensity (p. 143) differences, crucial to sound localization (see below), can be accurately registered.

Page 11 of 62

Audition

Tonotopy

Figure 8.5 Tonotopy. Best frequency of voxels in the human auditory cortex, measured with fMRI, plotted on the flattened cortical surface (Humphries, Lieben thal, et al., 2010). Note that the best frequency varies quasi-smoothly over the cortical surface and is suggestive of two maps that are approximately mir ror images of each other.

Although many of the functional properties of subcortical and cortical neurons are dis tinct from what is found in auditory nerve responses, frequency tuning persists. Every subcortical region contains frequency-tuned neurons, and neurons tend to be spatially or ganized to some extent according to their best frequency, forming “tonotopic” maps. This organization is also evident in the cortex. Many cortical neurons have a preferred fre quency, although they are often less responsive to pure tones (relative to sounds with more complex spectra) and often have broader tuning than neurons in peripheral stages (Moshitch, Las, et al., 2006). Cortical frequency maps were one of the first reported find ings in single-unit neurophysiology studies of the auditory cortex in animals, and have since been found using functional magnetic resonance imaging (fMRI) in humans (Formisano, Kim, et al., 2003; Humphries, Liebenthal, et al., 2010; Talavage, Sereno, et al., 2004) as well as monkeys (Petkov, Kayser, et al., 2006). Figure 8.5 shows an example of a tonotopic map obtained in a human listener with fMRI. Although never formally quantified, it seems that tonotopy is less robust than the retinotopy found in the visual system (evident, e.g., in recent optical imaging studies; Bandyopadhyay, Shamma, et al., 2010; Rothschild, Nelken, et al., 2010). Although the presence of some degree of tonotopy in the cortex is beyond question, its functional importance remains unclear. Frequency selectivity is not the end goal of the auditory system, and it does not obviously bear much relevance to behavior, so it is un clear why tonotopy would be a dominant principle of organization throughout the audito Page 12 of 62

Audition ry system. It may be that other principles of organization are in fact more prominent but have yet to be discovered. At present, however, tonotopy remains a staple of textbooks and review chapters such as this.

Functional Organization Largely on grounds of anatomy and connectivity, mammalian auditory cortex is standard ly divided into three sets of regions, shown in Figure 8.6: a core region receiving direct input from the thalamus, a “belt” region surrounding it, and a “parabelt” region beyond that (Kaas & Hackett, 2000; Sweet, Dorph-Petersen, et al., 2005). Within these areas, tonotopy is often used to delineate distinct fields (a field is typically considered to contain a single tonotopic map). The core region is divided in this way into areas A1, R (for ros tral), and RT (for rostrotemporal) in primates, with A1 and R receiving direct input from the medial geniculate nucleus of the thalamus. There are also multiple belt areas (Petkov, Kayser, et al., 2006), each receiving input from the core areas. Functional imaging re veals many additional areas that respond to sound in the awake primate, including parts of parietal and frontal cortex (Poremba, Saunders, et al., 2003). There are some indica tions that the three core regions have different properties (Bendor & Wang, 2008), and that stimulus selectivity increases in complexity from the core to surrounding areas (Kikuchi, Horwitz, et al., 2010; Rauschecker & Tian, 2004; Tian & Rauschecker, 2004), suggestive of a hierarchy of processing. However, at present, there is not a single widely accepted framework for auditory cortical organization. Several principles of organization have been proposed with varying degrees of empirical support; here, we review a few of them.

Page 13 of 62

Audition

Figure 8.6 Anatomy of auditory cortex. A, Lateral view of macaques cortex. The approximate location of the parabelt region is indicated with dashed or ange lines. B, View of the brain from (A) after re moval of the overlying parietal cortex. Approximate locations of the core (solid red line), belt (dashed yel low line), and parabelt (dashed orange line) regions are shown. AS, arcuate sulcus; CS, central sulcus; INS, insula; LS, lateral sulcus; STG, superior tempo ral gyrus; STS, superior temporal sulcus. C, Connec tivity between core and belt regions. Solid lines with arrows denote dense connections; dashed lines with arrows denote less dense connections. RT, R, and A1 compose the core; all three subregions receive input from the thalamus. The areas surrounding the core make up the belt, and the two regions outlined with dashed lines make up the parabelt. The core has few direct connections with the parabelt or more distant cortical areas. AL, anterolateral; CL, caudolateral; CM, caudomedial; CPB, caudal parabelt; ML, middle lateral; MM, middle medial; RM, rostromedial; RPB, rostral parabelt; RT, rostrotemporal; RTM, medial rostrotemporal; RTL, lateral rostrotemporal. All parts reprinted from original source: Kaas & Hackett, 2000.

Some of the proposed organizational principles clearly derive inspiration from the visual system. For (p. 144) instance, selectivity for vocalizations and selectivity for spatial loca tion have been found to be partially segregated, each being most pronounced in a differ ent part of the lateral belt (Tian, Reser, et al., 2001; Woods, Lopez, et al., 2006). These re gions have thus been proposed to constitute the beginning of ventral “what” and dorsal “where” pathways analogous to those in the visual system, perhaps culminating in the same parts of the prefrontal cortex as the analogous visual pathways (Cohen, Russ, et al., 2009; Romanski, Tian, et al., 1999). Functional imaging results in humans have also been viewed as supportive of this framework (Alain, Arnott, et al., 2001; Warren, Zielinski, et Page 14 of 62

Audition al., 2002). Additional evidence for a “what/where” dissociation comes from a recent study in which sound localization and temporal pattern discrimination in cats were selectively impaired by reversibly deactivating different regions of nonprimary auditory cortex (Lomber & Malhotra, 2008). However, other studies have found less evidence for segre gation of tuning properties in early auditory cortex (Bizley, Walker, et al., 2009). More over, the properties of the “what” stream remain relatively undefined (Recanzone, 2008); at this point, it has been defined mainly by reduced selectivity to spatial location. There have been further attempts to extend the characterization of a ventral auditory pathway by testing for specialization for the analysis of particular categories of sounds, analogous to what has been found in the visual system (Kanwisher, 2010). The most wide ly proposed specialization is for vocalizations. Using functional imaging, regions of the anterior temporal lobe have been identified in both humans (Belin, Zatorre, et al., 2000) and macaques (Petkov, Kayser, et al., 2008) that appear to be somewhat selectively re sponsive to vocalizations and that could be homologous across species. Evidence for re gions selective for other categories is less clear at present (Leaver & Rauschecker, 2010), although see the section below on pitch perception for a discussion of a cortical region putatively involved in pitch processing. Another proposal is that the left and right auditory cortices are specialized for different aspects of signal processing, with the left optimized for temporal resolution and the right for frequency resolution (Zatorre, Belin, et al., 2002). This idea is motivated by the uncer tainty principle of time–frequency analysis, whereby resolution cannot simultaneously be optimized for both time and frequency. The evidence for hemispheric differences comes mainly from functional imaging studies that manipulate spectral and temporal stimulus characteristics (Samson, Zeffiro, et al., 2011; Zatorre & Belin, 2001) and neuropsycholo gy studies that find pitch perception deficits associated with right temporal lesions (John srude, Penhune, et al., 2000; Zatorre, 1985). (p. 145) A related alternative idea is that the two hemispheres are specialized to analyze distinct timescales, with the left hemisphere more responsive to short-scale temporal variation (e.g. tens of milliseconds) and the right hemisphere more responsive to long-scale variation (e.g. hundreds of milliseconds) (Boemio, Fromm, et al., 2005; Poeppel, 2003).

Page 15 of 62

Audition

III. Sound Measurement—Modulation Amplitude Modulation and the Envelope

Figure 8.7 Amplitude modulation. A, The output of a bandpass filter (centered at 340 Hz) for a recording of speech, plotted in blue, with its envelope plotted in red. B, Close-up of part of A (corresponding to the black rectangle in A). Note that the filtered sound signal (like the unfiltered signal) fluctuates around zero at a high rate, whereas the envelope is positivevalued and fluctuates more slowly. C, Spectrogram of the same speech signal. Spectrogram is formed from the envelopes (one of which is plotted in A) of a set of filters mimicking the frequency tuning of the cochlea. The spectrogram is produced by plotting each envelope horizontally in grayscale. D, Power spectra of the filtered speech signal in A and its en velope. Note that the envelope contains power only at low frequencies (modulation frequencies), where as the filtered signal has power at a restricted range of high frequencies (acoustic frequencies).

The cochlea decomposes the acoustic input into frequency channels, but much of the im portant information in sound is conveyed by the way that the output of these frequency channels is modulated in amplitude. Consider Figure 8.7A, which displays in blue the out put of one such frequency channel for a short segment of a speech signal. The blue wave form oscillates at a rapid rate, but its amplitude waxes and wanes at a much lower rate (evident in the close-up view of Figure 8.7B). This waxing and waning is known as ampli tude modulation and is a common feature of many modes of sound production (e.g., vocal articulation). The amplitude is captured by what is known as the envelope of a signal, shown in red for the signal of Figures 8.7A and B. Often, the envelopes of each cochlear channel are stacked vertically and displayed as an image called a spectrogram, providing a depiction of how the sound energy in each frequency channel varies over time (Figure 8.7C). Figure 8.7D shows the spectra of the signal and envelope shown in Figures 8.7A and B. The signal spectrum is bandpass (because it is the output of a bandpass filter), with energy at frequencies in the audible range. The envelope spectrum, in contrast, is Page 16 of 62

Audition low-pass, with most of the power below 10 Hz, corresponding to the slow rate at which the envelope changes. The frequencies that compose the envelope are typically termed modulation frequencies, distinct from the acoustic frequencies that compose the signal that the envelope is derived from. The information carried by a cochlear channel can thus be viewed as the product of “fine structure”—a (p. 146) waveform that varies rapidly, at a rate close to the center frequency of the channel—and an amplitude envelope that varies more slowly (Rosen, 1992). The envelope and fine structure have a clear relation to common signal processing formula tions in which the output of a bandpass filter is viewed as a single sinusoid varying in am plitude and frequency—the envelope describes the amplitude variation, and the fine structure describes the frequency variation. The envelope of a frequency channel is also straightforward to extract from the auditory nerve—it can be obtained by low-pass filter ing a spike train (because the amplitude changes reflected in the envelope are relatively slow). Despite the fact that envelope and fine structure are not completely independent (Ghitza, 2001), there has been much interest in the past decade in distinguishing their roles in different aspects of hearing (Smith, Delgutte, et al., 2002) and its impairment (Lorenzi, Gilbert, et al., 2006). Perhaps surprisingly, the temporal information contained in amplitude envelopes can be sufficient for speech comprehension even when spectral information is severely limited. In a classic paper, Shannon and colleagues isolated the information contained in the am plitude envelopes of speech signals with a stimulus known as noise-vocoded speech (Shannon, Zeng, et al., 1995). Noise-vocoded speech is generated by filtering a speech signal and a noise signal into frequency bands, multiplying the frequency bands of the noise by the envelopes of the speech, and then summing the modified noise bands to syn thesize a new sound signal. By using a small number of broad frequency bands, spectral information can be greatly reduced, leaving amplitude variation over time (albeit smeared across a broader than normal range of frequencies) as the primary signal cue. Examples are shown in Figure 8.8 for two, four, and eight bands. Shannon and colleagues found that the resulting stimulus was intelligible even when just a few bands were used (i.e., with much broader frequency tuning than is present in the cochlea), indicating that the tempo ral modulation of the envelopes contains much information about speech content.

Modulation Tuning Motivated by its perceptual importance, amplitude modulation has been proposed to be analyzed by dedicated banks of filters operating on the envelopes of cochlear filter out puts rather than the sound waveform itself (Dau, Kollmeier, et al., 1997). Early evidence for such a notion came from masking and adaptation experiments, which found that the detection of a modulated signal was impaired by a masker or adapting stimulus modulat ed at a similar frequency (Bacon & Grantham, 1989; Houtgast, 1989; Tansley & Suffield, 1983). There is now considerable evidence from neurophysiology that single neurons in the midbrain, thalamus, and cortex exhibit some degree of tuning to modulation frequen cy (Depireux, Simon, et al., 2001; Joris, Schreiner, et al., 2004; Miller, Escabi, et al., 2001; Page 17 of 62

Audition Rodriguez, Chen, et al., 2010; Schreiner & Urbas, 1986, 1988; Woolley, Fremouw, et al., 2005), loosely consistent with the idea of a modulation filter bank (Figure 8.9A). Because such filters are typically conceived to operate on the envelope of a particular cochlear channel, they are tuned both in acoustic frequency (courtesy of the cochlea) and modula tion frequency. Neurophysiological studies in nonhuman animals (Schreiner & Urbas, 1986, 1988) and neuroimaging results in humans (Boemio, Fromm, et al., 2005; Giraud, Lorenzi, et al., 2000; Schonwiesner & Zatorre, 2009) have generally found that the auditory cortex re sponds preferentially to low modulation frequencies (in the range of 4–8 Hz), whereas subcortical structures prefer higher rates (up to 100–200 Hz), with preferred modulation frequency generally decreasing up the auditory pathway. Based on this, it is intriguing to speculate that successive stages of the auditory system might process structure at pro gressively longer (slower) timescales, analogous to the progressive increase in receptive field size that occurs in the visual system from V1 to inferotemporal cortex (Lerner, Hon ey, et al., 2011). Within the cortex, however, no hierarchy is clearly evident as of yet, at least in the response to simple patterns of modulation (Boemio, Fromm, et al., 2005; Gi raud, Lorenzi, et al., 2000). Moreover, there is considerable variation within each stage of the pathway in the preferred modulation frequency of individual neurons (Miller, Escabi, et al., 2001; Rodriguez, Chen, et al., 2010). There are several reports of topographic orga nization for modulation frequency in the inferior colliculus, in which a gradient of pre ferred modulation frequency is observed orthogonal to the tonotopic gradient of pre ferred acoustic frequency (Baumann, Griffiths, et al., 2011; Langner, Sams, et al., 1997). Whether there is topographic organization in the cortex remains unclear (Nelken, Bizley, et al., 2008).

Page 18 of 62

Audition

Figure 8.8 Noise-vocoded speech. A, Spectrogram of a speech utterance, generated as in Figure 8.7C. B–D Spectrograms of noisevocoded versions of the utter ance from A, generated with eight (B), four, (C), or two (D) channels. To generate the noise-vocoded speech, the amplitude envelope of the original speech signal was first measured in each of the fre quency bands in B, C, and D. A white noise signal was then filtered into these same bands, and the noise bands were multiplied by the corresponding speech envelopes. These modulated noise bands were then summed to generate a new sound signal. It is visually apparent that the sounds in parts B to D are spectrally coarser versions of the original utter ance. Good speech intelligibility is usually obtained with only four channels, indicating that patterns of amplitude modulation can support speech recogni tion in the absence of fine spectral detail.

Modulation tuning in single neurons is often studied by measuring spectrotemporal re ceptive fields (STRFs) (Depireux, Simon, et al., 2001), (p. 147) conventionally estimated using techniques such as spike-triggered averaging. To compute an STRF, neuronal re sponses to a long, stochastically varying stimulus are recorded, after which the stimulus spectrogram segments preceding each spike are averaged to yield the STRF—the stimu lus, described in terms of acoustic frequency content over time, that on average preceded a spike. In Figure 8.9B, for instance, the STRF consists of a decrease in power followed by an increase in power in the range of 10 kHz; the neuron would thus be likely to re spond well to a rapidly modulated 10 kHz tone, and less so to a tone whose amplitude was constant. This STRF can be viewed as a filter that passes modulations in a certain range of rates, that is, modulation frequencies. Note, however, that it is also tuned in acoustic frequency (the dimension on the y-axis), responding only to modulations of fairly high acoustic frequencies.

Page 19 of 62

Audition

Figure 8.9 Modulation tuning. A, Example of tempo ral modulation tuning curves for neurons in the me dial geniculate nucleus of the thalamus (Miller, Es cabi, et al., 2002). B, Example of the spectrotemporal receptive field (STRF) from a thalamic neuron (Miller , Escabi, et al., 2002). Note that the modulation in the STRF is predominantly along the temporal di mension, and that this neuron would thus be sensi tive primarily to temporal modulation. C, Example of STRFs from cortical neurons (Mesgarani, David, et al., 2008). Note that the STRFs feature spectral mod ulation in addition to temporal modulation, and as such are selective for more complex acoustic fea tures. Cortical neurons typically have longer laten cies than subcortical neurons, but this is not evident in the STRFs, probably because of nonlinearities in the cortical neurons that produce small artifacts in the STRFs (Stephen David, personal communication). Figure parts are taken from the original sources.

The STRF approximates a neuron’s output as a linear function of the cochlear input—the result of convolving the spectrogram of the acoustic input with the STRF. However, it is clear that linear models are inadequate to explain neuronal responses (Christianson, Sa hani, et al., 2008; Machens, Wehr, et al., 2004; Rotman, Bar Yosef, et al., 2001; Theunis sen, Sen, et al., 2000). Understanding the nonlinear contributions is an important direc tion (p. 148) of future research (Ahrens, Linden, et al., 2008; David, Mesgarani, et al., 2009), as neuronal nonlinearities likely play critical computational roles, but at present much analysis is restricted to linear receptive field estimates. There are established methods for computing STRFs, and they exhibit many interesting properties even though they are clearly not the whole story. Modulation tuning functions (e.g., those shown in Figure 8.9A) can be obtained via the Fourier transform of the STRF. Temporal modulation tuning is commonly observed, as previously discussed, but some tuning is normally also present for spectral modulation— Page 20 of 62

Audition variation in power that occurs along the frequency axis. Spectral modulation is often evi dent as well in spectrograms of speech (e.g., Figure 8.7C) and animal vocalizations. Mod ulation results both from individual frequency components and from formants—the broad spectral peaks that are present for vowel sounds due to vocal tract resonances. Tuning to spectral modulation is generally less pronounced than to amplitude modulation, especial ly subcortically (Miller, Escabi, et al., 2001), but is an important feature of cortical re sponses (Barbour & Wang, 2003; Mesgarani, David, et al., 2008). Examples of cortical STRFs with spectral modulation sensitivity are shown in Figure 8.9C.

(p. 149)

IV. Adaptive Coding and Plasticity

Because the auditory system evolved to enable behavior in natural auditory environ ments, it is likely to be adapted for the representation of naturally occurring sounds. Nat ural sounds thus in principle should provide hearing researchers with clues about the structure and function of the auditory system (Attias & Schreiner, 1997). In recent years there has been increasing interest in the use of natural sounds as experimental stimuli and in computational analyses of the relation between auditory representation and the environment. Most of the insights gained thus far from this approach are “postdictive”— they offer explanations of previously observed phenomena rather than revealing previous ly unforeseen mechanisms. For instance, we described earlier the attempts to explain cochlear frequency selectivity as optimal for encoding natural sounds (Lewicki, 2002; Smith & Lewicki, 2006). The efficient coding hypothesis has also been proposed to apply to modulation tuning in the inferior colliculus. Modulation tuning bandwidth tends to increase with preferred modulation frequency (Rodriguez, Chen, et al., 2010), as would be predicted if the lowpass modulation spectra of most natural sounds (Attias & Schreiner, 1997; McDermott, Wrobleski, et al., 2011; Singh & Theunissen, 2003) were to be divided into channels con veying equal power. Inferior colliculus neurons have also been found to convey more in formation about sounds whose amplitude distribution follows that of natural sounds rather than that of white noise (Escabi, Miller, et al., 2003). Along the same lines, studies of STRFs in the bird auditory system indicate that neurons are tuned to the properties of bird song and other natural sounds, maximizing discriminability of behaviorally important sounds (Hsu, Woolley, et al., 2004; Woolley, Fremouw, et al., 2005). Similar arguments have been made about the coding of binaural cues to sound localization (Harper & McAlpine, 2004). Other strands of research have explored whether the auditory system might further adapt to the environment by changing its coding properties in response to changing environ mental statistics, so as to optimally represent the current environment. Following on re search showing that the visual system adapts to local contrast statistics (Fairhall, Lewen, et al., 2001), numerous groups have reported evidence for neural adaptation in the audi tory system—responses to a fixed stimulus that vary depending on the immediate history of stimulation (Ulanovsky, Las, et al., 2003; Kvale & Schreiner, 2004). In some cases, it Page 21 of 62

Audition can be shown that this adaptation increases information transmission. For instance, the “tuning” of neurons in the inferior colliculus to sound intensity (i.e., the function relating intensity to firing rate) depends on the mean and variance of the local intensity distribu tion (Dean, Harper, et al., 2005). Qualitatively, the rate–intensity curves shift so that the point of maximum slope (around which neural discrimination of intensity is best) is closer to the most commonly occurring intensity. Quantitatively, this behavior results in in creased information transmission about stimulus level. Some researchers have recently taken things a step further, showing that auditory re sponses are dependent not just on the stimulus history but also on the task a listener is performing. Fritz and colleagues found that the STRFs measured for neurons in the pri mary auditory cortex of awake ferrets change depending on whether the animals are per forming a task (Fritz, Shamma, et al., 2003), and that the nature of the change depends on the task (Fritz, Elhilali, et al., 2005). For instance, STRF changes serve to accentuate the frequency of a tone being detected, or to enhance discrimination of a target tone from a reference. These changes are mirrored in sound-evoked responses in the prefrontal cor tex (Fritz, David, et al., 2010), which may drive the changes that occur in auditory cortex during behavior. In some cases the STRF changes persist long after the animals are fin ished performing the task, and as such may play a role in sensory memory and perceptual learning. Perhaps surprisingly, long-term plasticity appears to occur as early as the brainstem, where recent evidence in humans suggests considerable experience-dependent variation across individuals. The data in question derive from an evoked electrical potential known as the auditory brainstem response (ABR) (Skoe & Kraus, 2010). The ABR is recorded at the scalp but is believed to originate in the brainstem. It often mirrors properties of the stimulus, such that its power spectrum, for instance, often resembles that of the acoustic input. The extent to which the ABR preserves the stimulus can thus be interpreted as a measure of processing integrity. Interestingly, the ABR more accurately tracks stimulus frequency for musician listeners than nonmusicians (Wong, Skoe, et al., 2007). This could in principle reflect innate differences in auditory ability that predispose listeners to be come musicians or not, but it could also reflect the substantial differences in auditory ex perience between the two groups. Consistent (p. 150) with the latter notion, 10 hours of training on a pitch discrimination task is sufficient to improve the fidelity of the ABR re sponse to frequency, providing clear evidence of experience-dependent plasticity (Carcagno & Plack, 2011). Aspects of the ABR are also altered in listeners with reading problems (Banai, Hornickel, et al., 2009). This line of research suggests that potentially important individual differences are present at early stages of the auditory system, and that these differences are in part the result of plasticity.

V. Sound Source Perception Ultimately, we wish to understand not only what acoustic measurements are made by the auditory system, as were characterized in the previous sections, but also how these mea Page 22 of 62

Audition surements give rise to perception—what we hear when we listen to sound. Following Helmholtz, we might suppose that the purpose of audition is to infer something about the events in the world that produce sound. We can often identify sound sources with a ver bal label, for instance, and realize that we heard a finger snap, a flock of birds, or con struction noise. Even if we cannot determine the object that caused the sound, we may nonetheless know something about what happened: that something fell onto a hard floor, or into water (Gaver, 1993). Despite the richness of these aspects of auditory recognition, remarkably little is known about them at present (speech recognition stands alone as an exception), mainly because they are rarely studied (but see Gygi, Kidd, et al., 2004; Lutfi, 2008; McDermott & Simoncelli, 2011). Perhaps because they are more easily controlled and manipulated, researchers have been more inclined to instead study the perception of isolated properties of sounds or their sources. Much research has concentrated in particular on three well-known properties of sound: spatial location, pitch, and loudness. This focus is in some sense unfortunate be cause auditory perception is much richer than the hegemony of these three attributes in hearing science would indicate. However, their study has nonetheless given rise to fruit ful lines of research that have yielded many useful insights about hearing more generally.

Localization Localization is less precise in hearing than in vision but is nonetheless of great value, be cause sound enables us to localize objects that we may not be able to see. Human ob servers can judge the location of a source to within a few degrees if conditions are opti mal. The processes by which this occurs are among the best understood in hearing. Spatial location is not made explicit on the cochlea, which provides a map of frequency rather than of space, and instead must be derived from three primary sources of informa tion. Two of these are binaural, resulting from differences in the acoustic input to the two ears. Due to the difference in path length from the source to the ears, and to the acoustic shadowing effect of the head, sounds to one side of the vertical meridian reach the two ears at different times and with different intensities. These interaural time and level dif ferences vary with direction and thus provide a cue to a sound source’s location. Binaural cues are primarily useful for deriving the location of a sound in the horizontal plane, be cause changes in elevation do not change interaural time or intensity differences much. To localize sounds in the vertical dimension, or to distinguish sounds coming from in front of the head from those from in back, listeners rely on a third source of information: the filtering of sounds by the body and ears. This filtering is direction specific, such that a spectral analysis can reveal peaks and valleys in the frequency spectrum that are signa tures of location in the vertical dimension (Figure 8.10; discussed further below).

Page 23 of 62

Audition

Figure 8.10 Head-related transfer function (HRTF). Example HRTF for the left ear of one human listener. The gray level represents the amount by which a fre quency originating at a particular elevation is attenu ated or amplified by the torso, head, and ear of the listener. Sounds are filtered differently depending on their elevation, and the spectrum that is registered by the cochlea thus provides a localization cue. Note that most of the variation in elevation-dependent fil tering occurs at high frequencies (above 4 kHz). Figure is reprinted with permission from original source: Zahorik, Bangayan, et al., 2006.

Interaural time differences (ITDs) are typically a fraction of a millisecond, and just-notice able ITDs (which determine spatial acuity) can be as low as 10 microseconds (Klump & Eady, 1956). This is striking given that neural refractory periods (which determine the minimal interspike interval for a single neuron) are on the order of a millisecond, which one might think would put a limit on the temporal resolution of neural representations. Typical interaural level differences (ILDs) can be as large as 20 dB, with a just-noticeable difference of about 1 dB. ILDs result from the acoustic shadow cast by the head. To first order, ILDs are more pronounced for high frequencies because low frequencies are less affected by the acoustic shadow (because their wavelengths are comparable to the dimen sions of the head). ITDs, in contrast, support localization most effectively at low frequen cies, when the time difference between individual cycles of sinusoidal sound components can be detected via phase-locked spikes from the two ears (phase locking, as we dis cussed earlier, degrades at high frequencies). That said, ITDs between the envelopes of high-frequency sounds can also produce percepts of localization. The classic “duplex” view that localization is determined by either ILDs or ITDs, depending (p. 151) on the fre quency (Rayleigh, 1907), is thus not fully appropriate for realistic natural sounds, which in general produce perceptible ITDs across the spectrum. See Middlebrooks and Green (1991), for a review of much of the classic behavioral work on sound localization. The binaural cues to sound location are extracted in the superior olive, a subcortical re gion where inputs from the two ears are combined. In most animals there appears to be an elegant segregation of function, with ITDs being extracted in the medial superior olive (MSO) and ILDs being extracted in the lateral superior olive (LSO). In both cases, accu rate coding of interaural differences is made possible by neural signaling with unusually high temporal precision. This precision is needed to encode both sub-millisecond ITDs and ILDs of brief transient events, for which the inputs from the ears must be aligned in time. Brain structures subsequent to the superior olive largely inherit its ILD and ITD Page 24 of 62

Audition sensitivity. See Yin and Kuwada, 2010, for a recent review of the physiology of binaural lo calization. Binaural cues are of little use in distinguishing sounds at different locations on the verti cal dimension (relative to the head), or in distinguishing front from back, because interau ral time and level differences are largely unaffected by changes across these locations. Instead, listeners rely on spectral cues provided by the filtering of a sound by the torso, head, and ears of a listener. The filtering results from the reflection and absorption of sound by the surfaces of a listener’s body, with sound from different directions producing different patterns of reflection and thus different patterns of filtering. The effect of these interactions on the sound that reaches the eardrum can be described by a linear filter known as the head-related transfer function (HRTF). The overall effect is that of amplify ing some frequencies while attenuating others. A broadband sound entering the ear will thus be endowed with peaks and valleys in its frequency spectrum (see Figure 8.10). Compelling sound localization can be perceived when these peaks and valleys are artifi cially induced. The effect of the filtering is obviously confounded with the spectrum of the unfiltered sound source, and the brain must make some assumptions about the source spectrum. When these assumptions are violated, as with narrowband sounds whose spec tral energy occurs at a peak in the HRTF of a listener, sounds are mislocalized (Middle brooks, 1992). For broadband sounds, however, HRTF filtering produces signatures that are sufficiently distinct as to support localization in the vertical dimension to within 5 de grees or so in some cases, although some locations are more accurately perceived than others (Makous & Middlebrooks, 1990; Wightman & Kistler, 1989). The bulk of the filtering occurs in the outer ear (the pinna), the folds of which produce distinctive pattern of reflections. Because pinna shapes vary across listeners, the HRTF is listener specific as well as location specific, with spectral peaks and valleys that are in different places for different listeners. Listeners appear to learn the HRTFs for their set of ears. When ears are artificially modified with plastic molds that change their shape, lo calization initially suffers considerably, but over a period of weeks, listeners regain the ability to localize with the modified ears (Hofman, Van Riswick, et al., 1998). Listeners thus learn at least some of the details of their particular HRTF through experience, al though sounds (p. 152) can be localized even when the peaks and valleys of the pinna fil tering are somewhat blurred (Kulkarni & Colburn, 1998). Moreover, compelling spatial ization is often evident even if a generic HRTF is used. The physiology of HRTF-related cues for localization is not as developed as it is for binau ral cues, but there is evidence that midbrain regions may again be important. Many infe rior colliculus neurons, for instance, show tuning to sound elevation (Delgutte, Joris, et al., 1999). The selectivity for elevation presumably derives from tuning to particular spec tral patterns (peaks and valleys in the spectrum) that are diagnostic of particular loca tions (May, Anderson, et al., 2008).

Page 25 of 62

Audition Although the key cues for sound localization are extracted subcortically, lesion studies re veal that the cortex is essential for localizing sound. Ablating auditory cortex typically produces large deficits in localization (Heffner & Heffner, 1990), with unilateral lesions producing deficits specific to locations contralateral to the side of the lesion (Jenkins & Masterton, 1982). Consistent with these findings, tuning to sound location is widespread in auditory cortical neurons, with the preferred location generally positioned in the con tralateral hemifield (Middlebrooks, 2000). Topographic representations of space have not been found to be evident within individual auditory cortical areas, although one recent re port argues that such topography may be evident across multiple areas (Higgins, Storace, et al., 2010).

Pitch Although the word pitch is often used colloquially to refer to the perception of sound fre quency, in hearing research it has a more specific meaning—pitch is the perceptual corre late of periodicity. Vocalizations, instrument sounds, and some machine sounds are all of ten produced by periodic physical processes. Our vocal cords open and close at regular intervals, producing a series of clicks separated by regular temporal intervals. Instru ments produce sounds via strings that oscillate at a fixed rate, or via tubes in which the air vibrates at particular resonant frequencies, to give two examples. Machines frequent ly feature rotating parts, which often produce sounds at every rotation. In all these cases, the resulting sounds are periodic—the sound pressure waveform consists of a single shape that repeats at a fixed rate (Figure 8.11A). Perceptually, such sounds are heard as having a pitch that can vary from low to high, proportional to the frequency at which the waveform repeats (the fundamental frequency, i.e., the F0). The periodicity is distinct from whether a sound’s frequencies fall in high or low regions of the spectrum, although in practice periodicity and the spectral center of mass are sometimes correlated. Pitch is important because periodicity is important—the period is often related to proper ties of the source that are useful to know, such as its size, or tension. Pitch is also used for communicative purposes, varying in speech prosody, for instance, to convey meaning or emotion. Pitch is a centerpiece of music, forming the basis of melody, harmony, and tonality. Listeners also use pitch to track sound sources of interest in auditory scenes. Many physically different sounds—all those with a particular period—have the same pitch. Historically, pitch has been a focal point of hearing research because it is an impor tant perceptual property with a nontrivial relationship to the acoustic input, whose mech anistic characterization has been resistant to unambiguous solution. Debates on pitch and related phenomena date back at least to Helmholtz, and continue to occupy many re searchers today (Plack, Oxenham, et al., 2005). One central debate concerns whether pitch is derived from an analysis of frequency or time. Periodic waveforms produce spectra whose frequencies are harmonically related— they form a harmonic series, being integer multiples of the fundamental frequency, whose period is the period of the waveform (Figure 8.11B). Although the fundamental frequency Page 26 of 62

Audition determines the pitch, the fundamental need not be physically present in the spectrum for a sound to have pitch—sounds missing the fundamental frequency but containing other harmonics of the fundamental are still perceived to have the pitch of the fundamental, an effect known as the missing fundamental illusion. What matters for pitch perception is whether the frequencies that are present are harmonically related. Pitch could thus con ceivably be detected with harmonic templates applied to an estimate of a sound’s spec trum obtained from the cochlea (Goldstein, 1973; Shamma & Klein, 2000; Terhardt, 1974; Wightman, 1973). Alternatively, periodicity could be assessed in the time domain, for in stance via the autocorrelation function (Cariani & Delgutte, 1996; de Cheveigne & Kawa hara, 2002; Meddis & Hewitt, 1991). The autocorrelation measures the correlation of a signal with a delayed copy of itself. For a periodic signal that repeats with some period, the autocorrelation exhibits peaks at multiples of the period (Figure 8.11C).

Figure 8.11 Periodicity and pitch. Waveform, spec trum, and autocorrelation function for a note played on an oboe. The note shown is the A above middle C, with a fundamental frequency (F0) of 440 Hz. A, Ex cerpt of waveform. Note that the waveform repeats every 2.27 ms (the period). B, Spectrum. Note the peaks at integer multiples of the F0, characteristic of a periodic sound. In this case, the F0 is physically present, but the second, third, and fourth harmonics actually have higher amplitude. C, Autocorrelation. The correlation coefficient is always 1 at a lag of 0 ms, but because the waveform is periodic, correla tions close to 1 are also found at integer multiples of the period (2.27, 4.55, 6.82, and 9.09 ms in this ex ample). Figure reprinted with permission from original source: McDermott & Oxenham, 2008. (p. 153)

Page 27 of 62

Audition Such analyses are in principle functionally equivalent because the power spectrum is re lated to the autocorrelation via the Fourier transform, and detecting periodicity in one do main versus the other might simply seem a question of implementation. In the context of the auditory system, however, the two concepts diverge, due to information being limited by distinct factors in the two domains. Time–domain models are typically assumed to uti lize fine-grained spike timing (i.e., phase locking), with concomitant temporal resolution limits. In contrast, frequency-based models (often known as place models, in reference to the frequency–place mapping that occurs on the basilar membrane) rely on the pattern of excitation along the cochlea, which is limited in resolution by the frequency tuning of the cochlea (Cedolin & Delgutte, 2005). Cochlear frequency selectivity is present in time–do main models of pitch as well, but its role is typically not to estimate the spectrum but sim ply to restrict an autocorrelation analysis to a narrow frequency band (Bernstein & Oxen ham, 2005), which might help improve its robustness in the presence of multiple sound sources. Reviews of the current debates and their historical origins are available else where (de Cheveigne, 2004; Plack & Oxenham, 2005), and we will not discuss them ex haustively here. Suffice it to say that despite being a centerpiece of hearing research for decades, the mechanisms underlying pitch perception remain under debate. Research on pitch has provided many important insights about hearing even though a conclusive account of pitch remains elusive. One contribution of pitch research has been to reveal the importance of the resolvability of individual frequency components by the cochlea, a principle that has importance in other aspects of hearing as well. Because the frequency resolution of the cochlea is approximately constant on a logarithmic scale, whereas the components of a harmonic tone are equally spaced on a linear scale (separat ed by a fixed number of hertz, equal to the fundamental frequency of the tone; Figure 8.12A), multiple high-numbered harmonics fall within a single cochlear filter (Figure 8.12B). Because of the nature of the log scale, this is true regardless of whether the fun damental is low or high. As a result, the excitation pattern induced by a tone on the cochlea (of a human with normal hearing) is believed to contain resolvable peaks for only the first ten or so harmonics (Figure 8.12C).

Page 28 of 62

Audition

Figure 8.12 Resolvability. A, Spectrum of a harmonic complex tone composed of thirty-five harmonics of equal amplitude. The fundamental frequency is 100 Hz—the frequency of the lowest component in the spectrum and the amount by which adjacent harmon ics are separated. B, Frequency responses of audito ry filters, each of which represents a particular point on the cochlea. Note that because a linear frequency scale is used, the filters increase in bandwidth with center frequency, such that many harmonics fall within the passband of the high frequency filters. C, The resulting pattern of excitation along the cochlea in response to the tone in A. The excitation is the am plitude of vibration of the basilar membrane as a function of characteristic frequency (the frequency to which a particular point on the cochlea responds best, i.e., the center frequency of the auditory filter representing the response properties of the cochlea at that point). Note that the first ten or so harmonics produce resolvable peaks in the pattern of excitation, but that higher numbered harmonics do not. The lat ter are thus said to be “unresolved.” D, The pattern of vibration that would be observed on the basilar membrane at several points along its length. When harmonics are resolved, the vibration is dominated by the harmonic close to the characteristic frequen cy, and is thus sinusoidal. When harmonics are unre solved, the vibration pattern is more complex, re flecting the multiple harmonics that stimulate the cochlea at those points. Figure reprinted with permission from original source: Plack, 2005.

There is now abundant evidence that resolvability places strong constraints on pitch per ception. For instance, the perception of pitch is determined (p. 154) predominantly by lownumbered harmonics (harmonics one to ten or so in the harmonic series), presumably ow ing to the peripheral resolvability of these harmonics. Moreover, the ability to discrimi Page 29 of 62

Audition nate pitch is much poorer for tones synthesized with only high-numbered harmonics than for tones containing only low-numbered harmonics, an effect not accounted for simply by the frequency range in which the harmonics occur (Houtsma & Smurzynski, 1990; Shack leton & Carlyon, 1994). This might be taken as evidence that the spatial pattern of excita tion, rather than the periodicity that could be derived from the autocorrelation, underlies pitch perception, but variants of autocorrelation-based models have also been (p. 155) pro posed to account for the effect of resolvability (Bernstein & Oxenham, 2005). Resolvabili ty has since been demonstrated to constrain sound segregation as well as pitch (Micheyl & Oxenham, 2010); see below. Just as computational theories of pitch remain a matter of debate, so do its neural corre lates. One might expect that neurons at some stage of the auditory system would be tuned to stimulus periodicity, and there is one recent report of this in marmosets (Bendor & Wang, 2005). However, comparable results have yet to be reported in other species (Fishman, Reser, et al., 1998), and some have argued that pitch is encoded by ensembles of neurons with broad tuning rather than single neurons selective for particular funda mental frequencies (Bizley, Walker, et al., 2010). In general, pitch-related responses can be difficult to disentangle from artifactual responses to distortions introduced by the non linearities of the cochlea (de Cheveigne, 2010; McAlpine, 2004). Given the widespread presence of frequency tuning in the auditory system, and the im portance of harmonic frequency relations in pitch, sound segregation (Darwin, 1997), and music (McDermott, Lehr, et al., 2010), it is natural to think there might be neurons with multipeaked tuning curves selective for harmonic frequencies. There are a few isolated reports of such tuning (Kadia & Wang, 2003; Sutter & Schreiner, 1991), but the tuning peaks do not always correspond to harmonic frequencies, and whether they relate to pitch is unclear. At least given how researchers have looked for it thus far, tuning for har monicity is not as evident in the auditory system as might be expected. If pitch is analyzed in a particular part of the brain, one might expect the region to re spond more to stimuli with pitch than to those lacking it, other things being equal. Such response properties have in fact been reported in regions of auditory cortex identified with functional imaging in humans (Hall, Barrett, et al. 2005; Patterson, Uppenkamp, et al., 2002; Penagos, Melcher, et al., 2004; Schonwiesner & Zatorre, 2008). The regions are typically reported to lie outside primary auditory cortex, and could conceivably be homol ogous to the region claimed to contain pitch-tuned neurons in marmosets (Bendor & Wang, 2006), although again there is some controversy over whether pitch per se is impli cated (Hall & Plack, 2009). See Winter, 2005, and Walker, Bizley, et al., 2010, for recent reviews of the brain basis of pitch. In many contexts (e.g., the perception of music or speech intonation), it is the changes in pitch over time that matter rather than the absolute value of the F0. For instance, pitch increases or decreases are what capture the identity of a melody or the intention of a speaker. Less is known about how this relative pitch information is represented in the brain, but the right temporal lobe has been argued to be important, in part on the basis of Page 30 of 62

Audition brain-damaged patients with apparently selective deficits in relative pitch (Johnsrude, Penhune, et al., 2000). See McDermott and Oxenham, 2008, for a review of the perceptu al and neural basis of relative pitch.

Loudness Loudness is the perhaps the most immediate perceptual property of sound, and has been actively studied for more than 150 years. To first order, loudness is the perceptual corre late of sound intensity. In real-world listening scenarios, loudness exhibits additional in fluences that suggest it serves to estimate the intensity of a sound source, as opposed to the intensity of the sound entering the ear (which changes with distance and the listening environment). However, loudness models that capture exclusively peripheral processing nonetheless have considerable predictive power. For a sound with a fixed spectral profile, such as a pure tone or a broadband noise, the relationship between loudness and intensity can be approximated via the classic Stevens power law (Stevens, 1955). However, the relation between loudness and intensity is not as simple as one might imagine. For instance, loudness increases with increasing band width—a sound whose frequencies lie in a broad range will seem louder than a sound whose frequencies lie in a narrow range, even when their physical intensities are equal. Standard models of loudness thus posit something somewhat more complex than a simple power law of intensity: that loudness is linearly related to the total amount of neural ac tivity elicited by a stimulus at the level of the auditory nerve (ANSI, 2007; Moore & Glas berg, 1996). The effect of bandwidth on loudness is explained via the compression that occurs in the cochlea: loudness is determined by the neural activity summed across nerve fibers, the spikes of which are generated after the output of a particular cochlear location is nonlinearly compressed. Because compression boosts low responses relative to high re sponses, the sum of several responses to low amplitudes (produced by the several fre quency channels stimulated by a broadband sound) is greater than a single response to a high amplitude (produced by a single frequency (p. 156) channel responding to a narrow band sound of equal intensity). Loudness also increases with duration for durations up to half a second or so (Buus, Florentine, et al., 1997), suggesting that it is computed from neural activity integrated over some short window. The ability to predict perceived loudness is important in many practical situations, and is a central issue in the fitting of hearing aids. Cochlear compression is typically reduced in hearing-impaired listeners, and amplification runs the risk of making sounds uncomfort ably loud unless compression is introduced artificially. There has thus been long-standing interest in quantitative models of loudness. Loudness is also influenced in interesting ways by the apparent distance of a sound source. Because intensity attenuates with distance from a sound source, the intensity of a sound at the ear is determined conjointly by the intensity and distance of the source. At least in some contexts, the auditory system appears to use loudness as a perceptual esti mate of a source’s intensity (i.e., the intensity at the point of origin), such that sounds Page 31 of 62

Audition that appear more distant seem louder than those that appear closer but have the same overall intensity. Visual cues to distance have some influence on perceived loudness (Mer shon, Desaulniers, et al., 1981), but the cue provided by the amount of reverberation also seems to be important. The more distant a source, the weaker the direct sound from the source to the listener, relative to the reverberant sound that reaches the listener after re flection off of surfaces in the environment (see Figure 8.14). This ratio of direct to rever berant sound appears to be used both to judge distance and to calibrate loudness percep tion (Zahorik & Wightman, 2001), although how the listener estimates this ratio from the sound signal remains unclear at present. Loudness thus appears to function somewhat like size or brightness perception in vision, in which perception is not based exclusively on retinal size or light intensity (Adelson, 2000).

VI. Auditory Scene Analysis Thus far we have discussed how the auditory system represents single sounds in isola tion, as might be produced by a note played on an instrument, or a word uttered by some one talking. The simplicity of such isolated sounds renders them convenient objects of study, yet in many auditory environments, isolated sounds are not the norm. It is often the case that many things make sound at the same time, causing the ear to receive a mixture of multiple sources as its input. Consider Figure 8.13, which displays spectrograms of a single “target” speaker along with that of the mixture that results from adding to it the utterances of one, three, and seven additional speakers, as might occur in a social set ting. The brain’s task in this case is to take such a mixture as input and recover enough of the content of a target sound source to allow speech comprehension or otherwise support behavior. This is a nontrivial task. In the example of Figure 8.13, for instance, it is appar ent that the structure of the target utterance is progressively obscured as more speakers are added to the mixture. Machine systems for recognizing speech suffer dramatically un der such conditions, performing well in quiet, but much worse in the presence of multiple speakers (Lippmann, 1997). The presence of competing sounds greatly complicates the computational extraction of just about any sound source property, from pitch (de Cheveigne, 2006) to location. Human listeners, however, parse auditory scenes with a re markable degree of success. In the example of Figure 8.13, the target remains largely au dible to most listeners even in the mixture of eight speakers. This is the classic “cocktail party problem” (Bee & Micheyl, 2008; Bregman, 1990; Bronkhorst, 2000; Carlyon, 2004; Cherry, 1953; Darwin, 1997; McDermott, 2009). Historically, the “cocktail party problem” has referred to two conceptually distinct prob lems that in practice are closely related. The first, known as sound segregation, is the problem of deriving representations of individual sound sources from a mixture of sounds. The second is the task of directing attention to one source among many, as when listening to a particular speaker at a party. These tasks are related because the ability to segregate sounds is probably dependent on attention (Carlyon, Cusack, et al., 2001; Shinn-Cunning ham, 2008), although the extent and nature of this dependence remains an active area of study (Macken, Tremblay, et al., 2003). Here, we will focus on the first problem, of sound Page 32 of 62

Audition segregation, which is usually studied under conditions in which listeners pay full atten tion to a target sound. Al Bregman, a Canadian psychologist, is typically credited with drawing interest to this problem and pioneering its study (Bregman, 1990).

Sound Segregation and Acoustic Grouping Cues

Figure 8.13 The cocktail party problem. Spectro grams of a single “target” utterance (top row), and the same utterance mixed with one, three, and seven additional speech signals from different speakers. The mixtures approximate the signal that would en ter the ear if the additional speakers were talking as loud as the target speaker, but were standing twice as far away from the listener (to simulate cocktail party conditions). The grayscale denotes attenuation from the maximum energy level across all of the sig nals (in dB), such that gray levels can be compared across spectrograms. Spectrograms in the right col umn are identical to those on the left except for the superimposed color masks. Pixels labeled green are those where the original target speech signal is more than –50 dB but the mixture level is at least 5 dB higher, and thus masks the target speech. Pixels la beled red are those where the target had less than -50 dB and the mixture had more than –50 dB ener gy. Spectrograms were computed from a filter bank with bandwidths and frequency spacing similar to those in the ear. Each pixel is the rms amplitude of the signal within a frequency band and time window. Figure reprinted with permission from original source: McDermott, 2009.

Sound segregation is a classic example of an ill-posed problem in perception. Many differ ent sets of sounds are physically consistent with the mixture (p. 157) that enters the ear (in that their sum is equal to the mixture), only one of which actually occurred in the world. The auditory system must infer the set of sounds that actually occurred. As in oth Page 33 of 62

Audition er ill-posed problems, this inference is only possible with the aid of assumptions that con strain the solution. In this case, the assumptions concern the nature of sounds in the world, and are presumably learned from experience with natural sounds (or perhaps hard-wired into the auditory system via evolution). Grouping cues (i.e., sound properties that dictate whether sound elements are heard as part of the same sound) are examples of these assumptions. For instance, natural sounds that have pitch, such as vocalizations, contain frequencies that are harmonically related, evident as banded structures in lower half of the spectrogram of the target speaker in Figure 8.13. Harmonically related frequencies are unlikely to occur from the chance alignment of multiple different sounds, and thus when they (p. 158) are present in a mix ture, they are likely to be due to the same sound and are generally heard as such (de Cheveigne, McAdams, et al., 1995; Roberts & Brunstrom, 1998). Moreover, a component that is mistuned (in a tone containing otherwise harmonic frequencies) segregates from the rest of the tone (Moore, Glasberg, et al., 1986). Understanding sound segregation re quires understanding the acoustic regularities, such as harmonicity, that characterize nat ural sound sources and that are used by the auditory system. Perhaps the most important generic acoustic grouping cue is common onset: frequency components that begin and end at the same time are likely to belong to the same sound. Onset differences, when manipulated experimentally, cause frequency components to per ceptually segregate from each other (Cutting, 1975; Darwin, 1981). Interestingly, a com ponent that has an earlier or later onset than the rest of a set of harmonics has reduced influence over the perceived pitch of the entire tone (Darwin & Ciocca, 1992), suggesting that pitch computations operate on frequency components that are deemed likely to be long together, rather than on the raw acoustic input. Onset may be viewed as a special case of comodulation—amplitude modulation that is common to different spectral regions. In some cases relatively slow comodulation pro motes grouping of different spectral components (Hall, Haggard, et al., 1984), although abrupt onsets seem to be most effective. Common offset also promotes grouping but is less effective than common onset (Darwin, 1984), perhaps because abrupt offsets are less common in natural sounds (Cusack & Carlyon, 2004). Not every intuitively plausible grouping cue produces a robust effect when assessed psy chophysically. For instance, frequency modulation (FM) that is shared (“coherent”) across multiple frequency components, as in voiced speech, has been proposed to promote their grouping (Bregman, 1990; McAdams, 1989). However, listeners are poor at discriminat ing coherent from incoherent FM if the component tones are not harmonically related, in dicating that sensitivity to FM coherence may simply be mediated by the deviations from harmonicity that occur when harmonic tones are incoherently modulated (Carlyon, 1991). One might also think that the task of segregating sounds would be greatly aided by the tendency of distinct sound sources in the world to originate from distinct locations. In practice, spatial cues are indeed of some benefit, for instance, in hearing a target sen tence from one direction amid distracting utterances from other directions (Bronkhorst, Page 34 of 62

Audition 2000; Hawley, Litovsky, et al., 2004; Ihlefeld & Shinn-Cunningham, 2008; Kidd, Arbogast, et al., 2005). However, spatial cues are surprisingly ineffective at segregating one fre quency component from a group of others (Culling & Summerfield, 1995), especially when pitted against other grouping cues such as onset or harmonicity (Darwin & Hukin, 1997). The benefit of listening to a target with a distinct location (Bronkhorst, 2000; Haw ley, Litovsky, et al., 2004; Ihlefeld & Shinn-Cunningham, 2008; Kidd, Arbogast, et al., 2005) may thus be due to the ease with which the target can be attentively tracked over time amid competing sound sources, rather than to a facilitation of auditory grouping per se (Darwin & Hukin, 1999). Moreover, humans are usually able to segregate monaural mixtures of sounds without difficulty, demonstrating that spatial separation is often not necessary for high performance. For instance, much popular music of the twentieth cen tury was released in mono, and yet listeners have no trouble distinguishing many differ ent instruments and voices in any given recording. Spatial cues thus contribute to sound segregation, but their presence or absence does not seem to fundamentally alter the problem. The weak effect of spatial cues on segregation may reflect their fallibility in complex audi tory scenes. Binaural cues can be contaminated when sounds are combined or degraded by reverberation (Brown & Palomaki, 2006) and can even be deceptive, as when caused by echoes (whose direction is generally different from the original sound source). It is possible that the efficacy of different grouping cues in general reflects their reliability in natural conditions. Evaluating this hypothesis will require statistical analysis of natural auditory scenes, an important direction for future research.

Sequential Grouping Because the spectrogram approximates the input that the cochlea provides to the rest of the auditory system, it is common to view the problem of sound segregation as one of de ciding how to group the various parts of the spectrogram (Bregman, 1990). However, the brain does not receive an entire spectrogram at once. Rather, the auditory input arrives gradually over time. Many researchers thus distinguish between the problem of simulta neous grouping (determining how the spectral content of a short segment of the auditory input should be segregated) and sequential grouping (determining how the (p. 159) groups from each segment should be linked over time, e.g., to form a speech utterance or a melody) (Bregman, 1990). Although most of the classic grouping cues (e.g., onset/comodulation, harmonicity, ITD) are quantities that could be measured over short timescales, the boundary between what is simultaneous and what is sequential is unclear for most real-world signals, and it may be more appropriate to view grouping as being influenced by processes operating at mul tiple timescales rather than two cleanly divided stages of processing. There are, however, contexts in which the bifurcation into simultaneous and sequential grouping stages is nat ural, as when the auditory input consists of discrete sound elements that do not overlap in time. In such situations interesting differences are sometimes evident between the grouping of simultaneous and sequential elements. For instance, spatial cues, which are Page 35 of 62

Audition relatively weak as a simultaneous cue, have a stronger influence on sequential grouping of tones (Darwin & Hukin, 1997). Another clear case of sequential processing can be found in the effects of sound repeti tion. Sounds that occur repeatedly in the acoustic input are detected by the auditory sys tem as repeating, and are inferred to be a single source. Perhaps surprisingly, this is true even when the repeating source is embedded in mixtures with other sounds, and is never presented in isolation (McDermott, Wrobleski, et al., 2011). In such cases the acoustic in put itself does not repeat, but the source repetition induces correlations in the input that the auditory system detects and uses to extract the repeating sound. The informativeness of repetition presumably results from the fact that mixtures of multiple sounds tend not to occur repeatedly, such that when a structure does repeat, it is likely to be a single source. Effects of repetition are also evident in classic results on “informational masking”—mask ing-like effects on the detectability of a target tone, so-called because they cannot be ex plained in terms of conventional “energetic masking,” (in which the response to the tar get is swamped by a masker that falls within the same peripheral channel). Demonstra tions of informational masking typically present a target tone along with other tones that lie outside a “protected region” of the spectrum, such that they are unlikely to stimulate the same filters as the target tone. These “masking” tones nonetheless often elevate the detection threshold for the target, sometimes quite dramatically (Durlach, Mason, et al., 2003; Lutfi, 1992; Neff, 1995; Watson, 1987). The effect is presumably due to impair ments in the ability to segregate the target tone from the masker tones, and can be re duced when the target is repeatedly presented (Kidd, Mason et al., 1994; Kidd, Mason et al., 2003).

Streaming One type of sequential segregation effect has particularly captured the imagination of the hearing community and merits special mention. When two pure tones of different fre quency are repeatedly presented in alternation, one of two perceptual states is commonly reported by listeners: one in which the two repeated tones are heard as a single “stream” whose pitch varies over time, and one in which two distinct streams are heard, one with the high tones and one with the low tones (Bregman & Campbell, 1971). If the frequency separation between the two tones is small, and if the rate of alternation is slow, one stream is generally heard. When the frequency separation is larger or the rate is faster, two streams tend to be heard, in which case “streaming” is said to occur (van Noorden, 1975). An interesting hallmark of this phenomenon is that when two streams are perceived, judgments of the temporal order of elements in different streams are impaired (Bregman & Campbell, 1971; Micheyl & Oxenham, 2010). This latter finding provides compelling ev idence for a substantive change in the representation underlying the two percepts. Sub sequent research has demonstrated that separation along most dimensions of sound can elicit streaming (Moore & Gockel, 2002). The streaming effects in these simple stimuli Page 36 of 62

Audition may be viewed as a variant of grouping by similarity—elements are grouped together when they are similar along some dimension, and segregated when they are sufficiently different, presumably because this similarity reflects the likelihood of having been pro duced by the same source.

Filling in Although it is common to view sound segregation as the problem of grouping the spectro gram-like output of the cochlea across frequency and time, this cannot be the whole story, in part because large swaths of a sound’s time–frequency representation are often physi cally obscured (masked) by other sources and are thus not physically available to be grouped. Masking is evident in the green pixels of Figure 8.13, which represent points where the target source has substantial energy, but where the mixture exceeds it in level. If these points are simply assigned (p. 160) to the target, or omitted from its representa tion, the target’s level at those points will be misconstrued, and the sound potentially misidentified. To recover an accurate estimate of the target source, it is necessary to in fer not just the grouping of the energy in the spectrogram but also the structure of the target source in the places where it is masked. There is in fact considerable evidence that the auditory system does just this, from exper iments investigating the perception of partially masked sounds. For instance, tones that are interrupted by noise bursts are “filled in” by the auditory system, such that they are heard as continuous in conditions in which physical continuity is plausible given the stim ulus (Warren, Obusek, et al., 1972). Known as the “continuity effect”, it occurs only when the interrupting noise bursts are sufficiently intense in the appropriate part of the spec trum to have masked the tone should it have been present continuously. Continuity is also heard for frequency glides (Ciocca & Bregman, 1987; Kluender & Jenison, 1992) as well as oscillating frequency-modulated tones (Carlyon, Micheyl, et al., 2004). The perception of continuity across intermittent maskers was actually first reported for speech signals in terrupted by noise bursts (Warren, 1970). For speech, the effect is often termed phonemic restoration, and likely indicates that knowledge of speech acoustics (and perhaps of other types of sounds as well) influences the inference of the masked portion of sounds. Similar effects occur in the spectral domain—regions of the spectrum are perceptually filled in when evidence indicates they are likely to have been masked, e.g. by a continuous noise source (McDermott & Oxenham, 2008). Filling-in effects in hearing are conceptually simi lar to completion under and over occluding surfaces in vision, although the ecological constraints provided by masking (involving the relative intensity of two sounds) are dis tinct from those provided by occlusion (involving the relative depth of two surfaces). Neu rophysiological evidence indicates that the representation of tones in primary auditory cortex reflects the perceived continuity, responding as though the tone were continuously present despite being interrupted by noise (Petkov, O’Connor, et al., 2007; Riecke, van Opstal, et al., 2007).

Page 37 of 62

Audition

Brain Basis of Sound Segregation Recent years have seen great interest in how sound segregation is instantiated in the brain. One proposal that has attracted interest is that sounds are heard as segregated when they are represented in non-overlapping neural populations at some stage of the au ditory system. This idea derives largely from studies of the pure-tone streaming phenome na described earlier, with the hope that it will extend to more realistic sounds. The notion is that conditions that cause two tones to be represented in distinct neural populations are also those that cause sequences of two tones to be heard as separate streams (Bee & Klump, 2004; Fishman, Arezzo, et al., 2004; Micheyl, Tian, et al., 2005; Pressnitzer, Sayles, et al., 2008). Because of tonotopy, different frequencies are processed in neural populations whose degree of overlap decreases as the frequencies become more separated. Moreover, tones that are more closely spaced in time are more likely to reduce each other’s response (via what is termed suppression), which also reduces overlap be tween the tone representations—a tone on the outskirts of a neuron’s receptive field might be sufficiently suppressed as to not produce a response at all. These two factors, frequency separation and suppression, predict the two key effects in pure-tone stream ing: that streaming should increase when tones are more separated in frequency or are presented more quickly (van Noorden, 1975). Experiments over the past decade in multiple animal species indicate that pure-tone se quences indeed produce non-overlapping neural responses under conditions in which streaming is perceived by human listeners (Bee & Klump, 2004; Fishman, Arezzo, et al., 2004; Micheyl, Tian, et al., 2005; Pressnitzer, Sayles, et al., 2008). Some of these experi ments take advantage of another notable property of streaming—its strong dependence on time. Specifically, the probability that listeners report two streams increases with time from the beginning of the sequence, an effect termed buildup (Bregman, 1978). Buildup has been linked to neurophysiology via neural adaptation. Because neural responses de crease with stimulus repetition, over time it becomes less likely that two stimuli with dis tinct properties will both exceed the spiking threshold for the same neuron, such that the neural responses to two tones become increasingly segregated on a timescale consistent with that of perceptual buildup (Micheyl, Tian, et al., 2005; Pressnitzer, Sayles, et al., 2008). For a comprehensive review of these and related studies, see Snyder and Alain, 2007, and Fishman and Steinschneider, 2010. A curious feature of these studies is that they suggest that streaming is an accidental side effect of what would appear to be general features of the auditory system—tonotopy, sup pression, and (p. 161) adaptation. Given that sequential grouping seems likely to be of great adaptive significance (because it affects our ability to recognize sounds), it would seem important for an auditory system to behave close to optimally, that is, for the per ception of one or two streams to be related to the likelihood of one or two streams in the world. It is thus striking that the phenomenon is proposed to result from apparently inci dental features of processing. Consistent with this viewpoint, a recent study showed that synchronous high- and low-frequency tones produce neural responses that are just as Page 38 of 62

Audition segregated as those for the classic streaming configuration of alternating high and low tones, even though perceptual segregation does not occur when the tones are synchro nous (Elhilali, Ma, et al., 2009). This finding indicates that non-overlapping neural re sponses are not sufficient for perceptual segregation, and that the relative timing of neur al responses may be more important. The significance of neural overlap thus remains un clear, and the brain basis of streaming will undoubtedly continue to be debated in the years to come.

Separating Sound Sources from the Environment Thus far we have mainly discussed how the auditory system segregates the signals from multiple sound sources, but listeners face a second important scene analysis problem. The sound that reaches the ear from a source is almost always altered to some extent by the surrounding environment, and these environmental influences must be separated from those of the source if the source content is to be estimated correctly. Typically the sound produced by a source reflects off multiple surfaces on its way to the ears, such that the ears receive some sound directly from the source, but also many reflected versions (Figure 8.14). These reflected versions (echoes) are delayed because their path to the ear is lengthened, but generally they also have altered frequency spectra because reflective surfaces absorb some frequencies more than others. Because each reflection can be well described with a linear filter applied to the source signal, the signal reaching the ear, which is the sum of the direct sound along with all the reflections, can be described sim ply as the result of applying a single composite linear filter to the source (Gardner, 1998). Significant filtering of this sort occurs in almost every natural listening situation, such that sound produced in anechoic conditions (in which all surfaces are minimally reflec tive) sounds noticeably strange and unnatural. Listeners are often interested in the properties of sound sources, and one might think of the environmental effects as a nuisance that should simply be discounted. However, envi ronmental filtering imbues the acoustic input with useful information—for instance, about the size of a room where sound is produced and the distance of the source from the lis tener. It is thus more appropriate to think of separating source and environment, at least to some extent, rather than simply recovering the source. Reverberation is commonly used in music production, for instance, to create a sense of space or to give a different feel to particular instruments or voices. The loudness constancy phenomena discussed earlier are one example of the brain infer ring the properties of the sound source as separate from that of the environment, but there are many others. One of the most interesting involves the treatment of echoes in sound localization. The echoes that are common in most natural environments pose a problem for localization because they generally come from directions other than that of the source (Figure 8.14B). The auditory system appears to solve this problem by percep tually fusing similar impulsive sounds that occur within a brief interval of each other (on the order of 10 ms or so), and using the sound that occurs first to determine the per ceived location. This precedence effect, so called because of the dominance of the sound Page 39 of 62

Audition that occurs first, was described and named by Hans Wallach (Wallach, Newman, et al., 1949), one of the great gestalt psychologists, and has since been the subject of a large and interesting literature. For instance, the maximal delay at which echoes are perceptu ally suppressed increases as two pairs of sounds are repeatedly presented (Freyman, Clifton, et al., 1991), presumably because the repetition provides evidence that the sec ond sound is indeed an echo of the first, rather than being due to a distinct source (in which case it would not occur at a consistent delay following the first sound). Moreover, reversing the order of presentation can cause an abrupt breakdown of the effect, such that two sounds are heard rather than one, each with a different location. See Litovsky, Colburn, et al., 1999, for a review.

Figure 8.14 Reverberation. A, Impulse response for a classroom. This is the sound waveform recorded in this room in response to a click (impulse) produced at a particular location in the room. The top arrow indicates the impulse that reaches the microphone directly from the source (that thus arrives first). The lower arrow indicates one of the subsequent reflec tions, i.e., echoes. After the early reflections, a grad ually decaying reverberation tail is evident (cut off at 250 ms for clarity). The sound signal resulting from an arbitrary source could be produced by convolving the sound from the source with this impulse re sponse. B, Schematic diagram of the sound reflec tions that contribute to the signal that reaches a listener’s ears in a typical room. The brown box in the upper right corner depicts the speaker producing sound. The green lines depict the path taken by the direct sound to the listener’s ears. Blue and red lines depict sound reaching the ears after one and two re flections, respectively. Sound reaching the ear after more than two reflections is not shown. Part B is reprinted with permission from Culling & Akeroyd, 2010.

Page 40 of 62

Audition Reverberation poses a problem for sound recognition in addition to localization because different environments alter the sound from a source in different ways. Large amounts of reverberation (with prominent echoes at very long delays), as are present in some large auditoriums, can in fact greatly reduce the intelligibility of speech. Moderate amounts of (p. 162) reverberation, however, as are present most of the time, typically have minimal ef fect on our ability to recognize speech and other sounds. Recent work indicates that part of our robustness to reverberation derives from a process that adapts to the history of echo stimulation. In reverberant conditions, the intelligibility of a speech utterance has been found to be higher when preceded by another utterance than when not, an effect that does not occur in anechoic conditions (Brandewie & Zahorik, 2010). Such results, like those of the precedence effect, are consistent with the idea that listeners construct a model of the environment’s contribution to the acoustic input and use it to partially dis count the environment when judging properties of a source. Analogous effects have been found with nonspeech sounds. When listeners hear instrument sounds preceded by speech or music that has been passed through a filter that “colors” the spectrum, the in strument sound is identified differently, as though listeners internalize the filter, assume it to be an environmental effect, and discount it to some extent when identifying the sound (Stilp, Alexander, et al., 2010).

VII. Current and Future Directions Hearing science is one of the oldest areas of psychology and neuroscience, with a strong research tradition dating back over 100 years, yet there remain many important open questions. Although research on each of the senses need not be expected to proceed ac cording to a single fixed trajectory, the contrast between hearing and vision nonetheless provides useful reminders of what remains poorly understood in audition. The classic methods of psychophysics were initially developed largely within hearing research, and were then borrowed by vision scientists to explore sensory encoding processes in vision. But while vision science quickly embraced perceptual and cognitive questions, hearing science remained more focused on the periphery. This can be explained in part by the challenge of understanding the cochlea, the considerable complexity of the early auditory system, and the clinical importance of peripheral audition. However, the focus on the pe riphery has left many central aspects of audition underexplored, and recent trends in hearing research reflect a shift toward the study of these neglected mid- and high-level questions. One important set of questions concerns the interface of audition with the rest of cogni tion, via attention and memory. Attention research ironically also flourished in hearing early on (with Cherry’s [1953] classic dichotic listening studies), but then largely moved to the visual domain. Recent years have seen renewed interest (see chapter 11 in this vol ume), but there remain many open questions. Much is still unclear about what is repre sented about sound in the absence of attention, about how and what auditory attention selects, and about the role of attention in perceptual organization.

Page 41 of 62

Audition Another promising research area involves working memory. Auditory short-term memory may have some striking differences with its visual counterpart (Demany, Trost, et al., 2008) and appears closely linked to auditory scene analysis (Conway, Cowan, et al., 2001). (p. 163) Studies of these topics in audition also hold promise for informing us more gener ally about the structure of cognition––the similarities and differences with respect to visu al cognition will reveal much about whether attention and memory mechanisms are do main general (perhaps exploiting central resources) or specific to particular sensory sys tems. Interactions between audition and the other senses are also attracting increased interest. Information from other sensory systems likely plays a crucial role in hearing given that sound on its own often provides ambiguous information. The sounds produced by rain and applause, for instance, can in some cases be quite similar, such that multisensory integra tion (using visual, somatosensory, or olfactory input) may help to correctly recognize the sound source. Cross-modal interactions in localization (Alais & Burr, 2004) are similarly powerful. Understanding cross-modal effects within the auditory system (Bizley, Nodal, et al., 2007; Ghazanfar, 2009; Kayser, Petkov, et al., 2008) and their role in behavior will be a significant direction of research going forward. In addition to the uncharted territory in perception and cognition, there remain important open questions about peripheral processing. Some of these unresolved issues, such as the mechanisms of outer hair cell function, have great importance for understanding hearing impairment. Others may dovetail with higher level function. For instance, the role of ef ferent connections to the cochlea is still uncertain, with some hypothesizing a role in at tention or segregation (Guinan, 2006). The role of phase locking in frequency encoding and pitch perception is another basic issue that remains controversial and that has wide spread relevance to mid-level audition. As audition continues to evolve as a field, I believe useful guidance will come from a com putational analysis of the inference problems the auditory system must solve (Marr, 1982). This necessitates thinking about the behavioral demands of real-world listening situations, as well as the constraints imposed by the way that information about the world is encoded in a sound signal. Many of these issues are becoming newly accessible with re cent advances in computational power and signal processing techniques. For instance, one of the most important tasks a listener must perform with sound is sure ly that of recognition—determining what it was in the world that caused a sound, be it a particular type of object, or of a type of event, such as something falling on the floor (Gaver, 1993; Lutfi, 2008). Recognition is computationally challenging because the same type of occurrence in the world typically produces a different sound waveform each time it occurs. A recognition system must generalize across the variation that occurs within categories, but not the variation that occurs across categories (DiCarlo & Cox, 2007). Re alizing this computational problem allows us to ask how the auditory system solves it. One place where these issues have been explored to some extent is speech perception (Holt & Lotto, 2010). The ideas explored there—about how listeners achieve invariance Page 42 of 62

Audition across different speakers and infer the state of the vocal apparatus along with the accom panying intentions of the speaker—could perhaps be extended to audition more generally (Rosenblum, 2004). The inference problems of audition can also be better appreciated by examining realworld sound signals, and formal analysis of these signals seems likely to yield valuable clues. As discussed in previous sections, statistical analysis of natural sounds has been a staple of recent computational auditory neuroscience (Harper & McAlpine, 2004; Ro driguez, Chen, et al., 2010; Smith & Lewicki, 2006), where natural sound statistics have been used to explain the mechanisms observed in the peripheral auditory system. Howev er, sound analysis seems likely to provide insight into mid- and high-level auditory prob lems as well. For instance, the acoustic grouping cues used in sound segregation are al most surely rooted to some extent in natural sound statistics, and examining such statis tics could reveal unexpected cues. Similarly, because sound recognition must generalize across the variability that occurs within sounds produced by a particular type of source, examining this variability in natural sounds may provide clues to how the auditory system achieves the appropriate invariance in this domain. The study of real-world auditory competence will also necessitate measuring auditory abilities and physiological responses with more realistic sound signals. The tones and noises that have been the staple of classical psychoacoustics and auditory physiology have many uses, but also have little in common with many everyday sounds. One chal lenge of working with realistic signals is that actual recordings of real-world sounds are often uncontrolled, and typically introduce confounds associated with their familiarity. Methods of synthesizing novel sounds with naturalistic properties (Cavaco & Lewicki, 2007; McDermott, Wrobleski et al., 2011; (p. 164) McDermott & Simoncelli, 2011) are thus likely to be useful experimental tools. Simulations of realistic auditory environments are also increasingly within reach, with methods for generating three-dimensional auditory scenes (Wightman & Kistler, 1989; Zahorik, 2009) being used in studies of sound localiza tion and speech perception in realistic conditions. We must also consider more realistic auditory behaviors. Hearing does not normally oc cur while we are seated in a quiet room, listening over headphones, and paying full atten tion to the acoustic stimulus, but rather in the context of everyday activities in which sound is a means to some other goal. The need to respect this complexity while maintain ing sufficient control over experimental conditions presents a challenge, but not one that is insurmountable. For instance, neurophysiology experiments involving naturalistic be havior are becoming more common, with preparations being developed that will permit recordings from freely moving animals engaged in vocalization (Eliades & Wang, 2008) or locomotion—ultimately, perhaps a real-world cocktail party.

Page 43 of 62

Audition

Author Note I thank Garner Hoyt von Trapp, Sam Norman-Haignere, Michael Schemitsch, and Sara Steele for helpful comments on earlier drafts of this chapter, the authors who kindly al lowed me to reproduce their figures (acknowledged individually in the figure captions), and the Howard Hughes Medical Institute for support.

References Adelson, E. H. (2000). Lightness perception and lightness illusions. In: M. S. Gazzaniga (Ed.), The new cognitive neurosciences (2nd ed., pp. 339–351). Cambridge, MA, MIT Press. Ahrens, M. B., Linden, J. F., et al. (2008). Influences in auditory cortical responses mod eled with multilinear spectrotemporal methods. Journal of Neuroscience, 28 (8), 1929– 1942. Alain, C., Arnott, S. R., et al. (2001). “What” and “where” in the human auditory system. Proceedings of the National Academy of Sciences U S A, 98, 12301–12306. Alais, D., & Burr, D. E. (2004). The ventriloquist effect results from near-optimal bimodal integration. Current Biology, 14, 257–262. ANSI (2007). American national standard procedure for the computation of loudness of steady sounds. ANSI, S3–4. Ashmore, J. (2008). Cochlear outer hair cell motility. Physiological Review, 88, 173–210. Attias, H., & Schreiner, C. E. (1997). Temporal low-order statistics of natural sounds. Ad vances in Neural Information Processing (p. 9). In M. Mozer, Jordan, M., & Petsche, T. Cambridge, MA: MIT Press. Attneave, F., & Olson, R. K. (1971). Pitch as a medium: A new approach to psychophysical scaling. American Journal of Psychology, 84 (2), 147–166. Bacon, S. P., & Grantham, D. W. (1989). Modulation masking: Effects of modulation fre quency, depth, and phase. Journal of the Acoustical Society of America, 85, 2575–2580. Banai, K., Hornickel, J., et al. (2009). Reading and subcortical auditory function. Cerebral Cortex, 19 (11), 2699–2707. Bandyopadhyay, S., Shamma, S. A., et al. (2010). Dichotomy of functional organization in the mouse auditory cortex. Nature Neuroscience, 13 (3), 361–368. Barbour, D. L., & Wang, X. (2003). Contrast tuning in auditory cortex. Science, 299, 1073– 1075.

Page 44 of 62

Audition Baumann, S., Griffiths, T. D., et al. (2011). Orthogonal representation of sound dimensions in the primate midbrain. Nature Neuroscience, 14 (4), 423–425. Bee, M. A., & Klump, G. M. (2004). Primitive auditory stream segregation: A neurophysio logical study in the songbird forebrain. Journal of Neurophysiology, 92, 1088–1104. Bee, M. A., & Micheyl, C. (2008). The cocktail party problem: What is it? How can it be solved? And why should animal behaviorists study it? Journal of Comparative Psychology, 122 (3), 235–251. Belin, P., Zatorre, R. J., et al. (2000). Voice-selective areas in human auditory cortex. Na ture, 403, 309–312. Bendor, D., & Wang, X. (2005). The neuronal representation of pitch in primate auditory cortex. Nature, 426, 1161–1165. Bendor, D., & Wang, X. (2006). Cortical representations of pitch in monkeys and humans. Current Opinion in Neurobiology, 16, 391–399. Bendor, D., & Wang, X. (2008). Neural response properties of primary, rostral, and ros trotemporal core fields in the auditory cortex of marmoset monkeys. Journal of Neuro physiology, 100 (2), 888–906. Bernstein, J. G. W., & Oxenham, A. J. (2005). An autocorrelation model with place depen dence to account for the effect of harmonic number on fundamental frequency discrimi nation. Journal of the Acoustical Society of America, 117 (6), 3816–3831. Bitterman, Y., Mukamel, R., et al. (2008). Ultra-fine frequency tuning revealed in single neurons of human auditory cortex. Nature, 451 (7175), 197–201. Bizley, J. K., Nodal, F. R., et al. (2007). Physiological and anatomical evidence for multi sensory interactions in auditory cortex. Cerebral Cortex, 17, 2172–2189. Bizley, J. K., Walker, K. M. M., et al. (2009). Interdependent encoding of pitch, timbre, and spatial location in auditory cortex. Journal of Neuroscience, 29 (7), 2064–2075. Bizley, J. K., Walker, K. M. M., et al. (2010). Neural ensemble codes for stimulus periodici ty in auditory cortex. Journal of Neuroscience, 30 (14), 5078–5091. Boemio, A., Fromm, S., et al. (2005). Hierarchical and asymmetric temporal sensitivity in human auditory cortices. Nature Neuroscience, 8, 389–395. Brandewie, E., & Zahorik, P. (2010). Prior listening in rooms improves speech intelligibili ty. Journal of the Acoustical Society of America, 128, 291–299. Bregman, A. S. (1978). Auditory streaming is cumulative. Journal of Experimental Psy chology: Human Perception and Performance, 4, 380–387.

Page 45 of 62

Audition Bregman, A. S. (1990). Auditory scene analysis: The perceptual organization of sound. Cambridge, MA: MIT Press. Bregman, A. S., & Campbell, J. (1971). Primary auditory stream segregation and percep tion of order in rapid sequences of tones. Journal of Experimental Psychology, 89, 244– 249. Bronkhorst, A. W. (2000). The cocktail party phenomenon: A review of research on speech intelligibility in multiple-talker conditions. Acustica 86: 117–128. (p. 165)

Brown, G. J., & Palomaki, K. J. (2006). In D. Wang & G. J. Brown (Eds.), Reverberation. Computational auditory scene analysis: Principles, algorithms, and applications (pp. 209– 250). D. Wang and G. J. Brown. Hoboken, NJ: John Wiley & Sons. Buus, S., Florentine, M., et al. (1997). Temporal integration of loudness, loudness discrim ination, and the form of the loudness function. Journal of the Acoustical Society of Ameri ca, 101, 669–680. Carcagno, S., & Plack, C. J. (2011). Subcortical plasticity following perceptual learning in a pitch discrimination task. Journal of the Association for Research in Otolaryngology, 12, 89–100. Cariani, P. A., & Delgutte, B. (1996). Neural correlates of the pitch of complex tones. I. Pitch and pitch salience. Journal of Neurophysiology, 76, 1698–1716. Carlyon, R. P. (1991). Discriminating between coherent and incoherent frequency modula tion of complex tones. Journal of the Acoustical Society of America, 89, 329–340. Carlyon, R. P. (2004). How the brain separates sounds. Trends in Cognitive Sciences, 8 (10), 465–471. Carlyon, R. P., & Cusack, R., et al. (2001). Effects of attention and unilateral neglect on auditory stream segregation. Journal of Experimental Psychology: Human Perception and Performance, 27 (1), 115–127. Carlyon, R. P., Micheyl, C., et al. (2004). Auditory processing of real and illusory changes in frequency modulation (FM) phase. Journal of the Acoustical Society of America, 116 (6), 3629–3639. Cavaco, S., & Lewicki, M. S. (2007). Statistical modeling of intrinsic structures in impact sounds. Journal of the Acoustical Society of America, 121 (6), 3558–3568. Cedolin, L., & Delgutte, B. (2005). Pitch of complex tones: Rate-place and interspike in terval representations in the auditory nerve. Journal of Neurophysiology, 94, 347–362. Cherry, E. C. (1953). Some experiments on the recognition of speech, with one and two ears. Journal of the Acoustical Society of America, 25 (5), 975–979.

Page 46 of 62

Audition Christianson, G. B., Sahani, M., et al. (2008). The consequences of response nonlineari ties for interpretation of spectrotemporal receptive fields. Journal of Neuroscience, 28 (2), 446–455. Ciocca, V., & Bregman, A. S. (1987). Perceived continuity of gliding and steady-state tones through interrupting noise. Perception & Psychophysics, 42, 476–484. Cohen, Y. E., Russ, B. E., et al. (2009). A functional role for the ventrolateral prefrontal cortex in non-spatial auditory cognition. Proceedings of the National Academy of Sciences U S A, 106, 20045–20050. Conway, A. R., Cowan, A. N., et al. (2001). The cocktail party phenomenon revisited: The importance of working memory capacity. Psychonomic Bulletin & Review, 8, 331–335. Culling, J. F., & Akeroyd, M. A. (2010). In C. J. Plack (Ed.), Spatial hearing. The Oxford handbook of auditory science: Hearing (Vol. 3, pp. 123–144). Oxford, UK: Oxford Universi ty Press. Culling, J. F., & Summerfield, Q. (1995). Perceptual separation of concurrent speech sounds: Absence of across-frequency grouping by common interaural delay. Journal of the Acoustical Society of America, 98 (2), 785–797. Cusack, R., & Carlyon, R. P. (2004). Auditory perceptual organization inside and outside the laboratory. In J. G. Neuhoff (Ed.), Ecological psychoacoustics (pp. 15–84). San Diego: Elsevier Academic Press. Cutting, J. E. (1975). Aspects of phonological fusion. Journal of Experimental Psychology: Human Perception and Performance, 104, 105–120. Dallos, P. (2008). Cochlear amplification, outer hair cells and prestin. Current Opinion in Neurobiology, 18, 370–376. Darrow, K. N., Maison, S. F., et al. (2006). Cochlear efferent feedback balances interaural sensitivity. Nature Neuroscience, 9 (12), 1474–1476. Darwin, C. (1984). Perceiving vowels in the presence of another sound: Constraints on formant perception. Journal of the Acoustical Society of America, 76 (6), 1636–1647. Darwin, C. J. (1981). Perceptual grouping of speech components different in fundamental frequency and onset-time. Quarterly Journal of Experimental Psychology, 3A 185–207. Darwin, C. J. (1997). Auditory grouping. Trends in Cognitive Sciences, 1, 327–333. Darwin, C. J., & Ciocca, V. (1992). Grouping in pitch perception: Effects of onset asyn chrony and ear of presentation of a mistuned component. Journal of the Acoustical Soci ety of America, 91, 3381–3390.

Page 47 of 62

Audition Darwin, C. J., & Hukin, R. W. (1997). Perceptual segregation of a harmonic from a vowel by interaural time difference and frequency proximity. Journal of the Acoustical Society of America, 102 (4), 2316–2324. Darwin, C. J., & Hukin, R. W. (1999). Auditory objects of attention: The role of interaural time differences. Journal of Experimental Psychology: Human Perception and Perfor mance, 25 (3), 617–629. Dau, T., Kollmeier, B., et al. (1997). Modeling auditory processing of amplitude modula tion. I. Detection and masking with narrow-band carriers. Journal of the Acoustical Soci ety of America, 102 (5), 2892–2905. David, S. V., Mesgarani, N., et al. (2009). Rapid synaptic depression explains nonlinear modulation of spectro-temporal tuning in primary auditory cortex by natural stimuli. Jour nal of Neuroscience, 29 (11), 3374–3386. de Cheveigne, A. (2005). Pitch perception models. In C. J. Plack & A. J. Oxenham (Eds.), Pitch (pp. 169–233). New York: Springer-Verlag. de Cheveigne, A. (2006). Multiple F0 estimation. In: D. Wang & G. J. Brown (Eds.), Com putational auditory scene analysis: Principles, algorithms, and applications (pp. 45–80). Hoboken, NJ: John Wiley & Sons. de Cheveigne, A. (2010). Pitch perception. In C. J. Plack (Ed.), The Oxford handbook of au ditory science: Hearing (Vol. 3), pp. 71–104. New York: Oxford University Press. de Cheveigne, A., & Kawahara, H. (2002). YIN, a fundamental frequency estimator for speech and music. Journal of the Acoustical Society of America, 111, 1917–1930. de Cheveigne, A., McAdams, S., et al. (1995). Identification of concurrent harmonic and inharmonic vowels: A test of the theory of harmonic cancellation and enhancement. Jour nal of the Acoustical Society of America, 97 (6), 3736–3748. Dean, I., Harper, N. S., et al. (2005). Neural population coding of sound level adapts to stimulus statistics. Nature Neuroscience, 8 (12), 1684–1689. Delgutte, B., Joris, P. X., et al. (1999). Receptive fields and binaural interactions for virtu al-space stimuli in the cat inferior colliculus. Journal of Neurophysiology, 81, 2833–2851. Demany, L., & Semal, C. (1990). The upper limit of “musical” pitch. Music Percep tion, 8, 165–176. (p. 166)

Demany, L., Trost, W., et al. (2008). Auditory change detection: Simple sounds are not memorized better than complex sounds. Psychological Science, 19, 85–91. Depireux, D. A., Simon, J. Z., et al. (2001). Spectro-temporal response field characteriza tion with dynamic ripples in ferret primary auditory cortex. Journal of Neurophysiology, 85 (3), 1220–1234. Page 48 of 62

Audition DiCarlo, J. J., & Cox, D. D. (2007). Untangling invariant object recognition. Trends in Cog nitive Sciences, 11, 333–341. Durlach, N. I., Mason, C. R., et al. (2003). Note on informational masking. Journal of the Acoustical Society of America, 113 (6), 2984–2987. Elgoyhen, A. B., & Fuchs, P. A. (2010). Efferent innervation and function. In P. A. Fuchs (Ed.), The Oxford handbook of auditory science: The ear (pp. 283–306). Oxford, UK: Ox ford University Press. Elhilali, M., Ma, L., et al. (2009). Temporal coherence in the perceptual organization and cortical representation of auditory scenes. Neuron, 61, 317–329. Eliades, S. J., & Wang X. (2008). Neural substrates of vocalization feedback monitoring in primate auditory cortex. Nature, 453, 1102–1106. Escabi, M. A., Miller, L. M., et al. (2003). Naturalistic auditory contrast improves spec trotemporal coding in the cat inferior colliculus. Journal of Neuroscience, 23, 11489– 11504. Fairhall, A. L., Lewen, G. D., et al. (2001). Efficiency and ambiguity in an adaptive neural code. Nature, 412, 787–792. Field, D. J. (1987). Relations between the statistics of natural images and the response profiles of cortical cells. Journal of the Optical Society of America A, 4, 2379–2394. Fishman, Y. I., Arezzo, J. C., et al. (2004). Auditory stream segregation in monkey auditory cortex: Effects of frequency separation, presentation rate, and tone duration. Journal of the Acoustical Society of America, 116, 1656–1670. Fishman, Y. I., Reser, D. H., et al. (1998). Pitch vs. spectral encoding of harmonic complex tones in primary auditory cortex of the awake monkey. Brain Research, 786, 18–30. Fishman, Y. I., & Steinschneider, M. (2010). Formation of auditory streams. In A. Rees & A. R. Palmer (Eds.), The oxford handbook of auditory science: The auditory brain (pp. 215–245). Oxford, UK: Oxford University Press. Formisano, E., Kim, D., et al. (2003). Mirror-symmetric tonotopic maps in human primary auditory cortex. Neuron, 40 (4), 859–869. Freyman, R. L., Clifton, R. K., et al. (1991). Dynamic processes in the precedence effect. Journal of the Acoustical Society of America, 90, 874–884. Fritz, J. B., David, S. V., et al. (2010). Adaptive, behaviorally gated, persistent encoding of task-relevant auditory information in ferret frontal cortex. Nature Neuroscience, 13 (8), 1011–1019. Fritz, J. B., Elhilali, M., et al. (2005). Differential dynamic plasticity of A1 receptive fields during multiple spectral tasks. Journal of Neuroscience, 25 (33), 7623–7635. Page 49 of 62

Audition Fritz, J. B., Shamma, S. A., et al. (2003). Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex. Nature Neuroscience, 6, 1216–1223. Gardner, W. G. (1998). Reverberation algorithms. In M. Kahrs and K. Brandenburg (Eds.), Applications of digital signal processing to audio and acoustics (pp. 85–131). Norwell, MA: Kluwer Academic. Gaver, W. W. (1993). What in the world do we hear? An ecological approach to auditory source perception. Ecological Psychology, 5 (1), 1–29. Ghazanfar, A. A. (2009). The multisensory roles for auditory cortex in primate vocal com munication. Hearing Research, 258, 113–120. Ghitza, O. (2001). On the upper cutoff frequency of the auditory critical-band envelope detectors in the context of speech perception. Journal of the Acoustical Society of Ameri ca, 110 (3), 1628–1640. Giraud, A., Lorenzi, C., et al. (2000). Representation of the temporal envelope of sounds in the human brain. Journal of Neurophysiology, 84 (3), 1588–1598. Glasberg, B. R., & Moore, B. C. J. (1990). Derivation of auditory filter shapes from notched-noise data. Hearing Research, 47, 103–138. Goldstein, J. L. (1973). An optimum processor theory for the central formation of the pitch of complex tones. Journal of the Acoustical Society of America, 54, 1496–1516. Guinan, J. J. (2006). Olivocochlear efferents: Anatomy, physiology, function, and the mea surement of efferent effects in humans. Ear and Hearing, 27 (6), 589–607. Gygi, B., Kidd, G. R., et al. (2004). Spectral-temporal factors in the identification of envi ronmental sounds. Journal of the Acoustical Society of America, 115 (3), 1252–1265. Hall, D. A., & Plack, C. J. (2009). Pitch processing sites in the human auditory brain. Cere bral Cortex, 19 (3), 576–585. Hall, D. A., Barrett, D. J. K., Akeroyd, M. A., & Summerfield, A. Q. (2005). Cortical repre sentations of temporal structure in sound. Journal of Neurophysiology, 94 (11), 3181– 3191. Hall, J. W., Haggard, M. P., et al. (1984). Detection in noise by spectro-temporal pattern analysis. Journal of the Acoustical Society of America, 76, 50–56. Harper, N. S., & McAlpine, D. (2004). Optimal neural population coding of an auditory spatial cue. Nature, 430, 682–686. Hawley, M. L., Litovsky, R. Y., et al. (2004). The benefit of binaural hearing in a cocktail party: Effect of location and type of interferer. Journal of the Acoustical Society of Ameri ca, 115 (2), 833–843. Page 50 of 62

Audition Heffner, H. E., & Heffner, R. S. (1990). Effect of bilateral auditory cortex lesions on sound localization in Japanese macaques. Journal of Neurophysiology, 64 (3), 915–931. Heinz, M. G., Colburn, H. S., et al. (2001). Evaluating auditory performance limits: I. Oneparameter discrimination using a computational model for the auditory nerve. Neural Computation, 13, 2273–2316. Higgins, N. C., Storace, D. A., et al. (2010). Specialization of binaural responses in ventral auditory cortices. Journal of Neuroscience, 30 (43), 14522–14532. Hofman, P. M., Van Riswick, J. G. A., et al. (1998). Relearning sound localization with new ears. Nature Neuroscience, 1 (5), 417–421. Holt, L. L., & Lotto, A. J. (2010). Speech perception as categorization. Attention, Percep tion, and Psychophysics, 72 (5), 1218–1227. Houtgast, T. (1989). Frequency selectivity in amplitude-modulation detection. Journal of the Acoustical Society of America, 85, 1676–1680. Houtsma, A. J. M., & Smurzynski, J. (1990). Pitch identification and discrimination for complex tones with many harmonics. Journal of the Acoustical Society of America, 87 (1), 304–310. Hsu, A., Woolley, S. M., et al. (2004). Modulation power and phase spectrum of natural sounds enhance neural encoding performed by single auditory neurons. Journal of Neuroscience, 24, 9201–9211. (p. 167)

Hudspeth, A. J. (2008). Making an effort to listen: Mechanical amplification in the ear. Neuron, 59 (4), 530–545. Humphries, C., Liebenthal, E., et al. (2010). Tonotopic organization of human auditory cortex. NeuroImage, 50 (3), 1202–1211. Ihlefeld, A., & Shinn-Cunningham, B. (2008). Spatial release from energetic and informa tional masking in a divided speech identification task. Journal of the Acoustical Society of America, 123 (6), 4380–4392. Javel, E., & Mott, J. B. (1988). Physiological and psychophysical correlates of temporal processes in hearing. Hearing Research, 34, 275–294. Jenkins, W. M., & Masterton, R. G. (1982). Sound localization: Effects of unilateral lesions in central auditory system. Journal of Neurophysiology, 47, 987–1016. Johnson, D. H. (1980). The relationship between spike rate and synchrony in responses of auditory-nerve fibers to single tones. Journal of the Acoustical Society of America, 68, 1115–1122. Johnsrude, I. S., Penhune, V. B., et al. (2000). Functional specificity in the right human au ditory cortex for perceiving pitch direction. Brain, 123 (1), 155–163. Page 51 of 62

Audition Joris, P. X., Bergevin, C., et al. (2011). Frequency selectivity in Old-World monkeys corrob orates sharp cochlear tuning in humans. Proceedings of the National Academy of Sciences U S A, 108 (42), 17516–17520. Joris, P. X., Schreiner, C. E., et al. (2004). Neural processing of amplitude-modulated sounds. Physiological Review, 84, 541–577. Kaas, J. H., & Hackett, T. A. (2000). Subdivisions of auditory cortex and processing streams in primates. Proceedings of the National Academy of Sciences U S A, 97, 11793– 11799. Kadia, S. C., & Wang, X. (2003). Spectral integration in A1 of awake primates: Neurons with single and multipeaked tuning characteristics. Journal of Neurophysiology, 89 (3), 1603–1622. Kanwisher, N. (2010). Functional specificity in the human brain: A window into the func tional architecture of the mind. Proceedings of the National Academy of Sciences U S A, 107, 11163–11170. Kawase, T., Delgutte, B., et al. (1993). Anti-masking effects of the olivocochlear reflex. II. Enhancement of auditory-nerve response to masked tones. Journal of Neurophysiology, 70, 2533–2549. Kayser, C., Petkov, C. I., et al. (2008). Visual modulation of neurons in auditory cortex. Cerebral Cortex, 18 (7), 1560–1574. Kidd, G., Arbogast, T. L., et al. (2005). The advantage of knowing where to listen. Journal of the Acoustical Society of America, 118 (6), 3804–3815. Kidd, G., Mason, C. R., et al. (1994). Reducing informational masking by sound segrega tion. Journal of the Acoustical Society of America, 95 (6), 3475–3480. Kidd, G., Mason, C. R., et al. (2003). Multiple bursts, multiple looks, and stream coher ence in the release from informational masking. Journal of the Acoustical Society of Amer ica, 114 (5), 2835–2845. Kikuchi, Y., Horwitz, B., et al. (2010). Hierarchical auditory processing directed rostrally along the monkey’s supratemporal plane. Journal of Neuroscience, 30 (39), 13021–13030. Kluender, K. R., & Jenison, R. L. (1992). Effects of glide slope, noise intensity, and noise duration on the extrapolation of FM glides through noise. Perception & Psychophysics, 51, 231–238. Klump, R. G., & Eady, H. R. (1956). Some measurements of interural time difference thresholds. Journal of the Acoustical Society of America, 28, 859–860. Kulkarni, A., & Colburn, H. S. (1998). Role of spectral detail in sound-source localization. Nature, 396, 747–749. Page 52 of 62

Audition Kvale, M., & Schreiner, C. E. (2004). Short-term adaptation of auditory receptive fields to dynamic stimuli. Journal of Neurophysiology, 91, 604–612. Langner, G., Sams, M., et al. (1997). Frequency and periodicity are represented in orthog onal maps in the human auditory cortex: Evidence from magnetoencephalography. Jour nal of Comparative Physiology, 181, 665–676. Leaver, A. M., & Rauschecker, J. P. (2010). Cortical representation of natural complex sounds: Effects of acoustic features and auditory object category. Journal of Neuroscience, 30 (22), 7604–7612. Lerner, Y., Honey, C. J., et al. (2011). Topographic mapping of a hierarchy of temporal re ceptive windows using a narrated story. Journal of Neuroscience, 31 (8), 2906–2915. Lewicki, M. S. (2002). Efficient coding of natural sounds. Nature Neuroscience, 5 (4), 356–363. Liberman, M. C. (1982). The cochlear frequency map for the cat: Labeling auditory-nerve fibers of known characteristic frequency. Journal of the Acoustical Society of America, 72, 1441–1449. Lippmann, R. P. (1997). Speech recognition by machines and humans. Speech Communi cation, 22, 1–16. Litovsky, R. Y., Colburn, H. S., et al. (1999). The precedence effect. Journal of the Acousti cal Society of America, 106, 1633–1654. Lomber, S. G., & Malhotra, S. (2008). Double dissociation of “what” and “where” process ing in auditory cortex. Nature Neuroscience, 11 (5), 609–616. Lorenzi, C., Gilbert, G., et al. (2006). Speech perception problems of the hearing impaired reflect inability to use temporal fine structure. Proceedings of the National Academy of Sciences U S A, 103, 18866–18869. Lutfi, R. A. (1992). Informational processing of complex sounds. III. Interference. Journal of the Acoustical Society of America, 91, 3391–3400. Lutfi, R. A. (2008). Human sound source identification. In W. A. Yost & A. N. Popper (Eds.), Springer handbook of auditory research: Auditory perception of sound sources (pp. 13–42). New York: Springer-Verlag. Machens, C. K., M. S. Wehr, et al. (2004). Linearity of cortical receptive fields measured with natural sounds. Journal of Neuroscience, 24, 1089–1100. Macken, W. J., Tremblay, S., et al. (2003). Does auditory streaming require attention? Evi dence from attentional selectivity in short-term memory. Journal of Experimental Psychol ogy: Human Perception and Performance, 29, 43–51.

Page 53 of 62

Audition Makous, J. C., & Middlebrooks, J. C. (1990). Two-dimensional sound localization by human listeners. Journal of the Acoustical Society of America, 87, 2188–2200. Marr, D. C. (1982). Vision: A computational investigation into the human representation and processing of visual information. New York: Freeman. May, B. J., Anderson, M., et al. (2008). The role of broadband inhibition in the rate representation of spectral cues for sound localization in the inferior colliculus. Hearing Research, 238, 77–93. (p. 168)

May, B. J., & McQuone, S. J. (1995). Effects of bilateral olivocochlear lesions on pure-tone discrimination in cats. Auditory Neuroscience, 1, 385–400. McAdams, S. (1989). Segregation of concurrent sounds. I. Effects of frequency modula tion coherence. Journal of the Acoustical Society of America, 86, 2148–2159. McAlpine, D. (2004). Neural sensitivity to periodicity in the inferior colliculus: Evidence for the role of cochlear distortions. Journal of Neurophysiology, 92, 1295–1311. McDermott, J. H. (2009). The cocktail party problem. Current Biology, 19, R1024–R1027. McDermott, J. H., Lehr, A. J., et al. (2010). Individual differences reveal the basis of conso nance. Current Biology, 20, 1035–1041. McDermott, J. H., & Oxenham, A. J. (2008a). Music perception, pitch, and the auditory system. Current Opinion in Neurobiology, 18, 452–463. McDermott, J. H., & Oxenham, A. J. (2008b). Spectral completion of partially masked sounds. Proceedings of the National Academy of Sciences U S A, 105 (15), 5939–5944. McDermott, J. H., & Simoncelli, E. P. (2011). Sound texture perception via statistics of the auditory periphery: Evidence from sound synthesis. Neuron, 71, 926–940. McDermott, J. H., Wrobleski, D., et al. (2011). Recovering sound sources from embedded repetition. Proceedings of the National Academy of Sciences U S A, 108 (3), 1188–1193. Meddis, R., & Hewitt, M. J. (1991). Virtual pitch and phase sensitivity of a computer mod el of the auditory periphery: Pitch identification. Journal of the Acoustical Society of America, 89, 2866–2882. Mershon, D. H., Desaulniers, D. H., et al. (1981). Perceived loudness and visually-deter mined auditory distance. Perception, 10, 531–543. Mesgarani, N., David, S. V., et al. (2008). Phoneme representation and classification in primary auditory cortex. Journal of the Acoustical Society of America, 123 (2), 899–909. Micheyl, C., & Oxenham, A. J. (2010). Objective and subjective psychophysical measures of auditory stream integration and segregation. Journal of the Association for Research in Otolaryngology, 11 (4), 709–724. Page 54 of 62

Audition Micheyl, C., & Oxenham, A. J. (2010). Pitch, harmonicity and concurrent sound segrega tion: Psychoacoustical and neurophysiological findings. Hearing Research, 266, 36–51. Micheyl, C., Tian, B., et al. (2005). Perceptual organization of tone sequences in the audi tory cortex of awake macaques. Neuron, 48, 139–148. Middlebrooks, J. C. (1992). Narrow-band sound localization related to external ear acoustics. Journal of the Acoustical Society of America, 92 (5), 2607–2624. Middlebrooks, J. C. (2000). Cortical representations of auditory space. In M. S. Gazzaniga. The new cognitive neurosciences (2nd ed., pp. 425–436). Cambridge, MA: MIT Press. Middlebrooks, J. C., & Green, D. M. (1991). Sound localization by human listeners. Annual Review of Psychology, 42, 135–159. Miller, L. M., Escabi, M. A., et al. (2001). Spectrotemporal receptive fields in the lemnis cal auditory thalamus and cortex. Journal of Neurophysiology, 87, 516–527. Miller, L. M., Escabi, M. A., et al. (2002). Spectrotemporal receptive fields in the lemnis cal auditory thalamus and cortex. Journal of Neurophysiology, 87, 516–527. Miller, R. L., Schilling, J. R., et al. (1997). Effects of acoustic trauma on the representation of the vowel /e/ in cat auditory nerve fibers. Journal of the Acoustical Society of America, 101 (6), 3602–3616. Moore, B. C. J. (1973). Frequency differences limens for short-duration tones. Journal of the Acoustical Society of America, 54, 610–619. Moore, B. C. J. (2003). An introduction to the psychology of hearing. San Diego, CA: Acad emic Press. Moore, B. C., & Glasberg, B. R. (1996). A revision of Zwicker’s loudness model. Acta Acus tica, 82 (2), 335–345. Moore, B. C. J., Glasberg, B. R., et al. (1986). Thresholds for hearing mistuned partials as separate tones in harmonic complexes. Journal of the Acoustical Society of America, 80, 479–483. Moore, B. C. J., & Gockel, H. (2002). Factors influencing sequential stream segregation. Acta Acustica, 88, 320–332. Moore, B. C. J., & Oxenham, A. J. (1998). Psychoacoustic consequences of compression in the peripheral auditory system. Psychological Review, 105 (1), 108–124. Moshitch, D., Las, L., et al. (2006). Responses of neurons in primary auditory cortex (A1) to pure tones in the halothane-anesthetized cat. Journal of Neurophysiology, 95 (6), 3756– 3769.

Page 55 of 62

Audition Neff, D. L. (1995). Signal properties that reduce masking by simultaneous, random-fre quency maskers. Journal of the Acoustical Society of America, 98, 1909–1920. Nelken, I., Bizley, J. K., et al. (2008). Responses of auditory cortex to complex stimuli: Functional organization revealed using intrinsic optical signals. Journal of Neurophysiolo gy, 99 (4), 1928–1941. Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609. Palmer, A. R., & Russell, I. J. (1986). Phase-locking in the cochlear nerve of the guinea-pig and its relation to the receptor potential of inner hair-cells. Hearing Research, 24, 1–15. Patterson, R. D., Uppenkamp, S., et al. (2002). The processing of temporal pitch and melody information in auditory cortex. Neuron, 36 (4), 767–776. Penagos, H., Melcher, J. R., et al. (2004). A neural representation of pitch salience in non primary human auditory cortex revealed with functional magnetic resonance imaging. Journal of Neuroscience, 24 (30), 6810–6815. Petkov, C. I., Kayser, C., et al. (2006). Functional imaging reveals numerous fields in the monkey auditory cortex. PLoS Biology, 4 (7), 1213–1226. Petkov, C. I., Kayser, C., et al. (2008). A voice region in the monkey brain. Nature Neuro science, 11, 367–374. Petkov, C. I., O’Connor, K. N., et al. (2007). Encoding of illusory continuity in primary au ditory cortex. Neuron, 54, 153–165. Plack, C. J. (2005). The sense of hearing. New Jersey, Lawrence Erlbaum. Plack, C. J., & Oxenham, A. J. (2005). The psychophysics of pitch. In C. J. Plack, A. J. Oxen ham, R. R. Fay, & A. J. Popper (Eds.), Pitch: Neural coding and perception (pp. 7–55). New York: Springer-Verlag. Plack, C. J., Oxenham, A. J., et al. (Eds.) (2005). Pitch: Neural coding and perception. Springer Handbook of Auditory Research. New York: Springer-Verlag. Poeppel, D. (2003). The analysis of speech in different temporal integration windows: Cerebral lateralization as “asymmetric sampling in time.” Speech Communication, 41, 245–255. (p. 169)

Poremba, A., Saunders, R. C., et al. (2003). Functional mapping of the primate au

ditory system. Science, 299, 568–572. Pressnitzer, D., Sayles, M., et al. (2008). Perceptual organization of sound begins in the auditory periphery. Current Biology, 18, 1124–1128.

Page 56 of 62

Audition Rajan, R. (2000). Centrifugal pathways protect hearing sensitivity at the cochlea in noisy environments that exacerbate the damage induced by loud sound. Journal of Neuro science, 20, 6684–6693. Rauschecker, J. P., & Tian, B. (2004). Processing of band-passed noise in the lateral audi tory belt cortex of the rhesus monkey. Journal of Neurophysiology, 91, 2578–2589. Rayleigh, L. (1907). On our perception of sound direction. Philosophical Magazine, 3, 456–464. Recanzone, G. H. (2008). Representation of con-specific vocalizations in the core and belt areas of the auditory cortex in the alert macaque monkey. Journal of Neuroscience, 28 (49), 13184–13193. Rhode, W. S. (1971). Observations of the vibration of the basilar membrane in squirrel monkeys using the Mossbauer technique. Journal of the Acoustical Society of America, 49, 1218–1231. Rhode, W. S. (1978). Some observations on cochlear mechanics. Journal of the Acoustical Society of America, 64, 158–176. Riecke, L., van Opstal, J., et al. (2007). Hearing illusory sounds in noise: Sensoryperceptual transformations in primary auditory cortex. Journal of Neuroscience, 27 (46), 12684–12689. (p. 170)

Roberts, B., & Brunstrom, J. M. (1998). Perceptual segregation and pitch shifts of mis tuned components in harmonic complexes and in regular inharmonic complexes. Journal of the Acoustical Society of America, 104 (4), 2326–2338. Rodriguez, F. A., Chen, C., et al. (2010). Neural modulation tuning characteristics scale to efficiently encode natural sound statistics. Journal of Neuroscience, 30, 15969–15980. Romanski, L. M., Tian, B., et al. (1999). Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex. Nature Neuroscience, 2 (12), 1131–1136. Rose, J. E., Brugge, J. F., et al. (1967). Phase-locked response to low-frequency tones in single auditory nerve fibers of the squirrel monkey. Journal of Neurophysiology, 30, 769– 793. Rosen, S. (1992). Temporal information in speech: Acoustic, auditory and linguistic as pects. Philosophical Transactions of the Royal Society, London, Series B, 336, 367–373. Rosenblum, L. D. (2004). Perceiving articulatory events: Lessons for an ecological psy choacoustics. In J. G. Neuhoff (Ed.), Ecological psychoacoustics (pp.: 219–248). San Diego, CA: Elsevier Academic Press. Rothschild, G., Nelken, I., et al. (2010). Functional organization and population dynamics in the mouse primary auditory cortex. Nature Neuroscience, 13 (3), 353–360. Page 57 of 62

Audition Rotman, Y., Bar Yosef, O., et al. (2001). Relating cluster and population responses to nat ural sounds and tonal stimuli in cat primary auditory cortex. Hearing Research, 152, 110– 127. Ruggero, M. A. (1992). Responses to sound of the basilar membrane of the mammalian cochlea. Current Opinion in Neurobiology, 2, 449–456. Ruggero, M. A., & Rich, N. C. (1991). Furosemide alters organ of Corti mechanics: Evi dence for feedback of outer hair cells upon the basilar membrane. Journal of Neuro science, 11, 1057–1067. Ruggero, M. A., Rich, N. C., et al. (1997). Basilar-membrane responses to tones at the base of the chinchilla cochlea. Journal of the Acoustical Society of America, 101, 2151– 2163. Samson, F., Zeffiro, T. A., et al. (2011). Stimulus complexity and categorical effects in hu man auditory cortex: an Activation Likelihood Estimation meta-analysis. Frontiers in Psy chology, 1, 1–23. Scharf, B., Magnan, J., et al. (1997). On the role of the olivocochlear bundle in hearing: 16 Case studies. Hearing Research, 102, 101–122. Schonwiesner, M., & Zatorre, R. J. (2008). Depth electrode recordings show double disso ciation between pitch processing in lateral Heschl’s gyrus. Experimental Brain Research, 187, 97–105. Schonwiesner, M., & Zatorre, R. J. (2009). Spectro-temporal modulation transfer function of single voxels in the human auditory cortex measured with high-resolution fMRI. Pro ceedings of the National Academy of Sciences U S A, 106 (34), 14611–14616. Schreiner, C. E., & Urbas, J. V. (1986). Representation of amplitude modulation in the au ditory cortex of the cat. I. Anterior auditory field. Hearing Research, 21, 227–241. Schreiner, C. E., & Urbas, J. V. (1988). Representation of amplitude modulation in the au ditory cortex of the cat. II. Comparison between cortical fields. Hearing Research, 32, 49– 64. Shackleton, T. M., & Carlyon, R. P. (1994). The role of resolved and unresolved harmonics in pitch perception and frequency modulation discrimination. Journal of the Acoustical So ciety of America, 95 (6), 3529–3540. Shamma, S. A., & Klein, D. (2000). The case of the missing pitch templates: How harmon ic templates emerge in the early auditory system. Journal of the Acoustical Society of America, 107, 2631–2644. Shannon, R. V., Zeng, F. G., et al. (1995). Speech recognition with primarily temporal cues. Science, 270 (5234), 303–304.

Page 58 of 62

Audition Shera, C. A., Guinan, J. J., et al. (2002). Revised estimates of human cochlear tuning from otoacoustic and behavioral measurements. Proceedings of the National Academy of Sciences U S A, 99 (5), 3318–3323. Shinn-Cunningham, B. G. (2008). Object-based auditory and visual attention. Trends in Cognitive Sciences, 12 (5), 182–186. Singh, N. C., & Theunissen, F. E. (2003). Modulation spectra of natural sounds and etho logical theories of auditory processing. Journal of the Acoustical Society of America, 114 (6), 3394–33411. Skoe, E., & Kraus, N. (2010). Auditory brainstem response to complex sounds: A tutorial. Ear and Hearing, 31 (3), 302–324. Smith, E. C., & Lewicki, M. S. (2006). Efficient auditory coding. Nature, 439, 978–982. Smith, Z. M., Delgutte, B., et al. (2002). Chimaeric sounds reveal dichotomies in auditory perception. Nature, 416, 87–90. Snyder, J. S., & Alain, C. (2007). Toward a neurophysiological theory of auditory stream segregation. Psychological Bulletin, 133 (5), 780–799. Stevens, S. S. (1955). The measurement of loudness. Journal of the Acoustical Society of America, 27 (5), 815–829. Stilp, C. E., Alexander, J. M., et al. (2010). Auditory color constancy: Calibration to reli able spectral properties across nonspeech context and targets. Attention, Perception, and Psychophysics, 72 (2), 470–480. Sutter, M. L., & Schreiner, C. E. (1991). Physiology and topography of neurons with multi peaked tuning curves in cat primary auditory cortex. Journal of Neurophysiology, 65, 1207–1226. Sweet, R. A., Dorph-Petersen, K., et al. (2005). Mapping auditory core, lateral belt, and parabelt cortices in the human superior temporal gyrus. Journal of Comparative Neurolo gy, 491, 270–289. Talavage, T. M., Sereno, M. I., et al. (2004). Tonotopic organization in human auditory cor tex revealed by progressions of frequency sensitivity. Journal of Neurophysiology, 91, 1282–1296. Tansley, B. W., & Suffield, J. B. (1983). Time-course of adaptation and recovery of chan nels selectively sensitive to frequency and amplitude modulation. Journal of the Acousti cal Society of America, 74, 765–775. Terhardt, E. (1974). Pitch, consonance, and harmony. Journal of the Acoustical Society of America, 55, 1061–1069.

Page 59 of 62

Audition Theunissen, F. E., Sen, K., et al. (2000). Spectral-temporal receptive fields of non-linear auditory neurons obtained using natural sounds. Journal of Neuroscience, 20, 2315–2331. Tian, B., & Rauschecker, J. P. (2004). Processing of frequency-modulated sounds in the lat eral auditory belt cortex of the rhesus monkey. Journal of Neurophysiology, 92, 2993– 3013. Tian, B., Reser, D., et al. (2001). Functional specialization in rhesus monkey auditory cor tex. Science, 292, 290–293. Ulanovsky, N., Las, L., et al. (2003). Processing of low-probability sounds by cortical neu rons. Nature Neuroscience, 6 (4), 391–398. van Noorden, L. P. A. S. (1975). Temporal coherence in the perception of tone sequences. Eindhoven, The Netherlands: The Institute of Perception Research, University of Technol ogy. Walker, K. M. M., Bizley, J. K., et al. (2010). Cortical encoding of pitch: Recent results and open questions. Hearing Research, 271 (1–2), 74–87. Wallace, M. N., Anderson, L. A., et al. (2007). Phase-locked responses to pure tones in the auditory thalamus. Journal of Neurophysiology, 98 (4), 1941–1952. Wallach, H., Newman, E. B., et al. (1949). The precedence effect in sound localization. American Journal of Psychology, 42, 315–336. Warren, J. D., Zielinski, B. A., et al. (2002). Perception of sound-source motion by the hu man brain. Neuron, 34, 139–148. Warren, R. M. (1970). Perceptual restoration of missing speech sounds. Science, 167, 392–393. Warren, R. M., Obusek, C. J., et al. (1972). Auditory induction: perceptual synthesis of ab sent sounds. Science, 176, 1149–1151. Watson, C. S. (1987). Uncertainty, informational masking and the capacity of immediate auditory memory. In W. A. Yost & C. S. Watson (Eds.), Auditory processing of complex sounds (pp. 267–277). Hillsdale, NJ: Erlbaum. Wightman, F. (1973). The pattern-transformation model of pitch. Journal of the Acoustical Society of America, 54, 407–416. Wightman, F., & Kistler, D. J. (1989). Headphone simulation of free-field listening. II. Psy chophysical validation. Journal of the Acoustical Society of America, 85 (2), 868–878. Winslow, R. L., & Sachs, M. B. (1987). Effect of electrical stimulation of the crossed olivo cochlear bundle on auditory nerve response to tones in noise. Journal of Neurophysiology, 57 (4), 1002–1021. Page 60 of 62

Audition Winter, I. M. (2005). The neurophysiology of pitch. In C. J. Plack, A. J. Oxenham, R. R. Fay, & A. J. Popper (Eds.), Pitch—Neural coding and perception (pp. 99–146). New York: Springer-Verlag. Wong, P. C. M., Skoe, E., et al. (2007). Musical experience shapes human brainstem en coding of linguistic pitch patterns. Nature Neuroscience, 10 (4), 420–422. Woods, T. M., Lopez, S. E., et al. (2006). Effects of stimulus azimuth and intensity on the single-neuron activity in the auditory cortex of the alert macaque monkey. Journal of Neu rophysiology, 96 (6), 3323–3337. Woolley, S. M., Fremouw, T. E., et al. (2005). Tuning for spectro-temporal modulations as a mechanism for auditory discrimination of natural sounds. Nature Neuroscience, 8 (10), 1371–1379. Yates, G. K. (1990). Basilar membrane nonlinearity and its influence on auditory nerve rate-intensity functions. Hearing Research, 50, 145–162. Yin, T. C. T., & Kuwada, S. (2010). Binaural localization cues. In A. Rees & A. R. Palmer (Eds.), The Oxford handbook of auditory science: The auditory brain (pp. 271–302). Ox ford, UK: Oxford University Press. Young, E. D. (2010). Level and spectrum. In A. Rees & A. R. Palmer (Eds.), The Oxford handbook of auditory science: The auditory brain (pp. 93–124). Oxford, UK: Oxford Uni versity Press. Zahorik, P. (2009). Perceptually relevant parameters for virtual listening simulation of small room acoustics. Journal of the Acoustical Society of America, 126, 776–791. Zahorik, P., Bangayan, P., et al. (2006). Perceptual recalibration in human sound localiza tion: Learning to remediate front-back reversals. Journal of the Acoustical Society of America, 120 (1), 343–359. Zahorik, P., & Wightman, F. L. (2001). Loudness constancy with varying sound source dis tance. Nature Neuroscience, 4 (1), 78–83. Zatorre, R. J. (1985). Discrimination and recognition of tonal melodies after unilateral cerebral excisions. Neuropsychologia, 23 (1), 31–41. Zatorre, R. J., & Belin, P. (2001). Spectral and temporal processing in human auditory cor tex. Cerebral Cortex, 11, 946–953. Zatorre, R. J., Belin, P., et al. (2002). Structure and function of auditory cortex: Music and speech. Trends in Cognitive Sciences, 6 (1), 37–46.

Josh H. McDermott

Page 61 of 62

Audition Josh H. McDermott, Department of Brain and Cognitive Sciences, Massachusetts In stitute of Technology, Cambridge MA

Page 62 of 62

Neural Correlates of the Development of Speech Perception and Compre hension

Neural Correlates of the Development of Speech Per ception and Comprehension Angela Friederici and Claudia Männel The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0009

Abstract and Keywords The development of auditory language perception proceeds from acoustic features via phonological representations to words and their relations in a sentence. Neurophysiologi cal data indicate that infants discriminate acoustic differences relevant for phoneme cate gories and word stress patterns by the age of 2 and 4 months, respectively. Salient acoustic cues that mark prosodic phrase boundaries (e.g., pauses) are also perceived at about the age of 5 months and infants learn about the rules according to which phonemes are legally combined (i.e, phonotactics). At the end of their first year of life, children rec ognize and produce their first words, and electrophysiological studies suggest that they establish brain mechanisms to gain lexical representations similar to those of adults. In their second year of life, children enlarge their lexicon, and electrophysiological data show that 24-month-olds base their processing of semantic relations in sentences on brain mechanisms comparable to those observable in adults. At this age, children are also able to recognize syntactic errors in a sentence, but it takes until 32 months before they display a brain response pattern to syntactic violations similar to adults. The development of comprehension of syntactically complex sentences, such as sentences with a noncanon ical word order, however, takes several more years before adult-like processes are estab lished. Keywords: phonemes, word stress, prosody, phonotactics, lexicon, semantics, syntax

Introduction Language acquisition, with its remarkable speed and high levels of success, remains a mystery. At birth, infants are able to communicate by crying in different ways. From birth on, infants also distinguish the sound of their native language from that of other lan guages. Following these first language-related steps, there is a fast progression in the de velopment of perceptive and expressive language skills. At about 4 months, babies start to babble, the earliest stages of language production. A mere 12 months after birth, most Page 1 of 36

Neural Correlates of the Development of Speech Perception and Compre hension babies start to speak their first words, and about half a year later, they can even produce short sentences. Finally, at the end of most children’s third year of life, they have ac quired at least 500 words and know how to combine them into meaningful utterances. Thus, they have mastered the entry into their native language: They have acquired a com plex system with the typical sounds of a language, these sounds are combined in different ways to make up a wide vocabulary, and the vocabulary items are related to each other by means of syntactic rules. Although developmental research has delivered increasing knowledge about language ac quisition (e.g., Clark, 2003; Szagun, 2006), many questions remain. Studying how chil dren acquire language is not easily accomplished because a great deal of learning takes place before the child is able to speak and to show overt responses to what he or she ac tually perceives. It is a methodological challenge to develop ways to investigate whether infants know (p. 172) a particular word before they can demonstrate this by producing it. The measurement of infants’ brain responses to language input can help to provide infor mation about speech perception abilities early in life, and, moreover, they allow us to de scribe the neural basis of language perception and comprehension during development.

Measuring Brain Activity in Early Development There are several neuroimaging methods that enable the measurement of the brain’s re action to environmental input such as spoken language. The most frequently used mea sures in infants and young children are event-related brain potentials (ERPs), as regis tered with electroencephalography (EEG). ERPs reflect the brain’s electrical activity in response to a particular stimulus with an excellent temporal resolution, thus covering the high-speed and temporally sequenced sensory and cognitive processes. Each time-locked average response typically appears as a waveform with several positive or negative peaks at particular latencies after stimulus onset; and each peak, or component, has a charac teristic scalp distribution. Although ERPs deliver restricted spatial information about the component’s distribution in two-dimensional maps, reliable source reconstruction from surface data still poses a methodological problem. The polarity (negative/positive inflec tion of the waveform relative to baseline) and the latency and the scalp distribution of dif ferent components allow us to dissociate perceptual and cognitive processes associated with them. Specifically, changes within the dimensions of the ERP can be interpreted as reflecting a slowing down of a particular cognitive process (reflected in the latency), a re duction in the processing demands or efficiency (amplitude of a positivity or negativity), or alterations/maturation of cortical tissue supporting a particular process (topography). For example, ERP measures allow the investigation of infants’ early ability to detect audi tory changes and how the timing and appearance of these perceptual processes vary through the first year (Kushnerenko, Ceponiene, Balan, Fellman, & Näätänen, 2002a). Only recently, magnetoencephalography (MEG) has started to be used for developmental research. MEG measures the magnetic fields associated with the brain’s electrical activi ty. Accordingly, this method also captures information processing in the brain with a high Page 2 of 36

Neural Correlates of the Development of Speech Perception and Compre hension temporal resolution. In contrast to EEG, however, it also provides reliable spatial informa tion about the localization of the currents responsible for the magnetic field sources. For example, an MEG study with newborns revealed infants’ instant ability to discriminate speech sounds and, moreover, reliably located this process in the auditory cortices (Ku jala et al., 2004). Because movement restrictions limit the use of MEG in developmental research, this method has been primarily applied to sleeping newborns (e.g., Kujala et al., 2004; Sambeth, Ruohio, Alku, Fellman, & Huotilainen, 2008). However, the use of addi tional head-tracking techniques considerably expands its field of application (e.g., Imada et al., 2006). A third neuroimaging method, functional near-infrared spectroscopy (fNIRS) or optical topography (OT) (Villringer & Chance, 1997), enables the study of cortical hemodynamic responses in infants. This method relies on the spectroscopic determination of changes in hemoglobin concentrations resulting from increased regional blood flow in the cerebral cortex, which can be assessed through the scalp and skull. Changes in light attenuation at different wavelengths greatly depend on the concentration changes in oxygenated and deoxygenated hemoglobin ([oxy-Hb] and [deoxy-Hb]) in the cerebral cortex. Because he modynamic responses are only slowly evolving, the temporal resolution of this method is low, but its spatial resolution is relatively informative, depending on the number of chan nels measured (Obrig & Villringer, 2003; Okamoto et al., 2004; Schroeter, Zysset, Wahl, & von Cramon, 2004). To improve the temporal resolution, event-related NIRS paradigms have been suggested (Gratton & Fabiani, 2001). The limitations of fNIRS are at the same time its advantages because the spatial characteristics outrank EEG and the temporal characteristics are comparable or superior to fMRI, so fNIRS simultaneously delivers both kinds of information in moderate resolutions. Furthermore, it is, in contrast to MEG and fMRI, not subject to movement restrictions and seems thus particularly suitable for infant research. For example, fNIRS was used to locate brain responses to vocal and non vocal sounds and revealed voice-sensitive areas in the infant brain (Grossmann, Obereck er, Koch, & Friederici, 2010). Another advantage of fNIRS is its compatibility with EEG and MEG measures, which delivers complementary high temporal information (e.g., Grossmann et al., 2008). Another method that registers the metabolic demand due to neural signaling is functional magnetic resonance imaging (fMRI). Here, the resulting changes in oxygenated hemoglo bin are measured as (p. 173) blood-oxygen-level-dependent (BOLD) contrast. The temporal resolution of this method is considerably lower than with EEG/MEG, but its spatial resolu tion is excellent. Thus, fMRI studies provide information about the localization of sensory and cognitive processes, not only in surface layers of the cortex as with fNIRS, but also in deeper cortical and subcortical structures. So far, this measurement has been primarily applied with infants while they were asleep in the scanner (e.g., Dehaene-Lambertz, De haene, & Hertz-Pannier, 2002; Dehaene-Lambertz et al., 2010; Perani et al., 2010). The limited number of developmental fMRI studies may be due to both practical issues and methodological considerations. Movement restrictions during brain scanning make it dif ficult to work with children in the scanner. Moreover, there is an ongoing discussion whether the BOLD signal in children is comparable to the one in adults and whether the Page 3 of 36

Neural Correlates of the Development of Speech Perception and Compre hension adult models applied so far are appropriate for developmental research (see, e.g., Muzik, Chugani, Juhasz, Shen, & Chugani, 2000; Rivkin et al., 2004; Schapiro et al., 2004). To ad dress the latter problem, recent fMRI studies in infants and young children have used age-specific brain templates (Dehaene-Lambertz, Dehaene, & Hertz-Pannier, 2002) and optimized template construction for developmental populations (Fonov et al., 2011; Wilke, Holland, Altaye, & Gaser, 2008). The decision to use one of these neuroimaging techniques in developmental research is thus determined by practical matters and, in addition, is highly dependent on the kind of information sought, that is, the neuronal correlates of information processing in their temporal or spatial resolution. Ideally, various methods using their complementary abili ties should be combined because results from the respective measures all add up to pro vide insight into the brain bases of language development in its early stages.

Neural Dispositions of Language in the Infant Brain For about half a century, developmental researchers have used electrophysiological mea sures to investigate the neural basis of language development (for reviews, see Csibra, Kushnerenko, & Grossmann, 2008; Friederici, 2005; Kuhl, 2004). Methods that allow a better spatial resolution, (i.e., MEG, fNIRS, and fMRI) have only recently been applied in language research with infants and young children (for reviews, see Cheour et al., 2004; Gervain et al., 2011; Leach & Holland, 2010; Lloyd-Fox, Blasi, & Elwell, 2010; MinagawaKawai, Mori, Hebden, & Dupoux, 2008; Vannest et al., 2009). Although there are only a few studies that have applied the latter methods with infants so far, their findings strongly point to neural dispositions for language in the infant brain. In an fNIRS study with sleeping newborns, Pena et al. (2003) observed a left hemispheric dominance in temporal areas for normal speech compared with backward speech. In an fMRI experiment with 3-month-olds, Dehaene-Lambertz, Dehaene, and Hertz-Pannier (2002) also found that speech sounds, collapsed across forward and backward speech, compared with silence evoked strong left hemispheric activation in the superior temporal gyrus. This activation included Heschl’s gyrus and extended to the superior temporal sul cus and the temporal pole (Figure 9.1). Activation differences between forward and back ward speech were observed in the left angular gyrus and the precuneus. An additional right frontal brain activation occurred only in infants that were awake and was interpret ed as reflecting attentional factors. In a second analysis of the fMRI data from 3-month-olds, Dehaene-Lambertz et al. (2006) found that the temporal sequence of left hemispheric activations in the different brain ar eas was similar to adult patterns, with the activation in the auditory cortex preceding ac tivation both in the most posterior and anterior parts of the temporal cortex and in Broca’s area. The reported early left hemisphere specialization has been shown to be speech-specific (Dehaene-Lambertz et al., 2010) and, in addition, appears more pro Page 4 of 36

Neural Correlates of the Development of Speech Perception and Compre hension nounced for native relative to non-native language input (Minagawa-Kawai et al., 2011). Specifically, 2-month-olds showed stronger fMRI activation in the left posterior temporal lobe in response to language than music stimuli (Dehaene-Lambertz et al., 2010). In an fNIRS study with 4-month-olds, Minagawa-Kawai et al. (2011) reported stronger respons es in left temporal regions for speech relative to three nonspeech conditions, with native stimuli revealing stronger lateralization patterns than non-native stimuli. Left hemispher ic superior temporal activation has also been reported as a discriminative response to syl lables in a recent MEG study with neonates, 6-month-olds, and 12-month-olds (Imada et al., 2006). Interestingly, 6- and 12-month-olds additionally showed activation patterns in inferior frontal regions, but newborns did not yet do this. Together with the results in 3month-olds (Dehaene-Lambertz et al., 2006), these findings indicate developmental changes in motor speech areas (Broca’s area) at an age when infants produce their first syllable sequences and words.

Figure 9.1 Neural disposition for language in the in fant brain. Brain activation of 3-month-old infants in response to speech sounds. A, Averaged brain activa tion in response to speech sounds (forward speech and backward speech versus rest). B, left panel, Av eraged brain activation in awake infants (forward speech versus backward speech). Right panel, Aver aged hemodynamic responses to forward speech and backward speech in awake and asleep infants. L, left hemisphere; R, right hemisphere. Reprinted with permission from Dehaene-Lambertz, Dehaene, & Hertz-Pannier, 2002. Copyright © 2002, American Association for the Advancement of Science. (p. 174)

Page 5 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Thus, it appears that early during the development, there is a dominance of the left hemi sphere for the processing of speech, and particularly native language stimuli, in which both segmental and suprasegmental information is intact, compared with, for example, backward speech, in which this information is altered. However, the processing of more fine-grained phonological features and language-specific contrasts may lateralize later during the development, once children have advanced from general perceptual abilities to the attunement to their native language (e.g., Minagawa-Kawai, Mori, Naoi, & Kojima, 2007). In adults, processing of sentence-level prosody (i.e., suprasegmental information) has been shown to predominantly recruit brain areas in the right hemisphere (Meyer, Alter, Friederici, Lohmann, & von Cramon, 2002; Meyer, Steinhauer, Alter, Friederici, & von Cramon, 2004). To investigate the brain basis for prosodic processes in infancy, Sambeth et al. (2008) used MEG and presented sleeping newborns with varying degrees of prosod ic information. For normal continuous speech and singing, infants showed pronounced bi lateral brain responses, which, however, dramatically decreased when infants were pre sented with filtered low-prosody speech. Similarly, in two fNIRS studies with newborns, Saito and colleagues observed first, increased bilateral frontal responses to (p. 175) in fant-directed compared with adult-directed speech, with the former featuring more pro nounced prosodic information (Saito et al., 2007a). Second, infants only showed this frontal activation pattern for speech with normal prosody, but not for speech with flat prosody (Saito et al., 2007b). An fNIRS study with 3-month-olds, which directly compared normal and flattened speech, revealed activation differences in the right temporal-pari etal cortex, suggesting a right hemispheric dominance for the processing of sentential prosody (pitch information) similar to the dominance reported in adults (Homae, Watan abe, Nakano, Asakawa, & Taga, 2006). Surprisingly, at 10 months, infants showed anoth er activation pattern, with flattened speech evoking stronger responses than normal speech in right temporal and temporal-parietal regions and bilateral prefrontal regions, which the authors explained by the additional effort to process unfamiliar pitch contours in brain regions specialized for prosodic information processing and attention allocation (Homae, Watanabe, Nakano, Asakawa, & Taga, 2007). Thus, the combined data on infant prosodic processing suggest that infants are sensitive to prosodic information from early on, but that the brain activation develops from a bilateral toward a more right lateralized pattern. Given these early neuroimaging findings, it seems that the functional neural network on which language is based, with a left-hemispheric dominance for speech over nonspeech and right-hemispheric dominance for prosody (Friederici & Alter, 2004; Hickok & Poep pel, 2007), is, in principle, established during the first 3 months of life. However, it ap pears that neither the specialization of the language-related areas (e.g., Brauer & Friederici, 2007; Minagawa-Kawai, Mori, Naoi, & Kojima, 2007) nor all of their structural connections are fully developed from early on (Brauer, Anwander, & Friederici, 2011; Dubois et al., 2008).

Page 6 of 36

Neural Correlates of the Development of Speech Perception and Compre hension

Developmental Stages in Language Acquisition and Their Associated Neural Correlates The development of auditory language perception proceeds from acoustic features via phonological representations to the representation of words and their relations in a sen tence. From a schematic point of view, two parallel developmental paths can be consid ered: one proceeding from acoustic cues to phonemes, and then to words and their mean ings, and the other proceeding from acoustic cues to prosodic phrases, to syntactic phras es and their relations. With respect to the first developmental path, neurophysiological data indicate that acoustic differences in phonemes and word stress patterns are detected by the age of 2 to 4 months. At the end of their first year, children recognize and produce their first words, and ERP studies suggest that infants have established brain mechanisms necessary to ac quire lexical representations in a similar way to adults, although these are still less spe cific. In their second year, children enlarge their lexicon, and ERP data show that 24month-olds process semantic relations between nouns and verbs in sentences. These processes resemble those in adults, indicated by children at this age already displaying an adult-like N400 component reflecting semantic processes. With respect to the second developmental path, developmental studies show that salient acoustic cues which mark prosodic phrase boundaries (e.g., pauses) are also perceived at about the age of 5 months, although it takes some time before less salient cues (e.g., changes in the pitch contour) can be used to identify prosodic boundaries that divide the speech stream into lexical and syntactic units. Electrophysiological data suggest that the processing of prosodic phrase structure, reflected by the closure positive shift (CPS), evolves with the emerging ability to process syntactic phrase structure. At the age of 2 years, children are able to recognize syntactic errors in a sentence, reflected by the P600 component. However, the fast automatic syntactic phrase structure building processes, indicated by the early left anterior negativity (ELAN) in addition to the P600, do not oc cur before the age of 32 months. The ability to comprehend syntactically complex sen tences, such as those with noncanonical word order (e.g., passive sentences, object-first sentences), takes a few more years until adult-like processes are established. Diffusiontensor imaging data suggest that this progression is dependent on the maturation of the fiber bundles that connect the language-relevant brain areas in the inferior frontal gyrus (Broca’s area) and in the superior temporal gyrus (posterior portion). Figure 9.2 gives an overview of the outlined developmental steps in language acquisition and the following sections will describe the related empirical evidence in more detail.

From Sounds to Words

Page 7 of 36

Neural Correlates of the Development of Speech Perception and Compre hension

Figure 9.2 Developmental stages of language acqui sition. Development stages are specified in the top row and their associated ERP components in the bot tom row. MMN, mismatch negativity. Modified from Friederici, 2005; Männel & Friederici, 2008.

On the path from sounds to words, infants initially start to process phonological informa tion that makes up the actual speech sounds (phonemes) and the rules according to which these sounds are combined (phonotactic rules). Soon, they process prosodic stress patterns of words, which help them to recognize lexical units in the speech stream. These information types are accessed before the processing of word meaning. (p. 176)

Phoneme Characteristics As one crucial step in language acquisition, infants have to tackle the basic speech sounds of their native language. The smallest sound units of a language, phonemes, are contrastive from each other, although functionally equivalent. In a given language, a cir cumscribed set of approximately 40 phonemes can be combined in different ways to form unique words. Thus, the meaning of a word changes when one of its component phonemes is exchanged with another, as in from cat to pat. Electrophysiological studies investigated phoneme discrimination using the mismatch paradigm. In this paradigm, two classes of stimuli are repeatedly presented, with one stimulus occurring relatively frequently (standard) and the other one relatively rarely (de viant). The mismatch negativity (MMN) component is a preattentive electrophysiological response that is evoked by any discriminable change in repetitive auditory stimulation (Näätänen, 1990). Thus, the mismatch response (MMR) in the ERP is the result of the brain’s automatic detection of the deviant among the standards. Several ERP studies have studied phoneme discrimination in infants and reported that the ability to detect acoustic changes in consonant articulation (Dehaene-Lambertz & Dehaene, 1994), consonant du ration (Kushnerenko et al., 2001), vowel duration (Leppänen, Pikho, Eklund, & Lyytinen, 1999; Pihko et al., 1999), and vowel type (Cheour et al., 1998) is present between 2 and 4 months of age. For example, Friederici, Friedrich, and Weber (2002) investigated infants’ ability to dis criminate between different vowel lengths in phonemes at the age of 2 months. Infants were presented with two syllables of different duration, /ba:/(baa) versus /ba/(ba), in an MMR paradigm. Two separate sessions tested the long syllable /ba:/(baa) as deviant in a

Page 8 of 36

Neural Correlates of the Development of Speech Perception and Compre hension stream of short syllable /ba/(ba) standards, and short /ba/(ba) as deviant in a stream of long /ba:/(baa) standards.

Figure 9.3 Syllable discrimination. ERP responses of 2-month-old infants to standard and deviant syllables and difference waves (deviant-standard) for the long syllable /ba:/ and the short syllable /ba/ in an auditory oddball paradigm. Modified from Friederici, Friedrich, & Weber, 2002.

In Figure 9.3, the ERP difference waves display a positivity with a frontal maximum at about 500-ms post-syllable onset for deviant processing. However, this positivity was only present for the deviancy detection of the long syllable in a stream of short syllables but not vice versa, which can be explained by the greater perceptual saliency of a larger ele ment in the context of smaller elements. In adults, the same experimental setting evoked a pronounced negative deflection at about 200-ms post-stimulus onset in the difference wave, the typical MMN response to acoustically deviating stimuli. Interestingly, in in fants, the response varied depending on their state of alertness; children who were in qui et sleep during the experiment showed only a positivity, while children who (p. 177) were awake showed an adult-like MMN in addition to the positivity. From the data, it follows that infants as young as 2 months of age are able to discriminate long syllables from short syllables and that they display a positivity in the ERP as MMR. Interestingly, language-specific phonemic discrimination is established only later during infants’ development, between the age of 6 and 12 months. Electrophysiological evidence revealed that younger infants aged 6 and 7 months show discrimination of phonemic con trasts that are either relevant or not relevant for their native language, whereas older in fants aged 11 and 12 months only display discrimination of the phonemic contrast in their native language (Cheour et al., 1998; Rivera-Gaxiola, Silva-Pereyra, & Kuhl, 2005). Simi larly, Minagawa-Kawai, Mori, Naoi, and Kojima (2007) showed in an fNIRS study that in fants tune into their native language-specific phoneme contrast at about the age of 6 to 7 months. However, the left dominance of the phoneme-specific response in the temporal regions was observed only in infants aged 13 months and older. These results suggest that phoneme contrasts are initially processed as acoustic rather than linguistic differ ence until at about 12 months, when left hemisphere regions are recruited similarly to in adults (Minagawa-Kawai, Mori, Furuya, Hayashi, & Sato, 2002). In infant studies using the mismatch paradigm, the MMR can appear as either a positive or a negative deflection in the ERP. For example, Kushnerenko et al. (2001) presented sleeping newborns with fricatives of different durations and observed negative MMRs, whereas Leppänen et al. Page 9 of 36

Neural Correlates of the Development of Speech Perception and Compre hension (1999) and Pihko et al. (1999) reported positive MMRs in sleeping newborns for syllables with different vowel length. The outcome of ERP responses to auditory change detection seems to be affected by several factors, for example, the infants’ state of alertness (awake or asleep). Furthermore, stimulus discrimination difficulty or saliency seems to have an impact on the discrimination response (Friederici, Friedrich, & Weber, 2002; Morr, Shafer, Kreuzer, & Kurtzberg, 2002). Also, the transition from a positive to a negative MMR has been shown to be an effect of advancing maturation (Kushnerenko et al., 2002b; Morr et al., 2002; Trainor et al., 2003). Despite the differences in the ERP morphology of the detection of phoneme changes, the combined data suggest that infants’ ability to au tomatically discriminate between different phonemes is present from early on.

Word Stress Another important phonological feature that infants have to discover and make use of during language acquisition is the rule according to which stress is applied to multisyllab ic words. For example, English, like German, is a stress-based language and has a bias to ward a stress-initial pattern for bisyllabic words (Cutler & Carter, 1987). French, in con trast, is a syllable-based language that tends to lengthen the word’s last syllable (Nazzi, Iakimova, Bertoncini, Frédonie, & Alcantara, 2006). Behaviorally, it has been shown that even newborns discriminate differently stressed pseudowords (Sansavini, Bertoncini, & Giovanelli, 1997) and that between 6 and 9 months, infants acquire language-specific knowledge about the stress pattern of possible word candidates (Jusczyk, Cutler, & Redanz, 1993; Höhle, Bijeljac-Babic, Nazzi, Herold, & Weissenborn, 2009; Skoruppa et al., 2009). Interestingly, studies revealed that stress pat tern discrimination at 6 months is shaped by language experience, as German-learning (p. 178) but not French-learning infants distinguish between stress-initial and stress-final pseudowords (Höhle et al., 2009). Similarly, at 9 months, Spanish-learning but not French-learning infants show discrimination responses, suggesting that French infants, although they are sensitive to the acoustic differences, do not treat stress as lexically in formative (Skoruppa et al., 2009). Neuroimaging studies that do not require infants’ attention during testing suggest that infants are sensitive to the predominant stress pattern of their target language as early as 4 to 5 months of age (Friederici, Friedrich, & Christophe, 2007; Weber, Hahne, Friedrich, & Friederici, 2004). In an ERP study, Friederici, Friedrich, and Christophe (2007) tested two groups of 4- to 5-month-old German- and French-learning infants for their ability to discriminate between different stress patterns. In a mismatch paradigm, the standard stimuli were bisyllabic pseudowords with stress on the first syllable (baaba), whereas the deviant stimuli had stress on the second syllable (babaa). The data revealed that both groups are able to discriminate between the two types of stress patterns (Fig ure 9.4). However, results differed with respect to the amplitude of the MMR: Infants learning German showed a larger effect for the language-nontypical iambic pattern (stress on the second syllable), whereas infants learning French demonstrated a larger ef fect for the language-nontypical trochaic pattern (stress on the first syllable). These re sults suggest that the respective nontypical stress pattern is considered deviant both Page 10 of 36

Neural Correlates of the Development of Speech Perception and Compre hension within the experiment (i.e., rare stimulus in the set) and with respect to an individual infant’s native language. This finding, in turn, presupposes that infants have established knowledge about the predominant stress pattern of their target language by the age of 5 months. As behavioral and electrophysiological developmental studies suggest, early syllable iden tification and stress pattern discrimination support speech segmentation during later ac quisition stages, performed by identifying onset and offset boundaries. Accordingly, in a number of behavioral experiments, Nazzi, Dilley, Jusczyk, Shattuck-Hunagel, and Jusczyk (2005) demonstrated that both the type of initial phoneme and the stress pattern influ ence word segmentation from fluent speech, with a preference for the predominant pat terns of the infants’ native language. Similarly, infants’ word detection is facilitated when words occur at boundary positions and are thus marked by additional prosodic informa tion (Gout, Christophe, & Morgan, 2004; Seidl & Johnson, 2007). Moreover, several stud ies that measured children’s later language outcome in lexical-semantic and syntactic do mains revealed the predictive value of infants’ early ERP responses to phoneme and stress pattern contrasts (Friedrich & Friederici, 2010; Kuhl et al., 2008; Tsao, Liu, & Kuhl, 2004). Regarding the development of word segmentation abilities, behavioral studies have demonstrated that at the age of 7.5 months, infants learning English are able to segment bisyllabic words with stress on the first syllable from continuous speech but not those with stress on the second syllable (Jusczyk, Houston, & Newsome, 1999). Segmentation of stress-initial words was also reported in 9-month-old Dutch-learning infants for both native and nonnative words, which, however, all followed the same language-specific stress pattern rules (Houston, Jusczyk, Kuijpers, Coolen, & Cutler, 2000; Kuijpers, Coolen, Houston, & Cutler, 1998). In contrast, the ability to segment words with stress on the sec ond syllable was only observed at the age of 10.5 months in English-learning infants (Jusczyk, Houston, & Newsome, 1999). For French-learning infants, Nazzi et al. (2006) found developmental differences between 8 and 16 months for the detection of syllables and bisyllabic words in fluent speech. Bisyllabic words as one unit are only detected at the age of 16 months. Although no segmentation effect was found for 8-month-olds, 12month-olds segmented individual syllables from the speech stream, with more ease in segmenting the second syllable, which is consistent with the rhythmic features of French.

Page 11 of 36

Neural Correlates of the Development of Speech Perception and Compre hension

Figure 9.4 Word stress. ERP responses of 4- to 5month-olds to standard and deviant stress patterns in an auditory oddball paradigm. A, Grand-average ERPs for French infants with the trochaic stress pat tern /baaba/ as standard and deviant (upper row) and the iambic stress pattern /babaa/ as standard and de viant (lower row). B, Grand-average ERPs for Ger man infants with the trochaic stress pattern /baaba/ as standard and deviant (upper row) and the iambic stress pattern /babaa/ as standard and deviant (lower row). Reprinted with permission from Friederici, Friedrich, & Christophe, 2007.

Electrophysiological studies on word segmentation revealed word recognition responses for Dutch-learning 7-month-olds in the ERP to previously familiarized words (Kooijman, Johnson, & Cutler, 2008), whereas behavioral studies observed word segmentation for 9month-olds, but not yet for 7.5-month-olds (Kuijpers, Coolen, Houston, & Cutler, 1998). Interestingly, detection of words in sentences by German-learning infants was observed even at 6 months, when during familiarization, words were particularly prosodically em phasized (Männel & Friederici, 2010). For the segmentation of the less familiar finalstress pattern, Dutch-learning 10-months-olds still largely relied on the strong syllable to launch words (Kooijman, Hagoort, & Cutler, 2009). Similarly, relating to the behavioral delay of French-learning infants in bisyllabic word segmentation, Goyet, de Schonen, and Nazzi (2010) found for French 12-month-olds, ERP responses to (p. 179) bisyllabic stressfinal words that revealed both whole word and syllable segmentation.

Phonotactics For successful language learning, infants eventually need to acquire the rules according to which phonemes may be combined to form a word in a given language. As infants be come more familiar with actual speech sounds, they gain probabilistic knowledge about particular phonotactic rules. This also includes information about which phonemes or phoneme combinations can legitimately appear at word onsets and offsets. If infants ac

Page 12 of 36

Neural Correlates of the Development of Speech Perception and Compre hension quire this kind of information early on, it can support the detection of lexical units in con tinuous speech and thus facilitate the learning of new words. Behaviorally, it has been shown that phonotactic knowledge about word onsets and off sets is present and used for speech segmentation at the age of 9 months, but is not yet present at 6 months of age (Friederici & Wessels, 1993; Jusczyk, Friederici, Wessels, Svenkerud, & Jusczyk, 1993). In ERP studies, the N400 component can serve as an elec trophysiological marker for studying phonotactic knowledge by enabling the comparison of brain responses to nonwords that follow the phonotactic rules of a given language and nonsense words that do not. The N400 component is known to indicate lexical (word form) and semantic (meaning) processes and is interpreted to mark the effort to integrate an event into its current or long-term context, with more pronounced N400 amplitudes in dicating lexically and semantically unfamiliar or unexpected events (Holcomb, 1993; Ku tas & Van Petten, 1994). Regarding phonotactic processing in adults, ERP studies re vealed larger N400 amplitudes for pseudowords (i.e., (p. 180) phonotactically legal but nonexistent in the lexicon) than to real words. In contrast, nonwords (i.e., phonotactically illegal words) did not elicit an N400 response (e.g., Bentin, Mouchetant-Rostaing, Giard, Echallier, & Pernier, 1999; Holcomb, 1993; Nobre & McCarthy, 1994). This suggests that pseudowords trigger search processes for possible lexical entries, but this search fails be cause pseudowords do not exist in the lexicon. Nonwords, in contrast, do not initiate a similar search response because they are not even treated as possible lexical entries as they already violate the phonotactic rules.

Figure 9.5 Phonotactic rules. ERP data of 12-montholds, 19-month-olds, and adults in response to phono tactically illegal nonwords and phonotactically legal pseudowords in a picture–word priming paradigm. Modified from Friedrich & Friederici, 2005a.

In a developmental ERP study, Friedrich and Friederici (2005a) investigated phonotactic knowledge in 12- and 19-month-old toddlers by measuring brain responses to phonotacti cally legal pseudowords and phonotactically illegal nonwords. In a picture–word priming paradigm, children were presented with simple colored pictures while simultaneously lis tening to words that either correctly labeled the picture content or were pseudowords or nonwords. The picture content is assumed to initiate lexical-semantic priming, which re sults in semantic integration difficulties when the respective labels do not match the pic tures, reflected in enhanced N400 amplitudes. As Figure 9.5 illustrates, the ERP respons es of 19-month-olds are quite similar to the ones observed in adults because they demon strate more negative responses to phonotactically legal pseudowords than to phonotacti cally illegal nonwords. Adults show the typical N400 response to pseudowords, starting at about 400 ms after stimulus onset, whereas in 19-month-olds, the negative deflection to Page 13 of 36

Neural Correlates of the Development of Speech Perception and Compre hension pseudowords is sustained longer. In contrast, data of 12-month-olds do not reveal differ ential ERP responses to pseudowords and nonwords. From these data it follows that, in contrast to 12-month-olds, who do not yet have this ability, 19-month-olds possess some phonotactic knowledge (indicated by an N400-like response) and therefore treat pseudo words, but not nonwords, as likely referents for picture labels. This implies that nonwords that do not follow the language-specific phonotactic rules are not considered word candi dates and, from very early on, are excluded from further word learning.

Phonological Familiarity Infants’ emerging efforts to map sounds onto objects (or pictures of objects) has been captured in an additional ERP effect. An ERP study with 11-month-olds suggested a dif ferential brain response to known compared with unknown words in the form of a nega tivity at about 200 ms after word onset, which could be viewed as a familiarity effect (Thierry, Vihman, & Roberts, 2003). Using a picture–word priming paradigm with 12- and 14-month-olds, Friedrich and Friederici (2005b) observed an early frontocentral negativi ty between 100 and 400 ms for auditory word targets that matched the picture compared with nonmatching words. This early effect was interpreted as a familiarity effect reflect ing the fulfillment of a phonological (word) expectation after seeing the picture of an ob ject. At this age, infants seem to have some lexical knowledge, but the specific word form referring to a given object may not yet be sharply defined. This interpretation is support ed by the finding that 14-month-olds show an ERP difference between known words and phonetically dissimilar known words, but not between known words and phonetically sim ilar words (Mills et al., 2004). The available data thus indicate that phonological informa tion and semantic knowledge interact at about 14 months of age.

Word Meaning As described above, the adult N400 component reflects the integration of a lexical ele ment into a semantic context (Holcomb, 1993; Kutas & Van Petten, 1994) and can be used as an ERP template (p. 181) against which the ERPs for lexical-semantic processing during early development are compared. Mills, Coffey-Corina, and Neville (1997) investigated infants’ processing of words whose meaning they knew or did not know. Infants between 13 and 17 months of age showed a bilateral negativity for unknown words, whereas 20-month-olds showed a left-hemispher ic negativity, which was interpreted as a developmental change toward a hemispheric specialization for word processing (see also Mills et al., 2004). In a more recent ERP study, Mills and colleagues tested the effect of word experience (training) and vocabulary size (word production) on lexical processing (Mills, Plunkett, Prat, & Schafer, 2005). In this word-learning paradigm, 20-month-olds acquired novel words either paired with a novel object or without an object. After training, the infant ERPs showed a repetition ef fect indicated by a reduced N200-500 amplitude to familiar and novel unpaired words, whereas an increased bilaterally distributed N200-500 was found for novel paired words. This finding indicates that the N200-500 is linked to word meaning; however, it is not en tirely clear whether the N200-500 reflects semantic processes alone or whether phono Page 14 of 36

Neural Correlates of the Development of Speech Perception and Compre hension logical familiarity also plays a role. Assuming that this early ERP effect indeed reflects se mantic processing, its early onset may be explained by infants’ relatively small vocabular ies. A small vocabulary results in a low number of phonologically possible alternative word forms, allowing the brain to respond soon after hearing a word’s first phonemes (see earlier section on phonological familiarity). A clear semantic-context N400 effect at the word level has been observed for 14- and 19month-olds, but not yet for 12-month-olds (Friedrich & Friederici, 2005a, 2005b, 2004). The ERP to words in picture contexts showed a central-parietal, bilaterally distributed negative-going wave between 400 and 1400 ms, which was more negative for words that did not match the picture context than those that did (Figure 9.6). Compared with adults, this N400-like effect reached significance later and lasted longer, which suggests slower lexical-semantic processes in children. There were also small topographic differences of the effect because children showed a stronger involvement of frontal electrode sites than adults. The more frontal distribution could either mean that children’s semantic process es are still more image-based (see frontal distribution in adults for picture instead of word processing; West & Holcomb, 2002) or that children recruit frontal brain regions that, in adults, are associated with attention (Courchesne, 1990) and increased demands on language processing (Brauer & Friederici, 2007). In a recent study, Friedrich and Friederici (2010) found that the emergence of the N400 is not merely age dependent but also related to the infants’ state of behavioral language development. Twelve-month-olds, who obtained a high early word production score, already displayed an N400 semantic priming effect, whereas infants with lower vocabulary rates did not. Torkildsen and colleagues examined lexical-semantic processes as indicated by the N400 in 2-year-olds. In the first study, using a picture–word priming paradigm, they found that 20-month-olds showed earlier and larger N400 effects for between-category than withincategory violations, pointing to children’s early knowledge about semantic categories (Torkildsen et al., 2006). In the second study, the authors used a unimodal lexical-seman tic priming paradigm with semantically related and unrelated word pairs, and demon strated that 24-month-olds reveal a phonological-lexical familiarity effect for related word pairs and an N400 effect for unrelated word pairs, suggesting that semantic relatedness priming is functional at the end of children’s second year (Torkildsen et al., 2007). There are few fMRI studies with children investigating lexical-semantic processes at the word level. These fMRI studies suggest that a neural network for the processing of words and their meaning often seen in adults is established by the age of 5 years. For example, one study used a semantic categorization task with 5- to 10-year-old children and ob served activation in the left inferior frontal gyrus and the temporal region as well as in the left fusiform gyrus, suggesting a left hemispheric language network similar to that in adults (Balsamo, Xu, & Gaillard, 2006). Another study used a semantic judgment task requiring the evaluation of the semantic relatedness of two auditorily presented words (Chou et al., 2006). During this task, 9- to 15-year-olds showed activation in the temporal gyrus and in the inferior frontal gyri bilaterally. In a recent fMRI language mapping study with 8- to 17-year-old children, de Guibert et al. (2010) applied two auditory lexical-se Page 15 of 36

Neural Correlates of the Development of Speech Perception and Compre hension mantic tasks and two visual phonological tasks and observed selective activations in left frontal and temporal regions.

Figure 9.6 Word meaning. ERP data and difference maps (nonmatching–matching) of 12-month-olds, 14month-olds, 19-montholds, and adults in response to words matching or not matching the picture content in a picture–word priming paradigm. Modified from Friedrich & Friederici, 2005a, 2005b.

In summary, in the developmental ERP studies on semantic processing at the word level we have introduced, two ERP effects have been observed. First, an early negativity in re sponse to picture-matching words has been found even in 12-month-olds (p. 182) and can be interpreted as a phonological familiarity effect. Second, a later central-parietal nega tivity for nonmatching words has been observed in 14- and 19-month-olds, an effect re ferred to as infant N400. The occurrence of a phonological familiarity effect across all age groups suggests that not only 14- and 19-month-olds but also 12-month-olds create lexical expectations from picture contents, revealing that they already possess some lexical-se mantic knowledge. However, infants at the latter age do not yet display an N400 semantic expectancy violation effect present in 14-month-olds, which indicates that the neural mechanisms of the N400 mature between 12 and 14 months of age. Furthermore, at the end of their second year, toddlers are sensitive to semantic category relations and seman tic relatedness of basic-level words. The finding that the N400 at this age still differs in latency and distribution from the adult N400 suggests that the underlying brain systems are still under development. The fact, however, that an N400 effect is present at this age implies that this ERP component is a useful tool to further investigate semantic process ing in young children. In this context, developmental fMRI studies have revealed lefthemisphere activation patterns for lexical-semantic processes that resemble those of adults. Direct developmental comparisons, however, suggest age-related activation in creases in left inferior frontal regions and left superior temporal regions, indicating greater lexical control and experience-based gain in lexical representations, respectively (Booth et al., 2004; Cao et al., 2010; Schlaggar et al., 2002).

Page 16 of 36

Neural Correlates of the Development of Speech Perception and Compre hension

From Sounds to Sentences On the path from sounds to sentences, prosodic information plays a central role. Senten tial prosody is crucial for the acquisition of syntactic structure because different acoustic cues that in combination mark prosodic phrase boundaries often signal syntactic phrase boundaries. The detection and processing of prosodic phrase boundaries thus facilitate the segmentation of linguistically relevant units from continuous speech and provide an easy entry into later lexical and syntactic learning (see Gleitman & Wanner, 1982).

Sentence-Level Prosody Intonational phrase boundaries (IPBs) mark the largest units in phrasal prosody, roughly (p. 183) corresponding to syntactic clauses, and are characterized by several acoustic cues, namely, preboundary lengthening, pitch change, and pausing (Selkirk, 1984). Be haviorally, it has been shown that adult listeners make use of prosodic boundaries in the interpretation of spoken utterances (e.g., Schafer, Speer, Warren, & White, 2000). Simi larly, developmental studies indicate that infants perceive larger linguistic units in contin uous speech based on prosodic boundary cues. Although 6-month-old English-learning in fants detect clauses in continuous speech, they cannot yet reliably identify syntactic phrases in continuous speech (Nazzi, Kemler Nelson, Jusczyk, & Jusczyk, 2000; Seidl, 2007; Soderstrom, Nelson, & Jusczyk, 2005; Soderstrom, Seidl, Nelson, & Jusczyk, 2003). In contrast, 9-month-olds demonstrate this ability at both clause and phrase level (Soder strom et al., 2003). Thus, the perception of prosodic cues that, in combination, signal boundaries appears to be essential for the structuring of the incoming speech signal and enables further speech analyses. In adult ERP studies, the offset of IPBs is associated with a positive-going deflection with a central-parietal distribution, the CPS (Pannekamp, Toepel, Alter, Hahne, & Friederici, 2005; Steinhauer, Alter, & Friederici, 1999). This component has been interpreted as an indicator of the closure of prosodic phrases by IPBs. The CPS has been shown to be not a mere reaction to the acoustically salient pause (lower-level processing), but rather an in dex for the underlying linguistic process of prosodic structure perception (higher-level processing) because it is still present when the pause is deleted (Steinhauer, Alter, & Friederici, 1999). To investigate the electrophysiology underlying prosodic processing at early stages of lan guage acquisition, a recent ERP study examined 5-month-olds’ ability to process IPBs with and without a boundary pause (Männel & Friederici, 2009). Infants listened to sen tences with two different prosodic realizations determined by their particular syntactic structure: sentences containing an IPB (e.g., Tommi verspricht, # Papa zu helfen [Tommi promises to help Papa]), and sentences without an IPB (e.g., Tommi verspricht Papa zu schlafen [Tommi promises Papa to sleep]). In a first experiment, 5-month-olds showed no CPS in response to IPBs; instead, they demonstrated an obligatory ERP response to sen tence continuation after the pause. In a second experiment in which the boundary pause had been deleted and only preboundary lengthening and pitch change signaled the IPBs, another group of 5-month-olds did not reveal the obligatory ERP response observed previ Page 17 of 36

Neural Correlates of the Development of Speech Perception and Compre hension ously. In contrast, adults showed a CPS in addition to obligatory ERP responses indepen dent of the presence of the boundary pause (see also Steinhauer, Alter, & Friederici, 1999). The developmental comparison indicates that infants are sensitive to salient acoustic cues such as pauses in the speech input, and that they process speech interrup tions at lower perceptual levels. However, they do not yet show higher-level processing of combined prosodic boundary cues, reflected by the CPS. ERP studies in older children examined when, during language learning, the processes associated with the CPS emerge by exploring the relationship between prosodic boundary perception and syntactic knowledge (Männel & Friederici, 2011). ERP studies on the pro cessing of phrase structure violations have revealed a developmental shift between children’s second and third year (Oberecker, Friedrich, & Friederici, 2005; Oberecker & Friederici, 2006; see below). Accordingly, children were tested on IPB processing before this developmental phase, at 21 months, and after this phase, at 3 and 6 years of age. As can be seen from Figure 9.7, 21-month-olds do not yet show a positive shift in response to IPBs, although 3- and 6-year-olds do. These results indicate that prosodic structure pro cessing, as indicated by the CPS, does not emerge until some knowledge of syntactic phrase structure has been established. The combined ERP findings on prosodic processing in infants and children suggest that during early stages of language acquisition, infants initially rely on salient acoustic as pects of prosodic information that are likely contributors to the recognition of prosodic boundaries. Children may initially detect prosodic breaks through lower-level processing mechanisms until a degree of syntactic structure knowledge is formed through continued language experience that, in turn, reinforces the ability of children to perceive prosodic phrasing at a cognitive level. The use of prosodic boundary cues for later language learning has been shown in lexical acquisition (Gout, Christophe, & Morgan, 2004; Seidl & Johnson, 2007), and in the acqui sition of word order regularities (Höhle, Weissenborn, Schmitz, & Ischebeck, 2001). Thus, from a developmental perspective, the initial analysis and segmentation of larger linguis tically relevant units based on prosodic boundary cues seems to be particularly important during language acquisition and likely facilitates bootstrapping into smaller syntactic and lexical units in the speech signal later in children’s development.

Page 18 of 36

Neural Correlates of the Development of Speech Perception and Compre hension

Figure 9.7 Sentence-level prosody. ERP data and dif ference maps (with IPB–without IPB) of 21-montholds, 3-year-olds, and 6-year-olds in response to sen tences with and without intonational phrase bound aries (IPB). Modified from Männel, 2011. (p. 184)

Sentence-Level Semantics Sentence processing requires not only the identification of linguistic units but also the maintenance of the related information in working memory and the integration of differ ent information over time. To understand the meaning of a sentence, the listener has to possess semantic knowledge about nouns and verbs as well as their respective relation ship (for neural correlates of developmental differences between noun and verb process ing, see Li, Shu, Liu, & Li, 2006; Mestres-Misse, Rodriguez-Fornells, & Münte, 2010; Tan & Molfese, 2009). To investigate whether children already process word meaning and se mantic relations in sentential context, the semantic violation paradigm can be applied with semantically correct and incorrect sentences such as The king was murdered and The honey was murdered, respectively (Friederici, Pfeifer, & Hahne, 1993; Hahne & Friederici, 2002). This paradigm uses the N400 as an index of semantic integration abili ties, with larger N400 amplitudes for higher integration efforts of semantically inappro priate words into their context. The semantic expectation of a possible sentence ending, for example, is violated in The honey was murdered because the verb at the end of the sentence (murdered) does not semantically meet the meaning that was set up by the noun in the beginning (honey). In adult ERP studies, an N400 has been found in response to such semantically unexpected sentence endings (Friederici, Pfeifer, & Hahne, 1993; Hahne & Friederici, 2002). Friedrich and Friederici (2005c) studied the ERP responses to semantically correct and incorrect sentences in 19- and 24-month-old children. Semantically incorrect sentences contained objects that violated the selection restrictions of the preceding verb, as in The cat drinks the ball in contrast to The child rolls the ball. For both age groups, the sen tence endings of semantically incorrect sentences evoked N400-like effects in the ERP, with a maximum at central-parietal electrode sites (Figure 9.8). In comparison to the adult data, the negativities in children started at about the same time (i.e., at around 400 ms post-word onset) but were longer lasting. This suggests that semantically unexpected nouns that violate the selection restrictions of the preceding verb also initiate semantic integration processes in children but that these integration efforts are maintained longer than in adults. The developmental ERP data indicate that even at the age of 19 and 24 Page 19 of 36

Neural Correlates of the Development of Speech Perception and Compre hension months, children are able to process semantic relations between words in sentences in a similar manner to adults. ERP studies on the processing of sentential lexical-semantic information have also report ed N400-like responses to semantically incorrect sentences in older children, namely 5- to 15-year-olds (Atchley et al., 2006; Hahne et al., 2004; Holcomb, Coffey, & Neville, 1992). Similarly, Silva-Pereyra and colleagues found that sentence endings that semantically vio lated the preceding sentence phrases evoked several anteriorly distributed negative peaks in 3- and 4-year-olds, whereas in 30-month-olds, an anterior negativity between 500- and 800-ms after word onset occurred (Silva-Pereyra, Klarman, Lin, & Kuhl, 2005; Silva-Pereyra, Rivera-Gaxiola, & Kuhl, 2005). Although these studies revealed differential responses to semantically incorrect and correct sentences in young children, the distribu tion of these negativities did not match the usual central-parietal maximum of the N400 seen in adults.

Figure 9.8 Sentence-level lexical-semantic informa tion. ERP data and difference maps (incorrect–cor rect) of 19-month-olds, 24-month-olds, and adults in response to the sentence endings of semantically cor rect and incorrect sentences in a semantic violation paradigm. Modified from Friedrich & Friederici, 2005c.

Despite the different effects reported in the ERP studies on sentential semantic process ing, the current ERP studies suggest that semantic processes at sentence level, as reflect ed by an N400-like response, are, in principle, present at the end of children’s second year of life. However, it takes a few more years (p. 185) before the neural network under lying these processes is established in an adult-like manner. A recent fMRI study investigating the neural network underlying sentence-level semantic processes in 5- to 6-year-old children and adults provides some evidence for the differ ence between the neural network recruited in children and adults (Brauer & Friederici, Page 20 of 36

Neural Correlates of the Development of Speech Perception and Compre hension 2007). Activation in children was found bilaterally in the superior temporal gyri and in the inferior and middle frontal gyri for the processing of correct sentences and semanti cally incorrect sentences. Compared with adults, the children’s language network was less lateralized, was less specialized with respect to different aspects of language pro cessing (semantics versus syntax, see also below), and engaged additional areas in the in ferior frontal cortex bilaterally. Another fMRI study examined lexical-semantic decisions for semantically congruous and incongruous sentences in older children, aged 7 to 10 years, and adults (Moore-Parks et al., 2010). Overall, the results suggested that by the end of children’s first decade, they employ a similar cortical network in semantic process ing as adults, including activation in left inferior frontal, left middle temporal, and bilater al superior temporal gyri. However, results also revealed developmental differences, with adults showing greater activation in the left inferior frontal gyrus, left supramarginal gyrus, and left inferior parietal lobule as well as motor-related regions.

Syntactic Rules In any language, a well-defined rule system determines the composition of lexical ele ments, thus giving the sentence its structure. The analysis of syntactic relations between words and phrases is a complicated process, yet children have acquired the basic syntac tic rules of their native language (p. 186) by the end of their third year (see Guasti, 2002; Hirsh-Pasek, & Golinkoff, 1996; Szagun, 2006). For successful sentence comprehension, two aspects of syntax processing appear to be of particular relevance: first, the structure of each phrase that has to be built on the basis of word category information; and second, the grammatical relationship between the various sentence elements, which has to be es tablished in order to allow the interpretation of who is doing what to whom.

Figure 9.9 Syntactic rules. ERP data of 24-montholds, 32-month-olds, and adults in response to syn tactically correct and incorrect sentences in a syntac tic violation paradigm. Modified from Oberecker, Friedrich, & Friederici, 2005.

Adult ERP and fMRI studies have investigated the neural correlates of syntactic process ing during sentence comprehension by focusing on two aspects: phrase structure build ing and the establishment of grammatical relations and thereby the sentence’s interpreta tion. Studies of the former have used the syntactic violation paradigm (e.g., Atchley et al., 2006; Friederici, Pfeifer, & Hahne, 1993). In this paradigm, syntactically correct and syn tactically incorrect sentences are presented, with the latter having morphosyntactic, phrase structure, or tense violations. In the ERP response to syntactically incorrect sen Page 21 of 36

Neural Correlates of the Development of Speech Perception and Compre hension tences containing phrase structure violations, two components have been observed. The first is the ELAN, an early anterior negativity, which is interpreted to reflect highly auto matic phrase structure building processes (Friederici, Pfeifer, & Hahne, 1993; Hahne & Friederici, 1999). The second is the P600, a later-occurring central-parietal positivity, which is interpreted to indicate processes of syntactic integration (Kaan et al., 2000) and controlled processes of syntactic reanalysis and repair (Friederici, Hahne, & Mecklinger, 1996; Osterhout & Holcomb, 1993). This biphasic ERP pattern in response to phrase structure violations has been observed for both passive and active sentence constructions (Friederici, Pfeifer, & Hahne, 1993; Hahne, Eckstein, & Friederici, 2004; Hahne & Friederici, 1999; Rossi, Gugler, Hahne, & Friederici, 2005). Several developmental ERP studies have examined at what age children process phrase structure violations and therefore show the syntax-related ERP components ELAN and P600 as observed in adults (Oberecker, Friedrich, Friederici, 2005; Oberecker & Friederi ci, 2006). In these experiments, 24- and 32-month-old German children listened to syntac tically correct sentences and incorrect sentences that comprised incomplete prepositional phrases. For example, the noun after the preposition was omitted as in *The lion in the ___ roars versus The lion roars. As illustrated in Figure 9.9, the adult data revealed the ex pected biphasic ERP pattern in response to the sentences containing a phrase structure violation. The ERP responses of 32-month-old children showed a similar ERP pattern, al though both components appeared in later time windows than the adult data. Interesting ly, 24-month-old children also showed a difference between correct and incorrect sen tences; however, in this age group only, a P600 but no ELAN occurred. Recently, Bernal, Dehaene-Lambertz, Millotte, and Christophe (2010) demonstrated that 24-month-old French children compute syntactic structure when listening to spoken sen tences. The authors report an early left-lateralized ERP response for word category viola tions (i.e., when an expected verb was incorrectly replaced by a noun, or vice versa). Sil va-Pereyra and colleagues examined the processing of tense violations in sentences in children between 30 and 48 months (Silva-Pereyra et al., 2005; (p. 187) Silva-Pereyra, Rivera-Gaxiola, & Kuhl, 2005). The ERPs to incorrect sentences revealed a late positivity for the older children and a very late-occurring positivity for the 30-month-olds. In a re cent series of ERP experiments, Silva-Pereyra, Conboy, Klarmann, and Kuhl (2007) studied syntactic processing in 3-year-olds, using natural sentences and sentences with out semantic information (so-called jabberwocky sentences) in which content words are replaced by pseudowords. Children were presented with correct sentences and incorrect sentences containing phrase structure violations. For the natural sentences, children showed two positivities in response to the syntactic violations, whereas for the syntacti cally incorrect jabberwocky sentences, two negativities were observed. This ERP pattern is certainly different from that in adults, who show an ELAN and a P600 in normal and jabberwocky sentences, with a constant amplitude of the ELAN and a reduced P600 for jabberwocky sentences, in which integration is not necessary (Hahne & Jescheniak, 2001; Yamada & Neville, 2007).

Page 22 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Hahne, Eckstein, and Friederici (2004) investigated the processing of phrase structure vi olations in syntactically more complex, noncanonical sentences (i.e., passive sentences such as The boy was kissed by the girl). In these sentences, the first noun (the boy) is not the actor, which makes the interpretation more difficult than in active sentences. When a syntactic violation occurred in passive sentences, the ELAN-P600 pattern was evoked in 7- to 13-year-old children. Six-year-olds, however, only displayed a late P600. The combined ERP results point to developmental differences suggesting that automatic syntactic processes, reflected by the ELAN, are present later during language develop ment than processes reflected by the P600. Moreover, the adult-like ERP pattern is present earlier for active than for passive sentences. This developmental course is in line with behavioral findings indicating that the processing of noncanonical sentences only de velops late, after the age of 5 years and, depending on the syntactic structure only around the age of 7 years (Dittmar, Abbot-Smith, Lieven, & Tomasello, 2008). The neural network underlying syntactic processes in the developing brain has recently been investigated in an fMRI study with 5- to 6-year-olds using the syntactic violation par adigm (Brauer & Friederici, 2007). Sentences containing a phrase structure violation bi laterally activated the superior temporal gyri and the inferior and middle frontal gyri (similar to correct and semantically incorrect sentences) but, moreover, specifically acti vated left Broca’s area. Compared with that in adults, this activation pattern was less lat eralized, less specific, and more extended. A time course analysis of the perisylvian acti vation across correct and incorrect sentences also revealed developmental differences. In contrast to that in adults, children’s inferior frontal cortex responded much later than their superior temporal cortex (Figure 9.10). Moreover, in contrast to adults, children dis played a temporal primacy of right-hemispheric over left-hemispheric activation (Brauer, Neumann & Friederici, 2008), which suggests a strong reliance on right-hemisphere prosodic processes during auditory sentence comprehension in childhood. In a recent fM RI study with 10- to 16-year-old children, Yeatman, Ben-Shachar, Glover, and Feldmann (2010) investigated sentence processing by systematically varying syntactic complexity and observed broad activation patterns in frontal, temporal, temporal-parietal and cingu late regions. Independent of sentence length, syntactically more complex sentences evoked stronger activation in the left temporal-parietal junction and the right superior temporal gyrus. Interestingly, activation changes in frontal regions correlated with vocab ulary and syntax perception measures. Thus, individual differences in activation patterns demonstrate that auditory sentence comprehension is based on a dynamic and distrib uted network that is modulated by age, language skills, and task demands.

Page 23 of 36

Neural Correlates of the Development of Speech Perception and Compre hension

Conclusion

Figure 9.10 Temporal organization of cortical activa tion during auditory sentence comprehension. Brain activation of adults and children in sagittal section (x = −50) and horizontal section (z = 2). Data are masked by random-effects activation maps at z = 2.33 and display a color coding for time-to-peak val ues in active voxels between 3.0 and 8.0 seconds. The lines indicate the cut for the corresponding sec tion. Note the very late response in the inferior frontal cortex in children and their hemispheric dif ferences in this region. Inserted diagrams demon strate examples of BOLD responses to sentence com prehension in Broca’s area and in Heschl’s gyrus. Reprinted with permission from Brauer, Neumann, & Friederici, 2008.

The results of the reported behavioral and neuroimaging studies broadly cover phonologi cal/prosodic, semantic, and syntactic aspects of language acquisition during the first years of life. In developmental research, ERPs are well established and often the method of choice; however, MEG, NIRS, and fMRI have recently been adjusted for use in develop mental populations. Because the ERP method delivers information about the neural corre lates of different aspects of language processing, it is an excellent tool for the investiga tion of the various developmental stages in language acquisition. More specifically, a par ticular ERP component, the MMR, which reflects discrimination not only of acoustic but also of phonological features, can thus be used to examine very early stages of language acquisition, even in newborns. A further ERP component that indicates lexical and seman tic processes in adults, the N400, has been registered in 14-month-olds, but has not been found in 12-month-olds, and can (p. 188) be used to investigate phonotactic knowledge, word knowledge, and knowledge of lexical-semantic relations between basic-level words and verbs and their arguments in sentences. For the syntactic domain, an adult-like biphasic ERP pattern, the ELAN-P600, is not yet present in 24-month-olds but is in 32month-old children for the processing of structural dependencies within phrases, thus Page 24 of 36

Neural Correlates of the Development of Speech Perception and Compre hension characterizing the developmental progression of syntax acquisition. Other methods, par ticularly fMRI, deliver complementary evidence that the neural basis underlying specific aspects of language processing, such as semantics and syntax, is still under development for a few more years before adult-like language processes are achieved. In summary, neuroimaging methods, in addition to behavioral studies, provide relevant in formation on various aspects of language processing. Although developmental research is still far from a detailed outline of the exact steps in language acquisition, the use of so phisticated neuroscientific methods with high temporal or spatial resolution allows re searchers to study language development from very early on and to gain a more finegrained picture of the language acquisition process and its neural basis.

References Atchley, R. A., Rice, M. L., Betz, S. K., Kwasney, K. M., Sereno, J. A., & Jongman, A. (2006). A comparison of semantic and syntactic event related potentials generated by children and adults. Brain & Language, 99, 236–246. Balsamo, L. M., Xu B., & Gaillard W. D. (2006). Language lateralization and the role of the fusiform gyrus in semantic processing in young children. NeuroImage, 31 (3), 1306–1314. Bentin, S., Mouchetant-Rostaing, Y., Giard, M. H., Echallier, J. F., & Pernier, J. (1999). ERP manifestations of processing printed words at different psycholinguistic levels: Time course and scalp distribution. Journal of Cognitive Neuroscience, 11 (3), 235–260. Bernal, S., Dehaene-Lambertz, G., Millotte, S., & Christophe, A. (2010). Two-yearolds compute syntactic structure online. Developmental Science, 13 (1), 69–76. (p. 189)

Booth, J. R., Burman, D. D., Meyer, J. R., Gitelman, D. R., Parrish, T. B., & Mesulam, M. M. (2004). Development of brain mechanisms for processing orthographic and phonologic representations. Journal of Cognitive Neuroscience, 16 (7), 1234–1249. Brauer, J., Anwander, A. & Friederici, A. D. (2011). Neuroanatomical prerequisites for lan guage functions in the maturing brain. Cerebral Cortex, 21, 459–466. Brauer, J., & Friederici, A. D. (2007). Functional neural networks of semantic and syntac tic processes in the developing brain. Journal of Cognitive Neuroscience, 19 (10), 1609– 1623. Brauer, J., Neumann, J., & Friederici, A. D. (2008). Temporal dynamics of perisylvian acti vation during language processing in children and adults. NeuroImage, 41 (4), 1484– 1492. Cao, F., Khalid, K., Zaveri, R., Bolger, D. J., Bitan, T., & Booth, J. R. (2010). Neural corre lates of priming effects in children during spoken word processing with orthographic de mands. Brain & Language, 114 (2), 80–89.

Page 25 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Cheour, M., Ceponiene, R., Lehtokoski, A., Luuk, A., Allik, J., Alho, K., et al. (1998). Devel opment of language-specific phoneme representations in the infant brain. Nature Neuro science, 1, 351–353. Cheour, M., Imada, T., Taulu, S., Ahonen, A., Salonen, J., & Kuhl, P. K. (2004). Magnetoen cephalography is feasible for infant assessment of auditory discrimination. Experimental Neurology, 190, 44–51. Chou, T. L., Booth, J. R., Burman, D. D., Bitan, T., Bigio, J. D., Lu, D., & Cone, N. E. (2006). Developmental changes in the neural correlates of semantic processing. NeuroImage, 29, 1141–1149. Clark, E. V. (2003). First language acquisition. Cambridge, MA: Cambridge University Press. Courchesne, E. (1990). Chronology of postnatal human brain development: Event-related potential, positron emission tomography, myelogenesis, and synaptogenesis studies. In J. W. Rohrbaugh, R. Parasuraman, & R. Johnson (Eds.), Event-related brain potentials: Basic issues and applications (pp. 210–241). New York: Oxford University Press. Csibra, G., Kushnerenko, E., & Grossmann, T. (2008). Electrophysiological methods in studying infant cognitive development. In: C. Nelson & M. Luciana (Eds.), Handbook of developmental cognitive neuroscience, 2nd ed. (pp. 247–262). Cambridge, MA: MIT Press. Cutler, A., & Carter, D. (1987). The predominance of strong initial syllables in the English vocabulary. Computational Speech and Language, 2, 133–142. de Guibert, C., Maumeta, C., Ferréa, J.-C., Jannina, P., Birabeng, A., Allairee, C., Barillota, C., & Le Rumeur, E. (2010). FMRI language mapping in children: A panel of language tasks using visual and auditory stimulation without reading or metalinguistic require ments. NeuroImage, 51 (2), 897–909. Dehaene-Lambertz, G., & Dehaene, S. (1994). Speed and cerebral correlates of syllable discrimination in infants. Nature, 370, 292–295. Dehaene-Lambertz, G., Dehaene, S., & Hertz-Pannier, L. (2002). Functional neuroimaging of speech perception in infants. Science, 298 (5600), 2013–2015. Dehaene-Lambertz, G., Hertz-Pannier, L., Dubois, J., Mériaux, S., Roche, A., Sigman, M., et al. (2006). Functional organization of perisylvian activation during presentation of sen tences in preverbal infants. Proceedings of the National Academy of Sciences U S A, 103, 14240–14245. Dehaene-Lambertz, G., Montavont, A., Jobert, A., Allirol, L., Dubois, J., Hertz-Pannier, L., & Dehaene, S. (2010). Language or music, mother or Mozart? Structural and environmen tal influences on infants’ language networks. Brain & Language, 114 (2), 53–65. Page 26 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Dittmar, M., Abbot-Smith, K., Lieven, E., & Tomasello, M. (2008). Young German children’s early syntactic competence: A preferential looking study. Developmental Science, 11 (4), 575–582. Dubois, J., Dehaene-Lambertz, G., Perrin, M., Mangin, J.-F., Cointepas, Y., Duchesnay, E., et al. (2008). Asynchrony of the early maturation of white matter bundles in healthy in fants: Quantitative landmarks revealed noninvasively by diffusion tensor imaging. Human Brain Mapping, 29 (1), 14–27. Fonov, V. S., Evans, A. C., Botteron, K., Almli, C. R., McKinstry, R. C., Collins, D. L., et al. (2011). Unbiased average age-appropriate atlases for pediatric studies. NeuroImage, 54, 313–327. Friederici, A. D. (2005). Neurophysiological markers of early language acquisition: From syllables to sentences. Trends in Cognitive Sciences, 9, 481–488. Friederici, A. D., & Alter, K. (2004). Lateralization of auditory language functions: A dy namic dual pathway model. Brain and Language, 89, 267–276. Friederici, A. D., & Wessels, J. M. (1993). Phonotactic knowledge and its use in infant speech perception. Perception and Psychophysics, 54, 287–295. Friederici, A. D., Friedrich, M., & Christophe, A. (2007). Brain responses in 4-month-old infants are already language specific. Current Biology, 17 (14), 1208–1211. Friederici, A. D., Friedrich, M., & Weber, C. (2002). Neural manifestation of cognitive and precognitive mismatch detection in early infancy. NeuroReport, 13, 1251–1254. Friederici, A. D., Hahne, A., & Mecklinger, A. (1996). Temporal structure of syntactic parsing: Early and late event-related brain potential effects. Journal of Experimental Psy chology: Learning Memory and Cognition, 22, 1219–1248. Friederici, A. D., Pfeifer, E., & Hahne, A. (1993). Event-related brain potentials during natural speech processing: Effects of semantic, morphological and syntactic violations. Cognitive Brain Research, 1, 183–192. Friedrich, M., & Friederici, A. D. (2004). N400-like semantic incongruity effect in 19month-olds: Processing known words in picture contexts. Journal of Cognitive Neuro science, 16, 1465–1477. Friedrich, M., & Friederici, A. D. (2005a). Phonotactic knowledge and lexical-semantic priming in one-year-olds: Brain responses to words and nonsense words in picture con texts. Journal of Cognitive Neuroscience, 17 (11), 1785–1802. Friedrich, M., & Friederici, A. D. (2005b). Lexical priming and semantic integration re flected in the ERP of 14-month-olds. NeuroReport, 16 (6), 653–656.

Page 27 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Friedrich, M., & Friederici, A. D. (2005c). Semantic sentence processing reflected in the event-related potentials of one- and two-year-old children. NeuroReport, 16 (6), 1801– 1804. Friedrich, M., & Friederici, A. D. (2010). Maturing brain mechanisms and developing be havioral language skills. Brain and Language, 114, 66–71. Gervain, J., Mehler, J., Werker, J. F., Nelson, C. A., Csibra, G., Lloyd-Fox, S., et al. (2011). Near-infrared spectroscopy: A report from the McDonnell infant methodology consor tium. Developmental Cognitive Neuroscience, 1 (1), 22–46. Gleitman, L. R., & Wanner, E. (1982). The state of the state of the art. In E. Wan ner & L. Gleitman (Eds.), Language acquisition: The state of the art (pp. 3–48). Cam bridge, MA: Cambridge University Press. (p. 190)

Gout, A., Christophe, A., & Morgan, J. L. (2004). Phonological phrase boundaries con strain lexical access. II. Infant data. Journal of Memory and Language, 51, 548–567. Goyet, L., de Schonen, S., & Nazzi, T. (2010). Words and syllables in fluent speech seg mentation by French-learning infants: An ERP study. Brain Research, 1332, 75–89. Gratton, G., & Fabiani, M. (2001). Shedding light on brain function: The event-related op tical signal. Trends in Cognitive Sciences, 5, 357–363. Grossmann, T., Johnson, M. H., Lloyd-Fox, S., Blasi, A., Deligianni, F., Elwell, C., et al. (2008). Early cortical specialization for face-to-face communication in human infants. Pro ceedings of the Royal Society B, 275, 2803–2811. Grossmann, T., Oberecker, R., Koch, S. P., & Friederici, A. D. (2010). Developmental ori gins of voice processing in the human brain. Neuron, 65, 852–858. Guasti, M. T. (2002). Language acquisition: The growth of grammar. Cambridge, MA: MIT Press. Hahne, A., & Friederici, A. D. (1999). Electrophysiological evidence for two steps in syn tactic analysis: Early automatic and late controlled processes. Journal of Cognitive Neuro science, 11, 194–205. Hahne, A., & Friederici, A. D. (2002). Differential task effects on semantic and syntactic processes as revealed by ERPs. Cognitive Brain Research, 13, 339–356. Hahne, A., & Jescheniak, J. D. (2001). What’s left if the Jabberwock gets the semantics? An ERP investigation into semantic and syntactic processes during auditory sentence comprehension. Cognitive Brain Research, 11, 199–212. Hahne, A., Eckstein, K., & Friederici, A. D. (2004). Brain signatures of syntactic and se mantic processes during children’s language development. Journal of Cognitive Neuro science, 16, 1302–1318. Page 28 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews Neuroscience, 8, 393–402. Hirsh-Pasek, K., & Golinkoff, R. M. (1996). The origins of grammar: Evidence from early language comprehension. Cambridge, MA: MIT Press. Höhle, B., Bijeljac-Babic, R., Herold, B., Weissenborn, J., & Nazzi, T. (2009). Language specific prosodic preferences during the first half year of life: Evidence from German and French infants. Infant Behavior and Development, 32 (3), 262–274. Höhle, B., Weissenborn, J., Schmitz, M., & Ischebeck, A. (2001). Discovering word order regularities: The role of prosodic information for early parameter setting. In J. Weis senborn & B. Höhle (Eds.), Approaches to bootstrapping. Phonological, lexical, syntactic and neurophysiological aspects of early language acquisition (Vol. 1, p. 249–265). Amster dam: John Benjamins. Holcomb, P. J. (1993). Semantic priming and stimulus degradation: Implications for the role of the N400 in language processing. Psychophysiology, 30, 47–61. Holcomb, P. J., Coffey, S. A., & Neville, H. J. (1992). Visual and auditory sentence process ing: A developmental analysis using event-related brain potentials. Developmental Neu ropsychology, 8, 203–241. Homae, F., Watanabe, H., Nakano, T., Asakawa, K., & Taga, G. (2006). The right hemi sphere of sleeping infant perceives sentential prosody. Neuroscience Research, 54 (4), 276–280. Homae, F., Watanabe, H., Nakano, T., & Taga, G. (2007). Prosodic processing in the devel oping brain. Neuroscience Research, 59 (1), 29–39. Houston, D. M., Jusczyk, P. W., Kuijpers, C., Coolen, R., & Cutler, A. (2000). Cross-lan guage word segmentation by 9-month-olds. Psychonomic Bulletin Review 7, 504–509. Imada, T., Zhang, Y., Cheour, M., Taulu, S., Ahonen, A., & Kuhl, P. K. (2006). Infant speech perception activates Broca`s area: A developmental magnetoencephalography study. Neu roReport, 17, 957–962. Jusczyk, P. W., Cutler, A., & Redanz, N. J. (1993). Infants’ preference for the predominant stress patterns of English words. Child Development, 64, 675–687. Jusczyk, P. W., Friederici, A. D., Wessels, J. M. I., Svenkerud, V., & Jusczyk, A. M. (1993). Infants’ sensitivity to the sound patterns of native language words. Journal of Memory and Language, 32, 402–420. Jusczyk, P. W., Houston, D. M., & Newsome, M. (1999). The beginnings of word segmenta tion in English-learning infants. Cognitive Psychology, 39 (3–4), 159–207.

Page 29 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Kaan, E., Harris, A., Gibson, E., & Holcomb, P. (2000). The P600 as an index of syntactic integration difficulty. Language and Cognitive Processes, 15, 159–201. Kooijman, V., Hagoort, P., & Cutler, A. (2009). Prosodic structure in early word segmenta tion: ERP evidence from Dutch ten-month-olds. Infancy, 14, 591–612. Kooijman, V., Johnson, E. K., & Cutler, A. (2008). Reflections on reflections of infant word recognition. In A. D. Friederici & G. Thierry (Eds.), Early language development: Bridging brain and behaviour (TiLAR 5, p. 91–114). Amsterdam: John Benjamins. Kuhl, P. K. (2004). Early language acquisition: Cracking the speech code. Nature Reviews Neuroscience, 5, 831–843. Kuhl, P. K., Conboy, B. T., Coffey-Corina, S., Padden, D., Rivera-Gaxiola, M., & Nelson, T. (2008). Phonetic learning as a pathway to language: New data and native language mag net theory expanded (NLM-e). Philosophical Transactions of the Royal Society B, 363, 979–1000. Kuijpers, C. T. L., Coolen, R., Houston, D., Cutler, A. (1998). Using the headturning tech nique to explore cross-linguistic performance differences. Advances in Infancy Research, 12, 205–220. Kujala, A., Huotilainen, M., Hotakainen, M., Lennes, M., Parkkonen, L., Fellman, et al. (2004). Speech-sound discrimination in neonates as measured with MEG. NeuroReport, 15 (13), 2089–2092. Kushnerenko, E., Ceponiene, R., Balan, P., Fellman, V., Huotilainen, M., & Näätänen, R. (2002b). Maturation of the auditory event-related potentials during the first year of life. NeuroReport, 13, 47–51. Kushnerenko, E., Ceponiene, R., Balan, P., Fellman, V., & Näätänen, R. (2002a). Matura tion of the auditory change detection response in infants: A longitudinal ERP study. Neu roReport, 13 (15), 1843–1846. Kushnerenko, E., Cheour, M., Ceponiene, R., Fellman, V., Renlund, M., Soininen, K., et al. (2001). Central auditory processing of durational changes in complex speech patterns by newborns: An event-related brain potential study. Developmental Neuropsychology, 19 (1), 83–97. Kutas, M., & van Petten, C. K. (1994). Psycholinguistics electrified: Event-related brain potential investigations. In M. A. Gernsbacher (Ed.), Handbook of psycholinguistics (pp. 83–143). San Diego, CA: Academic Press. Leach, J. L., & Holland, S. K. (2010). Functional MRI in children: Clinical and research ap plications. Pediatric Radiology, 40, 31–49.

Page 30 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Leppänen, P. H. T., Pikho, E., Eklund, K. M., & Lyytinen, H. (1999). Cortical re sponses of infants with and without a genetic risk for dyslexia: II. Group effects. NeuroRe port, 10, 969–973. Cambridge, MA: MIT Press. (p. 191)

Lloyd-Fox, S., Blasi, A., & Elwell, C. E. (2010) Illuminating the developing brain: The past, present and future of functional near infrared spectroscopy. Neuroscience and Biobehav ioural Reviews, 34 (3), 269–284. Li, X. S., Shu, H., Liu, Y. Y., & Li, P. (2006). Mental representation of verb meaning: Behav ioral and electrophysiological evidence. Journal of Cognitive Neuroscience, 18 (10), 1774– 1787. Männel, C., & Friederici, A. D. (2008). Event-related brain potentials as a window to children’s language processing: From syllables to sentences. In I. A. Sekerina, E. M. Fer nandez, & H. Clahsen (Eds.), Developmental psycholinguistics: On-line methods in children’s language processing (LALD 44, p. 29–72). Amsterdam: John Benjamins. Männel, C., & Friederici, A. D. (2009). Pauses and intonational phrasing: ERP studies in 5month-old German infants and adults. Journal of Cognitive Neuroscience, 21 (10), 1988– 2006. Männel, C., & Friederici, A. D. (2010). Prosody is the key: ERP studies on word segmenta tion in 6- and 12-month-old children. Journal of Cognitive Neuroscience, Supplement, 261. Männel, C., & Friederici, A. D. (2011). Intonational phrase structure processing at differ ent stages of syntax acquisition: ERP studies in 2-, 3-, and 6-year-old children. Develop mental Science, 14 (4), 786–798. Mestres-Misse, A., Rodriguez-Fornells, A., & Münte, T. F. (2010). Neural differences in the mapping of verb and noun concepts onto novel words. NeuroImage, 49, 2826–2835. Meyer, M., Alter, K., Friederici, A. D., Lohmann, G., & von Cramon, D. Y. (2002). fMRI re veals brain regions mediating slow prosodic modulations in spoken sentences. Human Brain Mapping, 17 (2), 73–88. Meyer, M., Steinhauer, K., Alter, K., Friederici, A. D., & von Cramon, D. Y. (2004). Brain activity varies with modulation of dynamic pitch variance in sentence melody. Brain and Language, 89 (2), 277–289. Mills, D. L., Coffey-Corina, S. A., & Neville, H. J. (1997). Language comprehension and cerebral specification from 13 to 20 months. Developmental Neuropsychology, 13 (3), 397–445. Mills, D. L., Plunkett, K., Prat, C., & Schafer, G. (2005). Watching the infant brain learn words: Effects of vocabulary size and experience. Cognitive Development, 20, 19–31.

Page 31 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Mills, D. L., Prat, C., Zangl, R., Stager, C. L., Neville, H. J., & Werker, J. F. (2004). Lan guage experience and the organization of brain activity to phonetically similar words: ERP evidence from 14- and 20-month-olds. Journal of Cognitive Neuroscience, 16 (8), 1452–1464. Minagawa-Kawai, Y., Mori, K., Furuya, I., Hayashi R., & Sato, Y. (2002). Assessing cere bral representations of short and long vowel categories by NIRS. NeuroReport, 13, 581– 584. Minagawa-Kawai, Y., Mori, K., Hebden, J., & Dupoux, E. (2008). Optical imaging of in fants’ neurocognitive development: Recent advances and perspectives. Developmental Neurobiology, 68 (6), 712–728. Minagawa-Kawai, Y., Mori, K., Naoi, N., & Kojima, S. (2007). Neural attunement process es in infants during the acquisition of a language-specific phonemic contrast. Journal of Neuroscience, 27, 315–321. Minagawa-Kawai, Y., van der Lely, H., Ramus, F., Sato, Y., Mazuka, R., & Dupoux, E. (2011). Optical brain imaging reveals general auditory and language-specific processing in early infant development. Cerebral Cortex, 21 (2), 254–261. Moore-Parks, E. N., Burns, E. L., Bazzill, R., Levy, S., Posada, V., & Muller, R. A. (2010). An fMRI study of sentence-embedded lexical-semantic decision in children and adults. Brain and Language, 114 (2), 90–100. Morr, M. L., Shafer, V. L., Kreuzer, J., & Kurtzberg, D. (2002). Maturation of mismatch negativity in infants and pre-school children. Ear and Hearing, 23, 118–136. Muzik, O., Chugani, D. C., Juhasz, C., Shen, C., & Chugani, H. T. (2000). Statistical para metric mapping: Assessment of application in children. NeuroImage, 12, 538–549. Näätänen, R. (1990). The role of attention in auditory information processing as revealed by event-related potentials and other brain measures of cognitive function. Behavioral and Brain Sciences, 13, 201–288. Nazzi, T., Dilley, L. C., Jusczyk, A. M., Shattuck-Hufnagel, S., & Jusczyk, P. W. (2005). Eng lish-learning infants’ segmentation of verbs from fluent speech. Language and Speech, 48, 279–298. Nazzi, T., Iakimova, G., Bertoncini, J., Frédonie, S., & Alcantara, C. (2006). Early segmen tation of fluent speech by infants acquiring French: Emerging evidence for crosslinguistic differences. Journal of Memory and Language, 54, 283–299. Nazzi, T., Kemler Nelson, D. G., Jusczyk, P.W., & Jusczyk, A. M. (2000). Six-month-olds’ de tection of clauses embedded in continuous speech: Effects of prosodic well-formedness. Infancy, 1 (1), 123–147.

Page 32 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Nobre, A. C., & McCarthy, G. (1994). Language-related ERPs: Scalp distributions and modulation by word type and semantic priming. Journal of Cognitive Neuroscience, 6 (33), 233–255. Oberecker, R., & Friederici, A. D. (2006). Syntactic event-related potential components in 24-month-olds’ sentence comprehension. NeuroReport, 17 (10), 1017–1021. Oberecker, R., Friedrich, M., & Friederici, A. D. (2005). Neural correlates of syntactic processing in two-year-olds. Journal of Cognitive Neuroscience, 17, 407–421. Obrig, H., & Villringer, A. (2003). Beyond the visible: Imaging the human brain with light. Journal of Cerebral Blood Flow and Metabolism, 23, 1–18. Okamoto, M., Dan, H., Shimizu, K., Takeo, K., Amita, T. Oda, I., et al. (2004). Multimodal assessment of cortical activation during apple peeling by NIRS and fMRI. NeuroImage, 21, 1275–1288. Osterhout, L., & Holcomb, P. J. (1993). Event-related brain potentials and syntactic anom aly: Evidence on anomaly detection during perception of continuous speech. Language and Cognitive Processes, 8, 413–437. Pannekamp, A., Toepel, U., Alter, K., Hahne, A., & Friederici, A. D. (2005). Prosody-driven sentence processing: An event-related brain potential study. Journal of Cognitive Neuro science, 17, 407–421. Pena, M., Maki, A., Kovacic, D., Dehaene-Lambertz, G., Koizumi, H., Bouquet, F., et al. (2003). Sounds and silence: An optical topography study of language recognition at birth. Proceedings of the National Academy of Sciences U S A, 100 (20), 11702–11705. Perani, D., Saccuman, M. C., Scifo, P., Spada, D., Andreolli, G., Rovelli, R., Baldoli, C., & Koelsch, S. (2010). Functional specializations for music processing in the human newborn brain. Proceedings of the National Academy of Sciences U S A, 107 (10), 4758–4763. Pihko, E., Leppänen, P. H. T., Eklund, K. M., Cheour, M., Guttorm, T. K., & Lyyti nen, H. (1999). Cortical responses of infants with and without a genetic risk for dyslexia: I. Age effects. NeuroReport, 10, 901–905. (p. 192)

Rivera-Gaxiola, M., Silva-Pereyra, J., & Kuhl, P. K. (2005). Brain potentials to native- and non-native speech contrasts in seven- and eleven-month-old American infants. Develop mental Science, 8, 162–172. Rivkin, M. J., Wolraich, D., Als, H., McAnulty, G., Butler, S., Conneman, N., et al. (2004). Prolonged T*[2] values in newborn versus adult brain: Implications for fMRI studies of newborns. Magnetic Resonance in Medicine, 51 (6), 1287–1291. Rossi, S., Gugler, M. F., Hahne, A., & Friederici, A. D. (2005). When word category infor mation encounters morphosyntax: An ERP study. Neuroscience Letters, 384, 228–233.

Page 33 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Saito, Y., Aoyama, S., Kondo, T., Fukumoto, R., Konishi, N., Nakamura, K., Kobayashi, M., & Toshima, T. (2007a). Frontal cerebral blood flow change associated with infant-directed speech. Archives of Disease in Childhood. Fetal and Neonatal Edition, 92 (2), F113–F116. Saito, Y., Kondo, T., Aoyama, S., Fukumoto, R., Konishi, N., Nakamura, K., Kobayashi, M., & Toshima, T. (2007b). The function of the frontal lobe in neonates for response to a prosodic voice. Early Human Development, 83 (4), 225–230. Sambeth, A., Ruohio, K., Alku, P., Fellman, V., & Huotilainen, M. (2008). Sleeping new borns extract prosody from continuous speech. Clinical Neurophysiology, 119 (2), 332– 341. Sansavini, A., Bertoncini, J., & Giovanelli, G. (1997). Newborns discriminate the rhythm of multisyllabic stressed words. Developmental Psychology, 33 (1), 3–11. Schafer, A. J., Speer, S. R., Warren, P., & White, S. D. (2000). Intonational disambiguation in sentence production and comprehension. Journal of Psycholinguistic Research, 29, 169–182. Schapiro, M. B., Schmithorst, V. J., Wilke, M., Byars Weber, A., Strawsburg, R. H., & Hol land, S. K. (2004). BOLD fMRI signal increases with age in selected brain regions in chil dren. NeuroReport, 15 (17), 2575–2578. Schlaggar, B. L., Brown, T. T., Lugar, H. L., Visscher, K. M., Miezin, F. M., & Petersen, S. E. (2002). Functional neuroanatomical differences between adults and school-age chil dren in the processing of single words. Science, 296, 1476–1479. Schroeter, M. L., Zysset, S., Wahl, M., & von Cramon, D. Y. (2004). Prefrontal activation due to Stroop interference increases during development: An event-related fNIRS study. NeuroImage, 23, 1317–1325. Seidl, A. (2007). Infants’ use and weighting of prosodic cues in clause segmentation. Jour nal of Memory and Language, 57, 24–48. Seidl, A., & Johnson, E. K. E. (2007). Boundary alignment facilitates 11-month-olds’ seg mentation of vowel-initial words from speech. Journal of Child Language, 34, 1–24. Selkirk, E. (1984). Phonology and syntax: The relation between sound and structure. Cambridge, MA: MIT Press. Silva-Pereyra, J., Conboy, B. T., Klarman, L., & Kuhl, P. K. (2007). Grammatical processing without semantics? An event-related brain potential study of preschoolers using jabber wocky sentences. Journal of Cognitive Neuroscience, 19 (6), 1–16. Silva-Pereyra, J., Klarman, L., Lin, L. J., & Kuhl, P. K. (2005). Sentence processing in 30month-old children: An event-related potential study. NeuroReport, 16, 645–648.

Page 34 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Silva-Pereyra, J., Rivera-Gaxiola, M., & Kuhl, P. K. (2005). An event-related brain potential study of sentence comprehension in preschoolers: Semantic and morphosyntactic pro cessing. Cognitive Brain Research, 23, 247–258. Skoruppa, K., Pons, F., Christophe, A., Bosch, L., Dupoux, E., Sebastián-Gallés, N., & Peperkamp, S. (2009). Language-specific stress perception by nine-month-old French and Spanish infants. Developmental Science, 12, 914–919. Soderstrom, M., Nelson, D. G. K., & Jusczyk, P. W. (2005). Six-month-olds recognize claus es embedded in different passages of fluent speech. Infant Behavior & Development, 28, 87–94. Soderstrom, M., Seidl, A., Nelson, D. G. K., & Jusczyk, P. W. (2003). The prosodic boot strapping of phrases: Evidence from prelinguistic infants. Journal of Memory and Lan guage, 49 (2), 249–267. Steinhauer, K., Alter, K., & Friederici, A. D. (1999). Brain potentials indicate immediate use of prosodic cues in natural speech processing. Nature Neuroscience, 2, 191–196. Szagun, G. (2006). Sprachentwicklung beim Kind. Weinheim: Beltz. Tan, A., & Molfese, D. L. (2009). ERP Correlates of noun and verb processing in preschool-age children. Biological Psychology, 8 (1), 46–51. Thierry, G., Vihman, M., & Roberts, M. (2003). Familiar words capture the attention of 11month-olds in less than 250 ms. NeuroReport, 14, 2307–2310. Torkildsen, J. V. K., Sannerud, T., Syversen, G., Thormodsen, R., Simonsen, H. G., Moen, I., et al. (2006). Semantic organization of basic level words in 20-month-olds: An ERP study. Journal of Neurolinguistics, 19, 431–454. Torkildsen, J. V. K., Syversen, G., Simonsen, H. G., Moen, I., Smith, L., & Lindgren, M. (2007). Electrophysiological correlates of auditory semantic priming in 24-month-olds. Journal of Neurolinguistics, 20, 332–351. Trainor, L., Mc Fadden, M., Hodgson, L., Darragh Barlow, J., Matsos, L., & Sonnadara, R. (2003). Changes in auditory cortex and the development of mismatch negativity between 2 and 6 months of age. International Journal of Psychophysiology, 51, 5–15. Tsao, F.-M., Liu, H.-M., & Kuhl, P. K. (2004). Speech perception in infancy predicts lan guage development in the second year of life: A longitudinal study. Child Development, 75, 1067–1084. Vannest, J., Karunanayaka, P. R., Schmithorst, V. J., Szaflarski, J. P., & Holland, S. K. (2009). Language networks in children: Evidence from functional MRI studies. American Journal of Roentgenology, 192 (5), 1190–1196.

Page 35 of 36

Neural Correlates of the Development of Speech Perception and Compre hension Villringer, A., & Chance, B. (1997). Noninvasive optical spectroscopy and imaging of hu man brain function. Trends in Neuroscience, 20, 435–442. Weber, C., Hahne, A., Friedrich, M., & Friederici, A. D. (2004). Discrimination of word stress in early infant perception: Electrophysiological evidence. Cognitive Brain Research, 18, 149–161. West, W. C., & Holcomb, P. J. (2002). Event-related potentials during discourse-level se mantic integration of complex pictures. Cognitive Brain Research, 13, 363–375. Wilke, M., Holland, S. K., Altaye, M., & Gaser, C. (2008). Template-O-Matic: A toolbox for creating customized pediatric templates. NeuroImage, 41 (3), 903–913. Yamada, Y., & Neville, H. J. (2007). An ERP study of syntactic processing in English and nonsense sentences. Brain Research, 1130, 167–180. Yeatman, J. D., Ben-Shachar, M., Glover, G. H., & Feldman, H. M. (2010). Individual differ ences in auditory sentence comprehension in children: An exploratory event-related func tional magnetic resonance imaging investigation. Brain & Language, 114 (2), 72–79.

Angela Friederici

Angela D. Friederici, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany. Claudia Männel

Claudia Männel, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany

Page 36 of 36

Perceptual Disorders

Perceptual Disorders Josef Zihl The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0010

Abstract and Keywords Perceptual processes provide the basis for mental representation of the visual, auditory, olfactory, gustatory, somatosensory, and social “worlds” as well as for guiding and con trolling cognitive, social, and motor activities. All perceptual systems, i.e. vision, audition, somatosensory perception, smell and taste, and social perception are segregated func tional networks and show a parallel-hierarchical type of organization of information pro cessing and encoding. In pathological conditions such as acquired brain injury, perceptu al functions and abilities can be variably affected, ranging from the loss of stimulus detec tion to impaired recognition. Despite the functional specialization of perceptual systems, association of perceptual deficits within sensory modalities is the rule, and disorders of a single perceptual function or ability are rare. This chapter describes cerebral visual, audi tory, somatosensory, olfactory, and gustatory perceptual disorders within a neuropsycho logical framework. Disorders in social perception are also considered because they repre sent a genuine category of perceptual impairments. Keywords: vision, audition, somatosensory perception, smell, taste, social perception, cerebral perceptual disor ders

Introduction Perception “is the process or result of becoming aware of objects, relationships, and events by means of the senses,” which includes activities such as detecting, discriminat ing, identifying, and recognizing. “These activities enable organisms to organize and in terpret the stimuli received into meaningful knowledge” (APA, 2007). Perception is con structed in the brain and involves lower and higher level processes that serve simpler and more complex perceptual abilities such as detection, identification, and recognition (Mather, 2006). The behavioral significance of perception lies not only in the processing of stimuli as a basis for mental representation of the visual, auditory, olfactory, gustatory, somatosensory, and social “worlds” but also in the guidance and control of activities. Thus there exists a reciprocal interaction between perception, cognition, and action. For per ceptual activities, attention, memory, and executive functions are crucial prerequisites. Page 1 of 37

Perceptual Disorders They form the bases for focusing on stimuli and maintaining attention during stimulus ac quisition and processing, storing percepts as experience and concepts, and controlling in put and output activities that allow for an optimal, flexible adaptation to extrinsic and in trinsic challenges. The aim of this chapter is to describe the effect of pathological conditions, particularly ac quired brain injury, on the various abilities in the domains of vision, audition, somatosen sory perception, smell and taste, and social perception as well as the behavioral conse quences and the significance of these disorders for the understanding of brain organiza tion. Perceptual disorders can result from injury to the afferent sensory pathways and/or to their subcortical and cortical processing and coding (p. 194) stages. Peripheral injury usually causes “lower level” dysfunctions (e.g., threshold elevation or difficulties with stimulus localization and sensory discrimination), whereas central injuries cause “higher level” perceptual dysfunctions (e.g., in the domains of identification and recognition). However, peripheral sensory deficits may also be associated with higher perceptual disor ders because the affected sensory functions and their interactions represent a crucial prerequisite for more complex perceptual abilities (i.e., detection and discrimination of stimuli build the basis for identification and recognition).

Vision Visual perception comprises lower level visual abilities (i.e., the visual field, visual acuity, contrast sensitivity, color and form vision, and stereopsis) and higher level visual abilities, in particular visual identification and recognition. Visual perceptual abilities also form the basis for visually guided behavior, such as oculomotor activities, hand and finger move ments, and spatial navigation. From its very beginning, visual neuroscience has been con cerned with the analysis of the various visual perceptual deficits and the identification of the location of the underlying brain injury. Early clinical reports on patients have already demonstrated the selective loss of visual abilities after acquired brain injury. These obser vations have suggested a functional specialization of the visual cortex, a concept verified many years later by combined anatomical, electrophysiological, and behavioral evidence (Desimone & Ungerleider, 1989; Grill-Spector & Malach, 2004; Orban, 2008; Zeki, 1993). The primary visual cortical area (striate cortex, Brodmann area 17, visual area 1, or V1) receives its input from the retina via the lateral geniculate body (LGN) and possesses a highly accurate, topographically organized representation of the retina and thus of the vi sual field. The central visual field occupies a large proportion of the striate cortex; about half of the cortical surface is devoted to the central 10 degrees of the visual field, which is only 1 percent of the visual field (Tootell, Hadjikhani, Mendola, Marrett, & Dale, 1998). In addition, V1 distributes specific visual signals to the other visual areas that are located in the surrounding cortex (for a review, see Bullier, 2003). This anatomical and functional or ganization enables the visual brain to deal with the processing of global and local fea tures of visual objects and scenes. The result of processing at distinct levels of complexity at each stage can be flexibly and dynamically integrated into coherent perception (Bar tels & Zeki, 1998; Tootell et al., 1998; Zeki, 1993; Zeki & Bartels, 1998). Because of the Page 2 of 37

Perceptual Disorders inhomogeneity of spatial resolution and acuity in the visual field (Anstis, 1974), the field size for processing visual details (e.g., form vision) is much smaller, comprising the inner 9 degrees of the binocular visual field (i.e., macular region; Henderson, 2003). Occipitalparietal, posterior parietal, and prefrontal mechanisms guarantee rapid global context ex traction as well as visual spatial working memory (Bar, 2004; Henderson, 2003; Hochstein & Ahissar, 2002). Ungerleider and Mishkin (1982) have characterized the functional specialization of the vi sual brain as consisting of two processing streams: The “where” pathway or dorsal route, comprising occipital-parietal visual areas and connections, is specialized in space pro cessing; and the “what” pathway or ventral route, comprising occipital-temporal visual areas and connections, is specialized in object processing. According to Milner and Goodale (2008), information processed in the dorsal pathway is used for the implicit visu al guidance of actions, whereas explicit perception is associated with processing in the ventral stream. Because visual perception usually involves both space- and object-based information processing, cooperation and interaction between the two visual streams are required (Goodale & Westwood, 2004). In addition, both routes interact either directly or indirectly via attention involving the inferior parietal cortex (Singh-Curry & Husain, 2009) and working memory involving the prefrontal cortex (Goodale & Westwood, 2004; Oliveri et al., 2001). Eye movements play a crucial role in visual processing and thus in vi sual perception (for a comprehensive review, see Martinez-Conde, Macknik, & Hubel, 2004). The posterior thalamus and its reciprocal connections with cortical regions in the occipital, parietal, and frontal lobes and with the limbic neocortex form a cortical-subcor tical network subserving intentionally guided and externally triggered attention as well as saccadic eye movements that are involved in visual information processing (e.g., Anders son et al., 2007; Dean, Crowley & Platt 2004; Himmelbach, Erb, & Karnath, 2006; Nobre, 2001; Olson et al., 2000; Schiller & Tehovnik, 2001, 2005). Complex visual stimuli (e.g., objects and faces) are coded as specific categories in extrastriate regions in the ventral visual pathway (Grill-Spector, 2003; Sigala, 2004; Wierenga et al., 2009). Top-down processes involving the prefrontal cortex facilitate visual object recognition (Bar, 2003), and hippocampal-dependent memory builds the basis for experience-dependent visual scanning (Smith & Squire, 2008). (p. 195) Yet, it is still unclear how the brain eventually codes complex visual stimuli for accurate identification and recognition; it appears, how ever, that complex visual stimuli are simultaneously represented in two parallel and hier archically organized processing systems in the ventral and dorsal visual pathways (Konen & Kastner, 2008). About 30 percent of patients with acquired brain injury suffer from visual disorders (Clarke, 2005; Rowe et al., 2009; Suchoff et al., 2008). Lower level visual functions and abilities (e.g., visual detection and localization, visual acuity, contrast sensitivity, and col or discrimination) may be understood as perceptual architecture, whereas higher level, visual-cognitive capacities (e.g., text processing and recognition) also involve learning and memory processes as well as executive functions. Selective visual disorders after brain injury are the exception rather than the rule because small “strategic” lesions are very rare and visual cortical areas are intensely interconnected. Injury to the visual brain, Page 3 of 37

Perceptual Disorders that is, to visual cortical areas and fiber connections, therefore commonly causes an asso ciation of visual disorders.

Visual Field A homonymous visual field defect is defined as a restriction of the normal visual field caused by injury to the afferent postchiasmatic visual pathway, that is, an interruption in the flow of visual information between the optic chiasm and the striate cortex. Homony mous visual field disorders are characterized by partial or total blindness in correspond ing visual field regions of each eye. In the case of unilateral postchiasmatic brain injury, vision may be lost in the left or right hemifield (homonymous left- or right-sided hemi anopia), the left or right upper or lower quadrants (homonymous upper or lower quadra nopia in the left or right hemifield), or a restricted portion in the parafoveal visual field (paracentral scotoma). The most common type of homonymous visual field disorders is hemianopia (loss of vision in one hemifield), followed by quadranopia (loss of vision in one quadrant) and paracentral scotoma (island of blindness in the parafoveal field region). Vi sual field defects are either absolute (complete loss of vision, anopia) or relative (de pressed vision, amblyopia, hemiachromatopsia). Homonymous amblyopia typically affects the entire hemifield (hemiamblyopia), and homonymous achromatopsia (i.e., the selective loss of color vision) typically affects one hemifield (hemiachromatopsia) or the upper quadrant. Visual field defects differ with respect to visual field sparing. Foveal sparing refers to sparing of the foveal region (1 degree), macular sparing refers to the preserva tion of the macular region (5 degrees), and macular splitting refers to a sparing of less than 5 degrees (for review, see Harrington & Drake, 1990). In the majority of patients (71.5 percent of 876 cases), field sparing does not exceed 5 degrees. As a rule, patients with small visual field sparing are more disabled, especially with regard to reading. Stroke represents the most common etiology, but other etiologies such as traumatic brain injury, tumors, multiple sclerosis, and cortical posterior atrophy may also cause homony mous visual field disorders (see Zihl, 2011). About 80 percent of patients (n = 157) with unilateral homonymous visual field loss suffer from functional impairments in reading (hemianopic dyslexia) and/or in global perception and overview (Zihl, 2011). Homonymous visual field loss causes a restriction of the field of view, which prevents the rapid extraction of the entire spatial configuration of the visu al environment. It therefore impairs the top-down and bottom-up interactions that are re quired for efficient guidance of spatial attention and oculomotor activities during scene perception and visual search. Patients with additional injury to the posterior thalamus, the occipital white matter route (i.e., fiber pathways to the dorsal visual route and path ways connecting occipital, parietal, temporal, and frontal cortical areas) show disorga nized oculomotor scanning behavior (Zihl & Hebel, 1997; Mort & Kennard, 2003). The im pairments in global perception and visual scanning shown by these patients are more se vere than those resulting from visual field loss alone (Zihl, 1995a). Interestingly, about 20 percent show spontaneous substitution of visual field loss by oculomotor compensation and thus enlargement of the field of view; the percentage is even higher in familiar sur roundings because patients can make use of their spatial knowledge of the surroundings Page 4 of 37

Perceptual Disorders (Zihl, 2011). In normal subjects, global visual perception is based on the visual field with in which they can simultaneously detect and process visual stimuli. The visual field can be enlarged by eye shifts, which is typically 50 degrees in all directions (Leigh & Zee, 2006). The resulting field of view is thus defined by the extent of the visual field when moving the eyes in global visual perception (see also Pambakian, Mannan, Hodgson, & Kennard, 2004). Reading is impaired in patients with unilateral homonymous field loss and visual field sparing of less than 5 degrees to the left and less than 8 degrees to the right of the fovea. In reading, the visual brain (p. 196) relies on a gestalt-type visual word-form processing, the “reading span.” It is asymmetrical (larger to the right in left-to right-orthographies) and is essential for the guidance of eye movements during text processing (Rayner, 1998). However, insufficient visual field sparing does not appear to be the only factor causing persistent “hemianopic” dyslexia. The extent of brain injury affecting in particular the oc cipital white matter seems to be crucial in this regard (Schuett, Heywood, Kentridge, & Zihl, 2008a; Zihl 1995b). That reading is impaired at the pre-semantic visual sensory level is supported by the outcome of treatment procedures involving practice with nontext ma terial, which have been found to be as effective as word material in reestablishing eye movement reading patterns and improving reading performance (Schuett, Heywood, Ken tridge, & Zihl, 2008b). In the case of bilateral postchiasmatic brain injury, both visual hemifields are affected, re sulting in bilateral homonymous hemianopia (“tunnel vision”), bilateral upper or lower hemianopia, bilateral paracentral scotoma, or central scotoma. Patients with bilateral vi sual field disorders suffer from similar, but typically more debilitating, visual impairments in global visual perception and reading. A central scotoma is a very dramatic form of homonymous visual field loss because foveal vision is either totally lost or depressed (cen tral amblyopia). The reduction or loss of vision in the central part of the visual field is typ ically associated with a corresponding loss of visual spatial contrast sensitivity, visual acu ity, and form, object, and face perception. The loss of foveal vision also causes a loss of the central reference for optimal fixation and of the straight-ahead direction as well as an impairment of the visual-spatial guidance of saccades and hand-motor responses. As a consequence, patients cannot accurately fixate a visual stimulus and shift their gaze from one stimulus to another, scan a scene or a face, and guide their eye movements during scanning and reading. Patients therefore show severe impairments in locating objects, recognizing objects and faces, finding their way in rooms or places, and reading, and of ten get lost when scanning a word or a scene (Zihl, 2011).

Visual Acuity, Spatial Contrast Sensitivity, and Visual Adaptation After unilateral postchiasmatic brain injury, visual acuity is usually not significantly re duced, except for cases in which the optic tract is involved (Frisén, 1980). After bilateral postchiasmatic injury, visual acuity can either be normal, gradually diminished, or totally

Page 5 of 37

Perceptual Disorders lost (i.e., form vision is no longer possible) (Symonds & MacKenzie, 1957). This reduction in visual acuity cannot be improved by optical correction. When spatial contrast sensitivity is reduced, patients usually complain of “blurred” or “foggy” vision despite normal visual acuity, accommodation, and convergence (Walsh, 1985). Impairments of contrast sensitivity have been reported in cerebrovascular dis eases (Bulens, Meerwaldt, van der Wildt, & Keemink, 1989; Hess, Zihl, Pointer, & Schmid, 1990); after closed head trauma, encephalitis, and hypoxia (Hess et al., 1990); in Parkinson’s disease (Bulens, Meerwaldt, van der Wildt & Keemink, 1986; Uc et al., 2005); multiple sclerosis (Gal, 2008); and dementia of the Alzheimer type (Jackson & Owsley, 2003). Bulens et al. (1989) have suggested that impairments of contrast sensitivity for high spatial frequencies mainly occur after occipital injury, whereas impairments of sensi tivity for lower spatial frequencies occur after temporal or parietal injury. Depending on the severity of the sensitivity loss, patients have difficulties with depth perception, text processing, face perception, and visual recognition. Because reduction in spatial contrast sensitivity is not necessarily associated with reduced visual acuity, assessing visual acuity alone is not sufficient for detecting impaired spatial contrast sensitivity.

Color Vision Color vision may be lost in the contralateral hemifield (homonymous hemiachromatopsia) or in the upper quadrant after unilateral occipital-temporal brain injury. Because light sensitivity and form vision are not impaired in the affected hemifield, the loss of color vi sion is selective (e.g., Short & Graff-Radford, 2001). Patients are usually aware of this dis order and report that the corresponding part of the visual environment appears “pale,” in “black and white,” or “like in an old movie.” In the case of cerebral dyschromatopsia, foveal color vision is affected with and without the concomitant loss of color vision in the peripheral visual field (Koh et al., 2008; Rizzo, Smith, Pokorny, & Damasio, 1993). Pa tients with cerebral dyschromatopsia find it difficult to discriminate fine color hues. Bilat eral occipital-temporal injury causes moderate or severe loss of color vision in the entire visual field, which is called cerebral achromatopsia (Bouvier & Engel, 2006; Heywood & Kentridge, 2003; Meadows 1974); yet, discrimination of grays (p. 197) (Heywood, Wilson, & Cowey, 1987) and even processing of wavelength differences (Heywood & Kentridge, 2003) may be spared. Consequently, discriminating and sorting of colors and associating color stimuli with their names and with particular objects (e.g., yellow and banana; green and grass) are affected. Patients may report that objects and pictures appear “drained of color,” as “dirty brownish” or “reddish,” or as “black and white.” Cerebral hemiachro matopsia is a rather rare condition. Among 1,020 patients with unilateral homonymous vi sual field disorders after acquired posterior brain injury, we found thirty cases (3.9 per cent) with unilateral hemiachromatopsia and impaired foveal color discrimination; among 130 cases with bilateral occipital injury, sixteen cases (12.3 percent) showed complete cerebral achromatopsia. Partial cerebral achromatopsia may also occur and may be asso ciated with impaired color constancy (Kennard, Lawden, Morland, & Ruddock, 1995). The ventral occipital-temporal cortex is the critical lesion location of color vision deficits (Bou vier & Engel, 2006; Heywood & Kentridge, 2003). Color vision may also be impaired in Page 6 of 37

Perceptual Disorders (mild) hypoxia (Connolly, Barbur, Hosking, & Moorhead, 2008), multiple sclerosis (Moura et al., 2008), Parkinson’s disease (Müller, Woitalla, Peters, Kohla, & Przuntek, 2002), and dementia of the Alzheimer type (Jackson & Owsley, 2003). Furthermore, color hue dis crimination accuracy can be considerably reduced in older age (Jackson & Owsley, 2003).

Spatial Vision Disorders in visual space perception comprise deficits in visual localization, depth percep tion, and perception of visual spatial axes. Brain injury can differentially affect retino topic, spatiotopic, egocentric, and allocentric frames of reference. Visual-spatial disor ders typically occur after occipital-parietal and posterior parietal injury; a right-hemi sphere injury more frequently causes visual spatial impairments (for comprehensive re views, see Farah, 2003; Karnath & Zihl, 2003; Landis 2000). After unilateral brain injury, moderate defective visual spatial localization is typically found in the contralateral hemifield, but may also be present in the foveal visual field (Postma, Sterken, de Vries, & de Haan, 2000), which is associated with less accurate sac cadic localization accuracy. Patients with bilateral posterior brain injury, in contrast, show moderate to severe localization inaccuracy in the entire visual field, which typically af fects all visually guided activities, including accurately fixating objects, reaching for ob jects, and reading and writing (Zihl, 2011). Interestingly, patients with parietal lobe injury can show dissociation between spatial perception deficits and pointing errors (Darling, Bartelt, Pizzimenti, & Rizzo, 2008), indicating that inaccurate pointing cannot always be explained in terms of defective localization but may represent a genuine disorder (optic ataxia; see Caminiti et al., 2010). Impaired monocular and binocular depth perception (astereopsis) has been observed in patients with unilateral and bilateral posterior brain injury, with bilateral injury causing more severe deficits. Defective depth perception may cause difficulties in pictorial depth perception, walking (downstairs), and reaching for objects or handles (Koh et al., 2008; Miller et al., 1999; Turnbull, Driver, & McCarthy, 2004). Impaired distance perception, in particular in the peripersonal space, has mainly been observed after bilateral occipitalparietal injury (Berryhill, Fendrich, & Olson, 2009). Shifts in the vertical and horizontal axes have been reported particularly in patients with right occipital-parietal injury (Barton, Behrmann, & Black, 1998; Bonan, Leman, Legar gasson, Guichard, & Yelnik, 2006). Right-sided posterior parietal injury can also cause ip silateral and contralateral shifts in the visually perceived trunk median plane (Darling, Pizzimenti, & Rizzo, 2003). Occipital injury more frequently causes contralateral shifts in spatial axes, whereas posterior parietal injury also causes ipsilateral shifts. Barton and Black (1998) suggested that the contralateral midline shift of hemianopic patients is “a consequence of the strategic adaptation of attention into contralateral hemispace after hemianopia” (p. 660), that is, that a change in attentional distribution might cause an ab normal bias in line bisection. In a study of 129 patients with homonymous visual field loss, we found the contralateral midline shift in more than 90 percent of cases. However, Page 7 of 37

Perceptual Disorders the line bisection bias was not associated with efficient oculomotor compensation for the homonymous visual loss. In addition, visual field sparing also did not modulate the degree of midline shift. Therefore, the subjective straight-ahead deviation may be explained as a consequence of a systematic, contralesional shift of the egocentric visual midline and may therefore represent a genuine visual-spatial perceptual disorder (Zihl, Sämann, Schenk, Schuett, & Dauner, 2009). This idea is supported by Darling et al. (2003), who reported difficulties in visual perception of the trunk-fixed anterior-posterior axis in patients with left- or (p. 198) right-sided unilateral posterior parietal lesions without visual field defects.

Visual Motion Perception Processing of direction and speed of visual motion stimuli is a genuine visual ability. How ever, in order to know how objects move in the world, we must take into account the rota tion of our eyes as well as of our head (Bradley, 2004; Snowden & Freeman, 2004). Mo tion perception also enables recognition of biological movements (Giese & Poggio, 2003) and supports face perception (Roark, Barrett, Spence, Abdi, & O’Toole, 2003). Visual area V5 activity is the most critical basis for generating motion perception (Moutoussis & Zeki, 2008), whereas superior temporal and premotor areas subserve biological motion percep tion (Saygin, 2007). The first well-documented case of loss of visual motion perception (cerebral akinetopsia) is L.M. After bilateral temporal-occipital cerebrovascular injury, she completely lost move ment vision in all three dimensions, except for detection and direction discrimination of single targets moving at low speed with elevated thresholds. In contrast, all other visual abilities, including the visual field, visual acuity, color vision, stereopsis, and visual recog nition, were spared, as was motion perception in the auditory and tactile modalities. Her striking visual-perceptual impairment could not be explained by spatial or temporal pro cessing deficits, impaired contrast sensitivity (Hess, Baker, & Zihl, 1989), or generalized cognitive slowing (Zihl, von Cramon, & Mai, 1983; Zihl, von Cramon, Mai, & Schmid, 1991). L.M. was also unable to search for a moving target among stationary distractor stimuli in a visual display (McLeod, Heywood, Driver, & Zihl, 1989) and could not see bio logical motion stimuli (McLeod, Dittrich, Driver, Perrett, & Zihl, 1996), including facial movements in speech reading (Campbell, Zihl, Massaro, Munhall, & Cohen, 1997). She could not extract shape from motion and lost apparent motion perception (Rizzo, Nawrot, & Zihl, 1995). Because of her akinetopsia, L.M. was severely handicapped in all activities involving visual motion perception, whereby perception and action were similarly affect ed (Schenk, Mai, Ditterich, & Zihl, 2000). Selective impairment of movement vision in terms of threshold elevation for speed and direction has also been reported in the hemi field contralateral to unilateral posterior brain injury for motion types of different com plexity, combined and in separation (Billino, Braun, Bohm, Bremmer, & Gegenfurtner, 2009; Blanke, Landis, Mermoud, Spinelli, & Safran, 2003; Braun, Petersen, Schoenle, & Fahle, 1998; Plant, Laxer, Barbaro, Schiffman, & Nakayama, 1993; Vaina, Makris, Kennedy, & Cowey, 1998).

Page 8 of 37

Perceptual Disorders

Visual Identification and Visual Recognition Visual agnosia is the inability to identify, recognize, interpret, or comprehend the mean ing of visual stimuli even though basic visual functions (i.e., the visual field, visual acuity, spatial contrast sensitivity, color vision, and form discrimination) are intact or at least suf ficiently preserved. Visual agnosia either results from defective visual perception (e.g., synthesis of features; apperceptive visual agnosia) or from the loss of the “bridge” be tween the visual stimulus and its semantic associations (e.g., label, use, history; associa tive or semantic visual agnosia). However, objects can be recognized in the auditory and tactile modalities, and the disorder cannot be explained by supramodal cognitive or apha sic deficits (modified after APA, 2007). Lissauer (1890) interpreted apperceptive visual ag nosia as “mistaken identity” because incorrectly identified objects share global (e.g., size and shape) and/or local properties (e.g., color, texture, form details) with other objects, which causes visual misidentification. Cases with pure visual agnosia seem to be the ex ception rather than the rule (Riddoch, Johnston, Bracewell, Boutsen, & Humphreys, 2008). Therefore, a valid and equivocal differentiation between a “genuine” visual ag nosia and secondary impairments in visual identification and recognition resulting from other visual deficits is often difficult, in particular concerning the integration of global and local information (Delvenne, Seron, Coyette, & Rossion, 2004; Thomas & Forde, 2006). In a group of 1,216 patients with acquired injury to the visual brain we have found only seventeen patients (about 2.4 percent) with genuine visual agnosia. Visual agnosia is typically caused by bilateral occipital-temporal injury (Barton, 2008a) but may also occur after left- (Barton, 2008b) or right-sided posterior brain injury (Landis, Regard, Bliestle, & Kleihues, 1988). There also exist progressive forms of visual agnosia in posterior corti cal atrophy and in early stages of dementia (Nakachi et al., 2007; Rainville et al., 2006). Farah (2000) has proposed a useful classification of visual agnosia according to the type of visual material patients find difficult to identify and recognize. Patients with visual ob ject and form agnosia are unable to visually recognize complex objects or pictures. (p. 199) There exist category-specific types of object agnosia, such as for living and nonliv ing things (Thomas & Forde, 2006), animals or artifacts (Takarae & Levin, 2001). A par ticular type of visual object agnosia is visual form agnosia. The most elaborated case with visual form agnosia is D.F. (Milner et al., 1991). After extensive damage to the ventral processing stream due to carbon monoxide poisoning, this patient showed a more or less complete loss of form perception, including form discrimination, despite having a visual resolution capacity of 1.7 minute of arc. Visually guided activities such as pointing to or grasping for an object, however, were spared (Carey, Dijkerman, Murphy, Goodale, & Mil ner, 2006; James, Culham, Humphrey, Milner, & Goodale, 2003; McIntosh, Dijkerman, Mon-Williams, & Milner, 2004). D.F. also showed profound inability to visually recognize objects, places, and faces, indicating a more global rather than selective visual agnosia. Furthermore, D.F.’s visual disorder may also be explained in terms of an allocentric spa tial deficit rather than as perceptual deficit (Schenk, 2006). As Goodale and Westwood (2004) have pointed out, the proposed ventral-dorsal division in visual information pro cessing may not be as exclusive as assumed, and both routes interact at various stages. However, automatic obstacle avoidance was intact in D.F. while correct grasping was pos Page 9 of 37

Perceptual Disorders sible for simple objects only (McIntosh et al., 2004), suggesting that the “what” pathway plays no essential role in detecting and localizing objects or in the spatial guidance of walking (Rice et al., 2006). Further cases of visual form agnosia after carbon monoxide poisoning have been reported by Heider (2000). Despite preserved visual acuity and only minor visual field defects, patients were severely impaired in shape and form discrimina tion, whereas the perception of color, motion, and stereoscopic depth was relatively unim paired. Heider (2000) identified a failure in figure–ground segregation and grouping sin gle elements of a composite visual scene into a “gestalt” as the main underlying deficit. Global as well as local processing can be affected after right- and left-sided occipital-tem poral injury (Rentschler, Treutwein, & Landis, 1994); yet, typically patients find it more difficult to process global features and integrate them into a whole percept (integrative or simultaneous agnosia; Behrmann & Williams, 2007; Saumier, Arguin, Lefebvre, & Las sonde, 2002; Thomas & Forde, 2006). Consequently, patients are unable to report more than one attribute of a single object (Coslett & Lie, 2008). Encoding the spatial arrange ments of parts of an object requires a mechanism that is different from that required for encoding the shape of individual parts, with the former selectively compromised in inte grative agnosia (Behrmann, Peterson, Moscovitch, & Suzuki, 2006). Integration of multi ple object stimuli into a holistic interpretation seems to depend on the spatial distance of local features and elements (Huberle & Karnath, 2006). Yet, shifting fixation and thus al so attention to all elements of an object in a regular manner seems not sufficient to “bind” together the different elements of spatially distributed stimuli (Clavagnier et al., 2006). The integration of multiple visual elements resulting in a conscious perception of their gestalt seems to rely on bilateral structures in the human lateral and medial inferior parietal cortex (Himmelbach, Erb, Klockgether, Moskau, & Karnath, 2009). An alternative explanation for the impairment in global visual perception is shrinkage of the field of at tention and thus perception (Michel & Henaff, 2004), which might be elicited by atten tional capture (“radical visual capture”) to single, local elements (Takaiwa, Yoshimura, Abe, & Terai, 2003; Dalrymple, Kingstone, & Barton, 2007). The pathological restriction and rigidity of attention impair the integration of multiple visual elements to a gestalt, but the type of capture depends on the competitive balance between global and local salience. The impaired disengaging of attention causes inability to “unlock” attention from the first object or object element to other objects or elements of objects (Pavese, Coslett, Saffran, & Buxbaum, 2002). Interestingly, facial expressions of emotion are less affected in simultanagnosia, indicating that facial stimuli constitute a specific category of stimuli that attract attention more effectively and are possibly processed before attention al engagement (Pegna, Caldara-Schnetzer, & Khateb, 2008). It has been proposed that differences in local relative to more global visual processing can be explained by different processing modes in the dorsal and medial ventral visual pathways at an extrastriate lev el; these characteristics can also explain category-specific deficits in visual perception (Riddoch et al., 2008). The dual-route organization of visual information has also been ap plied to local–global perception. Difficulties with processing of multiple stimulus elements or features (within-object representation) are often referred to as “ventral” simultanag nosia, and impaired processing of multiple spatial stimuli (between-object representation) as “dorsal” simultanagnosia (Karnath, Ferber, Rorden, & Driver, 2000). Dorsal simul Page 10 of 37

Perceptual Disorders tanagnosia is one component of the Bálint-Holmes syndrome, which consists of (p. 200) spatial (and possibly temporal) restriction of the field of visual attention and thus visual processing and perception, impaired visual spatial localization and orientation, and defec tive depth perception (Moreaud, 2003; Rizzo & Vecera, 2002). In addition, patients with severe Balint’s syndrome find it extremely difficult to shift their gaze voluntarily or on command (oculomotor apraxia or psychic paralysis of gaze) and are unable to direct movement of an extremity in space under visual guidance (optic or visuomotor ataxia). As a consequence, visually guided oculomotor and hand motor activities, visual-constructive abilities, visual orientation, recognition, and reading are severely impaired (Ghika, GhikaSchmid, & Bogousslavsky, 1998). In face agnosia (prosopagnosia), recognition of familiar faces, including one’s own face, is impaired or lost. The difficulties prosopagnosic patients have with visual face recognition also manifest in their oculomotor scan path during inspection of a face; global features such as hair or the forehead, for example, are scanned in much more detail than genuine facial features such as the eye or nose (Stephan & Caine, 2009). Other prosopagnosic subjects may show partial processing of facial features, such as the mouth region (Bukach, Le Grand, Kaiser, Bub, & Tanaka, 2008). Topographical (topographagnosia) or environmentalagnosia refers to defective recognition of familiar environments, in reality and on maps and pictures; however, patients may have fewer difficulties in familiar sur roundings and with scenes with clear landmarks, and may benefit from semantic informa tion such as street names (Mendez & Cherrier, 2003). Agnosia for letters (pure alexia) is a form of acquired dyslexia with defective visual recognition of letters and words while au ditory recognition of letters and words and writing are intact. The underlying disorder may have a pre-lexical, visual-perceptual basis because patients can also exhibit difficul ties with nonlinguistic stimuli (Mycroft, Behrmann, & Kay, 2009).

Audition Auditory perception comprises detection, discrimination, identification, and recognition of sounds, voice, music, and speech. The ability to detect and discriminate attributes of sounds improves with practice (Wright & Zhang, 2009) and thus depends on auditory ex perience. This might explain interindividual differences in auditory performance, in par ticular recognition expertise and domain specificity concerning, for example, sounds, voices, and music (Chartrand, Peretz, & Belin, 2008). Another factor that crucially modu lates auditory perceptual efficiency is selective attention (Shinn-Cunningham & Best, 2008). The auditory brain possesses tonotopic maps that show rapid task-related changes to sub serve distinct functional roles in auditory information processing, such as pitch versus phonetic analysis (Ozaki & Hashimoto, 2007). This task specificity can be viewed as a form of plasticity that is embedded in a context- and cognition-related frame of reference, whereby attention, learning and memory, and mental imagery can modulate processing (Dahmen & King, 2007; Fritz, Elhilali, & Shamma, 2005; Weinberger, 2007; Zatorre, Page 11 of 37

Perceptual Disorders 2007). The auditory cortex forms internal representations of temporal characteristic structures, which may build the further basis for sound segmentation, complex auditory objects processing, and also multisensory integration (Wang, Lu, Bendor, & Bertlett, 2008). In the discrimination of speech and nonspeech stimuli, which is based on subtle temporal acoustic features, the middle temporal gyrus, the superior temporal sulcus, the posterior part of the inferior frontal gyrus, and the parietal operculum of the left hemi sphere are involved (Zaehle, Geiser, Alter, Jancke, & Meyer, 2008). Environmental sounds are mainly processed in the middle temporal gyri in both hemispheres (Lewis et al., 2004), whereas vocal communication sounds are preferentially coded in the insular re gion (Bamiou, Musiek, & Luxon, 2003). Music perception is understood as a form of com munication in which formal codes (i.e., acoustic patterns) and their auditory representa tions are employed to elicit a variety of perceptual and emotional experiences (Bharucha, Curtis, & Paroo, 2006). Musical stimuli have also been found to activate specific path ways in several brain areas, which are associated with emotional behavior, such as insu lar and cingulate cortices, amygdala, and prefrontal cortex (Boso, Politi, Barale, & Enzo, 2006). For the representation of auditory scenes and categories within past and actual ex periences and contexts, the medial and ventrolateral prefrontal cortex appears to play a particular role (Janata, 2005; Russ, Lee, & Cohen, 2007). The auditory system also possesses a “where” and a “what” subdivision for processing spatial and nonspatial aspects of acoustic stimuli, which allows detection, localization, discrimination, identification, and recognition of auditory information, including vocal communication sounds (speech perception) and music (Kraus & Nicol, 2005; Wang, Wu, & Li, 2008). (p. 201)

Auditory Perceptual Disorders

Unilateral and bilateral injury to left- or right-sided temporal brain structures can affect processing of spatial and temporal auditory processing capacities (Griffiths et al., 1997; Polster & Rose, 1998) and the perception of environmental sounds (Tanaka, Nakano, & Obayashi, 2002), sound movement (Lewald, Peters, Corballis, & Hausmann, 2009), tunes, prosody, and voice (Peretz et al., 1994), and words (pure word deafness) (Shivashankar, Shashikala, Nagaraja, Jayakumar, & Ratnavalli, 2001). Functional dissociation of auditory perceptual deficits, such as preservation of speech perception and environmental sounds but impairment of melody perception (Peretz et al., 1994), impaired speech perception but intact environmental sound perception (Kaga, Shindo, & Tanaka, 1997), and impaired perception of verbal but spared perception of nonverbal stimuli (Shivashankar et al., 2001), suggests a modular architecture similar to that in the visual cortex (Polster & Rose, 1998).

Auditory Agnosia Auditory agnosia is defined as the impairment or loss of recognition of auditory stimuli in the absence of defective auditory functions and language and cognitive disorders that can (sufficiently) explain the recognition disorder. As in visual agnosia, it may be difficult to Page 12 of 37

Perceptual Disorders validly distinguish between genuine and secondary auditory agnosia. It is impossible to clearly differentiate sensory-perceptual from perceptual-cognitive abilities because both domains are required for auditory recognition. For example, patients with intact process ing of steady-state patterns but impaired processing of dynamic acoustic patterns may ex hibit verbal auditory agnosia (Wang, Peach, Xu, Schneck, & Manry, 2000) or have (addi tional) difficulties with auditory spatial localization and auditory motion perception (Clarke, Bellmann, Meuli, Assal, & Steck, 2000). Auditory agnosia for environmental sounds may be associated with impaired processing of meaningful verbal information (Saygin, Dick, Wilson, Dronkers, & Bates, 2003) and impaired recognition of music (Kaga, Shindo, Tanaka, & Haebara, 2000); yet, perception of environmental sound (Shivashankar et al., 2001) and music may also be spared even in the case of generalized auditory ag nosia (Mendez, 2001). However, there exist examples of pure agnosia for recognizing par ticular categories of auditory material, such as environmental sounds (Taniwaki, Tagawa, Sato, & Iino, 2000), speech (pure word deafness) (Engelien et al., 1995; Polster & Rose, 1998), and music perception. Musical timber perception can be affected after left- or right temporal lobe injury (Samson, Zatorre, & Ramsay, 2002). Agnosia for music (music agnosia, amusia) and agnosia for other auditory categories are frequently associated but can also dissociate; they typically occur after right unilateral and bilateral temporal lobe injury (Vignolo, 2003). Amusia may affect discrimination and recognition of familiar melodies (Ayotte, Peretz, Rousseau, Bard, & Bojanowski, 2000; Sato et al., 2005). Howev er, there is evidence for a less strong hemispheric specificity for music perception be cause cross-hemisphere and fragmented neural substrates underlie local and global musi cal information processing at least in the melodic and temporal dimensions (Schuppert, Munte, Wieringa, & Altenmüller, 2000).

Somatosensory Perception The somatosensory system provides information about object surfaces that are in direct contact with the skin (touch) and about the position and movements of body parts (propri oception and kinesthesis). Somatosensory perception thus includes detection and discrim ination of (fine) differences in touch stimulation and haptic perception, that is, the per ception of shape, size, and identity (recognition) of objects on the basis of touch and kinesthesis. Shape is an important cue for recognizing objects by touch; edges, curvature, and surface areas are associated with three-dimensional shape (Plaisier, Tiest, & Kap pers, 2009). Exploratory motor procedures are directly linked to the extraction of specific shape properties (Valenza et al., 2001). Somatosensory information is processed in anteri or, lateral, and posterior parietal cortex, but also in frontal, cingulate, temporal, and insu lar cortical regions (Porro, Lui, Facchin, Maieron, & Baraldi, 2005).

Somatosensory Perceptual Disorders Impaired haptic perception of (micro) geometrical properties, which may be associated with a failure to recognize objects, has been reported after injury to the postcentral gyrus, including somatosensory areas SI and SII, and the posterior parietal cortex Page 13 of 37

Perceptual Disorders (Bohlhalter, Fretz, & Weder, 2002; Estanol, Baizabal-Carvallo, & Senties-Madrid, 2008). Difficulties to identify objects using hand manipulation only have been reported after parietal injury (Tomberg & Desmedt, 1999). Impairment of the perception of stimulus shape (morphagnosia) may result from defective processing of spatial orientation in twoand three-dimensional space (Saetti, De Renzi, & Comper, 1999). (p. 202) Tactile object recognition can be impaired without associated disorders in tactile discrimination and manual shape exploration, indicating the existence of “pure” tactile agnosia (Reed, Casel li, & Farah, 1996).

Body Perception Disorders Disorders in body perception may affect body form and body actions selectively or in com bination (Moro et al., 2008). Patients with injury to the premotor cortex may show ag nosia for their body (asomatognosia); that is, they describe parts of their body to be miss ing or disappeared from body awareness (Arzy, Overney, Landis, & Blanke, 2006). Macro and, less frequently, micro somatognosia have been reported as transient and reversible modifications of body representation during migraine aura (Robinson & Podoll, 2000). Asomatognosia either may involve the body as a whole (Beis, Paysant, Bret, Le Chapelain, & Andre, 2007) or may be restricted to finger recognition (“finger agnosia”; Anema et al., 2008). Body misperception may also result in body illusion, a deranged representation of the body concerning its ownership labeled “somatoparaphrenia” (Vallar & Ronchi, 2009). Distorted body perception may also occur in chronic pain (Lotze & Moseley, 2007).

Olfactory and Gustatory Perception The significance of the sense of smell is still somehow neglected. This is surprising given that olfactory processing monitors the intake of airborne agents into the human respirato ry system and warns of spoiled food, leaking natural gas, polluted air, and smoke. In addi tion, it determines to a large degree the flavor and palatability of foods and beverages, enhances life quality, and mediates basic elements of human social relationships and com munication, such as in mother–child interactions (Doty, 2009). Olfactory perception im plies detection, discrimination, identification, and recognition of olfactory stimuli. Olfacto ry perception shows selective adaptation; the perceived intensity of a smell drops by 50 percent or more after continuous exposure of about 10 minutes, and recovers again after removal of the smell stimulus (Eckman, Berglund, Berglund, & Lindvall, 1967). Continu ous exposition to a particular smell, such as cigarette smoke, causes persistent adapta tion to that smell on the person and in the environment. Smell perception involves the caudal orbitofrontal and medial temporal cortices. Olfacto ry stimuli are processed in primary olfactory (piriform) cortex and also activate the amyg dala bilaterally, regardless of valence. In posterior orbitofrontal cortex, processing of pleasant and unpleasant odors is segregated within medial and lateral segments, respec tively, indicating functional heterogeneity. Olfactory stimuli also show that brain regions mediating emotional processing are differentially activated by odor valence and provide Page 14 of 37

Perceptual Disorders evidence for a close anatomical coupling between olfactory and emotional processes (Got tfried, Deichmann, Winston, & Dolan, 2002). Gustation is vital for establishing whether a specific substance is edible and nutritious or poisonous, and for developing preferences for specific foods. According to the well-known taste tetrahedron, four basic taste qualities can be distinguished: sweet, salt, sour, and bitter. A fifth taste quality is umami, a Japanese word for “good taste.” Perceptual taste qualities are based on the pattern of activity across different classes of sensory fibers (i.e., cross-fiber theory; Mather, 2006, pp. 44) and distributed cortical processing (Simon, de Araujo, Gutierrez, & Nicolelis, 2006). Taste information is conveyed through the cen tral gustatory pathways to the gustatory cortical area, but is also sent to the reward sys tem and feeding center via the prefrontal cortex, insular cortex, and amygdala (Simon et al., 2006; Yamamoto, 2006). The sensation of eating, or flavor, involves smell and taste as well as interactions between these and other perceptual systems, including temperature, touch, and sight. However, flavor is not a simple summation of different sensations; smell and taste seem to dominate flavor.

Olfactory Perceptual Disorders Olfactory perception can be impaired in the domains of detection, discrimination, and identification/recognition of smell stimuli. Typically, patients experience hyposmia or dys geusia (decrease) or anosmia (loss of sense of smell) (Haxel, Grant, & Mackay-Sim, 2008). However, distinct patterns of olfactory dysfunctions have been reported, indicating differ ential breakdown in olfactory perception analogous to visual and auditory modalities (Luzzi et al., 2007). Interestingly, selective inability to recognize the favorite foods by smell can also occur despite preserved detection and evaluation of food stimuli as pleas ant or familiar (Mendez & Ghajarnia, 2001). Chronic disorders in olfactory perception and recognition have been reported after (trau matic) brain injury mainly to ventral frontal cortical structures (Fujiwara, Schwartz, Gaom Black, & Levine, 2008; Haxel, Grant, & Mackay-Sim, 2008; Wermer, Donswijk, Greebe, Verweij, & Rinkel, 2007), in (p. 203) Parkinson’s disease and multiple sclerosis, in mesial temporal epilepsy, and in neurodegenerative diseases, including dementia of the Alzheimer type, frontal-temporal dementia, cortical-basal degeneration, and Huntington’s disease (Barrios et al., 2007; Doty, 2009; Jacek, Stevenson, & Miller, 2007; Pardini, Huey, Cavanagh, & Grafman, 2009). It should be mentioned, however, that hyposmia and im paired odor identification can also be found in older age (Wilson, Arnold, Tang, & Ben nett, 2006), in particular in subjects with cognitive decline. Presbyosmia has been found in particular after 65 years of age, with no difference between males and females, and with a weak relationship between self-reports of olfactory function and objective olfactory function (Mackay-Sim, Johnston, Owen, & Burne, 2006). Olfactory perceptual changes have also been reported among subjects receiving chemotherapy (Bernhardson, Tishel man, & Ruthqvist, 2009), in depression (Pollatos et al., 2007), and in anorexia nervosa (Roessner, Bleich, Banashewski, & Rothenburger, 2005).

Page 15 of 37

Perceptual Disorders

Gustatory Perceptual Disorders Gustatory disorders in the form of quantitatively reduced (hypogeusia) or qualitatively changed (dysgeusia) gestation have been reported after subcortical, inferior collicular stroke (Cerrato et al., 2005), after pontine infarction (Landis, Leuchter, San Millan Ruiz, Lacroix, & Landis, 2006), after left insular and opercular stroke (Mathy, Dupuis, Pigeolet, & Jacquerye, 2003), in multiple sclerosis (Combarros, Miro, & Berciano, 1994), and in di abetes mellitus (Stolbova, Hahn, Benes, Andel, & Treslova, 1999). The anteromedial tem poral lobe plays an important role in recognizing taste quality because injury to this structure can cause gustatory agnosia (Small, Bernasconi, Sziklas, & Jones-Gutman, 2005). Gustatory perception also decreases with age (>40 years), which is more pro nounced in males than in females (Fikentscher, Roseburg, Spinar, & Bruchmuller, 1977). Smell and taste dysfunctions, including impaired detection, discrimination, and identifica tion of foods, have been frequently reported in patients following (minor) stroke in tempo ral brain structures (Green, McGregor, & King, 2008). Abnormalities in taste and smell have also been reported in patients with Parkinson’s disease (Shah et al., 2009).

Social Perception Social perception is an individual’s perception of social stimuli (i.e., facial expression, prosody and gestures, and smells), which allow inferring motives, attitudes, or values from the social behavior of other individuals. Social perception and social cognition, but also sensitivity to the social context, and social action, belong to particular functional sys tems in the prefrontal brain (Adolphs, 2003; Adolphs, Tranel, & Damasio, 2003). The amygdala is involved in recognizing facial emotional expressions; the orbitofrontal cortex is important to reward processing; and the insula is involved in representing “affective” states of our own body, such as empathy or pain (Adolphs, 2009). The neural substrates of social perception are characterized by a general pattern of right-hemispheric functional asymmetry (Brancucci, Lucci, Mazzatenta, & Tommasi, 2009). The (right) amygdala is crucially involved in evaluating sad but not happy faces, suggesting that this brain struc ture plays a specific role in processing negative emotions, such as sadness and fear (Adolphs & Tranel, 2004).

Disorders in Social Perception Patients with traumatic brain injury may show difficulties with recognizing affective infor mation from the face, voice, bodily movement, and posture (Bornhofen & McDonald, 2008), which may persistently interfere with successful negotiation of social interactions (Ietswaart, Milders, Crawford, Currie, & Scott, 2008). Interestingly, face perception and perception of visual social cues can be affected while the perception of prosody can be relatively spared, indicating a dissociation between visual and auditory social-perceptual abilities (Croker & McDonald, 2005; Green, Turner, & Thompson, 2004; Pell, 1998). Im paired auditory recognition of fear and anger has been reported following bilateral amyg Page 16 of 37

Perceptual Disorders dala lesions (Scott et al., 1997). Impairments of social perception, including inaccurate in terpretation and evaluation of stimuli signifying reward or punishment in a social context, and failures to translate emotional and social information into task- and context-appropri ate action patterns are often observed in subjects with frontal lobe injury. Consequently, patients may demonstrate inadequate social judgments and decision making, social inflex ibility, and lack of self-monitoring, particularly in social situations (Rankin, 2007). Difficul ties with facial expression perception have also been reported in mood disorders (Venn, Watson, Gallagher, & Young, 2006).

Conclusion and Some Final Comments The systematic study of individuals with perceptual deficits has substantially contributed to the (p. 204) understanding of the role of perceptual abilities and their underlying so phisticated brain processes, as well as the neural organization of the perceptual modali ties. Combined neurobiological, neuroimaging, and neuropsychological evidence supports the view that all perceptual systems are functionally segregated and show a parallel-hier archical type of organization of information processing and coding. Despite this type of functional organization, pure perceptual disorders are the exception rather than the rule. This somehow surprising fact can be explained by three main factors: (1) focal brain in jury is only rarely restricted to the cortical area in question; (2) the rich, typically recipro cal fiber connections between cortical areas are frequently also affected; and (3) percep tion may depend on spatiotemporally distributed activity in more than just one cortical area, as is known, for example, in body perception (Berlucchi & Aglioti, 2010). Thus, an association of deficits is more likely to occur. Furthermore, complex perceptual disorders, such as recognition, may also be caused by impaired lower level perceptual abilities, and it is rather difficult to clearly distinguish between lower and higher level perceptual abili ties. In addition, recognition cannot be understood without reference to memory, and it is therefore not surprising that it has been suggested that the brain structures underlying visual memory, in particular in the medial temporal lobe, also possess perceptual func tions and can thus be understood as an extension of the ventral visual processing stream (Baxter, 2009; Suzuki, 2009). Consequently, rather than trying to map perceptual func tions onto more or less separate brain structures, a more comprehensive understanding of perception would benefit from the study of cortical representation of functions crucial ly involved in defined percepts (Bussey & Saksida, 2007). This also holds true for the per ception–action debate, in particular in vision, which is treated as an exploratory activity, that is, a way of acting based on sensorimotor contingencies, as proposed by O’Regan & Noë (2001). According to this approach, the outside visual world serves as its own repre sentation, whereas the experience of seeing occurs as a result of mastering the “govern ing laws of sensorimotor contingency” and thereby accounts for visual experience and “visual consciousness.” If one applies this approach to the pathology of visual perception, then the question arises as to which visual perceptual disorders would result from the im paired representation of the “outside” visual world, and which from the defective “mas tering of the governing laws of sensorimotor contingency.” Would visual perceptual disor ders of the first type not be experienced by patients, and thus not represent a disorder Page 17 of 37

Perceptual Disorders and not cause any handicap, because there is no “internal” representation of the outside world in our brains and thus no visual experience? Modulatory effects of producing action on perception such that observers become selectively sensitive to similar or related ac tions are known from visual imitation learning and social interactions (Schutz-Bosbach & Prinz, 2007), but in both instances, perception of action and, perhaps, motivation to ob serve and attention directed to the action in question are required. Nevertheless, a more detailed understanding of the bidirectional relationships between perception and action and the underlying neural networks will undoubtedly help us to understand how percep tion modulates action and vice versa. Possibly, the search for associations and dissocia tions of perceptions and actions in cases with acquired brain injury in the framework of common functional representations in terms of sensorimotor contingencies represents a helpful approach to studying the reciprocal relationships between perception and action. Accurate visually guided hand actions in the absence of visual perception (Goodale, 2008) and impaired eye–hand coordination and saccadic control in optic ataxia as a conse quence of impaired visual-spatial processing (Pisella et al., 2009) are examples of such dissociations and associations. Despite some conceptual concerns and limitations, the dual-route model of visual processing proposed by Milner and Goodale (2008) is still of theoretical and practical value (Clark, 2009). An interesting issue is implicit processing of stimuli in the absence of experience or awareness, such as detection, localization, and even discrimination of simple visual stim uli in hemianopia (“blindsight”; Cowey, 2010; Danckert & Rosetti, 2005); discrimination of letters in visual form agnosia (Aglioti, Bricolo, Cantagallo & Berlucchi, 1999), discrimina tion of forms in visual agnosia (Kentridge, Heywood, & Milner, 2004; Yang, Wu, & Shen, 2006); and discrimination of faces in prosopagnosia (Le, Raufaste, Roussel, Puel, & De monet, 2003). Such results suggest sparing of function in the particular brain structure, but they may also be explained by stimulus processing in structures or areas that also contribute to a particular perceptual function. However, spared processing of stimuli is not identical with perception of the same stimuli. A paradigmatic example of implicit pro cessing of visual stimuli in the absence of the primary visual (p. 205) cortex, blindsight, has helped us to understand the nature of visual processing, but it is still unknown whether it is used or useful in everyday life activities; that is, it may not have any percep tual significance (Cowey, 2010). Furthermore, cognition plays an important role in perception, in particular attention, memory, and in monitoring of perceptual activities. Therefore, perceptual disorders can also result from or at least be exaggerated by cognitive dysfunctions associated with ac quired brain injury. The parietal cortex may be one of the brain structures that serve as a bridge between perception, cognition, and action (Gottlieb, 2007). Future research on perceptual disorders should therefore also focus on the effect of injury to brain structures engaged in attention, memory, and executive functions involved in perception, such as the temporal lobe, hippocampus, (posterior) parietal cortex, and prefrontal cortex. As a re sult, the framework for interpreting perceptual disorders after brain injury, as well as in other pathological states, could be further widened substantially. The search for funda mental requirements for visual perception and the coupling between brain functions un Page 18 of 37

Perceptual Disorders derlying perception and cognition may further help to define perceptual dysfunction with sufficient validity and thus contribute to the significance of perception (Pollen, 2008). Re search on functional plasticity in subjects with perceptual disorders using experimental practice paradigms may, in addition, contribute to a more comprehensive and integrative understanding of perception in the framework of other functional systems in the brain, which are known to modulate perceptual learning and thus functional plasticity in percep tual systems (Gilbert, Li & Piech, 2009; Gilbert & Sigman, 2007).

Author Note Preparation of this chapter has been supported in part by the German Ministry for Educa tion and Research (BMBF grant 01GW0762). I want to thank Susanne Schuett for her very helpful support.

References Adolphs, R. (2003). Cognitive neuroscience of human social behaviour. Nature Reviews Neuroscience, 4, 165–178. Adolphs, R. (2009). The social brain: Neural basis of social knowledge. Annual Review of Psychology, 60, 693–716. Adolphs, R., & Tranel, D. (2004). Impaired judgments of sadness but not happiness follow ing bilateral amygdala damage. Journal of Cognitive Neuroscience, 16, 453–462. Adolphs, R., Tranel, D., & Damasio, A. R. (2003). Dissociable neural systems for recogniz ing emotions. Brain and Cognition, 52, 61–69. Aglioti, S., Bricolo, E., Cantagallo A., Berlucchi, G. (1999). Unconscious letter discrimina tion is enhanced by association with conscious color perception in visual form agnosia. Current Biology, 9, 1419–1422. Andersson, F., Joliot, M., Perchey, G., & Petit, L. (2007). Eye position-dependent activity in the primary visual area as revealed by fMRI. Human Brain Mapping, 28, 673–680. Anema, H. A., Kessels, R. P., de Haan, E. H., Kappelle, L. J., Leijten, F. S., van Zandvoort, M. J., & Dijkerman, H. C. (2008). Differences in finger localisation performance of pa tients with finger agnosia. NeuroReport, 19, 1429–1433. Anstis, S. M. (1974). A chart demonstrating variations in acuity with retinal position. Vi sion Research, 14, 579–582. Arzy, S., Overney, L. S., Landis, T., Blanke, O. (2006). Neural mechanisms of embodiment: Asomatognosia due to premotor cortex damage. Archives of Neurology, 63, 1022–1025. Ayotte, J., Peretz, I., Rousseau, I., Bard, C., & Bojanowski, M. (2000). Patterns of music agnosia associated with middle cerebral artery infarcts. Brain, 123, 1926–1938. Page 19 of 37

Perceptual Disorders Bamiou, D. E., Musiek, F. E., & Luxon, L. M. (2003). The insula (Island of Reil) and its role in auditory processing. Brain Research—Brain Research Reviews, 42, 143–154. Bar, M. (2003). A cortical mechanism triggering top-down facilitation in visual object recognition. Journal of Cognitive Neuroscience, 15, 600–609. Bar, M. (2004). Visual objects in context. Nature Reviews Neuroscience, 5, 617–628. Barrios, F. A., Gonzalez, L., Favila, R., Alonso, M. E., Salgado, P. M., Diaz, R., & Fernan dez-Ruiz, J. (2007). Olfaction and neurodegeneration in HD. NeuroReport, 18, 73–76. Bartels, A., & Zeki, S. (1998). The theory of multistage integration. Proceedings of the Royal Society London B, 265, 2327–2332. Barton, J. J. S. (2008a). Structure and function in acquired prosopagnosia: Lessons from a series of 10 patients with brain damage. Journal of Neuropsychology, 2, 197–225. Barton, J. J. S. (2008b). Prosopagnosia associated with a left occipitotemporal lesion. Neu ropsychologia, 46, 2214–2224. Barton, J. J. S., & Black, S. E. (1998). Line bisection in hemianopia. Journal of Neurology, Neurosurgery & Psychiatry, 64, 660–662. Barton, J. J. S., Behrmann, M., & Black, S. (1998) Ocular search during line bisection: The effects of hemi-neglect and hemianopia. Brain, 121, 1117–1131. Baxter, M. G. (2009). Involvement of medial temporal structures in memory and percep tion. Neuron, 61, 667–677. Behrmann, M., Peterson, M. A., Moscovitch, M., & Suzuki, S. (2006). Independent repre sentation of parts and the relations between them: Evidence from integrative agnosia. Journal of Experimental Psychology: Human Perception & Performance, 32, 1169–1184. Behrmann, M., & Williams, P. (2007). Impairments in part-whole representations of ob jects in two cases of integrative visual agnosia. Cognitive Neuropsychology, 24, 701–730. Beis, J. M., Paysant, J., Bret, D., Le Chapelain, L., & Andre, J. M. (2007). Specular right-left disorientation, finger-agnosia, and asomatognosia in right hemisphere stroke. Cognitive & Behavioral Neurology, 20, 163–169. Berlucchi, G., & Aglioti, S. M. (2010). The body in the brain revisited. Experimental Brain Research, 200, 25–35. Bernhardson, B. M., Tishelman, C., & Rutqvist, L. E. (2009). Olfactory changes among pa tients receiving chemotherapy. European Journal of Oncology Nursing, 13, 9–15. Berryhill, M. E., Fendrich, R., & Olson, I. R. (2009). Impaired distance perception and size constancy following bilateral occipito-parietal damage. Experimental Brain Re search, 194, 381–393. (p. 206)

Page 20 of 37

Perceptual Disorders Bharucha, J. J., Curtis, M., & Paroo, K. (2006). Varieties of musical experiences. Cognition, 100, 131–172. Billino, J., Braun, D. I., Bohm, K. D., Bremmer, F., & Gegenfurtner, K. R. (2009). Cortical networks for motion perception: effects of focal brain lesions on perception of different motion types. Neuropsychologia, 47, 2133–2144. Blanke, O., Landis, T., Mermoud, C., Spinelli, L., & Safran, A. B. (2003). Direction-selec tive motion blindness after unilateral posterior brain damage. European Journal of Neuro science, 18, 709–722. Bohlhalter, S., Fretz, C., & Weder, B. (2002). Hierarchical versus parallel processing in tactile object recognition: A behavioural-neuroanatomical study of apperceptive tactile agnosia. Brain, 125, 2537–2548. Bonan, I. V., Leman, M. C., Legargasson, J. F., Guichard, J. P., & Yelnik, A. P. (2006). Evolu tion of subjective visual vertical perturbation after stroke. Neurorehabilitation & Neural Repair, 20, 484–491. Bornhofen, C., & McDonald, S. (2008). Emotion perception deficits following traumatic brain injury: A review of the evidence and rationale for intervention. Journal of the Inter national Neuropsychological Society, 14, 511–525. Boso, M., Politi, P., Barale, F., & Enzo, E. (2006). Neurophysiology and neurobiology of the musical experience. Functional Neurology, 21, 187–191. Bouvier, S. E., & Engel, S. A. (2006). Behavioral deficits and cortical damage loci in cere bral achromatopsia. Cerebral Cortex, 16, 183–191. Bradley, D. (2004). Object motion: A world view. Current Biology, 14, R892–R894. Brancucci, A., Lucci, G., Mazzatena, A. & Tommasi, L. (2009). Asymmetries of the human social brain in the visual, auditory, and chemical modalities. Philosophical Transactions of the Royal Society of London—Series B: Biological Sciences, 364, 895–914. Braun, D., Petersen, D., Schoenle, P., & Fahle, M. (1998). Deficits and recovery of firstand second-order motion perception in patients with unilateral cortical lesions. European Journal of Neuroscience, 10, 2117–2128. Bukach, C. M., Le Grand, R., Kaiser, M. D., Bub, D., & Tanaka, J. W. (2008). Preservation of mouth region processing in two cases of prosopagnosia. Journal of Neuropsychology, 2, 227–244. Bulens, C., Meerwaldt, J. D., van der Wildt, G. J., & Keemink, D. (1986). Contrast sensitivi ty in Parkinson’s disease. Neurology, 36, 1121–1125.

Page 21 of 37

Perceptual Disorders Bulens, C., Meerwaldt, J. D., van der Wildt, G. J., & Keemink, D. (1989). Spatial contrast sensitivity in unilateral cerebral ischemic lesions involving the posterior visual pathway. Brain, 112, 507–520. Bullier, J. (2003). Cortical connections and functional interactions between visual cortical areas. In M. Fahle & M. Greenlee (Eds.), The neuropsychology of vision (pp. 23–63). Ox ford, UK: Oxford University Press. Bussey, T. J., & Saksida, L. M. (2007). Memory, perception, and the ventral visual-perirhi nal-hippocampal stream: thinking outside of the boxes. Hippocampus, 17, 898–908. Caminiti, R., Chafee, M. V., Battalglia-Mayer, A., Averbeck, B. B., Crowe, D. A., & Geor gopoulos, A. P. (2010). Understanding the parietal lobe syndrome from a neuropsychologi cal and evolutionary perspective. European Journal of Neuroscience, 31, 2320–2340. Campbell, R., Zihl, J., Massaro, D., Munhall, K., & Cohen, M. M. (1997). Speechreading in the akinetopsic patient, L.M. Brain, 120, 1793–1803. Carey, D. P., Dijkerman, H. C., Murphy, K. J., Goodale, M. A., & Milner, A. D. (2006). Point ing to places and spaces in a patient with visual form agnosia. Neuropsychologia, 44, 1584–1594. Cerrato, P., Lentini, A., Baima, C., Grasso, M., Azzaro, C., Bosco, G., Destefanis, E., Benna, P., Bergui, M., & Bergamasco, B. (2005). Hypogeusia and hearing loss in a patient with an inferior collicular lesion. Neurology, 65, 1840–1841. Chartrand, J. P., Peretz, I., & Belin, P. (2008). Auditory recognition expertise and domain specificity. Brain Research, 1220, 191–198. Clark, A. (2009). Perception, action, and experience: unraveling the golden braid. Neu ropsychologia, 47, 1460–1468. Clarke, G. (2005). Incidence of neurological vision impairment in patients who suffer from an acquired brain injury. International Congress Series, 1282, 365–369. Clarke, S., Bellmann, A., Meuli, R. A., Assal, G., & Steck, A. J. (2000). Auditory agnosia and auditory spatial deficits following left hemispheric lesions: Evidence for distinct pro cessing pathways. Neuropsychologia, 38, 797–807. Clavagnier, S., Fruhmann Berger, M., Klockgether, T., Moskau, S., & Karnath, H. O. (2006). Restricted ocular exploration does not seem to explain simultanagnosia. Neu ropsychologia, 44, 2330–2336. Combarros, O., Miro, J., & Berciano, J. (1994). Ageusia associated with thalamic plaque in multiple sclerosis. European Neurology, 34, 344–346.

Page 22 of 37

Perceptual Disorders Connolly, D. M., Barbur, J. L., Hosking, S. L., & Moorhead, I. R. (2008). Mild hypoxia im pairs chromatic sensitivity in the mesopic range. Investigative Ophthalmology & Visual Science, 49, 820–827. Coslett, H. B., & Lie, G. (2008). Simultanagnosia: When a rose is not red. Journal of Cog nitive Neuroscience, 20, 36–48. Cowey, A. (2010). The blindsight saga. Experimental Brain Research, 200, 3–24. Croker, V., & McDonald, S. (2005). Recognition of emotion from facial expression follow ing traumatic brain injury. Brain Injury, 19, 787–799. Dahmen, J. C., & King, A. J. (2007). Learning to hear: Plasticity of auditory cortical pro cessing. Current Opinion in Neurobiology, 17, 456–464. Dalrymple, K. A., Kingstone, A., & Barton, J. J. (2007). Seeing trees OR seeing forests in simultanagnosia: Attentional capture can be local or global. Neuropsychologia, 45, 871– 875. Danckert, J., & Rosetti, Y. (2005). Blindsight in action: What can the different subtypes of blindsight tell us about the control of visually guided actions? Neuroscience & Biobehav ioral Reviews, 29, 1035–1046. Darling, W. G., Bartelt, R., Pizzimenti, M. A., & Rizzo, M. (2008). Spatial perception errors do not predict pointing errors by individuals with brain lesions. Journal of Clinical & Ex perimental Neuropsychology, 30, 102–119. Darling, W. G., Pizzimenti, M. A., & Rizzo, M. (2003). Unilateral posterior parietal lobe le sions affect representation of visual space. Vision Research, 43, 1675–1688. Dean, H. L., Crowley, J. C., & Platt, M. L. (2004). Visual and saccade-related activity in macaque posterior cingulated cortex. Journal of Neurophysiology, 92, 3056–3068. Delvenne, J. F., Seron, X., Coyette, F., & Rossion, B. (2004). Evidence for perceptu al deficits in associative visual (prosop)agnosia: A single case study. Neuropsychologia, 42, 597–612. (p. 207)

Desimone, R., & Ungerleider, L. G. (1989). Neural mechanisms of visual processing in monkeys. In F. Boller & J. Grafman (Eds.), Handbook of neuropsychology (Vol. 2, pp. 267– 299). Amsterdam: Elsevier. Doty, R. L. (2009). The olfactory system and its disorders. Seminars in Neurology, 29, 74– 81. Eckman, G., Berglund, B., Berglund, U., & Lindvall, T. (1967). Perceived intensity of odor as a function of time of adaptation. Scandinavian Journal of Psychology, 8, 177–186.

Page 23 of 37

Perceptual Disorders Engelien, A., Silbersweig, D., Stern, E., Huber, W., Doring, W., Frith, C., & Frackowiak, R. S. (1995). The functional anatomy of recovery from auditory agnosia. A PET study of sound categorization in a neurological patient and normal controls. Brain, 118, 1395– 1409. Estanol, B., Baizabal-Carvallo, J. F., & Senties-Madrid, H. (2008). A case of tactile agnosia with a lesion restricted to the post-central gyrus. Neurology India, 56, 471–473. Farah, M. (2000). The cognitive neuroscience of vision. Oxford, UK: Blackwell. Farah, M. (2003). Disorders of visual-spatial perception and cognition. In K. M. Heilman & E. Valenstein (Eds)., Clinical neuropsychology (4th ed., pp. 146–160). New York: Oxford University Press. Fikentscher, R., Roseburg, B., Spinar, H., & Bruchmuller, W. (1977). Loss of taste in the el derly: Sex differences. Clinical Otolaryngology & Allied Sciences, 2, 183–189. Frisén, L. (1980). The neurology of visual acuity. Brain, 103, 639–670. Fritz, J., Elhilali, M., & Shamma, S. (2005). Active listening: Task-dependent plasticity of spectrotemporal receptive fields in primary auditory cortex. Hearing Research, 206, 159– 176. Fujiwara, E., Schwartz, M. L., Gao, F., Black, S.E., & Levine, B. (2008). Ventral frontal cor tex functions and quantified MRI in traumatic brain injury. Neuropsychologia, 46, 461– 474. Gal, R. L. (2008). Visual function 15 years after optic neuritis: A final follow-up report from the optic neuritis treatment trial. Ophthalmology. 115, 1079–1082. Ghika, J., Ghika-Schmid, F., & Bogousslavsky, J. (1998). Parietal motor syndrome: A clini cal description in 32 patients in the acute phase of pure parietal stroke studied prospec tively. Clinical Neurology & Neurosurgery, 100, 271–282. Giese, M. A., & Poggio, T. (2003). Neural mechanisms for the recognition of biological movements. Nature Reviews Neuroscience, 4, 179–192. Gilbert, C. D., Li, W., & Piech, V. (2009). Perceptual learning and adult cortical plasticity. Journal of Physiology, 587, 2743–2751. Gilbert, C. D., & Sigman, M. (2007). Brain states: Top-down influences in sensory process ing. Neuron, 54, 677–696. Goodale, M. A. (2008). Action without perception in human vision. Cognitive Neuropsy chology, 25, 891–919. Goodale, M. A., & Westwood, D. A. (2004). An evolving view of duplex vision: Separate but interacting cortical pathways for perception and action. Current Opinion in Neurobi ology, 14, 203–211. Page 24 of 37

Perceptual Disorders Gottfried, J. A., Deichmann, R., Winston, J. S., & Dolan, R. J. (2002). Functional hetero geneity in human olfactory cortex: An event-related functional magnetic resonance imag ing study. Journal of Neuroscience, 22, 10819–10828. Gottlieb, J. (2007). From thought to action: The parietal cortex as a bridge between per ception, action, and cognition. Neuron, 53, 9–16. Green, R. E., Turner, G. R., & Thompson, W. F. (2004). Deficits in facial emotion percep tion in adults with recent traumatic brain injury. Neuropsychologia, 42, 133–141. Green, T. L., McGregor, L. D., & King, K. M. (2008). Smell and taste dysfunction following minor stroke: A case report. Canadian Journal of Neuroscience Nursing, 30, 10–13. Griffiths, T. D., Rees, A., Witton, C., & Cross, P. M., Shakir, R. A., & Green, G. G. (1997). Spatial and temporal auditory processing deficits following right hemisphere infarction: A psychophysical study. Brain, 120, 85–94. Grill-Spector, K. (2003). The neural basis of object recognition. Current Opinion in Neuro biology, 13, 159–166. Grill-Spector, K., & Malach, R. (2004). The human visual cortex. Annual Review of Neuro science, 27, 649–677. Harrington, D. O., & Drake, M. V. (1990). The visual fields (6th ed.). St. Louis: Mosby. Haxel, B. R., Grant, L., & Mackay-Sim, A. (2008). Olfactory dysfunction after head injury. Journal of Head Trauma Rehabilitation, 23, 407–413. Heider, B. (2000). Visual form agnosia: Neural mechanisms and anatomical foundations. Neurocase, 6, 1–12. Henderson, J. M. (2003). Human gaze control during real-world scene perception. Trends in Cognitive Sciences, 7, 498–504. Hess, R. F., Zihl, J., Pointer, S. J., & Schmid, C. (1990). The contrast sensitivity deficit in cases with cerebral lesions. Clinical Vision Sciences, 5, 203–215. Hess, R. H., Baker, C. L. Jr., & Zihl, J. (1989). The “motion-blind” patient: Low-level spatial and temporal filters. Journal of Neuroscience, 9, 1628–1640. Heywood, C. A., & Kentridge, R. W. (2003). Achromatopsia, color vision, and cortex. Neu rologic Clinics, 21, 483–500. Heywood, C. A., Wilson, B., & Cowey, A. (1987). A case study of cortical colour blindness with relatively intact achromatic discrimination. Journal of Neurology, Neurosurgery, and Psychiatry, 50, 22–29. Himmelbach, M., Erb, M., & Karnath, H.-O. (2006). Exploring the visual world: The neur al substrate of spatial orienting. NeuroImage, 32, 1747–1759. Page 25 of 37

Perceptual Disorders Himmelbach, M., Erb, M., Klockgether, T., Moskau, S., & Karnath, H. O. (2009). fMRI of global visual perception in simultanagnosia. Neuropsychologia, 47, 1173–1177. Hochstein, S., & Ahissar, M. (2002). View from the top: Hierarchies and reverse hierar chies in the visual system. Neuron, 36, 791–804. Huberle, E., & Karnath, H. O. (2006). Global shape recognition is modulated by the spa tial distance of local elements: Evidence from simultanagnosia. Neuropsychologia, 44, 905–911. Ietswaart, M., Milders, M., Crawford, J. R., Currie, D., & Scott, C. L. (2008). Longitudinal aspects of emotion recognition in patients with traumatic brain injury. Neuropsychologia, 46, 148–159. Jacek, S., Stevenson, R. J., & Miller, L. A. (2007). Olfactory dysfunction in temporal lobe epilepsy: A case of ictus-related parosmia. Epilepsy & Behavior, 11, 466–470. Jackson, G. R., & Owsley, C. (2003). Visual dysfunction, neurodegenerative diseases, and aging. Neurologic Clinics, 21, 709–728. James, T. W., Culham, J., Humphreys, G. K., Milner, A. D., & Goodale, M. A. (2003). Ventral occipital lesions impair object recognition but not object-directed grasping: an fMRI study. Brain, 126, 2463–2475. (p. 208)

Janata, P. (2005). Brain networks that track musical structure. Annals of the New York Academy of Sciences, 1060, 111–124. Kaga, K., Shindo, M., & Tanaka, Y. (1997). Central auditory information processing in pa tients with bilateral auditory cortex lesions. Acta Oto-Laryngologica Supplement, 532, 77– 82. Kaga, K., Shindo, M., Tanaka, Y., & Haebara, H. (2000). Neuropathology of auditory ag nosia following bilateral temporal lobe lesions: A case study. Acta Oto-Laryngologica, 120, 259–262. Karnath, H.-O., Ferber, S., Rorden, C., & Driver, J. (2000). The fate of global information in dorsal simultanagnosia. Neurocase, 6, 295–306. Karnath, H.-O., Zihl, J. (2003). Disorders of spatial orientation. In T. Brandt, L. Caplan, J. Dichgans, C. Diener, & C. Kennard (Eds.), Neurological disorders: Course and treatment (2nd ed., pp. 277–286). New York: Academic Press. Kennard, C., Lawden, M., Morland, A. B., & Ruddock, K. H. (1995). Colour identification and colour constancy are impaired in a patient with incomplete achromatopsia associated with prestriate cortical lesions. Proceedings of the Royal Society of London—Series B: Bi ological Sciences, 260, 169–175.

Page 26 of 37

Perceptual Disorders Kentridge, R. W., Heywood, C. A., & Milner, A. D. (2004). Covert processing of visual form in the absence of area L Neuropsychologia, 42, 1488–1495. Koh, S. B., Kim, B. J., Lee, J., Suh, S. I., Kim, T. K., & Kim, S. H. (2008). Stereopsis and col or vision impairment in patients with right extrastriate cerebral lesions. European Neurol ogy, 60, 174–178. Konen, C. S., & Kastner, S. (2008). Two hierarchically organized neural systems for object information in human visual cortex. Nature Neuroscience, 11, 224–231. Kraus, N., & Nicol, T. (2005). Brainstem origins for cortical “what” and “where” pathways in the auditory system. Trends in Neurosciences, 28, 176–181. Landis, B. N., Leuchter, I., San Millan Ruiz, D., Lacroix, J. S., & Landis, T. (2006). Tran sient hemiageusia in cerebrovascular lateral pontine lesions. Journal of Neurology, Neuro surgery, and Psychiatry, 77, 680–683. Landis, T. (2000). Disruption of space perception due to cortical lesions. Spatial Vision, 13, 179–191. Landis, T., Regard, M., Bliestle, A., & Kleihues, P. (1988). Prosopagnosia and agnosia for noncanonical views. An autopsied case. Brain, 111, 1287–1297. Le, S., Raufaste, E., Roussel, S., Puel, M., & Demonet, J. F. (2003). Implicit face percep tion in a patient with visual agnosia? Evidence from behavioural and eye-tracking analy ses. Neuropsychologia, 41, 702–712. Leigh, R. J., & Zee, D. S. (2006). The neurology of eye movements (4th ed.). Philadelphia: F. A. Davis. Lewald, J., Peters, S., Corballis, M. C., & Hausmann, M. (2009). Perception of stationary and moving sound following unilateral cortectomy. Neuropsychologia, 47, 962–971. Lewis, J. W., Wightman, F. L., Brefczynski, J. A., Phinney, R. E., Binder, J. R., & DeYoe, E. A. (2004). Human brain regions involved in recognizing environmental sounds. Cerebral Cortex, 14, 1008–1021. Lissauer, H. (1890). Ein Fall von Seelenblindheit nebst einem Beitrage zur Theorie dersel ben. [A case of mindblindness with a contribution to its theory]. Archiv für Psychiatrie und Nervenkrankheiten, 21, 222–270. Lotze, M., & Moseley, G. L. (2007). Role of distorted body image in pain. Current Rheuma tology Reports, 9, 488–496. Luzzi, S., Snowden, J. S., Neary, D., Coccia, M., Provinciali, L., & Lambon Ralph, M. A. (2007). Distinct patterns of olfactory impairment in Alzheimer’s disease, semantic demen tia, frontotemporal dementia, and corticobasal degeneration. Neuropsychologia, 45, 1823–1831. Page 27 of 37

Perceptual Disorders Mackay-Sim, A., Johnston, A. N., Owen, C., & Burne, T. H. (2006). Olfactory ability in the healthy population: Reassessing presbyosmia. Chemical Senses, 31, 763–771. Martinez-Conde, S., Macknik, S. L., & Hubel, D. H. (2004). The role of fixational eye movements in visual perception. Nature Reviews Neuroscience, 5, 229–239. Mather, G. (2006). Foundations of perception. Hove (UK) and New York: Psychology Press. Mathy, I., Dupois, M. J., Pigeolet, Y., & Jacquerye, P. (2003). Bilateral ageusia after left in sular and opercular ischemic stroke. [French]. Revue Neurologique, 159, 563–567. McIntosh, R. D., Dijkerman, H. C., Mon-Williams, M., & Milner, A. D. (2004). Grasping what is graspable: Evidence from visual form agnosia. Cortex, 40, 695–702. McLeod, P., Dittrich, W., Driver, J., Perrett, D., & Zihl, J. (1996). Preserved and impaired detection of structure from motion by a motion-blind patient. Visual Cognition, 3, 363– 391. McLeod, P., Heywood, C., Driver, J., & Zihl, J. (1989). Selective deficit of visual search in moving displays after extrastriate damage. Nature, 339, 466–467. Meadows, J. C. (1974). Disturbed perception of colours associated with localized cerebral lesions. Brain, 97, 615–632. Mendez, M. F. (2001). Generalized auditory agnosia with spared music recognition in a left-hander: Analysis of a case with a right temporal stroke. Cortex, 37, 139–150. Mendez, M. F., & Cherrier, M. M. (2003). Agnosia for scenes in topographagnosia. Neu ropsychologia, 41, 1387–1395. Mendez, M. F., & Ghajarnia, M. (2001). Agnosia for familiar faces and odors in a patient with right temporal lobe dysfunction. Neurology, 57, 519–521. Michel, F., & Henaff, M. A. (2004). Seeing without the occipito-parietal cortex: Simul tanagnosia as a shrinkage of the attentional visual field. Behavioural Neurology, 15, 3–13. Miller, L. J., Mittenberg, S., Carey, V. M., McMorrow, M. A., Kushner, T. E., & Weinstein, J. M. (1999). Astereopsis caused by traumatic brain injury. Archives of Clinical Neuropsy chology, 14, 537–543. Milner, A. D., Perrett, D. I., Johnston, R. S., Benson, P. J., Jordan, T. R., Heeley, D. W., et al. (1991). Perception and action in “visual form agnosia.” Brain, 114, 405–428. Milner, A. D., & Goodale, M. A. (2008). Two visual systems re-reviewed. Neuropsychologia, 46, 774–785. Moreaud, O. (2003). Balint syndrome. Archives of Neurology, 60, 1329–1331.

Page 28 of 37

Perceptual Disorders Moro, V., Urgesi, C., Pernigo, S., Lanteri, P., Pazzaglia, M., & Aglioti, S. M. (2008). The neural basis of body form and body action agnosia. Neuron, 60, 235–246. Mort, D. J., & Kennard, C. (2003). Visual search and its disorders. Current Opinion in Neu rology, 16, 51–57. Moura, A. L., Teixeira, R. A., Oiwa, N. N., Costa, M. F., Feitosa-Santana, C., Calle garo, D., Hamer, R. D., & Ventura, D. F. (2008). Chromatic discrimination losses in multi ple sclerosis patients with and without optic neuritis using the Cambridge Colour Test. Vi sual Neuroscience, 25, 463–468. (p. 209)

Moutoussis, K., & Zeki, S. (2008). Motion processing, directional selectivity, and con scious visual perception in the human brain. Proceedings of the National Academy of the United States of America, 105, 16362–16367. Müller, T., Woitalla, D., Peters, S., Kohla, K., & Przuntek, H. (2002). Progress of visual dys function in Parkinson’s disease. Acta Neurologica Scandinavica, 105, 256–260. Mycroft, R. H., Behrmann, M., & Kay, J. (2009). Visuoperceptual deficits in letter-by-letter reading? Neuropsychologia, 47, 1733–1744. Nakachi, R., Muramatsu, T., Kato, M., Akiyama, T., Saito, F., Yoshino, F., Mimura, M., & Kashima, H. (2007). Progressive prosopagnosia at a very early stage of frontotempolar lobular degeneration. Psychogeriatrics, 7, 155–162. Nobre, A. C. (2001). The attentive homunculus: Now you see it, now you don’t. Neuro science & Biobehavioral Reviews, 25, 477–496. Oliveri, M., Turriziani, P., Carlesimo, G. A., Koch, G., Tomaiuolo, F., Panella M., & Calta girone, G. M. (2001). Parieto-frontal interactions in visual-object and visual-spatial work ing memory: Evidence from transcranial magnetic stimulation. Cerebral Cortex, 11, 606– 618. Olson, C. R., Gettner, S. N., Ventura, V., Carta, R., & Kass, R. E. (2000). Neuronal activity in macaque supplementary eye field during planning of saccades in response to pattern and spatial cues. Journal of Neurophysiology, 84, 1369–1384. Orban, G. A. (2008). Higher order visual processing in macaque extrastriate cortex. Physi ological Reviews, 88, 59–89. O’Regan, J. K., & Noë, A. (2001). A sensorimotor account of visual and visual conscious ness. Behavioral and Brain Sciences, 24, 939–1031. Ozaki, I., & Hashimoto, I. (2007). Human tonotopic maps and their rapid task-related changes studied by magnetic source imaging. Canadian Journal of Neurological Sciences, 34, 146–153.

Page 29 of 37

Perceptual Disorders Pambakian, A. L. M., Mannan, S. K., Hodgson, T. L., & Kennard, C. (2004). Saccadic visual search training: A treatment for patients with homonymous hemianopia. Journal of Neu rology, Neurosurgery, and Psychiatry, 75, 1443–1448. Pardini, M., Huey, E. D., Cavanagh, A. L., & Grafman, J. (2009). Olfactory function in corti cobasal syndrome and frontotemporal dementia. Archives of Neurology, 66, 92–96. Pavese, A., Coslett, H. B., Saffran, E., & Buxbaum, L. (2002). Limitations of attentional orienting: Effects of abrupt visual onsets and offsets on naming two objects in a patient with simultanagnosia. Neuropsychologia, 40, 1097–1103. Pegna, A. J., Caldara-Schnetzer, A. S., & Khateb, A. (2008). Visual search for facial expres sions of emotion is less affected in simultanagnosia. Cortex, 44, 46–53. Pell, M. D. (1998). Recognition of prosody following unilateral brain lesion: influence of functional and structural attributes of prosodic contours. Neuropsychologia, 36, 701–715. Peretz, I., Kolinsky, R., Tramo, M., Labrecque, R., Hublet, C., Demeurisse, G., & Belleville, S. (1994). Functional dissociations following bilateral lesions of auditory cortex. Brain, 117, 1283–1301. Pisella, L., Sergio, L., Blangero, A., Torchin, H., Vighetto, A., & Rosetti, Y. (2009). Optic ataxia and the function of the dorsal stream: Contributions to perception and action. Neu ropsychologia, 47, 3033–3044. Plaisier, M. A., Tiest, W. M., & Kappers, A. M. (2009). Salient features in 3-D haptic shape perception. Attention Perception & Psychophysics, 71, 421–430. Plant, G. T., Laxer, K. D., Barbaro, N. M., Schiffman, J. S., & Nakayama, K. (1993). Im paired visual motion perception in the contralateral hemifield following unilateral posteri or cerebral lesions in humans. Brain, 116, 1303–1335. Pollatos, O., Albrecht, J., Kopietz, R., Linn, J., Schoepf, V., Kleemann, A. M., Schreder, T., Schandry, R., & Wiesmann, M. (2007). Reduced olfactory sensitivity in subjects with de pressive symptoms. Journal of Affective Disorders, 102, 101–108. Pollen, D. A. (2008). Fundamental requirements for primary visual perception. Cerebral Cortex, 18, 1991–1998. Polster, M. R., & Rose, S. B. (1998). Disorders of auditory processing: Evidence for modu larity in audition. Cortex, 34, 47–65. Porro, C. A., Lui, F., Facchin, P., Maieron, M., & Baraldi, P. (2005). Percept-related activity in the human somatosensory system: functional magnetic resonance imaging studies. Magnetic Resonance Imaging, 22, 1539–1548.

Page 30 of 37

Perceptual Disorders Postma, A., Sterken, Y., de Vries, L., & de Haan, E. H. E. (2000). Spatial localization in pa tients with unilateral posterior left or right hemisphere lesions. Experimental Brain Re search, 134, 220–227. Rainville, C., Joubert, S., Felician, O., Chabanne, V., Ceccaldi, M., & Peruch, P. (2006). Wayfinding in familiar and unfamiliar environments in a case of progressive topographi cal agnosia. Neurocase, 11, 297–309. Rankin, K. P. (2007). Social cognition in frontal injury. In B. L. Miller & J. L. Cummings (Eds.), The human frontal lobes. Functions and disorders (2nd ed., pp. 345–360). New York, London: The Guilford Press. Rayner, K. (1998). Eye movements in reading and visual information processing: 20 Years of research. Psychological Bulletin, 124, 372–422. Reed, C. L., Caselli, R. J., & Farah, M. J. (1996). Tactile agnosia: Underlying impairment and implications for normal tactile object recognition. Brain, 119, 875–888. Rentschler, I., Treutwein, B., & Landis, T. (1994). Dissociation of local and global process ing in visual agnosia. Vision Research, 34, 963–971. Rice, N. J., McIntosh, R. D., Schindler, I., Mon-Williams, M., Demonet, J. F., & Milner, A. D. (2006). Intact automatic avoidance of obstacles in patients with visual form agnosia. Ex perimental Brain Research, 174, 76–88. Riddoch, M. J., Humphreys, G. W., Akhtar, N., Allen, H., & Bracewell, R. M., & Schofield, A. J. (2008). A tale of two agnosias: Distinctions between form and integrative agnosia. Cognitive Neuropsychology, 25, 56–92. Riddoch, M. J., Johnston, R. A., Bracewell, R. M., Boutsen, L., & Humphreys, G. W. (2008). Are faces special? A case of pure prosopagnosia. Cognitive Neuropsychology, 25, 3–26. Rizzo, M., Nawrot, M., & Zihl, J. (1995). Motion and shape perception in cerebral akine topsia. Brain, 118, 1105–1127. Rizzo, M., Smith, V., Pokorny, J., & Damasio, A. R. (1993). Colour perception profiles in central achromatopsia. Neurology, 43, 995–1001. Rizzo, M., & Vecera, S. P. (2002). Psychoanatomical substrates of Balint’s syndrome. Jour nal of Neurology, Neurosurgery, and Psychiatry, 72, 162–178. (p. 210)

Roark, D. A., Barrett, S. E., Spence, M. J., Abdi, H., & O’Toole, A. J. (2003). Psycho

logical and neural perspectives on the role of motion in face recognition. Behavioral & Cognitive Neuroscience Reviews, 2, 15–46. Robinson, D., & Podoll, K. (2000). Macrosomatognosia and microsomatognosia in mi graine art. Acta Neurologica Scandinavica, 101, 413–416.

Page 31 of 37

Perceptual Disorders Roessner, V., Bleich, S., Banaschewski, T., & Rothenberger, A. (2005). Olfactory deficits in anorexia nervosa. European Archives of Psychiatry & Clinical Neuroscience, 255, 6–9. Rowe, F., Brand, D., Jackson, C. A., Price, A., Walker, L., Harrison, S., Eccleston, C., Scott C., Akerman, N., Dodridge, C., Howard, C., Shipman, T., Sperring, U., MacDiarmid, S., & Freeman, C. (2009). Visual impairment following stroke: Do stroke patients require vision assessment? Age and Ageing, 38, 188–193. Russ, B. E., Lee, Y. S., & Cohen, Y. E. (2007). Neural and behavioral correlates of auditory categorization. Hearing Research, 229, 204–212. Saetti, M. C., De Renzi, E., & Comper, M. (1999). Tactile morphagnosia secondary to spa tial deficits. Neuropsychologia, 37, 1087–1100. Samson, S., Zatorre, R. J., & Ramsay, J. O. (2002). Deficits of musical timbre perception af ter unilateral temporal-lobe lesion revealed with multidimensional scaling. Brain, 125, 511–523. Satoh, M., Takeda, K., Murakami, Y., Onouchi, K., Inoue, K., & Kuzuhara, S. (2005). A case of amusia caused by the infarction of anterior portion of bilateral temporal lobes. Cortex, 41, 77–83. Saumier, D., Arguin, M., Lefebvre, C., & Lassonde, M. (2002). Visual object agnosia as a problem in integrating parts and part relations. Brain & Cognition, 48, 531–537. Saygin, A. P. (2007). Superior temporal and premotor brain areas necessary for biological motion perception. Brain, 130, 2452–2461. Saygin, A. P., Dick, F., Wilson, S. M., Dronkers, N. F., & Bates, E. (2003). Neural resources for processing language and environmental sounds: Evidence from aphasia. Brain, 126, 928–945. Schenk, T. (2006). An allocentric rather than perceptual deficit in patient DF. Nature Neu roscience, 9, 1369–1370. Schenk, T., Mai, N., Ditterich, J., & Zihl, J. (2000). Can a motion-blind patient reach for moving objects? European Journal of Neuroscience, 12, 3351–3360. Schiller, P. H., & Tehovnik, E. J. (2001). Look and see: How the brain moves your eyes about. Progress in Brain Research, 134, 127–142. Schiller, P. H., & Tehovnik, E. J. (2005). Neural mechanisms underlying target selection with saccadic eye movements. Progress in Brain Research, 149, 157–171. Schuett, S., Heywood, C. A., Kentrigde, R. W., & Zihl, J. (2008a). The significance of visual information processing in reading: Insights from hemianopic dyslexia. Neuropsychologia, 46, 2445–2462.

Page 32 of 37

Perceptual Disorders Schuett, S., Heywood, C. A., Kentrigde, R. W., & Zihl, J. (2008b). Rehabilitation of hemi anopic dyslexia: are words necessary for re-learning oculomotor control? Brain, 131, 3156–3168. Schuppert, M., Munte, T. F., Wieringa, B. M., & Altenmuller, E. (2000). Receptive amusia: Evidence for cross-hemispheric neural networks underlying music processing strategies. Brain, 123, 546–559. Schutz-Bosbach, S., & Prinz, W. (2007). Perceptual resonance: Action-induced modulation of perception. Trends in Cognitive Sciences, 11, 349–355. Scott, S. K., Young, A. W., Calder, A. J., Hellawell, D. J., Aggleton, J. P., & Johnson, M. (1997). Impaired auditory recognition for fear and anger following bilateral amygdala le sions. Nature, 385, 254–257. Shah, M., Deeb, J., Fernando, M., Noyce, A., Visentin, E., Findley, L. J., & Hawkes, C. H. (2009). Abnormality of taste and smell in Parkinson’s disease. Parkinsonism & Related Disorders, 15, 232–237. Shinn-Cunningham, B. G., & Best, V. (2008). Selective attention in normal and impaired hearing. Trends in Amplification, 12, 283–299. Shivashankar, N., Shashikala, H. R., Nagaraja, D., Jayakumar, P. N., & Ratnavalli, E. (2001). Pure word deafness in two patients with subcortical lesions. Clinical Neurology & Neurosurgery, 103, 201–205. Short, R. A., & Graff-Radford, N. R. (2001). Localization of hemiachromatopsia. Neuro case, 7, 331–337. Sigala, N. (2004). Visual categorization and the inferior temporal cortex. Behavioural Brain Research, 149, 1–7. Simon, S. A., de Araujo, I. E., Gutierrez, R., & Nicolelis, M. A. L. (2006). The neural mech anisms of gestation: a distributed processing code. Nature Reviews Neuroscience, 7, 890– 901. Singh-Curry, V., & Husain, M. (2009). The functional role of the inferior parietal lobe in the dorsal and ventral stream dichotomy. Neuropsychologia, 47, 1434–1448. Small, D. M., Bernasconi, N., Bernasconi, A., Sziklas, V., & Jones-Gotman, M. (2005). Gus tatory agnosia. Neurology, 64, 311–317. Smith, C. N., & Squire, L. R. (2008). Experience-dependent eye movements reflect hip pocampus-dependent (aware) memory. Journal of Neuroscience, 28, 12825–12833. Snowden, R. J., & Freeman, T. C. (2004). The visual perception of motion. Current Biology, 14, R828–R831.

Page 33 of 37

Perceptual Disorders Stephan, B. C. M., & Caine, D. (2009). Aberrant pattern of scanning in prosopagnosia re flects impaired face processing. Brain and Cognition, 69, 262–268. Stolbova, K., Hahn, A., Benes, B., Andel, M., & Treslova, L. (1999). Gustatometry of dia betes mellitus patients and obese patients. International Tinnitus Journal, 5, 135–140. Suchoff, I. B., Kapoor, N., Ciuffreda, K. J., Rutner, D., Han, E., & Craig, S. (2008). The fre quency of occurrence, types, and characteristics of visual field defects in acquired brain injury: A retrospective analysis. Optometry, 79, 259–265. Suzuki, W. A. (2009). Perception and the medial temporal lobe: Evaluating the current evi dence. Neuron, 61, 657–666. Symonds, C., & MacKenzie, I. (1957). Bilateral loss of vision from cerebral infarction. Brain, 80, 415–455. Takaiwa, A., Yoshimura, H., Abe, H., & Terai, S. (2003). Radical “visual capture” observed in a patient with severe visual agnosia. Behavioural Neurology, 14, 47–53. Takarae, Y., & Levin, D. T. (2001). Animals and artifacts may not be treated equally: differ entiating strong and weak forms of category-specific visual agnosia. Brain & Cognition, 45, 249–264. Tanaka, Y., Nakano, I., & Obayashi T. (2002). Environmental sound recognition after uni lateral subcortical lesions. Cortex, 38, 69–76. Taniwaki, T., Tagawa, K., Sato, F., & Iino, K. (2000). Auditory agnosia restricted to envi ronmental sounds following cortical deafness and generalized auditory agnosia. Clinical Neurology & Neurosurgery, 102, 156–162. Thomas, R., & Forde, E. (2006). The role of local and global processing in the recognition of living and nonliving things. Neuropsychologia, 44, 982–986. Tomberg, C., & Desmedt, J. E. (1999). Failure to recognise objects by active touch (astereognosia) results from lesion of parietal-cortex representation of finger kinaesthe sis. Lancet, 354, 393–394. (p. 211)

Tootell, R. B. H., Hadjikhani, N. K., Mendola, J. D., Marrett, S., & Dale, A. M. (1998). From retinotopy to recognition: fMRI in human visual cortex. Trends in Cognitive Sciences, 2, 174–183. Turnbull, O. H., Driver, J., & McCarthy, R. A. (2004). 2D but not 3D: Pictorial depth deficits in a case of visual agnosia. Cortex, 40, 723–738. Uc, E. Y., Rizzo, M., Anderson, S. W., Quian, S., Rodnitzky, R. L., & Dawson, J. D. (2005). Visual dysfunction in Parkinson disease without dementia. Neurology, 65, 1907–1923.

Page 34 of 37

Perceptual Disorders Ungerleider, L. G., & Mishkin, M. (1982). Two cortical systems. In D. J. Ingle, J. W. Mans field, & M. A. Goodale (Eds.), Advances in the analysis of visual behaviour (pp. 549–596). Cambridge, MA: MIT Press. Vaina, L. M., Makris, N., Kennedy, D., & Cowey, A. (1998). The selective impairment of the perception of first-order motion by unilateral cortical brain damage. Visual Neuroscience, 15, 333–348. Valenza, N., Ptak, R., Zimine, I., Badan, M., Lazeyras, F., & Schnider, A. (2001). Dissociat ed active and passive tactile shape recognition: A case study of pure tactile apraxia. Brain, 24, 2287–2298. Vallar, G., & Ronchi R. (2009). Somatoparaphrenia: A body delusion. A review of the neu ropsychological literature. Experimental Brain Research, 192, 533–551. VandenBos, G. R. (Ed.). (2007). APA dictionary of psychology. Washington, DC: American Psychological Association. Venn, H. R., Watson, S., Gallagher, P., & Young, A. H. (2006). Facial expression percep tion: An objective outcome measure for treatment studies in mood disorders? Internation al Journal of Neuropsychopharmacology, 9, 229–245. Vignolo, L. A. (2003). Music agnosia and auditory agnosia: Dissociations in stroke pa tients. Annals of the New York Academy of Sciences, 999, 50–57. Walsh, Th. J. (1985). Blurred vision. In Th. J. Walsh (Ed.) Neuro-ophthalmology: Clinical signs and symptoms (pp. 343–385). Philadelphia: Lea & Febiger. Wang, E., Peach, R. K., Xu, Y., Schneck, M., & Manry, C. (2000). Perception of dynamic acoustic patterns by an individual with unilateral verbal auditory agnosia. Brain & Lan guage, 73, 442–455. Wang, X., Lu, T., Bendor, D., & Bartlett, E. (2008). Neural coding of temporal information in auditory thalamus and cortex. Neuroscience, 157, 484–494. Wang, W. J., Wu, X. H., & Li, L. (2008). The dual-pathway model of auditory signal pro cessing. Neuroscience Bulletin, 24, 173–182. Weinberger, N. M. (2007). Auditory associative memory and representational plasticity in the primary auditory cortex. Hearing Research, 229, 54–68. Wermer, M. J., Donswijk, M., Greebe, P., Verweij, B. H., & Rinkel, G. J. (2007). Anosmia af ter aneurysmal subarachnoid hemorrhage. Neurosurgery, 61, 918–922. Wierenga, C. E., Perlstein, W. M., Benjamin, M., Leonard, C. M., Rothi, L. G., Conway, T., Cato, M. A., Gopinath, K., Briggs, R., & Crosson, B. (2009). Neural substrates of object identification: Functional magnetic resonance imaging evidence that category and visual

Page 35 of 37

Perceptual Disorders attribute contribute to semantic knowledge. Journal of the International Neuropsychologi cal Society, 15, 169–181. Wilson, R. S., Arnold, S. E., Tang, Y., & Bennett, D. A. (2006). Odor identification and de cline in different cognitive domains in old age. Neuroepidemiology, 26, 61–67. Wright, B. A., & Zhang, Y. (2009). A review of the generalization of auditory learning. Philosophical Transactions of the Royal Society of London—Series B: Biological Sciences, 364, 301–311. Yamamoto, T. (2006). Neural substrates fort he processing of cognitive and affective as pects of taste in the brain. Archives of Histology & Cytology, 69, 243–255. Yang, J., Wu, M., & Shen, Z. (2006). Preserved implicit form perception and orientation adaptation in visual form agnosia. Neuropsychologia, 44, 1833–1842. Zaehle, T., Geiser, E., Alter, K., Jancke, L., & Meyer, M. (2008). Segmental processing in the human auditory dorsal stream. Brain Research, 1220, 179–190. Zatorre, R. J. (2007). There’s more to auditory cortex than meets the ear. Hearing Re search, 229, 24–30. Zeki, S. (1993). A vision of the brain. Oxford, UK: Blackwell Scientific. Zeki, S., & Bartels, A. (1998). The autonomy of the visual systems and the modularity of conscious vision. Proceedings of the Royal Society London B, 353, 1911–1914. Zihl, J. (1995a). Visual scanning behavior in patients with homonymous hemianopia. Neu ropsychologia, 33, 287–303. Zihl, J. (1995b). Eye movement patterns in hemianopic dyslexia. Brain, 118, 891–912. Zihl, J. (2011). Rehabilitation of cerebral visual disorders (2nd ed.). Hove, UK: Psychology Press. Zihl, J., & Hebel, N. (1997). Patterns of oculomotor scanning in patients with unilateral posterior parietal or frontal lobe damage. Neuropsychologia, 35, 893–906. Zihl, J., Sämann, Ph., Schenk, T., Schuett S., & Dauner, R. (2009). On the origin of line bi section error in hemianopia. Neuropsychologia, 47, 2417–2426. Zihl, J., von Cramon, D., & Mai, N. (1983). Selective disturbance of movement vision after bilateral brain damage. Brain, 106, 313–334. Zihl, J., von Cramon, D., Mai, N., & Schmid, C. (1991). Disturbance of movement vision af ter bilateral posterior brain damage. Further evidence and follow up observations. Brain, 114, 2235–2352.

Page 36 of 37

Perceptual Disorders

Josef Zihl

Josef Zihl is research group leader and head of the outpatient clinic for neuropsychol ogy, Max Planck Institute of Psychiatry.

Page 37 of 37

Varieties of Auditory Attention

Varieties of Auditory Attention Claude Alain, Stephen R. Arnott, and Benjamin J. Dyson The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0011

Abstract and Keywords Research on attention is one of the major areas of investigation within psychology, neurol ogy, and cognitive neuroscience. There are many areas of active investigations that aim to understand the brain networks and mechanisms that support attention, in addition to the relationship between attention and other cognitive processes like working memory, vigi lance, and learning. This chapter focuses on auditory attention, with a particular empha sis on studies that have examined the neural underpinnings of sustained, selective, and divided attention. The chapter begins with a brief discussion regarding the possible role of attention in the formation and perception of sound objects as the underlying units of selection. The similarities and differences in neural networks supporting various aspects of auditory attention, including selective attention, sustained attention, and divided atten tion are then discussed. The chapter concludes with a description of the neural networks involved in the control of attention and a discussion of future directions. Keywords: attention, perception, cognitive neuroscience, working memory

Varieties of Auditory Attention Modern research in psychology and neuroscience has shown that attention is not a uni tary phenomenon and that attentional processes may vary as a function of the sensory modality and the task at hand. In audition, one can think of attention in terms of mode or types of processes that might be engaged during everyday listening situations, namely sustained attention, selective attention, and divided attention. Each plays an important role in solving complex listening situations that are often illustrated colloquially using the cocktail party example, although in most everyday situations, all three modes of attention can be called on, depending on the context or situation. Imagine, for example, waiting to be seated at a restaurant. While waiting, it is sustained attention that enables us to moni tor the auditory environment for a particular event (i.e., the maître d to call out your name). During the wait, we may also start to selectively attend to an interesting conversa tion occurring within the general dining cacophony, thereby leading to a division of our Page 1 of 35

Varieties of Auditory Attention attentional resources between the conversation and the maître d. Common to all three as pects of attention (i.e., sustained, selective, divided) are processes that allow us to switch from one auditory source to another as well as to switch from one attentional mode to an other. Auditory attention is both flexible and dynamic in that it can be driven by external events such as loud sounds (e.g., a plate smashing on the floor) as well as by internal goal-directed actions that enable listeners to prioritize and selectively process task-rele vant sounds at a deeper level (e.g., what the maître d is saying to the restaurant manager about the reservation), often at the expense of other, less relevant stimuli. This brings up another important issue in research related to auditory attention, and that is (p. 216) the role of bottom-up, data-driven attentional capture (e.g., one’s initial response to a fire alarm) in relation to top-down, controlled attention (e.g., the voluntary control of atten tion often associated with goal-directed behavior). This distinction between exogenous and endogenous factors appears inherent to all three modes of auditory attention. Owing to the fact that current theories of attention have been developed primarily to ac count for visual scene analysis, as well as the fact that early work on attention demon strated greater equity between vision and audition, there has been a tendency to assume that general principles derived from research on visual attention can also be applied to situations that involve auditory attention. Despite these links, analogies between auditory and visual attention may be misleading. For instance, in vision, it appears that attention can be allocated to a particular region of retinal space. However, the same does not nec essarily apply to audition, in the sense that as far as it is known, there is no topographic representation of auditory space in the human brain. Although evidence suggests that we can allocate auditory attention to various locations of retinal space, we can also attend to sounds that are outside of our sight (e.g., Tark & Curtis, 2009), which makes auditory at tention particularly important for monitoring changes that occur outside our visual field. However, it is important to point out that we do not necessarily have to actively monitor our auditory environment in order to notice occasional or peculiar changes in sounds oc curring outside our visual field. Indeed, there is ample evidence from scalp recordings of event-related potentials (ERPs) showing that infrequent changes in the ongoing auditory scene are automatically detected and can trigger attention to them (Näätänen et al., 1978; Picton et al., 2000; Winkler et al., 2009). These changes in the auditory environ ment may convey important information that could require an immediate response, such as a car horn that is increasing in intensity. In that respect, audition might be thought of as being at the service of vision, especially in situations that require the localization of sound sources in the environment (Arnott & Alain, 2011; Kubovy & Van Valkenburg, 2001). Such considerations set the scene for the following discussion, in which a number of central questions pertaining to attention and auditory cognitive neuroscience will be tackled. What are the similarities and differences in the brain areas associated with ex ogenous and endogenous auditory attention? What is the neural network that enables us to switch attention between modalities or tasks? Are there different networks for switch ing attention between auditory spatial locations and objects? If so, are these the same as those used in the visual modality? What are the neural networks supporting the engage ment and disengagement of auditory attention? Page 2 of 35

Varieties of Auditory Attention This chapter focuses on the psychological and neural mechanisms supporting auditory at tention. We review studies on auditory attention, with a particular emphasis on human studies, although we also consider animal research when relevant. Our review is by no means exhaustive but rather aims to provide a theoretical framework upon which future research can be generated. We begin with a brief discussion regarding the limit of atten tion and explore the possible mechanisms involved in the formation of auditory objects and the role of attention on sound object formation. We then discuss the similarities and differences in neural networks supporting various aspect of auditory attention, including selective attention, sustained attention, and divided attention. We conclude by describing the neural networks involved in the control of attention and discuss future directions.

What Are We Attending To? Although the auditory environment usually comprises a myriad of sounds from various physical sources, only a subset of those sounds “enter” our awareness. Continuing with our earlier analogy for instance, we may start to listen to a nearby conversation while the maître d speaks with other patrons. The sounds from both conversations may overlap in time, yet we can follow and switch from one conversation to the other with seeming ef fortlessness: each conversation appears as a separate auditory stream, with our attention alternating between them at will. During the past 30 years, researchers have identified numerous cues that help listeners sort the incoming acoustic data into distinct sound sources, hereafter referred to as auditory objects or streams. For instance, sounds with common spatial locations, onsets, and spectral profiles usually originate from the same physical source and therefore are usually assigned to the same perceptual object (Alain & Bernstein, 2008; Bregman, 1990; Carlyon, 2004). The sounds that surround us often change in a predictive manner such that a certain sound may lead us to expect the next one. While listening to the maître d, we anticipate the upcoming sounds that indicate that our table is ready. Knowledge and experience with the auditory environment are particu larly helpful in solving complex listening situations in which sounds (p. 217) from one source may partially overlap and mask those from another source. In everyday listening situations, our auditory scene changes constantly, and observers must be able to keep track of multiple sound objects. Once an auditory scene has been parsed into its compo nent objects, selectively attending to one stream (e.g., shouts from the kitchen) while ig noring all the other talkers (e.g., maître d) and background noise becomes crucial for ef fective communication, especially in acoustically adverse or “cocktail party” environ ments (Cherry, 1953; Moray & O’Brien, 1967). Some visual models have characterized attention in spatial or frequency terms, likening attention to a “spotlight” or “filter” that moves around, applying processing resources to whatever falls within a selected spatial region (e.g., Brefczynski & DeYoe, 1999; LaBerge, 1983; McMains & Somers, 2004). Other models discuss resource allocation on the basis of perceptual objects, in which attending to a particular object enhances processing of all features of that object (e.g., Chen & Cave, 2008). Recent models of auditory attention have been consistent with the latter conception of attention in that the underlying units of Page 3 of 35

Varieties of Auditory Attention selection are discrete objects or streams, and that attending to one component of an audi tory object facilitates the processing of other properties of the same object (Alain & Arnott, 2000; Shinn-Cunningham, 2008). In the visual domain, the object-based account of attention was proposed to explain why observers are better at processing visual features that belong to the same visual object than when those visual features are distributed between different objects (Duncan, 1984; Egly et al., 1994). For instance, Duncan (1984) showed that performance declined when individuals were required to make a single judgment about each of two visually overlap ping objects (e.g., the size of one object and the texture of the other object), compared with when those two judgments had to be made about a single object. The robustness of the findings was demonstrated not only in behavioral experiments (Baylis & Driver, 1993) but also in neuroimaging studies employing functional magnetic resonance imaging (fM RI; e.g., O’Craven et al., 1999; Yantis & Serences, 2003) and ERPs (e.g., Valdes-Sosa et al., 1998). Such findings indicated that attention is preferentially allocated to a visual ob ject such that the processing of all features belonging to an attended object is facilitated. Alain and Arnott (2000) proposed an analogous object-based account for audition in which listeners’ attention is allocated to auditory objects derived from the ongoing audi tory scene according to perceptual grouping principles (Bregman, 1990). More recently, Shinn-Cunningham (2008) has drawn a parallel between object-based auditory and visual attention, proposing that perceptual objects form the basic units of auditory attention. Central to the object-based account of auditory attention is the notion that several per ceptual objects may be simultaneously accessible for selection and that interactions be tween object formation and selective attention determine how competing sources inter fere with perception (Alain & Arnott, 2000; Shinn-Cunningham, 2008). Although there is some behavioral evidence supporting the notion that objects can serve as an organiza tional principle in auditory memory (Dyson & Ishfaq, 2008; Hecht et al., 2008), further work is required to understand how sound objects are represented in memory when these objects occur simultaneously. Moreover, the object-based account of auditory attention posits that perceptual objects form the basic unit for selection rather than individual acoustic features of the stimuli such as pitch and location (Mondor & Terrio, 1998). How ever, in many studies, it is often difficult to distinguish feature- and object-based attention effects because sound objects are usually segregated from other concurrent sources us ing simple acoustic features such as pitch and location (e.g., the voice and location of the maître d in the restaurant). Therefore, how do we know that attention is allocated to a sound object rather than its defining acoustic features? One way to distinguish between feature- and object-based attentional accounts is to pit the grouping of stimuli into per ceptual objects against the physical proximity of features (e.g., Alain et al., 1993; Alain & Woods, 1993, 1994; Arnott & Alain, 2002a; Bregman & Rudnicky, 1975; Driver & Baylis, 1989).

Page 4 of 35

Varieties of Auditory Attention

Figure 11.1 A, Schemata of the stimuli presented in three different conditions. In the evenly spaced (ES) condition, tones were equally spaced along the fre quency axis (tone 1 = 1,048, tone 2 = 1,482, and tone 3 = 2,096 Hz). In the clustering easy (CE) condi tion, the two task-irrelevant tones (i.e., tone 2 = 1,482 and tone 3 = 1,570 Hz) were clustered based on frequency to promote the segregation of the taskrelevant frequency (i.e., tone 1). In the clustering hard (CH) condition, the task-relevant frequency was clustered with the middle distracters (1 = 1,400 Hz). Arrows indicate the frequency to be attended in each condition. Targets (defined as longer or louder than other stimuli) are shown by asterisks. B, Response time to infrequent duration targets embedded in the attended stream. C, Group mean difference wave be tween event-related brain potentials elicited by the same stimuli when they were attended and unattend ed in all three conditions from the midline frontal scalp region. The negativity is plotted upward. Adapted from Alain & Woods, 1994.

Figure 11.1A shows an example in the auditory modality where the physical similarity be tween three different streams of sounds was manipulated to promote perceptual group ing while at the same time decreasing the physical distance between task-relevant and task-irrelevant stimuli. In that study, Alain and Woods (1993) presented participants with rapid transient pure tones that varied along three different frequencies in random order. In the example shown in Figure 11.1A, participants were asked to focus their attention to the lowest pitch sound in order to detect slightly longer (Experiment 1) or louder (Experi ment 2) target stimuli while ignoring the other sounds. In one condition, the tones com posing the sequence were evenly spaced along the frequency domain, whereas in another condition, the two task-irrelevant frequencies were (p. 218) grouped together by making the extreme sound (lowest or highest, depending on the condition) more similar to the middle pitch tone. Performance improved when the two task-irrelevant sounds were grouped together even though this meant having more distractors closer in pitch to the task-relevant stimuli (see Figure 11.1B). Figure 11.1C shows the effects of perceptual Page 5 of 35

Varieties of Auditory Attention grouping on the selective attention effects on ERPs, which was isolated in the difference wave between the ERPs elicited by the same sounds when they were task relevant and when they were task irrelevant. Examining the neural activity that occurred in the brain during these types of tasks suggested that perceptual grouping enhances selective atten tion effects in auditory cortices, primarily along the Sylvian fissure (Alain et al., 1993; Alain & Woods, 1994; Arnott & Alain, 2002a). Taken together, these studies show that perceptual grouping can override physical similarity effects during selective listening, and suggest that sound objects form the basic unit for attentional selection.

Figure 11.2 A, Schematic representation of harmon ic complex tones (each horizontal line represents a pure tone) used in studies that have examined the role of attention on concurrent sound segregation. Participants were presented with a harmonic com plex that had all tonal elements in tune (fusion) or in cluded a mistuned harmonic. In the active listening task, participants indicated whether they heard one sound or two sounds (i.e., a buzz plus another sound with a pure tone quality). In the passive listening condition, participants watched a muted movie of their choice with subtitles. B, Auditory event-related potentials (ERPs) to complex harmonic tones were measured over the right frontal-central scalp region (FC2). The difference wave reveals the object-related negativity (ORN), an ERP component that indexes the perception of concurrent sound objects. Note the similarity in ORN amplitude during active listening (participants indicated whether they heard one sound or two concurrent sound objects) and passive listening (participants watched a muted subtitled movie of their choice). During active listening, the ORN is followed by a positive wave (P400) thought to be related to the perceptual decision. Adapted with permission from Alain, Arnott, & Pic ton, 2001.

Page 6 of 35

Varieties of Auditory Attention One important assumption of the object-based account of auditory attention is that sound objects are created and segregated independently of attention and that selection for fur ther processing takes place after this initial partition of the auditory scene into its con stituent objects. Although there is evidence to suggest that sound segregation may occur independently of listeners’ attention, there are also some findings that suggest otherwise (Shamma et al., 2011). The role of attention on perceptual organization has been investi gated for sounds that occur concurrently as well as for sounds that are sequentially grouped into distinct perceptual streams. In the case of concurrent sound segregation (Figure 11.2), the proposal that concurrent sound (p. 219) segregation is not under voli tional control was confirmed in ERP studies using passive listening (Alain et al., 2001a, 2002; Dyson & Alain, 2004) as well as active listening paradigms that varied auditory (Alain & Izenberg, 2003) or visual attentional demands (Dyson et al., 2005). For sequen tial sound segregation, the results are more equivocal.

Figure 11.3 B, Schematic representation of the ABA paradigm often used to investigate auditory stream segregation. The loss of rhythmic information is in dicative of stream segregation. B, Likelihood of re porting hearing two concurrent stream of sounds as a function of the frequency separation between tone A and tone B. C, Scalp-recorded auditory event-relat ed potentials using sequences similar to those shown in A. Note the similarities in auditory event-related potentials (ERPs) recorded during attend-and-ignore condition. The effects of frequency separation on ERPs recorded during active and passive listening was not statistically different, suggesting that encod ing of ΔF (difference in frequency), which deter mines streaming, is little affected by attention. Adapted with permission from Snyder, Alain, & Pic ton, 2006.

In most studies, the effect of attention on sequential sound organization has been investi gated by mixing two sets of sound sequences that differ in terms of some acoustic feature (e.g., in the frequency range of two sets of interleaved pure tones). In a typical frequency paradigm, sounds are presented in patterns of “ABA—ABA—”, in which “A” and “B” are tones of different frequencies and “—” is a silent interval (Figure 11.3A). The greater the stimulation rate and the feature separation, the more likely and rapidly listeners are to Page 7 of 35

Varieties of Auditory Attention report hearing two separate streams of sounds (i.e., one of A’s and another of B’s), with this type of perceptual organization or stream segregation taking several seconds to build up. Using similar sequences as those shown in Figure 11.3A, Carlyon and colleagues found that the buildup of stream segregation was affected by auditory (Carlyon et al., 2001) and visual (Cusack et al., 2004) attention, and they proposed that attention may be needed (p. 220) for stream segregation to occur. Also consistent with this hypothesis are findings from neuropsychological studies in which patients with unilateral neglect follow ing a brain lesion show impaired buildup in streaming relative to age-matched controls when stimuli are presented to the neglected side (Carlyon et al., 2001). In addition to the pro-attention studies reviewed above, there is also evidence to suggest that attention may not be required for sequential perceptual organization to occur. For instance, patients with unilateral neglect who are unaware of sounds presented to their neglected side ex perience the “scale illusion” (Deutsch, 1975), which can occur only if the sounds from the left and right ears are grouped together (Deouell et al., 2008). Such findings are difficult to reconcile with a model invoking a required role of attention in stream segregation and suggest that some organization must be taking place outside the focus of attention, as others have suggested previously (e.g., Macken et al., 2003; McAdams & Bertoncini, 1997; Snyder et al., 2006). This apparent discrepancy could be reconciled by assuming that se quential stream segregation relies on multiple levels of representations (Snyder et al., 2009), some of which may be more sensitive to volitional control (Gutschalk et al., 2005; Snyder et al., 2006). In summary, evidence from ERP studies suggests that perceptual organization of acoustic features into sound objects can occur independently of attention (e.g., Alain & Izenberg, 2003; Alain & Woods, 1994; Snyder et al., 2006; Sussman et al., 2007). However, it is very likely that attention facilitates perceptual organization and that selective attention may determine which stream of sounds is in the foreground and which is in the background (i.e., figure–ground segmentation) (Sussman et al., 2005). These views are compatible with the object-based account of auditory attention in which primitive perceptual process es sort the incoming acoustic data into its constituent sources, allowing selective process es to work on the basis of meaningful objects (Alain & Arnott, 2000).

Mechanisms of Auditory Attention As mentioned earlier, selective attention enables us to prioritize information processing such that only a subset of the vast sensory world (and internal (p. 221) thought) receives more in-depth analysis. In the last decade, there has been a growing interest in three im portant mechanisms that could serve to optimize the contrast between sounds of interest and those that are “task irrelevant.” These are enhancement (also referred to as gain), the sharpening of receptive fields for task-relevant information, and the possible suppres sion of task-irrelevant information (for a discussion of these ideas related to stimulus rep etition, see Grill-Spector et al., 2006). The enhancement and suppression mechanisms were originally proposed to account for visual attention (e.g., Hillyard et al., 1998), and such models posit feature-specific enhancements in regions that are sensitive to the at Page 8 of 35

Varieties of Auditory Attention tended features as well as suppression in regions that are sensitive to the unattended (task-irrelevant) features.

Attention as an Enhancement and Suppression Mechanism to En hance Figure–Ground Segregation In the auditory attention literature, the notion of a gain mechanism was introduced early on (e.g., Hillyard et al., 1973), although it was not originally ascribed as a feature-specific process. For instance, early studies in nonhuman primates showed that the firing rate of auditory neurons increased when sounds occurred at the attended location (Benson & Hienz, 1978) or when attention was directed toward auditory rather than visual stimuli (Hocherman et al., 1976), consistent with a mechanism that “amplifies” or enhances the representation of task-relevant stimuli. Electroencephalography (EEG; Hillyard et al., 1973; Woods et al., 1994) and magnetoencephalography (MEG; Ross et al., 2010; Woldorff et al., 1993) studies provide further evidence for neural enhancement during auditory se lective attention. For instance, the amplitude of the N1 wave (negative wave at ∼100 ms after sound onset) from scalp-recorded auditory evoked potentials is larger when sounds are task-relevant and fall within an “attentional spotlight” compared with when the same sounds are task-irrelevant (Alho et al., 1987; Giard et al., 2000; Hansen & Hillyard, 1980; Hillyard et al., 1973; Woldorff, 1995; Woods et al., 1994; Woods & Alain, 2001).

Figure 11.4 Schematic representation of an experi ment in which participants were presented with streams of sounds defined by the conjunction of pitch and location. In a typical experiment, participants were asked to listen to low pitch sounds in the left ear in order to detect occasional stimuli (i.e., target) that slightly differed from the standard stimuli along a third dimension (e.g., duration or intensity).

The notion of feature-specific enhancement and suppression as the principle mechanisms to enhance figure–ground segregation implies that selective attention would “amplify” neural activity in specific cortical areas that respond preferably to particular stimulus at tributes. Because the task-relevant stream of sounds in most neuroimaging studies of au ditory selective attention is usually defined by the pitch or the location of the stimuli, the feature-specific enhancement and suppression hypothesis can be evaluated by comparing neural activity for identical stimuli when they are task relevant and task irrelevant. Such comparisons have revealed different ERP-selective attention effects for pitch and location attributes (Degerman et al., 2008; Woods & Alain, 2001). Figure 11.4 shows a schematic diagram of a typical experiment in which participants are presented with multidimension Page 9 of 35

Varieties of Auditory Attention al stimuli. The deployment of stimuli constructed by the orthogonal combination of two frequencies (i.e., 250 and 4000 Hz) and two locations (left and right ear) have proved helpful for investigating feature-specific attention effects (e.g., attend to high tones) as well as object-based attention effects that rely on the conjunction of sound features (e.g., attend to high tones in the left ear). Using such paradigms under feature-specific condi tions, Woods and colleagues have shown that auditory selective attention modulates activ ity in frequency-specific regions of auditory cortex (Woods et al., 1994; Woods & Alain, 2001). The fact that attention enhances the amplitude of the auditory evoked response from a tonotopically organized generator provides strong support for models that posit at tentional gain of sensory processing (Hillyard et al., 1987; Woldorff et al., 1993). In addi tion to enhanced responses to features deemed task relevant by virtue of the task instruc tions, object-based (or conjunctive) attentional effects have also been observed, including neural facilitation for objects expressed both during and after featural processing (e.g., Woods et al., 1994; Woods & Alain, 1993, 2001). For example, in complex listening situa tions in which target sounds are defined by a combination of features, (p. 222) nontarget sounds that share either frequency or location features with the targets have also shown attention-related effects that differ in amplitude distribution (Woods et al., 1994; Woods and Alain, 1993, 2001). Differences in amplitude distribution are indicative that attention modulates neural activity arising from different cortical fields related to the processing of different sound features. Such findings are also consistent with the dual-pathway model of auditory scene analysis (Rauschecker & Scott, 2009; Rauschecker & Tian, 2000) in which sound identity and sound location are preferably processed along ventral (what) and dorsal (where) pathway streams. There is also evidence that such gain mechanisms play an important role during sustained selective attention to a single speech stream em bedded in a multiple-talker environment (Kerlin et al., 2010). Although the evidence for feature-based and object-based enhancement is compelling, that related to suppression of auditory stimuli occurring outside the attentional spotlight are more equivocal. There are some findings consistent with an active suppression mech anism. For example, the amplitude of the P2 wave (positive deflection at ∼180 ms after sound onset) from the auditory ERPs elicited by task-irrelevant sounds is larger during in tramodal attention than during intermodal (auditory-visual) attention (Degerman et al., 2008; Michie et al., 1993). Although the enhanced P2 amplitude during intramodal atten tion may reflect an active suppression mechanism, there is no evidence that the suppres sion is feature specific. Moreover, one cannot rule out the possibility that during inter modal attention, participants’ attention may have wandered to the auditory stimuli, there by modulating the amplitude of ERPs (i.e., increased negativity) to the so-called ‘unat tended’ stimuli. Therefore, the higher P2 amplitude observed during selective attention tasks may not reflect suppression, but instead may simply indicate attention effects dur ing the control baseline condition. In a more recent study, Munte et al. (2010) measured auditory evoked responses from two locations, each containing a spoken story and bandpass-filtered noise. Participants were told to focus their attention on a designated story/location. Consistent with prior re search, the N1 elicited by the speech probe was found to be larger at the attended than Page 10 of 35

Varieties of Auditory Attention at the unattended location. More importantly, the noise probes from the task-relevant story’s location showed a more positive frontal ERP response at about 300 ms than did the probes at the task-irrelevant location. Although the enhanced positivity may be indica tive of a suppression mechanism, there is a possibility that it reflects novelty or target-re lated activity, which may comprise a positive wave that peaks about the same time. More over, using a similar design, but measuring the effect of sustained selective attention in the EEG power spectrum, Kerlin et al. (2010) found no evidence of suppression for the task-irrelevant stream of speech. If suppression of task-irrelevant sounds does occur, what underlying mechanisms support it? It is known that the auditory system is composed of descending efferent pathways that are particularly important for modulating neural activity at the peripheral level. Never theless, it remains unclear whether such suppression would be an appropriate strategy in everyday listening situations. Although one could argue that a feature-specific suppres sion mechanism would help to prevent information overload, such a suppression mecha nism could also have a negative impact in the sense that important information could be missed or undetected. Perhaps a more parsimonious account would be a facilitatory process that enhances representation of task-relevant information in sensory and shortterm memory, while representations of task-irrelevant information would simply decay progressively without necessarily postulating active suppression. Indeed, increasing at tentional demands reduces the amplitude of the mismatch negativity, an ERP component that reflects sensory memory, but has little impact on the amplitude of the N1 and P2 waves, which reflect sensory registration (Alain & Izenberg, 2003). Thus, attention ap pears to play an important role in maintaining and enhancing sensory representations such that conditions that prevent listeners from attending to task-irrelevant sounds cause a processing deficit in detecting changes in the unattended stream of sounds (depending both on sensory memory and a comparison process between recent and incoming stimuli) (e.g., Alain & Woods, 1997; Arnott & Alain, 2002b; Trejo et al., 1995; Woldorff et al., 1991). This is akin to the model proposed by Cowan (1988, 1993), in which attention plays an important role in keeping “alive” sensory representations for further and more indepth processing.

Page 11 of 35

Varieties of Auditory Attention

Attention Enhances Perception by Sharpening of Tuning Curve

Figure 11.5 A, Schematic representation of the stim uli embedded in notched noise around 1,000 Hz. B, Global field power showing the strength of the N100 auditory evoked response elicited by stimuli without masking noise (no mask) or as a function of width of the notched noise. Note that the greater the width of the notched noise, the larger the N100. C, As selec tive attention is made increasingly possible by the in crease in noise width, the N100 peak arrives earlier and is more sharply tuned. Adapted with permission from Kauramaki et al., 2007.

In addition to enhancement and suppression mechanisms, the effects of attention may be (p. 223) mediated by a mechanism that selectively sharpens the receptive fields of neu rons representing task-relevant sounds, a mechanism that may also enhance figure– ground separation. Kauramaki et al. (2007) used the notched noise technique in which a pure tone is embedded within noise that has a segment whose width around the pure tone is parametrically varied to ease target detection (Figure 11.5). Kauramaki et al. mea sured the N1 wave elicited by the pure tone during selective attention and found a de crease in attention effects when the width of the notched noise was decreased. However, the shape of the function was significantly different from a multiplicative one expected on the basis of simple gain model of selective attention (see also Okamoto et al., 2007). Ac cording to Kauramaki et al. (2007), auditory selective attention in humans cannot be ex plained by a gain model, whereby only the neural activity level is increased. Rather, selec tive attention additionally enhances auditory cortex frequency selectivity. This effect of selective attention on frequency tuning evolves rapidly, within a few seconds after atten tion switching, and appears to occur for neurons in nonprimary auditory cortices (Ahveni nen et al., 2011). In summary, attentional effects on sensory response functions include an increase in gain and sharpening of tuning curves that appears to be specific to the task-relevant feature. This feature-specific attention effect is also accompanied by a more general suppression response, although the evidence for such suppression in the auditory domain remains Page 12 of 35

Varieties of Auditory Attention equivocal. Both gain and suppression mechanisms, as well as sharper receptive field tun ing, may enhance figure–ground segregation, thereby easing the monitoring and selec tion of task-relevant information.

Neural Network of Auditory Attention The development of positron emission tomography (PET) and fMRI has allowed re searchers to make major strides in identifying the brain areas that play an important role in auditory attention. In the next sections, we briefly review the brain areas involved in sustained, selective (intramodal and intermodal), and divided attention in an effort to draw commonalities as well as important differences (p. 224) in the neural networks sup porting the various types of auditory attention.

Sustained Attention The process of monitoring the auditory environment for a particular event (e.g., perhaps you are still waiting for the maître d to call out your name) has been studied in a number of ways in order to reveal the neural architecture supporting sustained attention. One ex ample is the oddball paradigm in which participants are asked to respond to infrequent targets sounds in a train of nontarget stimuli. In oddball tasks, the target stimuli differ from the standard sounds along a particular dimension (e.g., pitch, duration, intensity, lo cation). The detection of the task-relevant stimuli is typically accompanied by a parietalcentral positive wave of the ERP, the P300 or P3b. Moreover, fMRI studies have identified several brain areas that show increased hemodynamic activity during the detection of these oddball targets relative to the nontarget stimuli, including the auditory cortex bilat erally, parietal cortex and prefrontal cortex, supramarginal gyrus, frontal operculum, and insular cortex bilaterally (Linden et al., 1999; Yoshiura et al., 1999). The increased fMRI signals for target versus nontarget conditions are consistent over various stimuli (i.e., au ditory versus visual stimuli) and response modalities (i.e., button pressing for targets ver sus silently counting the targets) and can be regarded as specific for target detection in both the auditory modality and the visual modality (Linden et al., 1999). Interestingly, the amount of activity in the anterior cingulate and bilateral lateral prefrontal cortex, tempo ral-parietal junction, postcentral gyri, thalamus, and cerebellum was positively correlated with an increase in time between targets (Stevens et al., 2005). The fact that these effects were only observed for target and not novel stimuli suggests that the activity in these ar eas indexes the updating of a working memory template for the target stimuli or strategic resource allocation processes (Stevens et al., 2005). Another task that requires sustained attention is the n-back working memory task in which participants indicate whether the incoming stimulus matches the one occurring one, two, or three positions earlier. Alain and colleagues (2008) used a variant of this par adigm and asked their participants to press a button only when the incoming stimulus sound matched the previous one (1-back) in terms of identity (i.e., same sound category such as animal sounds, human sounds, or musical instrument sounds) or location. Distinct Page 13 of 35

Varieties of Auditory Attention patterns of neural activity were observed for sustained attention and transient target-re lated activity. Figure 11.6 shows sustained task-related activity during sound identity and sound location processing after taking into account transient target-related activity. The monitoring of sound attributes recruited many of the areas previously mentioned for the oddball task, including auditory, parietal, and prefrontal cortices (see also, Martinkauppi et al., 2000; Ortuno et al., 2002). Interestingly, changes in task instruction modulated ac tivity in this attentional network, with greater activity in ventral areas, including the ante rior temporal lobe and inferior frontal gyrus, when participants’ attention was directed toward sound identity and greater activity in dorsal areas, including the inferior parietal lobule, superior parietal cortex, and superior frontal gyrus, when participants’ attention was directed toward sound location.

Figure 11.6 A, Schematic of n-back task used to in vestigate sustained working memory to sound identi ty and sound location. B, Task differences in sus tained activity during a working memory task. Warm colors indicate greater sustained activity during working memory for sound identity, and cool colors indicate greater sustained activity during working memory for sound location. The activation maps are displayed on the cortical surface using surface map ping (SUMA). IFG, inferior frontal gyrus; IPL, inferi or parietal lobule; STG, superior temporal gyrus. Adapted with permission from Alain et al., 2008.

Although such data again suggest the relationship between specific regions or pathways within the brain and specific perceptual and cognitive function, it is important to consider the extent to which clear delineations can be made. For example, although the inferior parietal lobule (IPL) is clearly involved in spatial analysis and may play an important role in monitoring or updating sound source location in working memory, there is also plenty of evidence demonstrating its involvement in nonspatial processing (see Arnott et al., 2004). In fact, some of this activity may be accounted for in terms of an action–perception Page 14 of 35

Varieties of Auditory Attention dissociation (Goodale & Milner, 1992) in which the dorsal auditory pathway brain regions are important for acting on objects and sounds in the environment (Arnott & Alain, 2011). For example, in a task requiring listeners to categorize various sounds as being “material” (i.e., malleable sheets of paper, Styrofoam, aluminium foil, or plastic being crumpled in a person’s hands), “human” (i.e., nonverbal vocalizations including coughing, yawning, snoring, and throat clearing) or “noise” (i.e., randomized material sounds), Arnott et al. (2008) found increased blood-oxygen-level- dependent (BOLD) effect along a dorsal region, the left intraparietal sulcus (IPS), in response to the material sounds. A very similar type of activation was also reported by Lewis and colleagues when listeners attended to hand-related (i.e., tool) sounds compared with animal vocalizations (Lewis, 2006; Lewis et al., 2005). Both groups proposed that such sounds triggered a “mental mimicry” of the motor production sequences that most likely would have produced the sounds, with the left hemispheric activation reflecting the right-handed dominance of the participants. This explanation finds support in the fact that area (p. 225) hAIP, a region shown to be active not only during real hand-grasping movements but also during imag ined movements, as well as passive observation of people grasping three-dimensional ob jects, is located proximally at the junction of the anterior IPS and inferior postcentral sul cus (Culham et al., 2006). Additionally, it is noteworthy that the IPS is an area of the IPL known to make use of visu al input, and that it is particularly important for integrating auditory and visual informa tion (Calvert, 2001; Macaluso et al., 2004; Meienbrock et al., 2007), especially with re spect to guiding and controlling action in space (e.g., Andersen et al., 1997; Sestieri et al., 2006). As noted earlier, knowledge about auditory material properties is largely depen dent on prior visual experiences, at least for normally sighted individuals. Thus, it is plau sible that the IPS auditory material–property activity reflects the integration of auditory input with its associated visual knowledge. The above data remind us that successful interaction with the complex and dynamic acoustic environment that we ultimately face involves the coordination and integration of attentional demands from other modalities such as vision, the timely initiation of task-ap propriate action, and the maintenance of attentional processes over long periods of time. Nevertheless, there are cases in which less important signals must be sacrificed for more important signals, and the brain regions associated with selective attention are those to which we now turn.

Selective Attention Auditory selective attention was originally studied using dichotic listening situations in which two (p. 226) different streams of speech sounds were presented simultaneously in both ears (Broadbent, 1962; Cherry, 1953; for a review, see Driver, 2001). In such situa tions, participants were asked to shadow (repeat) the message presented in one ear while ignoring the speech sounds presented at the irrelevant location (i.e., the other ear). This intramodal attention (i.e., between streams of sounds) involves sustained attention to a particular stream of sounds in the midst of others, usually defined by its most salient fea Page 15 of 35

Varieties of Auditory Attention tures, such as pitch and location (e.g., Hansen & Hillyard, 1983). This form of selective at tention differs from intermodal attention, whereby participants are presented with streams of auditory and visual stimuli and alternatively focus on either auditory or visual stimuli in order to detect infrequent target stimuli. The neural networks supporting in tramodal and intermodal selective attention are examined next in turn.

Intramodal Selective Attention Intramodal auditory selective attention tasks engage a network of frontal, temporal, and parietal regions (Hill & Miller, 2009; Lipschutz et al., 2002), and activity in these areas appears to be sensitive to task demands. For instance, using identical stimuli, Hill and Miller (2009) found greater activity in dorsal brain regions when listeners were told to at tend to the location of a particular talker in a multiple-talker situation, whereas more ven tral activity was observed when participants attended to the pitch (voice) of the talker. Once again, this dissociation between spatial and nonspatial auditory attention is consis tent with the general trend of showing greater activity in ventral (what) and dorsal (where) brain regions in auditory tasks that require sound identification or sound localiza tion, respectively (Alain et al., 2001b, 2008; Arnott et al., 2004, 2005; Degerman et al., 2006; Leung & Alain, 2011; Maeder et al., 2001; Rama et al., 2004). Recently, there has been increased interest in examining the effects of attention on audi tory cortical activity. This interest is motivated in part by the notion that the auditory cor tex is not a single entity but rather comprises many cortical fields that appear to be dif ferentially sensitive to sound frequency and sound location, and by the analogous discov ery of feature-specific enhancement and suppression of visual neurons during visual se lective attention. Although the effects of attention on neural activity in auditory cortical areas are not disputed, the effects of attention on the primary auditory cortex remain equivocal. For instance, some fMRI studies do not find evidence for attention effects on primary auditory cortex in Heschl’s gyrus (Hill & Miller, 2009; Petkov et al., 2004), whereas others report enhancements in frequency-sensitive regions, although the atten tion effects are not necessarily restricted to them (Paltoglou et al., 2009). The lack of at tention effects on the BOLD signal from the primary cortex does not mean that neural ac tivity in primary auditory cortex is not modulated by attention; it may be too weakly or differentially modulated such that the BOLD effect cannot capture it. As we have already seen, studies that have used another imaging technique such as EEG or MEG provide evi dence suggesting that selective attention amplifies neural activity in frequency-sensitive regions (Woods et al., 1994; Woods & Alain, 2001) as well as in or near primary areas (Woldorff & Hillyard, 1991; Woldorff et al., 1993). In addition, single-unit research in mammals has shown that attention can modulate the neural firing rate of neurons in pri mary auditory cortex (Benson and Hienz, 1978; Hocherman et al., 1976). Moreover, intramodal selective attention to location (i.e., left or right ear) has been shown to increase BOLD activity in the right middle frontal gyrus regardless of the loca tion of attentional focus (Lipschutz et al., 2002). In contrast, brain regions including the middle and inferior frontal cortex, frontal eye fields (FEFs), and the superior temporal cortex in the contralateral hemisphere did show attention-related activity according to Page 16 of 35

Varieties of Auditory Attention which ear was being attended to (Lipschutz et al., 2002). Activation in the superior tem poral cortex extended through the temporal-parietal junction to the inferior parietal cor tex, including the IPS (Lipschutz et al., 2002). The activation in the human homologue of FEFs during auditory spatial attention has been reported in several studies (e.g., Lip schutz et al., 2002; Tzourio et al., 1997; Zatorre et al., 1999), but the results should be in terpreted with caution because these studies did not control for eye movements, which could partly account for the activation in the FEFs. In a more recent study, Tark and Cur tis (2009) found FEF activity during audiospatial working memory task even for sounds that were presented behind the head to which it was impossible to make saccades. Their findings are consistent with the proposal that FEF plays an important role in processing and maintaining sound location (Arnott & Alain, 2011). In addition to enhanced activity in cortical areas during intramodal auditory selective at tention, there is also evidence from fMRI that selective attention (p. 227) modulates activi ty in the human inferior colliculus (Rinne et al., 2008). The inferior colliculus is a mid brain nucleus of the ascending auditory pathway with diverse internal and external con nections. The inferior colliculus also receives descending projections from the auditory cortex, suggesting that cortical processes affect inferior colliculus operations. Enhanced fMRI activity in the basal ganglia has also been observed during auditory selective atten tion to speech sounds (Hill & Miller, 2009). There is also some evidence that selective at tention may modulate neural activity at the peripheral level via descending projections. However, the effects of attention on the peripheral and efferent auditory pathways re main equivocal. For example, although some studies report attention effects on the pe ripheral auditory systems as measured with evoked otoacoustic emissions (Giard et al., 1994; Maison et al., 2001), other studies do not (Avan & Bonfils, 1992; Michie et al., 1996; Timpe-Syverson & Decker, 1999).

Intermodal Auditory Attention In intermodal attention studies, the effects of attention on auditory processing are as sessed by comparing neural activity to auditory stimuli when participants perform a de manding task in another modality (usually visual), with activity elicited by the same stim uli when attention is directed toward the auditory stimuli. Rinne et al. (2007) found en hanced activity in auditory areas during a cross-modal attention task, with the intermodal attention effects extending to both the posterior and superior temporal gyrus. However, there was no difference in parietal cortex or prefrontal cortex when attention was direct ed to auditory versus visual (i.e., picture) stimuli. Similarly, Kawashima et al. (1999) reported comparable activation in the right prefrontal cortex during visual and auditory attention to speech sounds. However, they did find a difference in parietal cortex activa tion between attention to auditory and visual stimuli (Kawashima et al., 1999), suggesting that the parietal cortex may play a different role during auditory and visual attention. In terestingly, electrophysiological investigations of the chinchilla cochlea demonstrate that as the attentional demands to the visual system increase (as in the case of an increasingly difficult visual discrimination task), there is a corresponding decrease in the sensitivity of the cochlea that appears mediated by efferent projections to the outer hair cells (Delano Page 17 of 35

Varieties of Auditory Attention et al., 2007). Accordingly, it seems plausible that intermodal attention could theoretically alter auditory processing at the very earliest stages of auditory processing (i.e., at senso ry transduction). Further underscoring the need to consider the interaction of attentional demands be tween modalities, there is evidence to suggest that during intermodal selective attention tasks, attention to auditory stimuli may alter visual cortical activity. The attentional repul sion effect is one example of this. Traditionally understood as a purely visual phenome non, attentional repulsion refers to the perceived displacement of a vernier stimulus in a direction that is opposite to that of a brief peripheral visual cue (Suzuki & Cavanagh, 1997). Observers in these behavioral tasks typically judge two vertical lines placed above and below one another to be offset in a direction that is opposite to the location of a briefly presented irrelevant visual stimulus. Under the assumption that the repulsion ef fect exerts its effect in early retinotopic areas (i.e., visual cortex; Pratt & Turk-Browne, 2003; Suzuki & Cavanagh, 1997), Arnott and Goodale (2006) sought to determine whether peripheral auditory events could also elicit the repulsion effect. In keeping with the notion that sounds can alter occipital activity, peripheral sounds were also found to elicit the repulsion effect. More direct evidence for enhancement of activity in occipital cortex comes from an fMRI study in which selective attention to sounds was found to activate visual cortex (Cate et al., 2009). In that particular study, the occipital activations appeared to be specific to at tended auditory stimuli given that the same sounds failed to produce occipital activation when they were not being attended to (Cate et al., 2009). Moreover, there is some evi dence that auditory attention, but not passive exposure to sounds, activates peripheral re gions of visual cortex when participants attended to sound sources outside the visual field. Functional connections between auditory cortex and visual cortex subserving the peripheral visual field appear to underlie the generation of auditory occipital activations (Cate et al., 2009). This activation may reflect the priming of visual regions to process soon-to-appear objects associated with unseen sound sources and provides further sup port for the idea that the auditory “where” subsystem may be in the service of the visualmotor “where” subsystem (Kubovy & Van Valkenburg, 2001). In fact, the functional over lap between the auditory cortical spatial network and the visual orientation network is quite striking, as we have recently shown, suggesting that the auditory spatial network and visual orienting network share a (p. 228) common neural substrate (Arnott & Alain, 2011). Along the same line, auditory selective attention to speech modulates activity in the visual word form areas (Yoncheva et al., 2009), suggesting a high level of interaction between sensory systems even at the relatively early stages of processing. Throughout this discussion of intermodal attention, one should also keep in mind that even though auditory information can certainly be obtained in the absence of other senso ry input (e.g., sounds perceived in the dark, or with the eyes closed), for the vast majority of neurologically intact individuals, any given auditory experience is often experienced in the presence of other sensory (especially visual) input. Thus, there is good reason to ex pect that in such cases, the neural architecture of auditory processing may be interwoven Page 18 of 35

Varieties of Auditory Attention with that of other sensory systems, especially in instances in which the two are depen dent on one another. Once again, neuroimaging data derived from the experience of audi tory material property information are useful in this regard. Unlike auditory localization where a sound’s position can be constructed from interaural timing and intensity differ ences, the acoustic route to the material properties of any given object depends entirely on previous associations between sound and information from other senses (e.g., hearing the sound that an object makes as one watches someone or something come into contact with that object, or hearing the sound that an object makes as one touches it). Building on research showing that visual material processing appears to be accomplished in ven tromedial brain areas that include the collateral sulcus and parahippocampal gyrus (Cant & Goodale, 2007), Arnott and colleagues used fMRI to investigate the brain regions in volved in auditory material processing (Arnott et al., 2008). Relative to control sounds, audio recordings of various materials being manipulated in someone’s hands (i.e., paper, plastic, aluminium foil, and Styrofoam) were found to elicit greater hemodynamic activity in the medial region of the right parahippocampus both in neurologically intact individu als and in a cortically blind individual. Most interestingly, a concomitant visual material experiment in which the sighted participants viewed pictures of objects rendered in dif ferent materials (e.g., plastic, wood, marble, foil) was found to elicit right parahippocam pal activity in an area just lateral to the auditory-evoked region. These results fit well with animal neuroanatomy in that the most medial aspect of the monkey parahippocam pus (i.e., area TH) has connections with auditory cortex (Blatt et al., 2003; Lavenex et al., 2004; Suzuki et al., 2003), whereas the region situated immediately adjacent to area TH (i.e., area TFm in the macaque or TL in the rhesus monkey) has strong inputs from areas processing visual information, receiving little if any input from auditory areas (Blatt et al., 2003). The data from both the intramodal and intermodal attentional literature reminds us again that attention may have a variety of neural expressions, at both relatively early (e.g., oc cipital) and late (e.g., parahippocampal) stages of processing, and depends to a large ex tent on specific task demands (Griffiths et al., 2004). In this respect, attention demon strates itself as a pervasive and powerful influence on brain functioning.

Divided Attention Divided attention between two concurrent streams of sounds (one in each hemispace) or between auditory modality and visual modality, has been associated with enhanced activi ty in the precuneus, IPS, FEFs, and middle frontal gyrus compared with focused attention to either one modality or location (Santangelo et al., 2010). Moreover, Lipschutz et al. (2002) found comparable activation during selective and divided attention suggestive of a common neural network. Bimodal divided attention (i.e., attending to auditory and visual stimuli) has also been associated with enhanced activity in the posterior dorsolateral pre frontal cortex (Johnson et al., 2007). Importantly, the same area was not significantly ac tive when participants focused their attention to either visual or auditory stimuli or when they were passively exposed to bimodal stimuli (Johnson & Zatorre, 2006). The impor tance of the dorsolateral prefrontal cortex (DLPFC) during bimodal divided attention is Page 19 of 35

Varieties of Auditory Attention further supported by evidence from a repetitive transcranial magnetic stimulation (rTMS) study (Johnson et al., 2007). In that particular study, performance during bimodal divided attention was hindered by temporarily disrupting the function of the DLPFC using rTMS compared with control site stimulation.

Deployment of Auditory Attention Our ability to attend a particular sound object or sound location is not instantaneous and may require a number of cognitive alterations. We may need to disengage from what we are doing, switch our attention to a different sensory modality, focus on a different spatial location or object, and then engage our selection mechanisms. It has been generally es tablished that focusing attention on a (p. 229) particular input modality, or a particular “what” or “where” feature, modulates cortical activity such that task-relevant representa tions are enhanced at the expense of irrelevant ones (Alain et al., 2001b; Degerman et al., 2006; Johnson & Zatorre, 2005, 2006). Although the object-based model can adequately account for many findings that involve focusing attention to a task-relevant stream, there is a growing need to better understand how attention is deployed toward an auditory ob ject within an auditory scene and whether the mechanisms at play when directing audito ry attention toward spatial and nonspatial cues also apply when the auditory scene com prises multiple sound objects. Although substantial research has been carried out on sus tained and selective attention, fewer studies have examined the deployment of attention, especially in the auditory modality. One popular paradigm to assess the deployment of attention consists of presenting an in formational cue before either a target sound or streams of sounds in which the likely in coming targets are embedded (Green et al., 2005; Stormer et al., 2009). The mechanisms involved in the control of attention can be assessed by comparing brain activity during the cue period (i.e., the interval between the cue and the onset of the target or stream of sounds) in conditions in which attention is directed to a particular feature (e.g., location or pitch) that defined either the likely incoming target or the streams of sounds to be at tended. Such a design has enabled researchers to identify a frontal-parietal network in the deployment of attention to either a particular location (Hill & Miller, 2009; Salmi et al., 2007a, 2009; Wu et al., 2007) or sound identity (Hill & Miller, 2009). A similar frontalparietal network has been reported during attention switching between locations in both auditory modality and visual modality (Salmi et al., 2007b; Shomstein & Yantis, 2004; Smith et al., 2010), sound locations (left vs. right ear), or sound identities (male vs. fe male voice) (Shomstein & Yantis, 2006). The network that mediates voluntary control of auditory spatial and nonspatial attention encompasses several brain areas that vary among studies and task and include, but are not limited to, the inferior and superior frontal gyrus, dorsal precentral sulcus, IPS, superior parietal lobule, and auditory cortex. Some of the areas (e.g., anterior cingulate, FEFs, superior parietal lobule) involved in ori enting auditory spatial attention are similar to those observed during the orientation of visual spatial attention, suggesting that control of spatial attention may be supported by a combination of supramodal and modality- specific brain mechanism (Wu et al., 2007). The activations in this network vary as a function of the feature to be attended, with location Page 20 of 35

Varieties of Auditory Attention recruiting the parietal cortex to a greater extent and attention to pitch recruiting the infe rior frontal gyrus. These findings resemble studies of auditory working memory for sound location and sound identity (Alain et al., 2008; Arnott et al., 2005; Rama et al., 2004). In addition to brain areas that appear to be specialized in terms of processing isolated featural information in individual modalities, other sites have been identified whose role may be considered more integrative in nature. For example, the left IPS has been found to be equally active when attention is directed to sound identity or sound location (Hill & Miller, 2009). Moreover, the left IPS is also activated by tool- or hand-manipulated sounds, as previously discussed. This suggests that the IPS may be an integrative center that coordinates attention regardless of which class of features is the focus of attention (Hill & Miller, 2009).

Figure 11.7 Blood oxygenation level–dependent ef fects showing brain areas that are activated by bot tom-up data-driven stimulus designed to capture a participant’s attention (i.e., loud deviant sounds) as well as those reflecting top-down controlled process es engaged a spatial cuing task. Note the overlap be tween the frontal eye field (FEF), the temporal pari etal junction (TPJ), and the superior parietal lobule (SPL) during bottom-up and top-down controlled au ditory spatial attention. Cb, cerebellum; CG/medFG, cingulated/medial frontal gyrus; IFG/MFG, inferior frontal gyrus/middle frontal gyrus; IPS, intraparietal sulcus; OC, occipital cortex; PMC, premotor cortex. Reprinted from Brain Research, 1286, Juha Salmi, Teemu Rinne, Sonja Koistinen, Oili Salonen, and Kim mo Alho, “Brain networks of bottom-up triggered and top-down controlled shifting of auditory attention,” 155–164, Copyright 2008, with permission from Else vier.

Additionally, the neural networks involved in the endogenous control of attention differ from those engaged by salient auditory oddball or novel stimuli designed to capture a participant’s attention in an exogenous fashion. Salmi et al. (2009) used fMRI to measure brain activity elicited by infrequently occurring loudness deviation tones (LTDs) while Page 21 of 35

Varieties of Auditory Attention participants were told to focus their attention on one auditory stream (e.g., left ear) and to ignore sounds presented in the other ear (i.e., right ear). The LTD occurred in both streams and served as a means of assessing involuntary (i.e., bottom-up) attentional cap ture. The authors found impaired performance when the targets were preceded by LTDs in the unattended location, and this effect coincided with enhanced activity in the ventro medial prefrontal cortex (VMPFC), possibly related to evaluation of the distracting event (Figure 11.7). Together, these fMRI studies reveal a complex neural network involved in the deployment of auditory attention. In a recent study, Gamble and Luck (2011) measured auditory ERPs while listeners were presented with two clearly distinguishable sound objects occurring in the left and right hemispace simultaneously. Participants indi cated whether a predefined target was present or absent. They found an increased nega tivity between 200 and 400 ms that was maximum at anterior and contralateral elec trodes to the target location, which was followed by a posterior contralateral positivity. These results suggest that auditory attention can be quickly deployed to the sound object location. (p. 230) More important, these findings suggest that scalp-recordings of ERPs may provide a useful tool for studying the deployment of auditory attention in real-life sit uations in which multiple sound objects are simultaneously present in the environment. This is an important issue to address given that auditory perception often occurs in a densely cluttered, rapidly changing acoustic environment, where multiple sound objects compete for attention.

Deployment of Attention in Time Although the data clearly provide evidence for neural modulation with respect to auditory spatial attention, it has been argued that the role of location in audition is less critical than in vision (Woods et al., 2001), and that in contrast to the high spatial resolution of the visual system, the auditory system shows similarly acute sensitivity with respect to the temporal domain (Welch & Warren, 1980). Ventriloquism and sound-induced visual temporal illusions (Shams et al., 2000; Recanzone, 2003) are good examples of this prop erty. Sanders and Astheimer (2008) showed that listeners can selectively direct their at tention to specific time points that differ by as little as 500 ms, and that doing so im proves target detection, affects baseline neural activity preceding stimulus presentation, and modulates auditory evoked potentials at a perceptually early stage (Figure 11.8). Rimmele, Jolsvai, and Sussman (2011) set up spatial and temporal expectation using a moving auditory stimulus. They found that implicitly directing attention to a specific mo ment in time modulated the amplitude of auditory ERPs, independently from spatially di recting attention. These studies show that listeners can flexibly allocate temporally selec tive attention over short intervals (for a more extensive review, see Jones, 1976; Jones & Boltz 1989).

Page 22 of 35

Varieties of Auditory Attention

Future Directions

Figure 11.8 A, Schematic of the paradigm used to in vestigate the deployment of auditory attention in time. B, Accuracy in detecting targets at the desig nated time. C, The first row shows scalp-recorded au ditory evoked potential at the midline central site (i.e., Cz) for the whole epoch. The amplitude of the contingent negative variation (CNV) increased as the designate attended time increased. The second row shows the transient N1 and P2 wave at the midline frontal site (i.e., Fz) elicited by the stimuli when they occurred at the attended time (thick line). Note that the N1 amplitude was larger when the stimulus oc curred at the attended time than when attention was allocated to a different time. Adapted with permission from Sanders & Astheimer, 2008.

Over the past decade, we have seen a significant increase in research activity regarding the mechanisms supporting the varieties of auditory attention. Attention to auditory mate rial engages a broadly distributed neural network that varies as a function of task de mands, including selective, divided, and sustained attention. An important goal for future research will be to clarify the role of auditory cortical areas as well as those beyond audi tory cortices (p. 231) (e.g., parietal cortex) in auditory attention. This may require a com bination of neuroimaging techniques such as EEG, TMS, and fMRI, as well as animal stud ies using microstimulation and selective-deactivation (e.g. cooling) techniques combined with behavioral measures. Current research using relatively simple sounds (e.g., pure tone) suggests that selective attention may involve facilitation and suppression of task-relevant and task-irrelevant stimuli, respectively. However, further research is needed to determine the extent to which attentional mechanisms derived from paradigms using relatively simple stimuli ac count for the processes involved in more complex and realistic listening situations often illustrated using the cocktail party example. Speech is a highly familiar stimulus, and our Page 23 of 35

Varieties of Auditory Attention auditory system has had the opportunity to learn about speech-specific properties (e.g., f0, formant transitions) that may assist listeners while they selectively attend to speech stimuli (Rossi-Katz & Arehart, 2009). For instance, speech sounds activate schemata that may interact with more primitive mechanisms, thereby influencing our incoming acoustic data to perceptually organize and select for further processing. It is unlikely that such schemata play an equivalent role in the processing of pure tones, so the relationship be tween bottom-up and top-down contributions in the deployment of attention may be dif ferent according to the naturalism of the auditory environment used. Lastly, spoken com munication is a multimodal and highly interactive process whereby visual input can help listeners identify speech in noise and can also influence what is heard. Hence, it is also important to examine the role of visual information during selective attention to speech sounds. In short, it is clear that auditory attention plays a central (and perhaps even pri mary) role in guiding our interaction with the external world. However, in reviewing the literature, we have noted how auditory attention is intimately connected with other fun damental issues such as multimodal integration, the relationship between perception and (p. 232)

action-based processing, and how mental representations are maintained across both space and time. In advancing the field, it will be important not to ignore the complexity of the problem, such that our understanding of the neural substrates that underlie auditory attention reflect this core mechanism at its most ecologically valid expression.

References Ahveninen, J., Hamalainen, M., Jaaskelainen, I. P., Ahlfors, S. P., Huang, S., Lin, F. H., Raij, T., Sams, M., Vasios, C. E., & Belliveau, J. W. (2011). Attention-driven auditory cortex short-term plasticity helps segregate relevant sounds from noise. Proceedings of the Na tional Academy of Sciences U S A, 108, 4182–4187. Alain, C., Achim, A., & Richer, F. (1993). Perceptual context and the selective attention ef fect on auditory event-related brain potentials. Psychophysiology, 30, 572–580. Alain, C., & Arnott, S. R. (2000). Selectively attending to auditory objects. Frontiers in Bioscience, 5, D202–D212. Alain, C., Arnott, S. R., Hevenor, S., Graham, S., & Grady, C. L. (2001). “What” and “where” in the human auditory system. Proceedings of the National Academy of Sciences U S A, 98, 12301–12306. Alain, C., Arnott, S. R., & Picton, T. W. (2001). Bottom-up and top-down influences on auditory scene analysis: Evidence from event-related brain potentials. Journal of Ex perimental Psychology: Human Perception and Performance, 27, 1072–1089. (p. 233)

Alain, C., & Bernstein, L. J. (2008). From sounds to meaning: The role of attention during auditory scene analysis. Current Opinion in Otolaryngology & Head and Neck Surgery, 16, 485–489. Page 24 of 35

Varieties of Auditory Attention Alain, C., He, Y., & Grady, C. (2008). The contribution of the inferior parietal lobe to audi tory spatial working memory. Journal of Cognitive Neuroscience, 20, 285–295. Alain, C., & Izenberg, A. (2003). Effects of attentional load on auditory scene analysis. Journal of Cognitive Neuroscience, 15, 1063–1073. Alain, C., Schuler, B. M., & McDonald, K. L. (2002). Neural activity associated with distin guishing concurrent auditory objects. Journal of the Acoustical Society of America, 111, 990–995. Alain, C., & Woods, D. L. (1993). Distractor clustering enhances detection speed and ac curacy during selective listening. Perception & Psychophysics, 54, 509–514. Alain, C., & Woods, D. L. (1994). Signal clustering modulates auditory cortical activity in humans. Perception & Psychophysics, 56, 501–516. Alain, C., & Woods, D. L. (1997). Attention modulates auditory pattern memory as indexed by event-related brain potentials. Psychophysiology, 34, 534–546. Alho, K., Tottola, K., Reinikainen, K., Sams, M., & Naatanen, R. (1987). Brain mechanism of selective listening reflected by event-related potentials. Electroencephalography and Clinical Neurophysiology, 68, 458–470. Andersen, R. A., Snyder, L. H., Bradley, D. C., & Xing, J. (1997). Multiple representation of space in the posterior parietal cortex and its use in planning movements. Annual Review of Neuroscience, 20, 303–330. Arnott, S. R., & Alain, C. (2002a). Effects of perceptual context on event-related brain po tentials during auditory spatial attention. Psychophysiology, 39, 625–632. Arnott, S. R., & Alain, C. (2002b). Stepping out of the spotlight: MMN attenuation as a function of distance from the attended location. NeuroReport, 13, 2209–2212. Arnott, S. R., & Alain, C. (2011). The auditory dorsal pathway: Orienting vision. Neuro science and Biobehavioral Reviews, 35 (10), 2162–2173. Arnott, S. R., Binns, M. A., Grady, C. L., & Alain, C. (2004). Assessing the auditory dualpathway model in humans. NeuroImage, 22, 401–408. Arnott, S. R., Cant, J. S., Dutton, G. N., & Goodale, M. A. (2008). Crinkling and crumpling: An auditory fMRI study of material properties. NeuroImage, 43, 368–378. Arnott, S. R., Grady, C. L., Hevenor, S. J., Graham, S., & Alain, C. (2005). The functional organization of auditory working memory as revealed by fMRI. Journal of Cognitive Neu roscience, 17, 819–831. Arnott, S. R., & Goodale, M. A. (2006). Distorting visual space with sound. Vision Re search, 46, 1553–1558. Page 25 of 35

Varieties of Auditory Attention Avan, P., & Bonfils, P. (1992). Analysis of possible interactions of an attentional task with cochlear micromechanics. Hearing Research, 57, 269–275. Baylis, G. C., & Driver, J. (1993). Visual attention and objects: Evidence for hierarchical coding of location. Journal of Experimental Psychology: Human Perception and Perfor mance, 19, 451–470. Benson, D. A., & Hienz, R. D. (1978). Single-unit activity in the auditory cortex of mon keys selectively attending left vs. right ear stimuli. Brain Research, 159, 307–320. Blatt, G. J., Pandya, D. N., & Rosene, D. L. (2003). Parcellation of cortical afferents to three distinct sectors in the parahippocampal gyrus of the rhesus monkey: An anatomical and neurophysiological study. Journal of Comparative Neurology, 466, 161–179. Brefczynski, J. A., & DeYoe, E. A. (1999). A physiological correlate of the “spotlight” of vi sual attention. Nature Neuroscience, 2, 370–374. Bregman, A. S. (1990). Auditory scene analysis: The perceptual organization of sounds. London, UK: MIT Press. Bregman, A. S., & Rudnicky, A. I. (1975). Auditory segregation: Stream or streams? Jour nal of Experimental Psychology: Human Perception and Performance, 1, 263–267. Broadbent, D. E. (1962). Attention and the perception of speech. Scientific American, 206, 143–151. Calvert, G. A. (2001). Crossmodal processing in the human brain: Insights from functional neuroimaging studies. Cerebral Cortex, 11, 1110–1123. Cant, J. S., & Goodale, M. A. (2007). Attention to form or surface properties modulates dif ferent regions of human occipitotemporal cortex. Cerebral Cortex, 17, 713–731. Carlyon, R. P. (2004). How the brain separates sounds. Trends in Cognitive Sciences, 8, 465–471. Carlyon, R. P., Cusack, R., Foxton, J. M., & Robertson, I. H. (2001). Effects of attention and unilateral neglect on auditory stream segregation. Journal of Experimental Psycholo gy: Human Perception and Performance, 27, 115–127. Cate, A. D., Herron, T. J., Yund, E. W., Stecker, G. C., Rinne, T., Kang, X., Petkov, C. I., Dis brow, E. A., & Woods, D. L. (2009). Auditory attention activates peripheral visual cortex. PLoS One, 4, e4645. Chen, Z., & Cave, K. R. (2008). Object-based attention with endogenous cuing and posi tional certainty. Perception & Psychophysics, 70, 1435–1443. Cherry, E. C. (1953). Some experiments on the recognition of speech with one and with two ears. Journal of the Acoustical Society of America, 25, 975–979. Page 26 of 35

Varieties of Auditory Attention Cowan, N. (1988). Evolving conceptions of memory storage, selective attention, and their mutual constraints within the human information-processing system. Psychological Bul letin, 104, 163–191. Cowan, N. (1993). Activation, attention, and short-term memory. Memory & Cognition, 21, 162–167. Culham, J. C., Cavina-Pratesi, C., & Singhal, A. (2006). The role of parietal cortex in visuo motor control: what have we learned from neuroimaging? Neuropsychologia, 44, 2668– 2684. Cusack, R., Deeks, J., Aikman, G., & Carlyon, R. P. (2004). Effects of location, frequency region, and time course of selective attention on auditory scene analysis. Journal of Ex perimental Psychology: Human Perception and Performance, 30, 643–656. Degerman, A., Rinne, T., Salmi, J., Salonen, O., & Alho, K. (2006). Selective attention to sound location or pitch studied with fMRI. Brain Research, 1077, 123–134. Degerman, A., Rinne, T., Sarkka, A. K., Salmi, J., & Alho, K. (2008). Selective attention to sound location or pitch studied with event-related brain potentials and magnetic fields. European Journal of Neuroscience, 27, 3329–3341. Delano, P. H., Elgueda, D., Hamame, C. M., & Robles, L. (2007). Selective attention to vi sual stimuli reduces cochlear sensitivity in chinchillas. Journal of Neuroscience, 27, 4146– 4153. Deouell, L. Y., Deutsch, D., Scabini, D., & Knight, R. T. (2008). No disillusions in auditory extinction: Perceived a melody comprised of unperceived notes. Frontiers in Human Neu roscience, 1, 1–6. Deutsch, D. (1975). Two-channel listening to musical scales. Journal of the Acoustical So ciety of America, 57, 1156–1160. Driver, J. (2001). A selective review of selective attention research from the past century. British Journal of Psychology, 92, 53–78. Driver, J., & Baylis, G. C. (1989). Movement and visual attention: The spotlight metaphor breaks down. Journal of Experimental Psychology: Human Perception and Performance, 15, 448–456. Duncan, J. (1984). Selective attention and the organization of visual information. Journal of Experimental Psychology: General, 113, 501–517. Dyson, B. J., & Alain, C. (2004). Representation of concurrent acoustic objects in primary auditory cortex. Journal of the Acoustical Society of America, 115, 280–288. Dyson, B. J., Alain, C., & He, Y. (2005). Effects of visual attentional load on low-level audi tory scene analysis. Cognitive, Affective, & Behavioral Neuroscience, 5, 319–338. Page 27 of 35

Varieties of Auditory Attention Dyson, B. J., & Ishfaq, F. (2008). Auditory memory can be object-based. Psychonomic Bul letin & Review, 15, 409–412. Egly, R., Driver, J., & Rafal, R. D. (1994). Shifting visual attention between objects and lo cations: evidence from normal and parietal lesion subjects. Journal of Experimental Psy chology: General, 123, 161–177. Gamble, M. L., & Luck, S. J. (2011). N2ac: An ERP component associated with the focus ing of attention within an auditory scene. Psychophysiology, 48, 1057–1068. Giard, M. H., Collet, L., Bouchet, P., & Pernier, J. (1994). Auditory selective attention in the human cochlea. Brain Research, 633, 353–356. Giard, M. H., Fort, A., Mouchetant-Rostaing, Y., & Pernier, J. (2000). Neurophysiological mechanisms of auditory selective attention in humans. Frontiers in Bioscience, 5, D84– D94. Goodale, M. A., & Milner, A. D. (1992). Separate visual pathways for perception and ac tion. Trends in Neurosciences, 15, 20–25. Green, J. J., Teder-Salejarvi, W. A., & McDonald, J. J. (2005). Control mechanisms mediat ing shifts of attention in auditory and visual space: A spatio-temporal ERP analysis. Exper imental Brain Research, 166, 358–369. Griffiths, T. D., Warren, J. D., Scott, S. K., Nelken, I., & King, A. J. (2004). Cortical process ing of complex sound: A way forward? Trends in Neurosciences, 27, 181–185. Grill-Spector, K., Henson, R., & Martin, A. (2006). Repetition and the brain: Neural mod els of stimulus-specific effects. Trends in Cognitive Sciences, 10, 14–23. Gutschalk, A., Micheyl, C., Melcher, J. R., Rupp, A., Scherg, M., & Oxenham, A. J. (2005). Neuromagnetic correlates of streaming in human auditory cortex. Journal of Neuro science, 25, 5382–5388. Hansen, J. C., & Hillyard, S. A. (1980). Endogenous brain potentials associated with selec tive auditory attention. Electroencephalography and Clinical Neurophysiology, 49, 277– 290. Hansen, J. C., & Hillyard, S. A. (1983). Selective attention to multidimensional auditory stimuli. Journal of Experimental Psychology: Human Perception and Performance, 9, 1–19. Hecht, L. N., Abbs, B., & Vecera, S. P. (2008). Auditory object-based attention. Visual Cog nition, 16, 1109–1115. Hill, K. T., & Miller, L. M. (2009). Auditory attentional control and selection during cock tail party listening. Cerebral Cortex, 20, 583–590. Hillyard, S. A., Hink, R. F., Schwent, V. L., & Picton, T. W. (1973). Electrical signs of selec tive attention in the human brain. Science, 182, 177–180. Page 28 of 35

Varieties of Auditory Attention Hillyard, S. A., Vogel, E. K., & Luck, S. J. (1998). Sensory gain control (amplification) as a mechanism of selective attention: Electrophysiological and neuroimaging evidence. Philo sophical Transactions of the Royal Society of London, Series B, Biological Sciences, 353, 1257–1270. Hillyard, S. A., Woldorff, M., Mangun, G. R., & Hansen, J. C. (1987). Mechanisms of early selective attention in auditory and visual modalities. Electroencephalography and Clinical Neurophysiology Supplement, 39, 317–324. Hocherman, S., Benson, D. A., Goldstein, M. H., Jr., Heffner, H. E., & Hienz, R. D. (1976). Evoked unit activity in auditory cortex of monkeys performing a selective attention task. Brain Research, 117, 51–68. Johnson, J. A., Strafella, A. P., & Zatorre, R. J. (2007). The role of the dorsolateral pre frontal cortex in bimodal divided attention: two transcranial magnetic stimulation studies. Journal of Cognitive Neuroscience, 19, 907–920. Johnson, J. A., & Zatorre, R. J. (2005). Attention to simultaneous unrelated auditory and vi sual events: behavioral and neural correlates. Cerebral Cortex, 15, 1609–1620. Johnson, J. A., & Zatorre, R. J. (2006). Neural substrates for dividing and focusing attention between simultaneous auditory and visual events. NeuroImage, 31, 1673–1681. (p. 234)

Jones, M. R. (1976). Time, our lost dimension: Toward a new theory of perception, atten tion, and memory. Psychological Review, 83, 323–355. Jones, M. R., & Boltz, M. (1989). Dynamic attending and responses to time. Psychological Review, 96, 459–491. Kauramaki, J., Jaaskelainen, I. P., & Sams, M. (2007). Selective attention increases both gain and feature selectivity of the human auditory cortex. PLoS One, 2, e909. Kawashima, R., Imaizumi, S., Mori, K., Okada, K., Goto, R., Kiritani, S., Ogawa, A., & Fukuda, H. (1999). Selective visual and auditory attention toward utterances: A PET study. NeuroImage, 10, 209–215. Kerlin, J. R., Shahin, A. J., & Miller, L. M. (2010). Attentional gain control of ongoing corti cal speech representations in a “cocktail party.” Journal of Neuroscience, 30, 620–628. Kubovy, M., & Van Valkenburg, D. (2001). Auditory and visual objects. Cognition, 80, 97– 126. LaBerge, D. (1983). Spatial extent of attention to letters and words. Journal of Experimen tal Psychology: Human Perception and Performance, 9, 371–379. Lavenex, P., Suzuki, W. A., & Amaral, D. G. (2004). Perirhinal and parahippocampal cor tices of the macaque monkey: Intrinsic projections and interconnections. Journal of Com parative Neurology, 472, 371–394. Page 29 of 35

Varieties of Auditory Attention Leung, A. W., & Alain, C. (2011). Working memory load modulates the auditory “What” and “Where” neural networks. NeuroImage, 55, 1260–1269. Lewis, J. W. (2006). Cortical networks related to human use of tools. Neuroscientist, 12, 211–231. Lewis, J. W., Brefczynski, J. A., Phinney, R. E., Janik, J. J., & DeYoe, E. A. (2005). Distinct cortical pathways for processing tool versus animal sounds. Journal of Neuroscience, 25, 5148–5158. Linden, D. E., Prvulovic, D., Formisano, E., Vollinger, M., Zanella, F. E., Goebel, R., & Dierks, T. (1999). The functional neuroanatomy of target detection: An fMRI study of visu al and auditory oddball tasks. Cerebral Cortex, 9, 815–823. Lipschutz, B., Kolinsky, R., Damhaut, P., Wikler, D., & Goldman, S. (2002). Attention-de pendent changes of activation and connectivity in dichotic listening. NeuroImage, 17, 643–656. Macaluso, E., George, N., Dolan, R., Spence, C., & Driver, J. (2004). Spatial and temporal factors during processing of audiovisual speech: A PET study. NeuroImage, 21, 725–732. Macken, W. J., Tremblay, S., Houghton, R. J., Nicholls, A. P., & Jones, D. M. (2003). Does auditory streaming require attention? Evidence from attentional selectivity in shortterm memory. Journal of Experimental Psychology: Human Perception and Performance, 29, 43–51. Maeder, P. P., Meuli, R. A., Adriani, M., Bellmann, A., Fornari, E., Thiran, J. P., Pittet, A., & Clarke, S. (2001). Distinct pathways involved in sound recognition and localization: a hu man fMRI study. NeuroImage, 14, 802–816. Maison, S., Micheyl, C., & Collet, L. (2001). Influence of focused auditory attention on cochlear activity in humans. Psychophysiology, 38, 35–40. Martinkauppi, S., Rama, P., Aronen, H. J., Korvenoja, A., & Carlson, S. (2000). Working memory of auditory localization. Cerebral Cortex, 10, 889–898. McAdams, S., & Bertoncini, J. (1997). Organization and discrimination of repeating sound sequences by newborn infants. Journal of the Acoustical Society of America, 102, 2945– 2953. McMains, S. A., & Somers, D. C. (2004). Multiple spotlights of attentional selection in hu man visual cortex. Neuron, 42, 677–686. Meienbrock, A., Naumer, M. J., Doehrmann, O., Singer, W., & Muckli, L. (2007). Retino topic effects during spatial audiovisual integration. Neuropsychologia, 45, 531–539. Michie, P. T., LePage, E. L., Solowij, N., Haller, M., & Terry, L. (1996). Evoked otoacoustic emissions and auditory selective attention. Hearing Research, 98, 54–67. Page 30 of 35

Varieties of Auditory Attention Michie, P. T., Solowij, N., Crawford, J. M., & Glue, L. C. (1993). The effects of betweensource discriminability on attended and unattended auditory ERPs. Psychophysiology, 30, 205–220. Mondor, T. A., & Terrio, N. A. (1998). Mechanisms of perceptual organization and audito ry selective attention: The role of pattern structure. Journal of Experimental Psychology: Human Perception and Performance, 24, 1628–1641. Moray, N., & O’Brien, T. (1967). Signal-detection theory applied to selective listening. Journal of the Acoustical Society of America, 42, 765–772. Munte, T. F., Spring, D. K., Szycik, G. R., & Noesselt, T. (2010). Electrophysiological atten tion effects in a virtual cocktail-party setting. Brain Research, 1307, 78–88. Näätänen, R., Gaillard, A. W., & Mantysalo, S. (1978). Early selective-attention effect on evoked potential reinterpreted. Acta Psychologica (Amsterdam), 42, 313–329. O’Craven, K. M., Downing, P. E., & Kanwisher, N. (1999). fMRI evidence for objects as the units of attentional selection. Nature, 401, 584–587. Okamoto, H., Stracke, H., Wolters, C. H., Schmael, F., & Pantev, C. (2007). Attention im proves population-level frequency tuning in human auditory cortex. Journal of Neuro science, 27, 10383–10390. Ortuno, F., Ojeda, N., Arbizu, J., Lopez, P., Marti-Climent, J. M., Penuelas, I., & Cervera, S. (2002). Sustained attention in a counting task: normal performance and functional neu roanatomy. NeuroImage, 17, 411–420. Paltoglou, A. E., Sumner, C. J., & Hall, D. A. (2009). Examining the role of frequency speci ficity in the enhancement and suppression of human cortical activity by auditory selective attention. Hearing Research, 257, 106–118. Petkov, C. I., Kang, X., Alho, K., Bertrand, O., Yund, E. W., & Woods, D. L. (2004). Atten tional modulation of human auditory cortex. Nature Neuroscience, 7, 658–663. Picton, T. W., Alain, C., Otten, L., Ritter, W., & Achim, A. (2000). Mismatch negativity: Dif ferent water in the same river. Audiology & Neuro-otology, 5, 111–139. Pratt, J., & Turk-Browne, N. B. (2003). The attentional repulsion effect in perception and action. Experimental Brain Research, 152, 376–382. Rama, P., Poremba, A., Sala, J. B., Yee, L., Malloy, M., Mishkin, M., & Courtney, S. M. (2004). Dissociable functional cortical topographies for working memory maintenance of voice identity and location. Cerebral Cortex, 14, 768–780. Rauschecker, J. P., & Scott, S. K. (2009). Maps and streams in the auditory cortex: Nonhu man primates illuminate human speech processing. Nature Neuroscience, 12, 718–724.

Page 31 of 35

Varieties of Auditory Attention Rauschecker, J. P., & Tian, B. (2000). Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proceedings of the National Academy of Sciences U S A, 97, 11800–11806. (p. 235)

Recanzone, G. H. (2003). Auditory influences on visual temporal rate perception. Journal of Neurophysiology, 89, 1078–1093. Rimmele, J., Jolsvai, H., & Sussman, E. (2011). Auditory target detection is affected by im plicit temporal and spatial expectations. Journal of Cognitive Neuroscience, 23, 1136– 1147. Rinne, T., Balk, M. H., Koistinen, S., Autti, T., Alho, K., & Sams, M. (2008). Auditory selec tive attention modulates activation of human inferior colliculus. Journal of Neurophysiolo gy, 100, 3323–3327. Rinne, T., Kirjavainen, S., Salonen, O., Degerman, A., Kang, X., Woods, D. L., & Alho, K. (2007). Distributed cortical networks for focused auditory attention and distraction. Neu roscience Letters, 416, 247–251. Ross, B., Hillyard, S. A., & Picton, T. W. (2010). Temporal dynamics of selective attention during dichotic listening. Cerebral Cortex, 20, 1360–1371. Rossi-Katz, J., & Arehart, K. H. (2009). Message and talker identification in older adults: Effects of task, distinctiveness of the talkers’ voices, and meaningfulness of the compet ing message. Journal of Speech, Language, and Hearing Research, 52, 435–453. Salmi, J., Rinne, T., Degerman, A., & Alho, K. (2007a). Orienting and maintenance of spa tial attention in audition and vision: An event-related brain potential study. European Journal of Neuroscience, 25, 3725–3733. Salmi, J., Rinne, T., Degerman, A., Salonen, O., & Alho, K. (2007b). Orienting and mainte nance of spatial attention in audition and vision: multimodal and modality-specific brain activations. Brain Structure and Function, 212, 181–194. Salmi, J., Rinne, T., Koistinen, S., Salonen, O., & Alho, K. (2009). Brain networks of bot tom-up triggered and top-down controlled shifting of auditory attention. Brain Research, 1286, 155–164. Sanders, L. D., & Astheimer, L. B. (2008). Temporally selective attention modulates early perceptual processing: Event-related potential evidence. Perception & Psychophysics, 70, 732–742. Santangelo, V., Fagioli, S., & Macaluso, E. (2010). The costs of monitoring simultaneously two sensory modalities decrease when dividing attention in space. NeuroImage, 49, 2717– 2727.

Page 32 of 35

Varieties of Auditory Attention Sestieri, C., Di Matteo, R., Ferretti, A., Del Gratta, C., Caulo, M., Tartaro, A., Olivetti Be lardinelli, M., & Romani, G. L. (2006). “What” versus “where” in the audiovisual domain: an fMRI study. NeuroImage, 33, 672–680. Shamma, S. A., Elhilali, M., & Micheyl, C. (2011). Temporal coherence and attention in auditory scene analysis. Trends in Neurosciences, 34, 114–123. Shams, L., Kamitani, Y., & Shimojo, S. (2000). Illusions. What you see is what you hear. Nature, 408, 788. Shinn-Cunningham, B. G. (2008). Object-based auditory and visual attention. Trends in Cognitive Sciences, 12, 182–186. Shomstein, S., & Yantis, S. (2004). Control of attention shifts between vision and audition in human cortex. Journal of Neuroscience, 24, 10702–10706. Shomstein, S., & Yantis, S. (2006). Parietal cortex mediates voluntary control of spatial and nonspatial auditory attention. Journal of Neuroscience, 26, 435–439. Smith, D. V., Davis, B., Niu, K., Healy, E. W., Bonilha, L., Fridriksson, J., Morgan, P. S., & Rorden, C. (2010). Spatial attention evokes similar activation patterns for visual and audi tory stimuli. Journal of Cognitive Neuroscience, 22, 347–361. Snyder, J. S., Alain, C., & Picton, T. W. (2006). Effects of attention on neuroelectric corre lates of auditory stream segregation. Journal of Cognitive Neuroscience, 18, 1–13. Snyder, J. S., Carter, O. L., Hannon, E. E., & Alain, C. (2009). Adaptation reveals multiple levels of representation in auditory stream segregation. Journal of Experimental Psycholo gy: Human Perception and Performance, 35, 1232–1244. Stevens, M. C., Calhoun, V. D., & Kiehl, K. A. (2005). fMRI in an oddball task: Effects of target-to-target interval. Psychophysiology, 42, 636–642. Stormer, V. S., Green, J. J., & McDonald, J. J. (2009). Tracking the voluntary control of au ditory spatial attention with event-related brain potentials. Psychophysiology, 46, 357– 366. Sussman, E. S., Bregman, A. S., Wang, W. J., & Khan, F. J. (2005). Attentional modulation of electrophysiological activity in auditory cortex for unattended sounds within multi stream auditory environments. Cognitive, Affective, and Behavioral Neuroscience, 5, 93– 110. Sussman, E. S., Horváth, J., Winkler, I., & Orr, M. (2007). The role of attention in the for mation of auditory streams. Perception & Psychophysics, 69, 136–152. Suzuki, K., Takei, N., Toyoda, T., Iwata, Y., Hoshino, R., Minabe, Y., & Mori, N. (2003). Au ditory hallucinations and cognitive impairment in a patient with a lesion restricted to the hippocampus. Schizophrenia Research, 64, 87–89. Page 33 of 35

Varieties of Auditory Attention Suzuki, S., & Cavanagh, P. (1997). Focused attention distorts visual space: An attentional repulsion effect. Journal of Experimental Psychology: Human Perception and Performance, 23, 443–463. Tark, K. J., & Curtis, C. E. (2009). Persistent neural activity in the human frontal cortex when maintaining space that is off the map. Nature Neuroscience, 12, 1463–1468. Timpe-Syverson, G. K., & Decker, T. N. (1999). Attention effects on distortion-product otoacoustic emissions with contralateral speech stimuli. Journal of the American Academy of Audiology, 10, 371–378. Trejo, L. J., Ryan-Jones, D. L., & Kramer, A. F. (1995). Attentional modulation of the mis match negativity elicited by frequency differences between binaurally presented tone bursts. Psychophysiology, 32, 319–328. Tzourio, N., Massioui, F. E., Crivello, F., Joliot, M., Renault, B., & Mazoyer, B. (1997). Functional anatomy of human auditory attention studied with PET. NeuroImage, 5, 63–77. Valdes-Sosa, M., Bobes, M. A., Rodriguez, V., & Pinilla, T. (1998). Switching attention without shifting the spotlight object-based attentional modulation of brain potentials. Journal of Cognitive Neuroscience, 10, 137–151. Welch, R. B., & Warren, D. H. (1980). Immediate perceptual response to intersensory dis crepancy. Psychological Bulletin, 88, 638–667. Winkler, I., Denham, S. L., & Nelken, I. (2009). Modeling the auditory scene: Predictive regularity representations and perceptual objects. Trends in Cognitive Sciences, 13, 532– 540. Woldorff, M. G. (1995). Selective listening at fast stimulus rates: so much to hear, so little time. Electroencephalography and Clinical Neurophysiology Supplement, 44, 32–51. Woldorff, M. G., Gallen, C. C., Hampson, S. A., Hillyard, S. A., Pantev, C., Sobel, D., & Bloom, F. E. (1993). Modulation of early sensory processing in human auditory cortex during auditory selective attention. Proceedings of the National Academy of Science U S A, 90, 8722–8726. (p. 236)

Woldorff, M. G., Hackley, S. A., & Hillyard, S. A. (1991). The effects of channel-selective attention on the mismatch negativity wave elicited by deviant tones. Psychophysiology, 28, 30–42. Woldorff, M. G., & Hillyard, S. A. (1991). Modulation of early auditory processing during selective listening to rapidly presented tones. Electroencephalography and Clinical Neu rophysiology, 79, 170–191. Woods, D. L., & Alain, C. (1993). Feature processing during high-rate auditory selective attention. Perception & Psychophysics, 53, 391–402.

Page 34 of 35

Varieties of Auditory Attention Woods, D. L., & Alain, C. (2001). Conjoining three auditory features: An event-related brain potential study. Journal of Cognitive Neuroscience, 13, 492–509. Woods, D. L., Alain, C., Diaz, R., Rhodes, D., & Ogawa, K. H. (2001). Location and fre quency cues in auditory selective attention. Journal of Experimental Psychology: Human Perception and Performance, 27, 65–74. Woods, D. L., Alho, K., & Algazi, A. (1994). Stages of auditory feature conjunction: An event-related brain potential study. Journal of Experimental Psychology: Human Percep tion and Performance, 20, 81–94. Wu, C. T., Weissman, D. H., Roberts, K. C., & Woldorff, M. G. (2007). The neural circuitry underlying the executive control of auditory spatial attention. Brain Research, 1134, 187– 198. Yantis, S., & Serences, J. T. (2003). Cortical mechanisms of space-based and object-based attentional control. Current Opinion in Neurobiology, 13, 187–193. Yoncheva, Y. N., Zevin, J. D., Maurer, U., & McCandliss, B. D. (2010). Auditory selective at tention to speech modulates activity in the visual word form area. Cerebral Cortex, 20, 622–632. Yoshiura, T., Zhong, J., Shibata, D. K., Kwok, W. E., Shrier, D. A., & Numaguchi, Y. (1999). Functional MRI study of auditory and visual oddball tasks. NeuroReport, 10, 1683–1688. Zatorre, R. J., Mondor, T. A., & Evans, A. C. (1999). Auditory attention to space and fre quency activates similar cerebral systems. NeuroImage, 10, 544–554.

Claude Alain

Claude Alain is Senior Scientist and Assistant Director Rotman Research Institute, Baycrest Centre; Professor Department of Psychology & Institute of Medical Sciences, University of Toronto. Stephen R. Arnott

Stephen R. Arnott, Rotman Research Institute, Baycrest Centre for Geriatric Care. Benjamin J. Dyson

Benjamin J. Dyson, Department of Psychology, Ryerson University.

Page 35 of 35

Spatial Attention

Spatial Attention Jeffrey R. Nicol The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0012

Abstract and Keywords Spatial attention facilitates adaptive interaction in the environment by enhancing the per ceptual processing associated with selected locations or objects and suppressing process ing associated with nonselected stimuli. This chapter presents spatial attention as a di chotomy of visual processes, mechanisms, and neural networks. Specifically, distinctions are made between space- and object-based models of attentional selection, overt and covert orienting, reflexive and voluntary control of the allocation of resources, and dorsal and ventral frontoparietal neural networks. The effects of spatial attention on neurophysi ological activity in subcortical and visual cortical areas are reviewed, as are the findings from behavioral studies examining the effects of spatial attention on early (i.e., low-level) visual perception. Keywords: space-based selection, object-based selection, reflexive control, voluntary control, covert orienting, overt orienting, eye movements, neural networks of attention, visual perception

When our eyes open to view the environment that surrounds us, we are immediately con fronted with a massive amount of information. This situation presents a problem to the vi sual system because it is limited in the amount of information that it can process, and hence be aware of, at one time (e.g., Broadbent, 1958). Adaptive interaction with the en vironment, therefore, requires a neural mechanism that effectively facilitates the selec tion of behaviorally relevant information for enhanced processing while simultaneously suppressing the processing associated with irrelevant information. This mechanism, which permits differential processing of spatially and temporally contiguous sources of in formation, is referred to as selective attention (e.g., Johnston & Dark, 1986), and it broad ly describes “those processes that enable an observer to recruit resources for processing selected aspects of the retinal image more fully than nonselected aspects” (Palmer, 1999, p. 532). Although a considerable body of research has demonstrated effects of spatial at tention in the auditory sensory modality (e.g., Spence & Driver, 1994) and cross-modally (see Spence & Driver, 2004), the majority of research has focused on the visual modality. Accordingly, the present chapter represents a selective review of seminal and recent studies that have examined visual spatial attention. Spatial attention is defined here as Page 1 of 30

Spatial Attention the focusing of limited capacity visual processing resources to a specific location in space for the purpose of selective perceptual enhancement of the information at that location.

Theories, Models, and Metaphors of Spatial At tention Given that mechanisms of spatial attention are charged with the critical task of selecting behaviorally relevant information for further processing, (p. 238) the question becomes: What does spatial attention select? In other words: What are the units of selection? For decades, researchers of attentional selection have been largely polarized into two fac tions. Broadly speaking, there are those that support space-based models and those that support object-based models of attentional selection.

Space-Based Models Space-based theories of attention posit that space is the primary unit of selection. Gener al support for space-based approaches comes from studies showing that performance is negatively affected when a target stimulus is flanked by spatially contiguous distractors (i.e., within 1 degree of visual angle), but not when distractors are more disparate from the target (Eriksen & Eriksen, 1974; Eriksen & Hoffman, 1972). Common space-based theories use metaphors such as a spotlight, a zoom lens, or a gradient to describe the process of attentional selection. According to the spotlight account (e.g., Posner, 1978; Posner, Snyder, & Davidson, 1980), attention operates like a beam of light that enhances processing of stimuli that occupy regions that fall within its boundary, and inhibits, or at tenuates, processing of stimuli at locations outside its boundary. The spotlight metaphor of attention was developed to account for findings that emerged from Posner and col leagues’ (Posner, 1978; Posner, Nissen, & Ogden, 1978; Posner et al., 1980) seminal work using the spatial cueing task. In a typical spatial cueing task, observers fixate on a centrally presented stimulus and are presented with a cue that directs attention to a specific spatial location. Following the cue, a target appears, and observers are required to either detect the onset of the target as quickly as possible or perform a target discrimination as quickly and as accurately as possible. Generally, three different cue types are used: valid cues that indicate where the target will appear, invalid cues that direct attention away from the target location, and neutral cues that alert the observer to the ensuing target onset in the absence of direc tional information (see Jonides & Mack, 1984, and Wright, Richard, & McDonald, 1995, in regard to methodological issues surrounding neutral cues). Cost–benefit analysis of reac tion time (RT) (i.e., the perceptual cost associated with invalid cues is determined by sub tracting mean invalid RT from neutral RT, and the perceptual benefit associated with valid cues is determined by subtracting neutral RT from valid RT) reveal that, relative to the neutral cueing condition, targets are detected faster when they appear at the cued lo cation and slower when they appear at the uncued location (Posner et al., 1978). Percep tual discriminations are also more accurate for targets that appear at the cued than un Page 2 of 30

Spatial Attention cued location (Posner et al., 1980). The spotlight account of the data forwarded by Posner and colleagues contends that detection was more efficient at the cued location because attentional resources were allocated to that location before the onset of the target, whereas detection was less efficient at the uncued location because of the additional time it took to shift attention to the actual location of the target. An important aspect of Posner and colleagues’ spotlight account is the notion that spatial attention must disengage from its origin before it can shift to another location (e.g., Pos ner & Petersen, 1990; Posner, Petersen, Fox, & Raichle, 1998). In fact, they propose that shifts of spatial attention involve three distinct processes, each of which is associated with a specific neural substrate. First, an area of the parietal lobe called the temporalparietal junction (TPJ) activates to permit disengagement of spatial attention from the current orientation. Next, a midbrain structure called the superior colliculus determines the target location and initiates the attentional shift toward that location. Finally, an area of the thalamus called the pulvinar activates to facilitate engagement of attention at the new spatial location. Perhaps the most convincing evidence in support of the disengage function comes from research demonstrating that perceptual discriminations improve when the central fixation stimulus is removed just before the onset of peripherally pre sented targets (e.g., Mackeben & Nakayama, 1993; Pratt & Nghiem, 2000)—a finding re ferred to as the “gap effect” (Saslow, 1967). In contrast to the spotlight metaphor, which assumes that the size of the attentional beam is fixed, the zoom lens theory (Eriksen & St. James, 1986; Eriksen & Yeh, 1985) suggests that the spatial extent of attention can be focused or diffuse, depending on the demands of the task. Support for that claim comes from an experiment by La Berge (1983) that re quired observers to categorize five-letter words, the middle letter of five-letter words, or the middle letter of five-letter nonwords in a speeded RT task. Within each condition, crit ical probe trials were periodically presented. On those trials, observers categorized a probe stimulus that appeared in one of the five letter positions (+ signs occupied the re maining four positions). In the letter conditions, RTs were fastest when the probe ap peared in the middle (i.e., attended) position, and slowest (p. 239) when it appeared at the first and fifth positions (i.e., the data formed a V-shaped RT function). In the word condi tion, however, RT was not affected by the position of the probe (i.e., the RT function was flat). The findings support the zoom lens theory by demonstrating that the focus of atten tion can indeed vary in size, in accordance with the demands of the task. Zoom lens theory also proposes that the resolution of the visual system that is afforded by spatial attention is determined by the variable scope of the lens: focusing the lens yields higher resolution within a narrow spatial area, whereas widening the lens yields lower resolution over a broader spatial area (e.g., Eriksen & Yeh, 1985). Consistent with that proposal, Eriksen and St. James (1986) showed that RTs to targets increased as a function of the number of spatial pre-cues presented. In the experiment, between one and four contiguous locations of an eight-item circular array were spatially cued, followed by the presentation of the target and distractors items. The results showed that target discrimi nations became slower as the number of cued items in the display increased, suggesting Page 3 of 30

Spatial Attention that attentional resources are “watered down” as they are expanded over larger spatial areas (Eriksen & St. James, 1986, p. 235). Two other findings from that study are also worth noting: the entire array used in the experiment only subtended 1.5 degrees of visu al angle, so the reported effects emerged when attention was quite focused; and the cueto-target stimulus onset asynchrony (SOA) manipulation revealed that performance was asymptotic beyond 100 ms, suggesting that the zoom lens required approximately that much time to make adjustments within spatial areas of that size. The space-based theories discussed above assume that the attentional field is indivisible (e.g., Eriksen & Yeh, 1985; Posner et al., 1980). Several studies, however, have found evi dence to the contrary (e.g., Bichot, Cave, & Pashler, 1999; Gobell, Tseng, & Sperling, 2004; Kramer & Hahn, 1995). For example, Kramer and Hahn (1995) presented observers with displays of two targets with two spatially intervening distractors. On each trial, two boxes were used to pre-cue the target positions, and observers were required to report whether the letters that appeared within the boxes were the same or different. To avoid an attentional capture confound associated with abrupt onsets, targets and distractors were initially presented as square figure eights, then shortly after the cues were present ed, segments of the figure eights were removed to produce letters. On half of the trials, the distractors primed a same response, and on the other half, they primed a different re sponse. The researchers found that both RT and discrimination accuracy were unaffected by the presence of the intervening distractors and concluded that “attention can be flexi bly deployed and maintained on multiple locations” (Kramer & Hahn, 1995, p. 384).

Object-Based Models The finding that the attentional field can be divided into multiple noncontiguous regions of space represents a major obstacle for space-based models of attention. According to object-based models, however, perceptual grouping mechanisms make it possible to di vide the attentional field across spatially disparate or partially occluded stimuli (Driver & Baylis, 1989; Duncan, 1984; Kahneman, Treisman, & Gibbs, 1992; Kramer & Jacobson, 1991). Object-based models assert that attentional selection is determined by the number of objects that are present in the visual field and emphasize the influence of gestalt grouping factors on the distribution of attention (Duncan, 1984; Neisser, 1967). One of the original object-based models contended that attentional selection is a twostage process (Neisser, 1967). In the first stage, the visual field is preattentively parsed into perceptual units (i.e., into objects) according to the gestalt principles of grouping (e.g., similarity, proximity). Then, in the second stage, the object is analyzed in detail by focal attention. Proponents of object-based theories of selection generally argue that once attention is directed to an object, all parts or features of that object are automatically se lected regardless of spatial location (e.g., Duncan, 1984; Egly, Rafal, & Driver, 1994; Kah neman & Treisman, 1984). As such, all parts of the same object are processed in a paral lel fashion, whereas different objects are processed serially (Treisman, Kahneman, & Burkell, 1983).

Page 4 of 30

Spatial Attention In a classic demonstration of object-based attentional selection, Duncan (1984) presented observers with two spatially overlapping objects, each with two attributes: a line that was either dotted or dashed and tilted to the left or right, and a box that was either large or small, with a gap on the left or right side. Observers were told which two attributes they would need to report before each trial. The results showed that discriminations were less accurate when observers reported two attributes of different objects (i.e., one attribute of the line and one attribute of the box) than two attributes of the same object (i.e., both at tributes of the line or the (p. 240) box). Given that the objects appeared in the same loca tion, a space-based account of attentional selection would predict that performance would not differ across the same- and different-object conditions. Thus, the results clearly support an object-based theory of spatial attention. Object-based attention is also supported by studies showing that attention automatically spreads across selected objects. In one such study, Egly et al. (1994) presented observers with two rectangular placeholders oriented lengthwise on either side of fixation (or ori ented sideways above and below fixation). In each trial, one rectangle was cued by a brief bolding of one end, followed by the presentation of a target disk at the cued or uncued end of the cued rectangle, or at the end of the uncued rectangle that was adjacent to the cued location on the cued rectangle. Critically, the uncued end of the cued rectangle and the uncued end of the uncued rectangle were equidistant from the cued location. Despite equal spacing across the two uncued conditions, observers were faster to respond to tar gets that appeared at the uncued end of the cued object than at the uncued end of the un cued object, suggesting that attention had automatically spread across the cued object (Egly et al., 1994).

Control of Spatial Attention Spatial attention can be controlled reflexively by external stimuli that capture attention, or voluntarily by internally generated goals (c.f. Klein, 2004). These two types of visual orienting are typically referred to as exogenously driven and endogenously driven attentional control, respectively (Posner, 1980). Similarly, a stimulus-driven shift of atten tion is said to be “pulled” to a peripheral location by the onset of a salient stimulus, and a goal-directed shift of attention is said to be “pushed” to a peripheral location following the cognitive interpretation of a centrally presented central cue (e.g., Corbetta & Shul man, 2002). One of the earliest studies to comprehensively examine the idea that both automatic and voluntary mechanisms can guide the allocation of spatial attention resources was con ducted by Jonides (1981). He tested the independence of these two mechanisms by pre senting observers with either a centrally presented directional cue (i.e., an arrow at fixa tion) or a peripherally presented location cue (i.e., an arrow adjacent to a potential target position in the periphery) and then asking them to perform a visual search for one of two targets in a circular array of eight letters. Across three experiments, Jonides showed that RTs to targets following central cues, but not peripheral cues, are slowed when mental re Page 5 of 30

Spatial Attention sources are simultaneously consumed in a working memory task (i.e., holding a sequence of numbers in mind); cueing effects persist when observers are asked to ignore peripher al cues, but not when they are asked to ignore central cues; and cueing effects are modu lated as a function of the relative proportion of central cues that observers expect to be presented with, but not as a function of the relative proportion of peripheral cues that they expect to be presented with. Together, the findings were taken as evidence that ex ogenously and endogenously driven spatial attention, activated by peripheral and central cues respectively, “differ in the extent to which they engage attention automatically” (Jonides, 1981, p. 200). Although concluding that reflexive or voluntary control processes can independently guide the allocation of spatial attention, Jonides (1981) nevertheless assumed that the modes of orienting were parts of the same unitary attentional mechanism. According to that account, exogenously and endogenously driven attentional controls differ simply in the process by which they propel spatial attention to a specific spatial location or object in a region of space. Other researchers, however, have argued that automatic and volun tary orienting are distinct separate attentional mechanisms (c.f. Klein & Shore, 2000). The two-mechanism model of spatial attention put forward by Nakayama and Mackeben (1989) consists of a relatively primitive fast-acting transient component that is guided by stimulus-driven, or bottom-up, processes and affects perception at early stages of cortical processing, and a more recently evolved sustained component that is guided by goal-di rected, or top-down, processes. According to the model, the transient component is re sponsible for generating rapid attentional shifts to a cued location, whereas the sustained component is needed to hold attention at that location (Nakayama & Mackeben, 1989). In an empirical test of their two-mechanism model, Nakayama and Mackeben (1989) instructed observers to perform either a simple search (i.e., orientation distinguished the target from the distractors) or a conjunctive search (i.e., target orientation and color dis tinguished it from distractors) for a target amid an eight-by-eight display array. In the sustained attention condition, observers’ attention was directed to the valid target loca tion by a location cue that remained visible for the duration of each trial and appeared at the same position in the array across all trials. Sustained (p. 241) attention was also inves tigated by informing observers of the valid target location without presenting them with a physical cue. In both scenarios, performance on cued (or informed) trials was compared with performance when no cue was presented. In the transient attention condition, spa tial attention was directed to the valid target location by a spatially unpredictable loca tion cue with an onset just before the display of the search array. The results showed that sustained attentional control facilitated performance in the conjunctive search task, but not in the simple search task (also see Treisman & Gelade, 1980) and transient attention al control facilitated performance when the cue preceded the display array by 50 to 200 ms, but impaired performance at SOAs longer than 200 ms. A two-mechanism model of spatial attention was also supported by the results of a study showing that reflexive and voluntary orientations have different time courses with re spect to their respective facilitative and inhibitory effects on perception. Muller and Rab bitt (1989) presented observers with a central or a peripheral cue that directed their at Page 6 of 30

Spatial Attention tention to one of four target locations. Following the cue, a target stimulus appeared at the cued or an uncued location, and observers were required to discriminate whether it was the same or different from a previously presented stimulus. Peripheral cues, which activate reflexive orienting, produced maximal facilitation of target discriminations at the cued location at short (100–175 ms) SOAs and also improved performance at the uncued location at longer (400–725 ms) SOAs. In contrast, central cues, which activate voluntary orienting, facilitated performance at the cued location maximally at SOAs between 275 and 400 ms. Based on these results, the researchers proposed that spatial orienting is composed of a fast-acting mechanism that briefly facilitates, but then inhibits, attentional processing at peripherally cued locations, and a slower-acting mechanism, activated only by central cues, that facilitates attentional processing for a more sustained interval (Muller & Rabbitt, 1989). It is noteworthy that the results from both studies that were just reviewed are consistent with the findings from an earlier study by Posner and Cohen (1984) showing that an inhi bition of return (IOR) effect occurs when spatial attention is directed exogenously in re sponse to peripheral cues, but not when it is directed endogenously by central cues. In the study, observers were presented with three placeholder boxes along the horizontal meridian of the screen. In the peripheral cueing procedure, one of peripheral boxes brightened briefly, followed by brief brightening of the centre box (i.e., to reorient spatial attention away from the initially cued location). The central cueing procedure was similar, except an arrow stimulus appeared in the center box, followed by the brightening of that box 600 ms later (if no target had appeared in a peripheral box 450 ms following the on set of the central cue). In line with previous research (e.g., Posner et al., 1978, 1980), pe ripheral cues and central cues produced reliable cueing effects: target detection was most efficient when the target appeared at the cued location. However, although the facil itative effect of the cue persisted across the entire range of cue–target SOAs when cen tral cues were used, target detection was actually inhibited at the cued location at SOAs beyond 200 ms when peripheral cues were used. Posner and Cohen (1984) called the ef fect IOR and suggested that it reflected an evolved mechanism that promoted efficient vi sual searching of the environment (i.e., by assigning inhibitory tags to previously exam ined spatial locations). The finding that the IOR effect occurs when attention is controlled exogenously, but not when it is under endogenous control (but see Lupianez et al., 2004 for an exception), clearly supports a two-mechanism model of spatial attention. Perhaps the strongest evidence supporting the distinction between exogenous and en dogenous attentional orienting comes from Klein and colleagues’ research (e.g., Briand & Klein, 1987; Klein, 1994; Klein & Hansen, 1990; also see Funes, Lupianez, & Milliken, 2007, for a double dissociation using on a spatial Stroop task) showing a double dissocia tion between the two components of spatial attention. In one study, Briand and Klein (1987) dissociated the two mechanisms by examining the effect of spatial cueing on the likelihood that observers would make illusory conjunction errors. Spatial attention was di rected to the left or right of fixation by peripheral or central cues. In each trial, a pair of letters appeared at the cued or uncued location and observers were required to report whether or not the target letter R was present or absent. The critical comparison con Page 7 of 30

Spatial Attention cerned the difference in performance when the target was absent and the letter pair pro moted an illusory conjunction of the target (i.e., PQ), as opposed to when the target was absent and the letter pair did not promote an illusory conjunction of the target (i.e., PB). The conditions were referred to as conjunction and feature search conditions, respective ly. The results revealed that search type interacted with spatial attention (p. 242) when ex ogenous peripheral cues were used, but not when endogenous central cues were used. Thus, the possibility of making an illusory conjunction error impaired performance when spatial attention was controlled exogenously, but not when it was controlled endogenous ly. Briand and Klein (1987) concluded that central and peripheral cues engage different attentional systems, and that feature integration processes are only performed by the ex ogenously driven component of spatial attention. In another study, Klein (1994) dissociated the two mechanisms of spatial attention by ex amining the effect of spatial cueing on nonspatial (i.e., perceptual motor) expectancies. Covert spatial attention was directed by peripheral or central cues to a box on the left or right side of fixation, and on each trial the cued or uncued box either increased or de creased in size. For half the observers an increase in target size was far more likely to oc cur, and for the other half a decrease in target size was far more likely to occur. The re sults indicated that nonspatial expectancies interacted with spatial attention when en dogenous central cues were used, but not when exogenous peripheral cues were used. Specifically, performance was impaired by the occurrence of an unexpected event at the uncued location when central cues were employed, whereas the effect of exogenous at tention on performance was the same for expected and unexpected events. Klein (1994) concluded that when taken together with evidence from the Briand and Klein (1987) study, their results demonstrated that exogenously and endogenously controlled atten tions recruit qualitatively different attentional mechanisms. In other words, “it is not just the vehicle, but the passenger, that might differ with endogenous versus exogenous con trol” (Klein, 1994, p. 169).

Eye Movements and Spatial Attention Orienting of spatial attention to specific locations or objects can be performed covertly or overtly. Covert shifts of spatial attention involve internal movement of the “mind’s eye,” whereas overt shifts of spatial attention involve observable eye movements. A consider able amount of research has been conducted in an attempt to understand the nature of the relationship between eye movements and spatial attention (e.g., Goldberg & Wurtz, 1972; Klein, 1994; Posner, 1980; Remington, 1980; Rizzolatti, Riggio, Dascola, & Umilta, 1987; Shepherd, Findlay, & Hockey, 1986). Specifically, a number of studies have exam ined whether the processes are controlled by the same mechanism, whether they are completely independent mechanisms, or whether they interact with each other as interde pendent mechanisms. Early single-cell recording research showed that programming an eye movement caused increased firing rates in cells in the superior colliculus whose receptive fields were at the Page 8 of 30

Spatial Attention target location, well before the actual eye movement had begun (Goldberg & Wurtz, 1972). These findings supported the notion that a relationship exists between eye movements and spatial attention and that the superior colliculus may be the neural substrate that manages that relationship. However, since then, several studies have demonstrated that the perceptual costs and benefits produced by spatial cueing manipulations can occur in the absence of eye movements (e.g., Bashinski & Bacharach, 1980; Eriksen & Hoffman, 1972; Posner et al., 1978). The fact that the effects of spatial attention can occur in the absence of eye movements suggests that attention shifts and eye movements are mediat ed by different neural structures. In fact, Posner (1980) concluded that when considered together, the behavioral, electrophysiological, and single-cell recording findings are evi dence that “eliminates the idea that attention and eye movements are identical systems” (p. 13). Although spatial attention and eye movements are not identical (Posner, 1980), the oculo motor readiness hypothesis (OMRH) (Klein, 1980) and premotor theory (Rizzolatti, Rig gio, Dascola, and Umilta (1987) nevertheless propose that they are two processes of a unitary mechanism. Both theories contend that endogenously controlled covert shifts of attention prepare and facilitate the execution of eye movements to a target location. Or as Rizzolatti et al. (1987) simply put it: Eye movements follow attention shifts. In this view, the preparation of an eye movement to a target location is the endogenous orienting mechanism (Klein, 2004). Thus, covert endogenous spatial attention is simply an intend ed, but unexecuted, eye movement (Rizzolatti et al., 1987). Critically, the theories predict that covert shifts of spatial attention facilitate eye movements when they are in the same direction, and that sensory processing should be enhanced at the location of a pro grammed eye movement. Despite representing the dominant view concerning the nature of the relationship between spatial attention and eye movements (Palmer, 1999), the re search has yielded equivocal results concerning the two critical predictions that emerge from the OMRH and premotor theory. Klein (1980; see also Hunt & Kingstone, 2003; Klein & Pontefract, 1994) conducted two (p. 243) experiments to test to the predictions of his OMRH. In one experiment, observers were presented with a central cue, and then, depending on the type of trial, they either made an eye movement or a detection response to a target at the cued or uncued location (i.e., the type of response was determined by the physical characteristics of the target). Most trials required a detection response, and performance in these trials suggested that the participants had indeed shifted their attention in response to the cue. However, in contrast to what would be predicted by the OMRH and premotor theory (Rizzolatti et al., 1987), eye movements to a target at the cued location were no faster than they were to a target at the uncued location (Klein, 1980). That result is not consistent with the OMRH prediction that covert shifts of spatial attention facilitate eye movements when they are in the same direction. In his second experiment, Klein (1980) instructed some observers to perform a leftward eye movement on every trial and others to perform a rightward eye movement on every trial, regardless of whether the target appeared on the left or right of central fixation. Critically, on some trials, instead of executing an eye movement, ob servers simply needed to make a detection response to the target. Again, the results were Page 9 of 30

Spatial Attention inconsistent with the OMRH and premotor theory. Instead of RTs being facilitated when detection targets appeared in the location of the prepared eye movement, as the OMRH and premotor theory would predict, RTs were unaffected by the spatial compatibility of the target and the direction of the eye movement. These findings and others (e.g., Hunt & Kingstone, 2003; Posner, 1980; Remington, 1980) refute the notion that covert spatial at tention shifts are unexecuted eye movements, and rather suggest that covert spatial at tention and overt eye movements are independent mechanisms (Klein, 1980; Klein & Pon tefract, 1994). The extant research does suggest that covert spatial attention and overt eye movements are interdependent. For example, the results of a study conducted by Shepherd et al. (1986) showed that spatial attention and eye movements form an asymmetrical relation ship, such that an attention shift does not require a corresponding eye movement, but an eye movement necessarily involves a shift in the focus of attention. They examined covert and overt attentional orienting independently by manipulating the validity of a central cue and by instructing observers to either prepare and execute, or prepare and inhibit, an eye movement to the target location, respectively. In a blocked design, observers were re quired to remain fixated (i.e., fixate condition) or make an eye movement (i.e., move con dition) following the presentation of either a valid (80 percent), uninformative (50 per cent), or invalid (20 percent) central cue, and then respond to the detection of a target at the cued or uncued location. Not surprisingly, the results showed a significant reduction in detection RT when the move condition was coupled with the valid spatial cue. Interest ingly, a benefit to RT was also found in the move condition when the cue was invalid, sug gesting that eye movements have a more dominant effect on performance than endoge nously controlled attention (Shepherd et al., 1986). Most germane to the OMRH (Klein, 1980) and premotor theory (Rizzolatti et al., 1987), however, was the large benefit to RT that was found in the move condition when the cue was uninformative. Accordingly, the authors concluded that making an eye movement “involves some allocation of attention to the target position before the movement starts” (Shepherd et al., 1986, p. 486). This demonstration of the interdependence between spatial attention and eye movements is at least in partial support of the OMRH (Klein, 1980) and premotor theory (Rizzolatti et al., 1987). The findings from a more recent study conducted by Hoffman and Subramanian (1995) also support the OMRH (Klein, 1980) and premotor theory (Rizzolatti et al., 1987) by showing that a close relationship exists between spatial attention and eye movements. In their first experiment, observers were presented with four placeholders around fixation (i.e., above and below, and on either side) followed by an uninformative central cue indi cating to which placeholder they should prepare an eye movement toward (but providing no prediction about the target location). Shortly after the directional cue was removed, a tone was presented to cue observers to initiate the planned eye movement. When the tone was presented or immediately afterward, a letter was presented in each placeholder (i.e., three distractors and one target), and observers were required to perform a two-al ternatives forced choice regarding the identity of the target. The results revealed that tar get discriminations were approximately 20 percent better when the target appeared in Page 10 of 30

Spatial Attention the location of the intended location compared with when the target appeared at one of the uncued locations. Because this finding indicates that observers attended to the loca tion of the intended eye movement, even though they were aware that the cue did not validly predict the target location, the researchers concluded that the spatial attention and eye movements are indeed (p. 244) related, such that eye movements to a given tar get location are preceded by a shift of spatial attention to that location (Hoffman & Subramanian, 1995). In their second experiment, Hoffman and Subramanian (1995) dissociated spatial atten tion and eye movements by requiring observers to prepare eye movements in the same di rection on each trial, before presenting them with a valid central cue. Thus, on some tri als, the intended eye movement and the target location matched, and on other trials they mismatched. The results showed that regardless of whether the cue directed attention to the target location or not (i.e., whether the cue was valid or not), target discriminations were highly accurate when the target location and the intended eye movement matched, and target discriminations were poor when the target location and the intended eye movement mismatched. The findings from this experiment indicate that spatial attention cannot be easily directed to a different location than an intended eye movement and sug gest that an “obligatory” relationship exists between spatial attention and eye movements such that covert spatial orienting precedes overt orienting (Hoffman & Subramanian, 1995). Thus, the results of both experiments reported by Hoffman and Subramanian (1995; see also Shepherd et al., 1986) support the OMRH (Klein, 1980) and premotor the ory (Rizzolatti et al., 1987). Taken together, the studies reviewed above indicate that spatial attention and eye move ments are interdependent mechanisms: Although a shift of attention can be made in the absence of an eye movement, an eye movement cannot be executed without a preceding shift of spatial attention. Moreover, because spatial attention appears to play a critical role in initiating eye movements, rather than vice versa, it can be concluded that “atten tion is the primary mechanism of visual selection, with eye movements playing an impor tant but secondary role” (Palmer, 1999, p. 570).

Neural Sources of Spatial Attention The advent of modern neuroimaging techniques has afforded researchers great insight in to the effects of spatial attention on neural activity in humans (e.g., Corbetta, Miezin, Dobmeyer, Shulman, & Petersen, 1993; Desimone & Duncan, 1995; Kanwisher & Wojciu lik, 2000). Subcortical and cortical sources have been implicated in spatial attention.

Subcortical Networks of Spatial Attention One subcortical substrate, the superior colliculus, plays a key role in reflexive shifts of at tention (it is also involved in localization of stimuli and eye movements) (Wright & Ward, 2008). Another area called the frontal eye field (FEF) is particularly important for execut ing voluntary shifts of attention (Paus, 1996). The critical connection between the FEF Page 11 of 30

Spatial Attention and attention shifts would be expected given the interdependent relationship between spatial attention and eye movements that was illustrated in the previous section. Some re searchers suggest that the superior colliculus and FEF make up a subcortical network that interacts to control the allocation of spatial attention. Specifically, it has been sug gested that the FEF may serve to inhibit reflexive attention shifts generated by the supe rior colliculus (c.f., Wright & Ward, 2008). Areas of the thalamus are also important subcortical regions associated with spatial at tention. One area, called the pulvinar nucleus, is critically involved in covert spatial ori enting (e.g., Robinson & Petersen, 1992). In a single-cell recording study, Petersen, Robin son, and Morris (1987) measured activity of neurons in the dorsomedial part of the lateral pulvinar (Pdm) of trained rhesus monkeys while they fixated centrally and made speeded responses to targets appearing at peripherally cued or uncued locations. First, the re searchers confirmed that the Pdm is related to spatial attention, and is independent of eye movements, by observing enhanced activity in that area when the monkeys covertly attended to the target location. Having established that the Pdm is related to spatial at tention, next Petersen et al. examined changes in attentional performance that resulted from pharmacological alteration of the Pdm by GABA-related drugs (i.e., muscimol, a GA BA-agonist, and bicuculline, a GABA-antagonist). The results showed that pharmacologi cal alteration of this part of the brain did in fact alter performance in the spatial cueing task: Injections of muscimol, which increases inhibition by increasing GABA effectiveness, impaired the monkey’s ability to execute contralateral attention shifts, and injections of bicuculline, which decreases inhibition by decreasing GABA effectiveness, facilitated the monkey’s ability to shift attention to the contralateral field. Given that these modulations of attentional performance were produced by drug injections to the Pdm, it is reasonable to conclude that the Pdm is implicated in spatial attention (Petersen et al. 1987). A study by O’Connor, Fukui, Pinsk, and Kastner (2002) that will be reviewed later in the chapter shows that activity in another area of the thalamus, called the lateral geniculate nucleus (LGN), is also modulated by spatial attention. (p. 245)

Cortical Networks of Spatial Attention

In addition to the subcortical sources discussed above, multiple cortical sources are criti cally involved in the allocation of spatial attention resources. A number of different atten tional control processes involve various areas of the parietal cortex (e.g., Corbetta, 1998; Kanwisher & Wojciulik, 2000). For example, Yantis, Schwarzbach, Serences, et al. (2002) used event-related functional magnetic resonance imaging (fMRI) to examine changes in brain activity that occurred when observers made covert attention shifts between two pe ripheral target locations. In particular, the researchers were interested in determining whether the activation of parietal cortex is associated with transient or sustained atten tional control. The scans revealed that while shifts of spatial attentional produced sus tained contralateral activation in extrastriate cortex, they produced only transient in creases in activation in the posterior parietal cortex (Yantis et al., 2002). Accordingly the authors concluded, “activation of the parietal cortex is associated with a discrete signal to Page 12 of 30

Spatial Attention shift spatial attention, and is not the source of a signal to continuously maintain the cur rent attentive state” (Yantis et al., 2002, p. 995). Activation of the parietal cortex is associated with top-down attentional control that is specifically related to the processing of the spatial pre-cue. That was demonstrated in an event-related fMRI study that Hopfinger, Buonocore, and Mangun (2000) designed to dis sociate neural activity associated with cue-related attentional control, from the selective sensory processing associated with target perception. Valid central cues were used to di rect spatial attention to one of the black-and-white checkerboard targets that were pre sented on either side of fixation. The task required observers to covertly attend to the cued checkerboard and report whether or not it contained some gray checks. The fMRI scans revealed that the cues activated a network for voluntary attentional control com prising the inferior parietal (particularly the intraparietal sulcus [IPS]), superior tempo ral, and superior frontal cortices. Moreover, the scans showed contralateral cue-related activation in areas of extrastriate cortex that represented the spatial location of the ensu ing target. Taken together, these findings indicate that a neural network comprising areas of parietal, temporal, and frontal cortex is associated with top-down attentional control, and that this network in turn modulates activity in areas of visual cortex where the target is expected to appear. Although one area of the parietal lobule, the IPS, is involved in generating and sustaining voluntary attention toward a cued location in the absence of sensory stimulation (Friedrich, Egly, Rafal, & Beck, 1998; Hopfinger et al., 2000; Yantis et al., 2002), another area, TPJ, is activated by unexpected stimulus onsets, particularly at unattended locations (e.g., Serences et al., 2005; Shulman et al., 2003). Thus, the neuroimaging evidence from the studies reviewed above indicates that several distinct areas of the parietal cortex are critically involved in a number of components of spatial attention (Serences et al., 2005, p. 1000). Abundant findings from neuroimaging research also indicate that the parietal cortex in terconnects with the frontal cortex to form a complex cortical network for spatial atten tion (e.g., Corbetta, 1998; Corbetta & Shulman, 2002; Kastner & Ungerleider, 2000). Positron emission tomography (PET) was used in a classic study by Corbetta et al. (1993) to investigate the brain areas involved in the voluntary orienting of spatial attention. Spa tial attention was directed endogenously by informative central cues and by instructions to covertly shift attention to the most probable target location. The behavioral findings showed the expected pattern of costs and benefits to RT (i.e., targets were detected faster at cued, and slower at uncued, locations), confirming the effectiveness of the cue in orienting attention. The PET scans revealed significant activation of superior (i.e., dorsal) areas in both the parietal and frontal cortex during performance of the spatial cueing task.

Page 13 of 30

Spatial Attention

Dorsal and Ventral Frontoparietal Networks The behavioral studies reviewed in the previous section on the control of spatial attention provide compelling evidence that it is governed by two independent, and perhaps inter acting, mechanisms. One mechanism shifts reflexively, provides transient attentional ef fects on perception, and is under exogenous (i.e., stimulus-driven) control. The other mechanism is associated with voluntary shifts, provides sustained attentional effects on perception, and is under endogenous (i.e., goal-directed) control. Moreover, given the as sertion made by researchers such as Klein (1994; Klein & Shore, 2000) that these two at tentional mechanisms are specialized to process different types of visual information, one might reasonably assume that they are each associated with distinct areas of the brain. In fact, in an influential review of the literature, Corbetta and Shulman (2002) suggested that spatial attention is (p. 246) associated with two “partially segregated networks of brain areas that carry out different attentional functions” (p. 201). On one hand, they pro pose that control of goal-directed (i.e., voluntary) spatial attention is accomplished in a bi lateral neural network that sends top-down signals from parts of the superior frontal cor tex (i.e., FEF) to areas of the parietal cortex (i.e., IPS). On the other hand, they propose that control of stimulus-driven (i.e., reflexive) spatial attention takes place in a right-later alized neural network involving temporal-parietal cortex (i.e., TPJ) and inferior frontal cortex. Corbetta and Shulman (2002) refer to these two streams as the dorsal frontopari etal network and ventral frontoparietal network, respectively. The dorsal and ventral frontoparietal networks play separate roles in the control of atten tion. The dorsal network performs attentional processing associated with voluntary con trol of attention (e.g., Corbetta et al., 2000; Hopfinger et al., 2000; Yantis et al., 2002), and the ventral network is more involved in attentional processing associated with the on set and detection of salient and behaviorally relevant stimuli in the environment (Corbet ta & Shulman, 2002; Marois, Chun, & Gore, 2000). While Corbetta and Schulman (2002) contend that the two networks are partially segregated, they also posit that the systems interact. Specifically, they suggest that one of the important functions of the ventral net work is to interrupt processing in the dorsal stream when a behaviorally relevant stimu lus is detected (Corbetta & Shulman, 2002). This “circuit breaker” role of the ventral stream would facilitate the disengagement and reorientation of attention (Corbetta & Shulman, 2002). Interaction between the dorsal and ventral frontoparietal networks may also be required for the spatial localization of salient stimuli. The ventral network is specialized to detect the onset of salient stimuli, but localizing such stimuli in space probably requires assis tance from the dorsal network. Consistent with that idea, an influential theory of visual processing postulated by Ungerleider and Mishkin (1982) contends that the ventral path way of the visual system determines “what is out there,” whereas the dorsal pathway of the visual system determines “where it is” (see also Ungerlieder & Mishkin, 1992). Results from a recent study by Fox, Corbetta, Snyder, Vincent, and Raichle (2006) provided some support for Corbetta and Shulman’s (2002) contention that the exchange Page 14 of 30

Spatial Attention of information between the dorsal and ventral networks is accomplished by interconnec tions between the right IPS and right TPJ. However, that study also found that the correla tion of neural activity between the right IPS and right TPJ was no stronger that it was be tween several other areas across the two networks (e.g., FEF/TPJ, IPS/VFC), and the au thors also conceded that spatial reorienting behaviors have been shown to persist even in patients with IPS lesions (Fox et al., 2006). Thus, the signal from the ventral network must be able to access dorsal stream in areas other than the IPS.

Neurophysiological Effects of Spatial Attention Spatially attended stimuli are perceived differently than unattended stimuli. Research has revealed a number of ways that attention modulates the neural activity that gives rise to our subjective experience of the visual world. Given the information processing con straints of the visual system (e.g., Desimone & Duncan, 1995), attention is needed to serve as a selection mechanism that actively enhances processing of some stimuli at the expense of others (e.g., Petersen et al., 1987). In a classic study of spatial attention, Moran and Desimone (1985) investigated how attentional mechanisms filter (i.e., select) wanted from unwanted stimuli. They recorded the activity of single cells in visual cortex of monkeys trained to perform a match-to-sample task. The monkeys covertly attended to either an effective (i.e., elicits a response from the cell) or ineffective stimulus and were required to determine whether sequentially presented stimuli at the attended location were the same or different. When the effective and the ineffective stimuli were both in side the cell’s receptive field, the response was determined by the attended stimulus: the cell responded strongly when the monkey attended to the effective stimulus, and it re sponded poorly when the monkey attended to the ineffective stimulus. Thus, when multi ple stimuli fall within the receptive field of a single cell, the response is determined by the attended stimulus. Indeed, in V4 the consequence of ignoring an effective stimulus in side the receptive field was a reduction in the cell’s response by more than half (Moran & Desimone, 1985). In contrast, when the ineffective stimulus was moved outside the cell’s receptive field, and the effective stimulus was left inside, the cell’s response to the effec tive stimulus was the same whether the animal attended to the ineffective stimulus out side the cell’s receptive field or attended to the effective stimulus inside the cell’s recep tive field. That pattern of data led the researchers to (p. 247) conclude that attentional mechanisms do not serve to enhance responses to attended stimuli; rather, the neural ba sis of spatial attention is to attenuate processing of irrelevant information “as if the re ceptive field has contracted around the attended stimulus” (Moran & Desimone, 1985, p. 783). Mangun and Hillyard (1991) investigated the neural bases of the ubiquitous perceptual benefits of spatial attention (e.g., more efficient and accurate detections and discrimina tions) using event-related potentials (ERPs) (i.e., changes in the electrophysiological ac tivity in the brain time-locked to the presentation of an external stimulus). Central cues were used to direct spatial attention toward or away from subsequently presented periph eral targets. Observers detected or made choice discriminations about targets that ap Page 15 of 30

Spatial Attention peared at covertly attended or unattended locations while the electrical activity from cor tical neurons was measured from their scalps. Visual onsets produce predictable early re sponses over visual cortex called the P1 and N1 waveform components (e.g., Eason, 1981; Van Voorhis & Hillyard, 1977). The P1 is the first major positive deflection, occurring be tween 70 and 100 ms after the presentation of a visual stimulus, and the N1 is the first major negative component that is more broadly distributed and occurs about 150 to 200 ms after the presentation of a visual stimulus (Mangun & Hillyard, 1991). The recordings revealed that the P1 was larger for attended than unattended targets in both the detec tion and the discrimination tasks; however, the amplitude of the N1 component only dif fered for attended and unattended targets in the discrimination task. These findings led Mangun and Hillyard (1991; Hillyard & Mangun, 1987) to conclude that spatial attention facilitates a sensory gain control mechanism that enhances processing associated with sensory signals emanating from attended locations. Interestingly, the observed attentionrelated dissociation between the P1 and N1 waveform components has since been repli cated, and has been interpreted as evidence that the costs and benefits produced by spa tial cues may reflect the activity of qualitatively different neural mechanisms (c.f. Luck, 1995). Similar attention-related increases in neural activity have been shown using fMRI. In a study by O’Conner et al. (2002), observers were presented with high- or low-contrast flickering checkerboards on either side of fixation while they were in the scanner. There were two viewing conditions in the experiment: In the attend condition, observers covert ly attended to the checkerboard on the left or right of fixation and responded when they detected random changes in target luminance; in the unattended condition, instead of shifting attention, observers counted letters that were presented at fixation. The results showed increased fMRI signal change in the LGN and visual cortex in the attended condi tion compared with the unattended condition. In the same study, O’Conner et al. (2002) also investigated the effect of spatial attention on processing of nonselected, or unattended, information. They assumed that the spread of spatial attention would be determined (i.e., constrained) by the relative amount of cog nitive effort that was required to perform a perceptual task at fixation. Specifically, be cause of the limited resource capacity of the attention system, they predicted that the amount of processing devoted to an unattended stimulus would be determined by the amount of resources not consumed by the attended stimulus. To test their prediction, ob servers were required to perform either a demanding (high-load) or easy (low-load) task at fixation while ignoring checkerboard stimuli presented in periphery. As expected, the results showed a decrease in activation across the visual cortex in the demanding condi tion compared with the easy condition. Thus, activation of the LGN and visual cortex is enhanced when observers voluntarily attend to peripheral stimulus, and it is attenuated when the same stimulus is voluntarily ignored (O’Conner et al., 2002). The researchers concluded that spatial attention facilitates visual perception by “enhancing neural re sponses to an attended stimulus relative to those evoked by the same stimulus when ig nored” (Kastner, McMains, & Beck, 2009, p. 206). Page 16 of 30

Spatial Attention

Effects of Spatial Attention on Early Visual Per ception Abundant research has shown that spatial attention enhances performance in tasks that are based on low-level visual perception (see Carrasco, 2006, for a review). Behavioral studies have traditionally investigated the relationship between spatial attention and ear ly vision using basic dimensions such as contrast sensitivity, spatial sensitivity, and tempo ral sensitivity as indices of early visual processing.

Effects of Spatial Attention on Contrast Sensitivity Spatial attention increases contrast sensitivity (see Carrasco, 2006 for a review). Cameron, Tai, and Carrasco (2002) examined the effect of covert spatial attention on con trast sensitivity in an orientation discrimination task. Spatial attention was manipulated by using an informative (i.e., (p. 248) 100 percent valid) peripheral cue that appeared at one of eight locations in an imaginary circular array around the fixation point, or a neu tral cue that appeared at fixation. Following the onset of the cue, observers were briefly presented with a tilted sine wave grating (i.e., alternating fuzzy black lines in a Gaussian envelope) at varying stimulus contrasts. The researchers found that observers’ contrast sensitivity thresholds were lower at the attended than the unattended location. In other words, the contrast needed for observers to perform the task at a given level of accuracy was different at attended and unattended locations: They attained a threshold level of performance in the orientation discrimination task at a lower contrast when targets ap peared at the attended than the unattended location. Similar findings were reported by Pestilli and Carrasco (2005) in a study that manipulated spatial attention using pure exogenous cues (i.e., peripheral and uninformative). They presented a peripheral cue (50 percent valid with two locations) or a neutral cue (i.e., at fixation) followed by the brief presentation of a titled sine wave grating on either side of fixation. Shortly after the removal of the target gratings (which were presented at vary ing contrasts), a centrally presented stimulus indicated to the observer whether the ori entation of the left or right target grating was to be reported. Thus, trials were divided in to three equally probable types: On valid trials the peripheral cue and the response cue were spatially congruent; on invalid trials the peripheral cue and the response cue were spatially incongruent; and on neutral trials the response cue, which followed the centrally presented neutral cue, was equally likely to point to the left or right target. This elegant experimental design permitted the researchers to evaluate the effect of attention on con trast sensitivity at cued and uncued locations. The results showed both a benefit and cost of spatial attention. That is, relative to the neutral condition, contrast sensitivity was en hanced at the attended location and was impaired at the unattended location. Pestilli and Carrasco (2005) concluded that there is a processing tradeoff associated with spatial at tention such that the benefit to perception at the attended location means that fewer cor tical resources are available for perceptual processing at unattended spatial locations.

Page 17 of 30

Spatial Attention

Effects of Spatial Attention on Spatial Sensitivity An extensive amount of research has demonstrated that spatial attention improves spatial resolution (e.g., Balz & Hock, 1997; Tsal & Shalev, 1996; Yeshurun & Carrasco, 1998; 1999; and see Carrasco & Yeshurun, 2009, for a review). In a study conducted by Yeshu run and Carrasco (1999), which used peripheral cues to direct covert spatial attention, it was shown that performance was better at attended than unattended locations in a vari ety of spatial resolution tasks that required observers to either localize a spatial gap, dis criminate between dotted and dashed lines, or determine the direction of vernier line dis placements (Yeshurun & Carrasco, 1999). In another study (Yeshurun & Carrasco, 1998), these researchers showed that spatial attention modulates spatial resolution via signal enhancement (i.e., as opposed to attenuating noise or changing decisional criteria). Ob servers performed a two-interval forced-choice texture-segregation task in which they re ported whether the first or second display contained a unique texture patch. On peripher al cue trials, a cue in one interval validly predicted the location of the target, but not the interval that contained the target, and a cue in another interval simply appeared at a non target location. Neutral cues were physically distinct from the peripheral cues, and they provided no information concerning where the target would be in either interval. Both pe ripheral and neutral cues were presented at a range of eccentricities from the fovea. In terestingly, but in line with the researchers’ prediction, performance was better in cued than neutral trials at all target eccentricities except those at, or immediately adjacent to, the fovea. According to Yeshurun and Carrasco (1998), this counterintuitive pattern of re sults occurred because spatial attention caused the already small spatial filters at the fovea to become so small that their resolution was beyond what the texture segregation task required. In other words, by enhancing spatial resolution, attention impaired task performance at central retinal locations. Justifiably, it was concluded that one way in which attention enhances spatial resolution is via signal enhancement (Yeshurun & Car rasco, 1998).

Effects of Spatial Attention on Temporal Sensitivity Research has also revealed counterintuitive effects of spatial attention on temporal sensi tivity. Yeshurun and Levy (2003) investigated the effect of spatial attention on temporal sensitivity using a temporal gap detection task. Observers judged whether a briefly pre sented target disc was on continuously, or contained a brief offset (i.e., a temporal gap). Targets appeared following valid peripheral cues or (p. 249) after a physically distinct neu tral cue that diffused spatial attention across the entire horizontal meridian. The results indicated that temporal sensitivity was actually worse on valid peripheral cue trials than on diffuse neutral cue trials. To account for the counterintuitive finding that spatial atten tion impairs temporal resolution, Yeshurun and Levy (2003) suggested that spatial atten tion produces an inhibitory interaction that activates the parvocellular visual pathway and inhibits the magnocellular visual pathway. To the extent that spatial resolution de pends on processing in the parvocellular pathway and temporal resolution depends on

Page 18 of 30

Spatial Attention processing in the magnocellular pathway, an inhibitory interaction of this nature could ex plain why spatial attention enhances spatial resolution and degrades temporal resolution. The effect of spatial resolution on temporal sensitivity was further investigated in a study by Hein, Rolke, and Ulrich (2006) that employed both peripheral and central cues in sepa rate temporal order judgment tasks. Two targets dots, side by side, were presented asyn chronously at either the attended or unattended location, and observers reported which target appeared first. Consistent with the results reported by Yeshurun and Levy (2003), when peripheral cues were used, TOJ performance was worse at the attended than the unattended location. However, when central cues were used, performance was more ac curate at the attended than the unattended location. To account for the qualitative differ ence in performance across the two cue types, Hein et al. (2006) adhered to the idea that automatic and voluntary shifts of spatial attention affect different stages of the visual sys tem (see also Briand & Klein, 1987). Specifically, they suggested that automatic shifts of attention influence early stages of processing and impair the temporal resolution of the visual system, whereas voluntary shifts of attention influence processing at higher levels and improve on the temporal resolution of the visual system (Hein et al., 2007). According to Titchener (1908), “the object of attention comes to consciousness more quickly than the objects that we are not attending to” (p. 251). He called this the “law of prior entry.” Shore, Spence, and Klein (2001; Shore & Spence, 2005; Spence, Shore, & Klein, 2001) investigated the prior entry effect of spatial attention using a visual temporal order task. Peripheral exogenous cues or central endogenous cues were presented, fol lowed by the asynchronous onset of one target line segment on either side of fixation. On one half of the trials, the target at the cued location appeared first, and on the other half of the trials, the target at the uncued location appeared first. One target was a vertical line, and the other was horizontal line, and observers were required to report the one they perceived first. The response and the cue were orthogonal in an attempt to reduce the tendency of subjects to simply report that the target at the cued location appeared first. The prior entry effect was observed in response to both exogenous than endogenous cues; that is, observers were more likely to perceive and report the target at the cued lo cation first, even when it was slightly preceded the onset of the target at the uncued loca tion. Thus, attended stimuli are perceived sooner than unattended stimuli (e.g., Shore et al., 2001; Stelmach & Herdman, 1991). Spatial attention has a similar effect on the perception of temporal offsets. In a study by Downing and Treisman (1997), subjects were exogenously cued to one side or the other of fixation by the transient brightening of one of two target placeholders (there was also an endogenous component to the cue because it validly predicted the location of the tar get event on two-thirds of the trials). Following the presentation of the cue, one of two target dots offset at either the cued or uncued location. When the target offset occurred at the cued location, observers were faster to respond relative to when the offset oc curred at the uncued location. That effect of cue validity shows that “attention facilitates

Page 19 of 30

Spatial Attention the detection of offsets at least as much as detection of onsets” (Downing & Treisman, 1997, p. 770). Seemingly at odds with the results indicating that perception of stimulus onsets and off sets are sped up at spatially attended locations, research has also shown that attention prolongs perceived duration of briefly presented stimuli (e.g., Enns, Brehaut, & Shore, 1999). Mattes and Ulrich (1998) investigated the effect of attention on perceived duration by assuming that more attention is allocated to a spatial cue as it becomes increasingly valid. In a blocked design, observers were presented with central cues that validly pre dicted the location either on 90 percent, 70 percent, or 50 percent of trials. Observers were explicitly aware of the cue validity in each block, and their task was to judge whether the presentation of the target stimulus was of short, medium, or long duration. As expected, the results indicated that as cue validity increased, so did mean ratings of perceived duration. Thus, endogenous spatial attention, activated by central cues, pro longs the perceived duration of a briefly presented target stimulus (Mattes & Ulrich, 1998). That result was (p. 250) subsequently replicated and extended in a study by Enns et al. (1999) showing that the illusion of prolonged duration was independent of the prior entry effect. In other words, attended stimuli do not seem to last longer because they also seem to have their onset sooner (Enns et al., 1999). Although the effects of spatial attention on many dimensions of basic vision have been well established in the extant research, for centuries attention researchers have contem plated the effect of spatial attention on the perceived intensity and perceptual clarity of a stimulus (Helmholtz, 1886; James, 1890; Wundt, 1912). In other words: Does attention al ter the appearance of a stimulus? Despite the long-standing interest, however, only re cently have psychophysical procedures been developed to permit attention researchers to evaluate the question with empirical data (e.g., Carrasco, Ling, & Read, 2004; Liu, Abrams, & Carrasco, 2009). Carrasco et al. (2004) approached the issue by investigating the effect of transient atten tion on perceived contrast. Uninformative peripheral cues were used to direct spatial at tention to the left or right side of fixation, or a neutral cue was presented at fixation. On each trial, observers were presented with a sine wave grating on either side of fixation and were required to perform an orientation discrimination on the target grating that ap peared higher in contrast. One of the two targets, the standard, was always presented at a fixed contrast (i.e., near threshold), and the other target, the test, was presented at var ious contrasts above and below that of the standard. To determine the effect of attention on perceived contrast, the authors determined the point of subjective equality between the test and the standard target. The results indicated that when the test target was cued, it was perceived as being higher in contrast than it really was. In other words, when the test was cued, its contrast was subjectively perceived as being equal to the standard even when the actual contrast was lower than the standard. Based on this find ing, the authors concluded “that attention changes the strength of a stimulus by increas

Page 20 of 30

Spatial Attention ing its ‘effective contrast’ or salience” (Carrasco et al., 2004, p. 308; see also Gobell & Carrasco, 2005). Recently, Liu et al. (2009) showed that voluntary spatial attention also enhances subjec tive contrast. They presented observers with valid central directional cues to neutral cen tral cues to orient observers to one of two rapid serial visual presentation (RSVP) streams of letters, where they were instructed to detect the rare occurrence of specific target. At the end of the RSVP stream, a sine wave grating was briefly presented at each location. On trials when the target was present in the cued or uncued RSVP stream, observers made one type of response, but if the target was not present, observers reported the ori entation of the target grating that was higher in contrast. Similar to the Carrasco et al. (2004) study summarized above, one target grating, the standard, was presented at a fixed contrast in all trials, and the other target grating, the test, was presented at a vary ing range of contrasts, above and below that of the standard. The critical finding support ed what Carrasco et al. (2004) found using peripheral uses: In order for the pairs to be perceived as subjectively equal in contrast, the test target needed to be presented at a lower contrast than the standard when it was at the attended location, and needed to be at a higher contrast than the standard target when it was at the unattended location (Liu et al., 2009). In sum, automatic and voluntary spatial attention both alter the appearance of stimuli by enhancing perceived contrast.

Summary Much of our current knowledge of the visual system conceptualizes it as being, in many ways, dichotomous (e.g., cells in the central visual pathways are mainly magnocellular and parvocellular, and visual information is processed in the putative “where/how” dorsal pathway and the “what” ventral pathway). The research reviewed in the present chapter presented visual spatial attention in much the same way. Spatial attention is a selective mechanism; it determines which sensory signals control behavior, and it constrains the rate of information processing (Maunsell, 2009). Abundant research has attempted to de termine the units of attentional selection. On the one hand, space-based models generally liken spatial attention to a mental spotlight (e.g., Eriksen & Eriksen, 1974; Posner, 1978) that selects and enhances perception of stimuli falling within a region of space. On the other hand, object-based models assert that spatial attention selects and enhances per ception of perceptually grouped stimuli, irrespective of spatial overlap, occlusion, or dis parity (e.g., Duncan, 1984; Neisser, 1967). Rather than one theory or the other being cor rect, it appears that the attentional system relies on both spatial and grouping factors in the selection process. Indeed, recent evidence suggests that relatively minor changes to the stimuli used in a spatial cueing task can promote space- or object-based selection (Nicol, Watter, Gray, & Shore, 2009). Control over the allocation of spatial attention resources is also dichotomous. In deed, exogenously controlled spatial attention is deployed rapidly and automatically in re sponse to salient stimuli that appear at peripheral locations (e.g., Jonides, 1981; Muller & (p. 251)

Page 21 of 30

Spatial Attention Rabbitt, 1989; Nakayama & Mackeben, 1989), whereas endogenously controlled atten tion is deployed voluntarily based on the interpretation of centrally presented, informa tive stimuli (e.g., Jonides, 1981; Muller & Rabbitt, 1989; Nakayama & Mackeben, 1989). In addition to reflecting distinct responses to relevant stimuli, these two attentional con trol mechanisms probably also process visual information in fundamentally distinct ways (e.g., Briand & Klein, 1987; Klein, 1994). Another dichotomy germane to spatial attention concerns overt and covert orienting. Al though it is clear that covert attention shifts and eye movements are not identical (e.g., Posner, 1980), it is uncertain precisely how the two are related. Some evidence argues for their independence (e.g., Klein & Pontefract, 1994), but according to the premotor theory (Rizzolatti et al., 1987)—the dominant view concerning the relationship between spatial attention and eye movements (Palmer, 1999)—the two mechanisms are actually interde pendent. To be sure, the research indicates that although attention shifts can occur in the absence of an eye movement, an eye movement cannot be executed before an attentional shift (e.g., Hoffman & Subramanian, 1995; Klein, 2004; Shepherd et al., 1986). Spatial attention can either enhance or degrade performance in tasks that measure lowlevel visual perception. Spatial attention enhances contrast sensitivity (c.f. Carrasco, 2006) and spatial sensitivity (c.f. Carrasco & Yeshurun, 2009), likely by effectively shrink ing the receptive field size of cells with receptive fields at the attended location (Moran & Desimone, 1985), or possibly by biasing activity of cells with smaller receptive fields at the attended location (Yeshurun & Carrasco, 1999). In contrast, spatial attention de grades temporal sensitivity (Yeshurun & Levy, 2003; but see Hein et al., 2006, and Nicol et al., 2009, for exceptions). One possible explanation for this counterintuitive finding is that spatial attention induces an inhibitory interaction that favors parvocellular over mag nocellular activity (Yeshurun & Levy, 2003). The final dichotomy of spatial attention reviewed in the present chapter pertained to the partially segregated, but interacting, dorsal and ventral frontoparietal neural networks. In their influential model, Corbetta and Shulman (2002) proposed that control of goal-di rected (i.e., voluntary) spatial attention is accomplished in a pathway comprising dorsal frontoparietal substrates, and control of stimulus-driven (i.e., automatic) spatial attention takes place in a right-lateralized pathway comprising ventral frontoparietal areas. Their model further proposes that the ventral network serves as a “circuit breaker” of the dor sal stream to facilitate the disengagement and reorientation of attention when a behav iorally relevant stimulus is detected (Corbetta & Shulman, 2002). In conclusion, spatial attention permits us to selectively enhance processing of behav iorally relevant stimuli and to attenuate processing of irrelevant stimuli (e.g., Yantis et al., 2002). By enabling us to allocate resources toward a specific stimulus, spatial attention prevents us from being overwhelmed by the massive stimulation that continually bom bards the visual system. Visual perception is, for the most part, enhanced by spatial at tention. For example, spatial attention enhances perceptual clarity by enhancing the per ceived contrast of a stimulus (Carrasco et al., 2004). Eye movements and spatial attention Page 22 of 30

Spatial Attention are identical mechanisms, but they are related (Rizzolatti et al., 1987): Although a shift of spatial attention can be made in the absence of an overt eye movement, eye movements cannot be executed without a preceding attentional shift. Throughout this chapter, spatial attention has been presented as a dichotomous neural mechanism. Spatial attention can select either spatial locations (e.g. Posner, 1980) or ob jects (e.g., Duncan, 1984) for further processing. The neural sources of spatial attention are both subcortical and cortical, the most critical of which are perhaps the two partially separate, but interacting, pathways that make up the dorsal and ventral frontoparietal networks (c.f. Corbetta & Shulman, 2002). Finally, the allocation of spatial attention re sources can be controlled automatically by stimulus-driven factors or voluntarily by topdown processes, and these two mechanisms of attentional control have distinct effects on perception and behavior (e.g., Jonides, 1981).

References Balz, G. W., & Hock, H. S. (1997). The effect of attentional spread on spatial resolution. Vi sion Research, 37, 1499–1510. Bashinski, H. S., & Bacharach, V. R. (1980). Enhancement of perceptual sensitivity as the result of selectively attending to spatial locations. Perception & Psychophysics, 28, 241– 248. Bichot, N. P., Cave, K. R., & Pashler, H. (1999). Visual selection mediated by location: Fea ture-based selection of noncontiguous locations. Perception & Psychophysics, 61, 403– 423. Briand, K. A., & Klein, R. M. (1987). Is Posner’s “beam” the same as Treisman’s “glue”? On the relation between visual orienting and feature integration theory. Journal of Experimental Psychology: Human Perception and Performance, 13, 228–241. (p. 252)

Broadbent, D. E. (1958). Perception and communication. London: Pergamon Press. Cameron, E. L., Tai, J. C., & Carrasco, M. (2002). Covert attention affects the psychomet ric function of contrast sensitivity. Vision Research, 42, 949–967. Carrasco, M. (2006). Covert attention increases contrast sensitivity: Psychophysical, neu rophysical and neuroimaging studies. Progress in Brain Research, 154, 33–70. Carrasco, M., Ling, S., & Read, S. (2004). Attention alters appearance. Nature Neuro science, 7, 308–313. Carrasco, M. & Yeshurun, Y. (2009). Covert attention effects on spatial resolution. Progress in Brain Research, 176, 65–86. Corbetta, M. (1998). Frontoparietal cortical networks for directing attention and the eye to visual locations: identical, independent, or overlapping systems? Proceedings of the National Academy of Science U S A, 95, 831–838. Page 23 of 30

Spatial Attention Corbetta, M., Akbudak, E., Conturo, T. E., Snyder, A. Z., Ollinger, J. M., Drury, H. A., et al. (1998). A common network of functional areas for attention and eye-movements. Neuron, 21, 761–773. Corbetta, M., Kincade, M. J., Ollinger, J. M., McAvoy, M. P., & Shulman, G. L. (2000). Vol untary orienting is dissociated from target detection in human posterior parietal cortex. Nature Neuroscience, 3, 292–297. Corbetta, M., Miezin, F. M., Dobmeyer, S., Shulman, G. L., & Petersen, S. E. (1993). Atten tional modulation of neural processing of shape, color, and velocity in humans. Science, 248, 1556–1559. Corbetta, M., Miezin, F. M., Shulman, G. L. & Petersen, S. E. (1993). A PET study of visu ospatial attention. Journal of Neuroscience, 13, 1202–1226. Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven atten tion in the brain. Nature Neuroscience, 3, 201–215. Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. An nual Review of Neuroscience, 18, 193–222. Driver, J., & Baylis, G. C. (1989). Movement and visual attention: The spotlight metaphor breaks down. Journal of Experimental Psychology: Human Perception and Performance, 3, 448–456. Downing, P. E., & Treisman, A. M. (1997). The line-motion illusion: Attention or impletion? Journal of Experimental Psychology: Human Perception and Performance, 23, 768–779. Duncan, J. (1984). Selective attention and the organization of visual information. Journal of Experimental Psychology: General, 113, 501–517. Eason, R. G. (1981). Visual evoked potential correlates of early neural filtering during se lective attention. Bulletin of the Psychonomic Society, 18, 203–206. Egly, R., Driver, J., & Rafal, R. D. (1994). Shifting visual attention between objects and lo cations: Evidence from normal and parietal lesion subjects. Journal of Experimental Psy chology: General, 123, 161–177. Enns, J. T., Brehaut, J. C., & Shore, D. I. (1999). The duration of a brief event in the mind’s eye. Journal of General Psychology, 126, 355–372. Eriksen, C. W., & Eriksen (1974). Effects of noise letters upon the identification of a tar get letter in a nonsearch task. Perception & Psychophysics, 16, 143–149. Eriksen, C. W., & Hoffman, J. E. (1972). Some characteristics of selective attention in visu al perception determined by vocal reaction time. Perception & Psychophysics, 11, 169– 171.

Page 24 of 30

Spatial Attention Eriksen, C. W., & St. James, J. D. (1986). Visual attention within and around the field of fo cal attention: a zoom lens model. Perception & Psychophysics, 40, 225–240. Eriksen, C. W., & Yeh, Y. (1985). Allocation of attention in the visual field. Journal of Ex perimental Psychology: Human Perception and Performance, 11, 583–597. Friedrich, F. J., Egly, R., Rafal, R. D., & Beck, D. (1998). Spatial attention deficits in hu mans: A comparison of superior parietal and temporal-parietal junction lesions. Neuropsy chology, 12, 193–207. Fox, M. D., Corbetta, M., Snyder, A. Z., Vincent, J. L., & Raichle, M. E. (2006). Sponta neous neuronal activity distinguishes human dorsal and ventral attention systems. Pro ceedings of the National Academy of Sciences, 103, 10046–10051. Friedrich, F. J., Egly, R., Rafal, R. D., & Beck, D. (1998). Spatial attention deficits in hu mans: A comparison of superior parietal and temporal-parietal junction lesions. Neuropsy chology, 12, 193–207. Funes, M. J., Lupianez, J., & Milliken, B. (2007). Separate mechanisms recruited by exoge nous and endogenous spatial cues: Evidence from the spatial Stroop paradigm. Journal of Experimental Psychology: Human Perception and Performance, 33, 348–362. Gobell, J., & Carrasco, M. (2005). Attention alters the appearance of spatial frequency and gap size. Psychological Science, 16, 644–651. Gobell, J. L., Tseng, C. H., & Sperling, G. (2004). The spatial distribution of visual atten tion. Vision Research, 44, 1273–1296. Goldberg, M. E., & Wurtz, R. H. (1972). Activity of superior colliculus cells in behaving monkey. I. Visual receptive fields of single neurons. Journal of Neurophysiology, 35, 542– 559. Hein, E., Rolke, B., & Ulrich, R. (2006). Visual attention and temporal discrimination: Dif ferential effects of automatic and voluntary cueing. Visual Cognition, 13, 29–50. Helmholtz, H. V. (1866). Treatise on physiological optics (3rd ed., Vols. 2 & 3; J. P Southall, Ed. and Trans.). Rochester, NY: Optimal Society of America. Rochester. Hillyard, S. A., & Mangun, G. R. (1987). Sensory gating as a physiological mechanisms for visual selective attention. In R. Johnson, R. Parasuraman, & J. W. Rohrbaugh (Eds.), Cur rent trends in event-related potential research (pp. 61–67). Amsterdam: Elsevier. Hoffman, J. E., & Subramanian, B. (1995). The role of visual attention in saccadic eye movements. Perception & Psychophysics, 57, 787–795. Hopfinger, J. B., Buonocre, M. H., & Mangun, G. R. (2000). The neural mechanisms of topdown attentional control. Nature Neuroscience, 3, 284–291.

Page 25 of 30

Spatial Attention Hunt, A., & Kingstone, A. (2003). Covert and overt voluntary attention: Linked or inde pendent? Cognitive Brain Research, 18, 102–115. James, W. (1890). The principles of psychology. New York: Henry Holt. Johnston, W. A., & Dark, V. J. (1986). Selective attention. Annual Review of Psychology, 37, 43–75. Jonides, J. (1981). Voluntary versus automatic control over the mind’s eye’s move ment. In J. Long & A. Baddeley (Eds.), Attention and performance VIII (pp. 259–276). Hillsdale, NJ: Erlbaum. (p. 253)

Jonides, J., & Mack, R. (1984). On the cost and benefit of cost and benefit. Psychological Bulletin, 96, 29–44. Kahneman, D., & Treisman, A. (1984). Changing views of attention and automaticity. In R. Parasuraman & D. R. Davies (Eds.), Varieties of attention (pp. 29–62). New York: Academ ic Press. Kahneman, D., Treisman, A., & Gibbs, B. J. (1992). The reviewing of object files: Object specific integration of information. Cognitive Psychology, 24, 175–219. Kanwisher, N., & Wojciulik, E. (2000). Visual attention: Insights from brain imaging. Na ture Reviews: Neuroscience, 1, 91–100. Kastner, S., McMains, S. A., & Beck, D. M. (2009). Mechanisms of selective attention in the human visual system: Evidence from neuroimaging. In M. S. Gazzaniga (Ed.), The cog nitive neurosciences (4th ed.). Cambridge, MA: MIT Press. Kastner, S. & Ungerleider, L. G. (2000). Mechanisms of visual attention in the human cor tex. Annual Review of Neuroscience, 23, 315–341. Klein, R. M. (1994). Perceptual-motor expectancies interact with covert visual orienting under endogenous but not exogenous control. Canadian Journal of Experimental Psychol ogy, 48, 151–166. Klein, R. M. (2004). On the control of visual orienting. In M. Posner (ed.), Cognitive neu roscience of attention (pp. 29–43.). New York: Guilford Press. Klein, R. M., & Hansen, E. (1990). Chronometric analysis of spotlight failure in endoge nous visual orienting. Journal of Experimental Psychology: Human Perception and Perfor mance, 16, 790–801. Klein, R. M., & Pontefract, A. (1994). Does oculomotor readiness mediate cognitive con trol of visual attention? Revisited! In C. Umilta (Ed.), Attention and performance XV (pp. 333–350). Hillsdale, NJ: Erlbaum.

Page 26 of 30

Spatial Attention Klein, R. M., & Shore, D. I. (2000). Relations among modes of visual orienting. In S. Mon sell & J. Driver (Eds.), Attention and performance XVIII (pp. 195–208). Hillsdale, NJ: Erl baum. Kramer, A. F., & Hahn, S. (1995). Splitting the beam: Distribution of attention over non contiguous regions of the visual field. Psychological Science, 6, 381–386. Kramer, A. F., & Jacobson, A. (1991). Perceptual organization and focused attention: The role of objects and proximity in visual processing. Perception & Psychophysics, 50, 267– 284. LaBerge, D. (1983). Spatial extent of attention to letters and words. Journal of Experimen tal Psychology: Human Perception and Performance, 9, 371–379. Liu, T., Abrams, J., & Carrasco, M. (2009). Voluntary attention enhances contrast appear ance. Psychological Science, 20, 354–362. Luck, S. (1995). Multiple mechanisms of visual-spatial attention: Recent evidence from human electrophysiology. Behavioural Brain Research, 71, 113–123. Lupianez, J., Decraix, C., Sieroff, E., Chokron, S., Milliken, B., & Bartolomeo, P. (2004). In dependent effects of endogenous and exogenous spatial cueing: Inhibition of return at en dogenously attended target locations. Experimental Brain Research, 159, 4, 447–457. Mackeben, M., & Nakayama, K. (1993). Express attentional shifts. Vision Research, 33, 85–90. Mangun, G. R., & Hillyard, S. A. (1991). Modulations of sensory-evoked brain potentials indicate changes in perceptual processing during visual-spatial priming. Journal of Exper imental Psychology: Human Performance and Performance, 17, 1057–1074. Marois, R., Chun, M. M., & Gore, J. C. (2000). Neural correlates of the attentional blink. Neuron, 28, 299–308. Mattes, S., & Ulrich, R. (1998). Directed attention prolongs the perceived duration of a brief stimulus. Perception & Psychophysics, 60, 1305–1317. Maunsell, J. H. R. (2009). The effect of attention on the responses of individual visual neu rons. In M. S. Gazzaniga (Ed.), The cognitive neurosciences (4th ed.). Cambridge, MA: MIT Press. Moran, J., & Desimone, R. (1985). Selection attention gates visual processing in the ex trastriate cortex. Science, 229, 782–784. Muller, H. J., & Rabbitt, P. M. A. (1989). Reflexive and voluntary orienting of visual atten tion: Time course of activation and resistance to interruption. Journal of Experimental Psychology: Human Perception and Performance, 15, 315–330.

Page 27 of 30

Spatial Attention Nakayama, K., & Mackeben, M. (1989). Sustained and transient components of focal visu al attention. Vision Research, 29, 1631–1647. Neisser, U. (1967). Cognitive psychology. Englewood Cliffs, NJ: Prentice-Hall. Nicol, J. R., Watter, S., Gray, K., & Shore, D. I. (2009). Object-based perception mediates the effect of exogenous attention on temporal resolution. Visual Cognition, 17, 555–573. O’Connor, D. H., Fukui, M. M., Pinsk, M. A., & Kastner, S. (2002). Attention modulates re sponses in the human lateral geniculate nucleus. Nature Neuroscience, 5, 1203–1209. Palmer, S. E. (1999). Vision science: Photons to phenomenology. Cambridge, MA: MIT Press. Paus, T. (1996). Localization and function of the human frontal eye-field: A selective re view. Neuropsychologia, 34, 475–483. Pestilli, F., & Carrasco, M. (2005). Attention enhances contrast sensitivity at cued and im pairs it at uncued locations. Vision Research, 45, 1867–1875. Petersen, S. E., Robinson, D. L., & Morris, D. J. (1987). Contributions of the pulvinar to vi sual spatial attention. Neuropsychologia, 25, 97–105. Posner, M. I. (1978). Chronometric explorations of mind. Hillsdale, NJ: Erlbaum. Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32A, 3–25. Posner, M. I., & Cohen, Y. (1984). Components of visual orienting. In H. Bouma & D. G. Bouwhuis (Eds.), Attention and performance X (pp. 521–556). Hillsdale, NJ: Erlbaum. Posner, M. I., Nissen, M. J., & Ogden, W. C. (1978). Attended and unattended processing modes: The role of set for spatial location. In H. L. Pick and J. J. Saltzman (Eds.), Modes of perceiving and processing information (pp. 137–157). Hillsdale, NJ: Erlbaum. Posner, M. I., & Petersen, S. E. (1990). The attention system of the human brain. Annual Review of Neuroscience, 13, 25–42. Posner, M. I., Petersen, S. E., Fox, P. T., & Raichle, M. E. (1988). Localization of cognitive operations in the human-brain. Science, 240, 1627–1631. Posner, M. I., Snyder, C. R. R., Davidson, B. J. (1980). Attention and the detection of sig nals. Journal of Experimental Psychology: General, 109, 160–174. Pratt, J., & Nghiem, T. (2000). The role of the gap effect in the orienting of attention: Evi dence for attentional shifts. Visual Cognition, 7, 629–644. Remington, R. W. (1980). Attention and saccadic eye movements. Journal of Exper imental Psychology: Human Perception and Performance, 6, 726–744. (p. 254)

Page 28 of 30

Spatial Attention Rizzolatti, G., Riggio L., Dascola, I., & Umilta, C. (1987). Reorienting attention across the horizontal and vertical meridians: Evidence in favour of a premotor theory of attention. Neuropsychologia, 25, 31–40. Robinson, D. L., & Petersen, S. (1992). The pulvinar and visual salience. Trends in Neuro sciences, 15, 127–721. Saslow, M. G. (1967). Effects of components of displacement-step stimuli upon latency for saccadic eye movements. Journal of the Optical Society of America, 57, 1024–1029. Serences, J. T., Shomstein, S., Leber, A. B., Golay, X., Egeth, H. E., & Yantis, S. (2005). Co ordination of voluntary and stimulus-driven attentional control in human cortex. Psycho logical Science, 16, 214–222. Shepherd, M., Findlay, J. M., & Hockey, R. J. (1986). The relationship between eye move ments and spatial attention. Quarterly Journal of Experimental Psychology, 38A, 475–491. Shore, D. I., & Spence, C. (2005). Prior entry. In L. Itti, G. Rees, & J. K. Tsotos (Eds.), Neu robiology of attention (pp. 89–95). Amsterdam: Elsevier. Shore, D. I., Spence, C., & Klein, R. M. (2001). Visual prior entry. Psychological Science, 12, 205–212. Shulman, G. L. McAvoy, M. P., Cowan, M. C., Astafiev, S. V., & Tansy, A. P., d’Avossa, G., et al. (2003). Quantitative analysis of attention and detection signals during visual search. Journal of Neurophysiology, 90, 3384–3397. Stelmach, L. B., & Herdman, C. M. (1991). Directed attention and the perception of tem poral order. Journal of Experimental Psychology: Human Performance and Performance, 17, 539–550. Spence, C., Shore, D. I., & Klein, R. M. (2001). Multisensory prior entry. Journal of Experi mental Psychology: General, 130, 799–832. Spence, C., & Driver, J. (1994). Covert spatial orienting in audition—exogenous and en dogenous mechanisms. Journal of Experimental Psychology: General, 20, 555–574. Spence, C., & Driver, J. (2004). Crossmodal space and crossmodal attention. London: Ox ford University Press. Titchener, E. B. (1908). Lectures and the elementary psychology of feeling and attention. New York: Macmillan. Treisman, A., & Gelade, G. (1980). A feature integration theory of attention. Cognitive Psychology, 12, 97–136. Treisman, A., Kahneman, D., & Burkell, J. (1983). Perceptual objects and the cost of filter ing. Perception & Psychophysics, 33, 527–532. Page 29 of 30

Spatial Attention Tsal, Y., & Shalev, L. (1996). Inattention magnifies perceived length: The attentional re ceptive field hypothesis. Journal of Experimental Psychology: Human Perception and Per formance, 22, 233–243. Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. G. Ingle, M. A. Goodale, & R. J. Q. Mansfield (Eds.), Analysis of visual behaviour (pp. 549–586). Cam bridge, MA: MIT Press. Van Voorhis, S. T., & Hillyard, S. E. (1977). Visual evoked potentials and selective atten tion to points in space. Perception & Psychophysics, 22, 54–62. Wright, R. D., Richard, C. M., & McDonald, J. J. (1995). Neutral location cues and cost/ benefit analysis of visual attention shifts. Canadian Journal of Experimental Psychology, 49, 540–548. Wright, R. D., & Ward, L. M. (2008). Orienting of attention. New York: Oxford University Press. Wundt, W. (1912). An introduction to psychology (R. Pinter, Trans.). London: Allen & Un win. Yantis, S. Schwarzbach, J., Serences, J. T., Carlson, R. L. Steinmetz, M. A., Pekar, J. J., & Courtney, S. M. (2002). Transient neural activity in human parietal cortex during spatial attention shifts. Nature Neuroscience, 5, 995–1002. Yeshurun, Y., & Carrasco, M. (1998). Attention improves or impairs visual performance by enhancing spatial resolution. Nature, 396, 72–75. Yeshurun, Y., & Carrasco, M. (1999). Spatial attention improves performance in spatial resolution tasks. Vision Research, 39, 293–306. Yeshurun, Y., & Levy, L. (2003). Transient attention degrades temporal resolution. Psycho logical Science, 14, 225–231.

Jeffrey R. Nicol

Jeffrey R. Nicol is Assistant Professor of Psychology, Affiliate of the Northern Centre for Research on Aging and Communication (NCRAC), Nipissing University.

Page 30 of 30

Attention and Action

Attention and Action George Alvarez The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0013

Abstract and Keywords At every moment, we face choices: Is it time to work, or to play? Should I listen to this lecture, or check my e-mail? Should I pay attention to what my significant other is saying, or do a mental inventory of what work I need to accomplish today? Should I keep my hands on the wheel, or change the radio station? Without any change to the external envi ronment, it is possible to select a subset of these possibilities for further action. The process of selection is called attention, and it operates in many domains, from selecting our higher level goals, to selecting the sensory information on which we focus, to select ing what actions we perform. This chapter focuses on the relationship between visual at tention (selecting visual inputs) and action (selecting and executing movements of the body). As a case study, the authors focus on visual-spatial attention, the act of choosing to attend to a particular location in the visual field, and its relationship to eye-movement control. Visual attention appears to select the targets for eye movements because atten tion to a location necessarily precedes an eye movement to that location. Moreover, there is a great deal of overlap in the neural mechanisms that control spatial attention and eye movements, and the neural mechanisms that are specialized for spatial attention or eyemovements are highly intertwined. This link between spatial attention and eye move ments strongly supports the idea that a computational goal of the visual attention system is to select targets for action, and suggests that many of the design properties of the spa tial attention system might be optimized for the control of eye movements. Whether this relationship will hold broadly between other forms of attention (e.g., goal selection, audi tory selection, tactile selection) and other forms of action (e.g., hand movements and lo comotion) is an important topic of contemporary and future research. Keywords: attention, action, visual attention, visual-spatial attention, eye movements, spatial attention

Let us start with a fairly ordinary situation (for me at least): you are hungry, and you de cide to get a snack from the refrigerator. You open the door with your left hand and arm, and you look inside. Because you just went shopping you have a fairly stocked refrigera tor, so you have a lot of food to choose from. You are thinking that a healthy snack is in order, and decide to look for carrots. How will you find the carrots? Most likely you will Page 1 of 32

Attention and Action “tune your visual system” to look for orange things, and all of the sudden you will become aware of the location of your orange juice, your carrots, some sharp cheddar cheese, and any other orange things in sight. You know that you keep your juice on the top shelf, and your carrots near the bottom, so you further tune your visual system to look for orange things near the bottom. Now you’ve zeroed in on those carrots and are ready to make your move and grab them. You reach, and with extreme precision you position your hand and fingers in exactly the right shape required to grasp the carrots, and begin snacking. I occasionally have poured juice into my cereal (by mistake), most likely because the milk and juice containers appeared somewhat similar in shape. But have you ever ac cidentally grabbed a carton of eggs instead of carrots? Have you ever reached for a bag of carrots as if they had a handle like a gallon of milk? Neither have I, and that is proba bly because we all have vision and action systems that have, for the most part, selected the visual inputs and the physical actions that meet our current goals. This deceptively simple act of grabbing the bag of carrots in your refrigerator is in fact an extremely im pressive behavior that requires the activity of a large network of brain regions (possibly most of them!). These different regions are specialized for setting up your goal states, (p. 256)

processing incoming visual information, and executing actions. But none of this would work if the system did not have “selective mechanisms” that enable us to select a subset of visual inputs (the carrots and not the juice), and a subset of possible actions (the car rot-reach and not the juice-reach) that match our current goal state (getting a snack and not a beverage). Not only do we have these selective mechanisms, but they also tend to work in a coordinated and seamless fashion. The overarching purpose of this chapter is to introduce the concept of attentional selec tion and the role that it plays in visual perception and action. To begin, the chapter pro vides a general definition of attention, followed by a specific introduction to visual atten tion. Next, the link between visual attention and eye movements is reviewed as a case study exploring the relationship between visual attention and action. By taking you through definitions, theoretical considerations, and evidence linking visual attention and action, this chapter is aimed at three specific goals: (1) to provide a nuanced view of at tention as a process that operates at multiple levels of cognition, (2) to summarize empiri cal evidence demonstrating that at least one form of visual attention (visual-spatial atten tion) interacts with the action system, and (3) to provide a theoretical framework for un derstanding how and why visual attention and action interact.

Defining Attention (think of attention as a verb instead of a noun) A generally accepted definition of attention has proved elusive, possibly because any sim ple definition of attention in one domain (say vision) will apply to attention in another do main (say action, or even goal setting). Take the core of what James offered as a defini tion of attention: “Focalization, concentration … withdrawal from some things in order to deal effectively with others” (James, 1890). Pashler (1998) offers a similar definition, in Page 2 of 32

Attention and Action which the core aspects of attention are (1) we can selectively process some stimuli more than others (selection), (2) we are limited in the number of things we can process simulta neously (capacity limitation), and (3) sustained processing involves effort or exertion (ef fort). Such a definition will apply to a wide range of cognitive processes. For instance, there are many possible things you can pay attention to in your visual field at this mo ment. Among other things, I currently have a keyboard, a coffee mug, a bottle of wine, and some knives in sight. The process of selecting a subset of those visual inputs to focus on is called attention. There are also many possible actions you can take at any given mo ment. I can move my eyes to focus on my coffee mug and pick it up, I can continue typing, or I can get up and find a corkscrew in my kitchen. The process of selecting a subset of those possible actions is also called attention. Finally, which one of these acts of selection I execute depends on my goal state. Do I want to finish my work tonight? What is the most effective approach for accomplishing this task? Should I keep typing, or reach for the coffee mug? Or would it serve me better in the long run to have a glass of wine and listen to some music, and start again tomorrow? What’s on TV? Choosing to focus on one of these trains of thought and set up stable goals is also called attention. At first it is confusing that we refer to all of these different acts by the same term (atten tion by James’ definition, and others’). These different acts of selection do not seem like they could possibly be the same thing in the mind and brain. Of course, that’s because they are not the same thing in either. The difficulty here is that we tend to think about the term “attention” as a noun, and that’s the wrong way to think about it. Think of the word “attention” as a verb instead, and it is a little easier to understand how and why the term attention accurately applies to so many situations. Now attention is more like the word “walk” which can be true of people, rats, horses. They are all the same because they all walk, but they are not all the same thing in any physical sense. Likewise, selecting visual inputs, selecting actions, and selecting goals are all forms of attention because they in volve a form of selection, but they are not all the same thing. What they share in common is the process of selection: There are many competing representations (visual inputs, ac tion plans, goal states), and attention is the process of (p. 257) selecting a subset of those competing representations. On this view, attention is a process that operates in multiple domains. Thus, whenever you hear the term “attention,” you should ask, “Selection in what domain? What are the competing representations that make selection necessary?” With that starting point you can then ask questions about the mechanisms by which a particular act of selection is achieved, the units of that selection process, and the conse quences of selection. This view of attention does not require that the mechanisms of selection in each domain are completely independent, or that there cannot be general mechanisms of attention. Ul timately selection operates over distinct competing representations in each domain (goals compete with other goals, visual inputs compete with other visual inputs, action represen tations compete with other action representations). However, it is likely that goal selec tion (e.g., to continue working) interacts with visual selection (e.g., to focus on the coffee mug), and that both forms of selection interact with action selection (e.g., to reach for the handle). The competition happens mostly within a domain, so the process of selection also Page 3 of 32

Attention and Action happens mostly within a domain. But each process operates within an integrated cogni tive system, and the selective processes in one domain will influence selective processes in the others. This chapter reviews evidence that the processes of visual selection and ac tion selection, although separable, are tightly coupled in this way. In the following section we introduce visual attention, including limits on visual attention, mechanisms of visual attention, and theoretical proposals for why we need visual atten tion, which is a useful starting point for understanding how visual attention interacts with the control of action.

Introduction to Visual Attention Understanding that attention is the process of selecting among competing representa tions, we can now focus on attentional selection within a particular domain: vision. Visual attention is the process of selecting a subset of the incoming visual information. Visual at tention can be controlled automatically by environmental factors, or intentionally based on the current goals, and with effort it can be continuously maintained on specific objects or events. Although there are many unanswered questions about the mechanisms and consequences of visual attention, two things about visual attention are abundantly clear: (1) you cannot pay attention to everything in your visual field simultaneously, and (2) you can tune your attention in various ways to select a particular subset of the incoming visu al information.

Limits of Visual Attention The limits on our ability to pay attention are sometimes startling, particularly in cases of inattentional blindness (Mack & Rock, 1998; Most, Scholl, Clifford, & Simons, 2005; Si mons & Chabris, 1999) and change blindness (J. K. O’Regan, Rensink, & Clark, 1999; Rensink, O’Regan, & Clark, 1997; Simons & Rensink, 2005). For instance, in one study Si mons and Chabris (1999) had participants watch a video in which two teams (white shirts, black shirts) tossed a basketball between players on the same team. The task was to count the number of times players on the white-shirt team tossed the ball to another play er on the same team. Halfway through the video, a man in a black gorilla suit walked into the scene from off camera, passing through the middle of the action right between play ers, stopping in the middle of the screen to thump his chest, and then continuing off screen to the left. Remarkably, half of the participants failed to notice the gorilla, even though it was plainly visible for several seconds! Indeed, when shown a replay of the video, participants were often skeptical that the event actually occurred during their ini tial viewing. The original video for this experiment can be viewed at http:// www.dansimons.com/. Similar results have been found in conditions with much higher stakes. For instance, when practicing landing an airplane in a simulator with a heads-up display in which con trol panels are superimposed on the cockpit window, many pilots failed to see the appear ance of an unexpected vehicle on the runway (Haines, 1991). Attending to the control Page 4 of 32

Attention and Action panel superimposed over the runway appeared to prevent noticing completely visible and critical information. Similarly, traffic accident reports often include accounts of drivers “not seeing” clearly visible obstacles (McLay, Anderson, Sidaway, & Wilder, 1997). Such occurrences are typically interpreted as attentional lapses: It seems that we have a se verely limited ability to perceive, understand, and act on information that falls outside the current focus of attention. Change-blindness studies provide further evidence for our limited ability to attend to vi sual information (J. K. O’Regan, Deubel, Clark, & Rensink, 2000; J. K. O’Regan, et al., 1999; Rensink, et al., 1997; Simons, 2000; Simons & Levin, 1998; Simons & Rensink, 2005). In a typical (p. 258) change-blindness study, observers watch as two pictures are presented in alternation, with a blank screen between them. The two pictures are identi cal, except for one thing that is changing, and the task is simply to identify what is chang ing. Even when the difference between images is a substantial visual change (e.g., a large object repeatedly disappearing and reappearing), observers often fail to notice the change for several seconds. For demonstrations, go to http://www.dansimons.com/ or http://visionlab.harvard.edu/Members/George/demo-GradualChange.html. These studies provide a dramatic demonstration that, although our subjective experience is that we “see the whole visual field,” we are unable to detect a local change to our environment unless we focus attention on the thing that changes (Rensink, et al., 1997; Scholl, 2000). Indeed, with attention, such changes are trivially easy to notice in standard change-blind ness tasks, making the failure to detect such changes before they are localized all the more impressive. Thus, it appears that we need attention to keep track of information in a dynamic visual environment. We fail to notice new visual information, or changes to old visual informa tion, unless we selectively attend to that information. If we could pay attention to all of the incoming visual information at once, we would not observe these dramatic inatten tional blindness and change-blindness phenomena.

How We Tune Our Attention Fortunately the visual system is equipped with a variety of mechanisms that control the allocation of attention, effectively tuning our attention toward salient and task-relevant information in the visual field. The mechanisms of attentional control are typically divided into stimulus-driven and goal-driven mechanisms (for a review, see Yantis, 1998). Stimu lus-driven attentional shifts occur when there is an abrupt change in the environment, particularly the appearance of a new object (Yantis & Jonides, 1984), or more generally when certain dynamic events occur in the environment (Franconeri, Hollingworth, & Si mons, 2005; Franconeri & Simons, 2005). Our attention also tends to be guided toward salient portions of the visual field, such as a location that differs from its surround in terms of color or other features (Itti & Koch, 2000).

Page 5 of 32

Attention and Action In addition to these stimulus-driven mechanisms, there are also a variety of goal-driven mechanisms that enable us to choose which visual information to select, including objectbased attention, feature-based attention, and location-based attention. Research on object-based attention suggests that attention can select discrete objects, spreading through them and constrained by their boundaries, while suppressing informa tion that does not belong to the selected object. There is a wide range of empirical sup port for this theory. In the classic demonstration of object-based attention, attention is cued to part of an object where the speed and accuracy of perceptual processing is en hanced. Critically, performance is enhanced at uncued locations that are part of the same object, relative to uncued locations equally distant from the cue, but within part of anoth er object (Atchley & Kramer, 2001; Avrahami, 1999; Egly, Driver, & Rafal, 1994; Z. J. He & Nakayama, 1995; Lamy & Tsal, 2000; Marino & Scholl, 2005; Vecera, 1994). Similarly, di vided attention studies have shown that, when the task is to report or compare two target features, performance is better when the target features lie within the boundaries of the same object compared with when the features are spatially equidistant but appear in dif ferent objects (Ben-Shahar, Scholl, & Zucker, 2007; Duncan, 1984; Kramer, Weber, & Wat son, 1997; Lavie & Driver, 1996; Valdes-Sosa, Cobo, & Pinilla, 1998; Vecera & Farah, 1994). The fact that these “same object” advantages occur suggests that visual attention automatically selects entire objects. Many other paradigms provide evidence supporting this hypothesis, including studies using functional imaging (e.g., O’Craven, Downing, & Kanwisher, 1999), visual search (e.g., Mounts & Melara, 1999), negative priming (e.g., Tipper, Driver, & Weaver, 1991), inhibition of return (e.g., Reppa & Leek, 2003; Tipper et al., 1991), object reviewing (e.g., Kahneman, Treisman, & Gibbs, 1992; Mitroff, Scholl, & Wynn, 2004), attentional capture (e.g., Hillstrom & Yantis, 1994), visual illusions (e.g., Cooper & Humphreys, 1999), and patient studies (e.g., Humphreys, 1998; Ward, Goodrich, & Driver, 1994). Feature-based attention is the ability to tune attention to a particular feature (e.g., red) such that all items in the visual field containing that feature are simultaneously selected. Early research provided evidence for feature-based attention using the visual search par adigm (Bacon & Egeth, 1997; Egeth, Virzi, & Garbart, 1984; Kaptein, Theeuwes, & Van der Heijden, 1995; Zohary & Hochstein, 1989). For example, Bacon and Egeth (1997) asked observers to search for a red X among red Os and black Xs. When participants were instructed to attend to (p. 259) only the red items, the time to find the target was in dependent of the number of black Xs. This finding suggests that attention could be limit ed to only the red items in the display, with black items filtered out. This example fits with the intuition that if your friend were one of a few people wearing a red shirt at a St. Patrick’s Day party, she would be relatively easy to find in the crowd. Other evidence that attention can be tuned to specific stimulus features comes from inattentional blindness studies (Most & Astur, 2007; Most et al., 2001, 2005). For example, Most and colleagues (2001) asked observers to attentively track a set of moving objects, and asked whether they noticed the appearance of an unexpected object on one of the trials. The likelihood of noticing the unexpected object depended systematically on its similarity to the attend ed objects. For example, when observers attended to white objects, they were most likely Page 6 of 32

Attention and Action to notice the unexpected object if it was white, less likely if it was gray, and least likely if it was black. Similarly, the time to find the target in a visual search task depends on the similarity between targets and distractors (Duncan & Humphreys, 1989; Konkle, Brady, Alvarez, & Oliva, 2010), as does the likelihood of a distractor capturing attention (Folk, Remington, & Johnston, 1992). Other research has shown that attention can be tuned to particular orientations or spatial frequencies (Rossi & Paradiso, 1995), as well as color and motion direction (Saenz, Buracas, & Boynton, 2002, 2003). These effects of featurebased attention appear to operate over the full visual field, such that attending to a par ticular feature selects all objects in the visual field that share the same feature, even for stimuli that are irrelevant to the task (Saenz, et al., 2002). Finally, location-based attention is the process of focusing attention on a particular loca tion in space (e.g., “to the left”), so that information in the attended location is selected and surrounding information is ignored or suppressed. You likely know from personal ex perience that it is possible to attend to visual information in the periphery, and early re searchers documented the same observation (Helmholtz, 1962; James, 1890). Early em pirical support for the idea that attention can be moved independent of the eyes came from a series of experiments by Posner and colleagues (Posner, 1980; Posner, Snyder, & Davidson, 1980). In the standard Posner cueing paradigm, observers keep their eyes fo cused on a fixation point at the center of the screen. The task is to respond to the presen tation of flash of light by pressing a key, but it is uncertain where that flash of light will appear. Observers are given a hint about the most likely location via a cue that indicates the most likely peripheral location where the flash will appear. On most trials the flash appears in the cued location (valid trials), so there is an incentive to focus on the cued lo cation. However, on some trials, the flash actually appears in the uncued location (invalid trials). Posner found that responses are faster and more accurate when they appear in the cued location than when they appear in the uncued location, even when the eyes remain focused at the central fixation point. This suggests that observers are able to shift their attention away from where their eyes are looking and focus on locations in the periphery. Other research has explored the spatial extent of the selected region (Engel, 1971; Erik sen & St. James, 1986; Intriligator & Cavanagh, 2001), the two-dimensional and three-di mensional shapes of the selected region (Downing & Pinker, 1985; LaBerge, 1983; LaBerge & Brown, 1986), how attention is shifted from one location to another (Reming ton & Pierce, 1984; Shulman, Remington, & McLean, 1979), whether attention can be split (Awh & Pashler, 2000; McMains & Somers, 2004), and the number of locations that can be selected at once (Alvarez & Franconeri, 2007; Franconeri, Alvarez, & Enns, 2007).

Theories of Visual Attention Our capacity to attend is quite limited, and there exist a variety of mechanisms for con trolling the allocation of attention so that it will be directed toward the most relevant sub set of the visual field. But why is our ability to attend so limited? There are many theories of visual attention, but for the purposes of this chapter, two classes of theory are perhaps the most relevant. Capacity-limitation theories of visual attention assume that the brain is Page 7 of 32

Attention and Action a finite system (fair assumption), and consequently is limited in the amount of informa tion it can process at once. On this view, we need attention to choose which inputs into the visual system will be fully processed, and which inputs will not be fully processed. An alternative framework, the attention-for-action framework, assumes that a particular body part can only execute a single action at once (another fair assumption), and that at tention is needed to select a target for that action. On this view, visual attention is limited because the action system is limited. Although not necessarily mutually exclusive, these frameworks have been presented in opposition. In this section, I summarize these theo retical frameworks and conclude (1) that visual attention is clearly required to handle in formation processing (p. 260) constraints (supporting a capacity-limitation view) and (2) that attention is not necessarily limited because action is limited. This is in line with the view that attention is a process that operates at multiple levels (goal selection, action se lection, visual selection), and highlights the point that the process of selection in one do main need not be constrained by the process of selection in another domain. However, as described in the following section, there is clearly a tight link between at least one form of visual attention—visual-spatial attention—and action.

Capacity Limitations and Competition for Representation One class of attentional theory assumes that there is simply too much information flood ing into the visual system to fully process all of it at once. Therefore, attentional mecha nisms are required to select which inputs to process at any given moment. This informa tion overload begins at the earliest stages of visual processing, when a massive amount of information is registered. The retina has about 6.4 million cones and 110 to 125 million rods (Østerberg, 1935), which is a fairly high-resolution sampling of the visual field (equivalent to a 6.4-megapixel camera in the fovea). A million or so axons from each eye then project information to the brain for further processing (Balazsi, Rootman, Drance, Schulzer, & Douglas, 1984; Bruesch & Arey, 2004; Polyak, 1941; Quigley, Addicks, & Green, 1982). These initial measurements must be passed on to higher-level processing mechanisms that enable the visual system to recognize and localize objects in the visual field. However, simultaneously processing all of these inputs in order to identify every ob ject in the visual field at once would be highly computationally demanding. Intuitively, such massively parallel processing is beyond the capacity of the human cognitive system (Broadbent, 1958; Neisser, 1967). This intuition is also supported by computational analy ses of visual processing (Tsotsos, 1988), which suggest that such parallel processing is not viable within the constraints of the human visual system. Thus, the capacity limita tions on higher-level cognitive processes, such as object identification and memory encod ing, require a mechanism for selecting which subset of the incoming information should gain access to these higher-level processes. Of course, there is good behavioral evidence that the visual system is not fully processing all of the incoming information simultaneously, including the inattentional blindness and change blindness examples described above. In your everyday life, you may have experi enced these phenomena while driving down the road and noticing a pedestrian that seemed to “appear out of nowhere,” or missing the traffic light turning green even Page 8 of 32

Attention and Action though you were looking in the general direction of the light. Or you may have experi enced not being able to find something that was in plain sight, like your keys on a messy desk. It seems safe to say that we are not always aware of all of the information coming into our visual system at any given moment. Take a look at Figure 13.1, which depicts a standard visual search task used to explore this in a controlled laboratory setting. Focus your eyes on the “x” at the center of the display, and then look for the letter T, which is lo cated somewhere in the display, tilted on its side in either the counterclockwise () or clockwise () direction. Try not to be fooled by the almost Ts, like ,, , or . Why does it take so long to find the T? Is it that you “can’t see it?” Not really, because now that you know where it is (far right, just below halfway down), you can see it perfectly well even while continuing to focus your eyes at the x at the center.

Figure 13.1 Standard visual search display. Focus your eyes on the “x” at the center of the display, and then look for the letter T, which is located some where in the display, titled on its side in either the counterclockwise |—, or clockwise—| direction.

How can we understand our failure to “see” things that are plainly visible? One possibility is that the visual system has many specialized neurons that code for different properties of objects (e.g., some neurons code for color, others code for shape), and that attention is required to combine these different features into a single coherent object representation (Treisman & Gelade, 1980). Attending to multiple objects at once would result in confu sion between the features of one object and the features of the other attended objects. On this account, attention can only perform this feature integration operation correctly if it operates over just one object at a time. (p. 261) Thus, you cannot see things that are plain ly visible because at any give moment you can only accurately see the current object of attention. An alternative theory, biased competition (Desimone, 1998; Desimone & Duncan, 1995), places the capacity limit at the level of neuronal representation. In this model, when mul tiple objects fall within the receptive field of a neuron in visual cortex, those objects com pete for the response of the neuron. Neurons are tuned such that they respond more for Page 9 of 32

Attention and Action some input features than for other input features (e.g., a particular neuron might fire strongly for shape A and not at all for shape B). How would such a neuron respond when multiple inputs fall within its receptive field simultaneously (e.g., if both shape A and shape B fell within the receptive field)? One could imagine that you would observe an av erage response, somewhere between the strong response to the preferred stimulus and a weak response to the nonpreferred stimulus. However, it turns out that the response of the neuron depends on which object is attended, with stronger responses when the pre ferred stimulus is attended and weaker responses when the nonpreferred stimulus is at tended. Thus, attention appears to bias this neuronal competition in favor of the attended stimulus. Both stimulus-driven and goal-driven mechanisms can bias the representation of the neuron in favor of one object over the other, and the biasing signal (“tuning mecha nism”) can be feature based or space based. This biasing signal enhances the response to the attended object and suppresses the response to the unattended object. The capacity limit in this case is at the level of the individual neuron, which cannot represent two ob jects at once, and therefore attentional mechanisms are required to determine which ob ject is represented. This theory has been supported by a variety of physiological evidence in monkeys (for a review, see Desimone, 1998) and is consistent with behavioral evidence in humans (Carlson, Alvarez, & Cavanagh, 2007; Motter & Simoni, 2007; Scalf & Beck, 2010; Torralbo & Beck, 2008). An analysis of the complexity of the problems the visual system must solve rules out the possibility that a human-sized visual system could fully process all of the incoming visual information at once. The limits on processing capacity can be conceived as limits on spe cific computations (e.g., feature binding), or in terms of neuronal competition (e.g., com peting for the response of a neuron). On either view, it is clear that visual selection is re quired, at least in part if not in full, by capacity limitations that arise within the stream of visual processing owing to the architecture of the visual system.

Attention for Action An alternative class of theories holds that attention is limited because action is limited. Because of physical constraints, your body parts can only move in one direction at a time: Each hand can only move in one direction, each eye can only move in one direction, and so on. According to the attention-for-action view, the purpose of visual attention is to se lect the most relevant visual information for action and to suppress irrelevant visual infor mation. Consequently, you are limited in the number of things you can pay attention to at once because your body is limited in the number of things it can act on at once. In its strongest form, the attention-for-action theory proposes that attentional selection limits are not set by information processing constraints: Attention is limited because action is limited (Allport, 1987, 1989, 1993; Neumann, 1987, 1990; Van der Heijden, 1992). The most appealing aspect of the attention-for-action theory is the most intuitive part: It makes sense that we would pay attention to things while we perform actions on them. Even without any data, I am convinced that I attend to the handle of my coffee mug just before I pick it up—and of course eye-movement studies support this intuition (Land & Hayhoe, 2001). However, the satisfaction of this part of the attention-for-action theory Page 10 of 32

Attention and Action does not imply that attention is limited because action is limited. It is possible for atten tion and action to be tightly coupled, but for attentional limitations to be determined by factors other than the limitations on action, such as the architecture of the visual system (e.g., neurons that encompass large regions of the visual field consisting of multiple ob jects). I will support a view in which visual attention selects the targets for action, but in which the limitations on visual attention are not set by the limitations on action.

Where Attention and Action Diverge To push the claim that attention is limited because action is limited, we can think about the limits on visual attention and ask if they match with limits on physical action. For in stance, feature-based visual attention is capable of globally selecting a feature such as “red” or “moving to the right,” such that all objects or regions in the visual field sharing that feature receive enhanced processing over objects or regions that do not have the se lected feature (Bacon & Egeth, 1997; Rossi & Paradiso, 1995; Saenz, et al., 2002, 2003; Serences & Boynton, 2007; Treue & Martinez Trujillo, 1999). Consequently, in some situa tions feature-based (p. 262) attention results in selecting dozens of objects, which far ex ceeds the action limit of one (or a few) objects at once. Perhaps one could argue that such a selected group constitutes a single “object of perception” —whereby within-group scrutiny requires additional selective mechanisms—but because this single perceptual group can be distributed across the entire visual field and interleaved with other “unse lected” information, it cannot be considered a potential object of action (which is neces sarily localized in space). Thus, action limits cannot explain the limits on feature-based at tention. One might argue that attention-for-action theory does not apply to feature-based atten tion, but rather that it applies only to spatial attention. This is a fair point because action occurs within the spatial domain. However, visual-spatial attention also shows many con straints that do not match obvious action limits. Most notably, although action is limited to a single location at any given moment, it is possible to attend to multiple objects or lo cations simultaneously, in parallel (Awh & Pashler, 2000; McMains & Somers, 2004; Pylyshyn & Storm, 1988; for a review of multifocal attention, see Cavanagh & Alvarez, 2005). For instance, a common task for exploring the limits on spatial attention is the multiple-object-tracking task. In this task, observers view a set of identical moving ob jects. A subset is highlighted as targets to be tracked, and then all items appear identical and continue moving. The task is to keep track of the target items, but because there are no distinguishing features of the objects, participants must continuously attend to the ob jects in order to track them. Observers are able to keep track of one to eight objects con currently, depending on the speed and spacing between the items (Alvarez & Franconeri, 2007), and the selection appears to be multifocal, selecting targets but not the space be tween targets (Intriligator & Cavanagh, 2001; Pylyshyn & Storm, 1988). The upper bound on spatial attention of at least eight objects is far beyond the action limit of a single loca tion. One might argue that multiple locations can be attended because it is possible to plan complex actions that involve acting on multiple locations sequentially. However, on this view, there is no reason for there to be any limit on the number of locations that Page 11 of 32

Attention and Action could be attended because in theory an infinite sequence of actions could be planned. Thus, action limitations seem to be unrelated to the limit on the number of attentional fo ci that can be deployed. Other constraints on spatial attention are also difficult to explain within the attention-foraction framework. In particular, there appear to be important low-level, anatomical con straints on attentional processing. For instance, the spatial resolution of attention is coarser in the upper visual field than in the lower visual field (He, Cavanagh, & Intriliga tor, 1996; Intriligator & Cavanagh, 2001), and it is easier to maintain sustained focal at tention along the horizontal meridian than along the vertical meridian (Mackeben, 1999). It is difficult to understand these limitations in terms of action constraints, unless we as sume that our hands are less able to act on the upper visual field (perhaps because of gravity) and less able to act on the vertical meridian (it seems awkward to perform ac tions above and below fixation). An alternative explanation is that the upper/lower asym metry is likely to be associated with visual areas in which the lower visual field is overrep resented. In monkeys, there is an upper/lower asymmetry that increases from relatively modest in V1 (Tootell, Switkes, Silverman, & Hamilton, 1988) to much more pronounced in higher visual areas like MT (Maunsell & Van Essen, 1987) and parietal cortex (Galletti, Fattori, Kutz, & Gamberini, 1999). The horizontal/vertical meridian asymmetry is likely to be linked to the relatively lower density of ganglion cells along the vertical meridian rela tive to the horizontal meridian (Curcio & Allen, 1990; Perry & Cowey, 1985) and possibly to the accelerated decline of cone density with eccentricity along the vertical meridian relative to the horizontal meridian (Curcio, Sloan, Packer, Hendrickson, & Kalina, 1987). There is some disagreement as to whether these asymmetries should be considered ef fects of attentional processing or lower level perceptual constraints (Carrasco, Talgar, & Cameron, 2001). However, other effects, such as interference effects between targets and distractors, or targets and other targets, are clearly limitations on attentional processing and not visual perception (Alvarez & Cavanagh, 2005; Carlson et al., 2007). One particu larly dramatic demonstration of a visual field effect on attentional processing is the hemi field independence observed in attentive tracking (Alvarez & Cavanagh, 2005). In this at tentive tracking task, observers kept their eyes focused at the center of the display and attentively tracked moving targets in one of the four quadrants of the peripheral visual field. Surprisingly, observers could keep track of twice as many targets when they ap peared in separate halves of the visual field (e.g., in the top left and top right quadrants) than when they appeared in the same half of the visual field (e.g., in the top right and bot tom right quadrants). (p. 263) It was as if the attentional processes required to track a moving object could operate independently in the left and right visual hemifields. This de gree of independence is surprising since attentional selection is often considered to be a high-level cognitive process, and hemifield representations are characteristic of lowerlevel visual areas (c.f. Bullier, 2004). It is tempting to link these sorts of hemifield effects with hemispheric control of action by assuming that contralateral control of the body is linked to hemifield constraints on atten tional selection (e.g., where the right hemisphere controls reaching and attention to the Page 12 of 32

Attention and Action left visual field, and the left hemisphere controls reaching and attention to the right visu al field). However, there are also quadrantic limits on attentional selection (Carlson et al., 2007) that are not amenable to such an account. Specifically, attended targets interfere with each other to a greater extent when they both appear within the same quadrant of the visual field than when they are equally far apart but appear in separate quadrants of the visual field. It is difficult to understand these limits on spatial attention within an at tention-for-action framework because there are no visual field quadrantic effects on body movement. Instead, a capacity limit account, particularly within the competition-for-rep resentation framework, provides a more natural explanation for these visual field effects. For instance, by taking known anatomy into account, we can find some important clues as to the cause of the quadrantic deficit. Visual effects that are constrained by the horizontal and vertical meridian can be directly linked to extrastriate areas V2 and V3 (Horton & Hoyt, 1991), which maintain noncontiguous representations of the four quadrants of the visual field. Increasing the distance between the lower-level cortical representations of each target appears to decrease the amount of attentional interference between them. One possibility is that cortical distance is correlated with the degree of the overlap be tween the receptive fields of neurons at two attended locations (i.e., the degree of compe tition for representation). On this account, the release from interference over the meridi ans suggests that receptive fields of neurons located on these intra-areal borders may not extend across quadrants of the visual field. This account is purely speculative, but for our purpose the important point is that there are quadrant-level attentional interference ef fects, and anatomical constraints offer some potential explanations, whereas limits on ac tion do not. These are just a few examples to illustrate that visual attention is not limited only be cause action is limited. Many constraints on visual-spatial attention are not mirrored by constraints on the action system, including limits on the number of attentional foci that can be maintained at once and visual field constraints on attentional selection. Such con straints can be understood in terms of capacity limits in the form competition for repre sentation, by taking the architecture of the human visual system into account. Thus, to a great extent, the mechanisms and limitations on visual-spatial attention can be under stood separately and independently from the limitations on action. This should not be tak en to mean that the process of spatial attention is somehow completely unrelated to the action system, or that the action system does not constrain visual selection. Visual atten tion and action are component processes of an integrated cognitive system, and as de scribed in the following section, the two systems are tightly coupled.

Spatial Attention and Eye Movements Although we have argued that the limits on the capacity of attention are largely indepen dent of the limits on action, there is clearly an important role for visual attention in the selection of the targets for action. This tight coupling between visual selection and action selection is most apparent with the interaction between visual-spatial attention and eye movements. There is strong evidence that before you move your eyes to a location, visual Page 13 of 32

Attention and Action attention first selects that location, as if attention selects the targets of eye movements. Indeed, theories such as the premotor theory of attention (Rizzolatti, Riggio, Dascola, & Umilta, 1987) propose that the mechanisms of visual attention are the same mechanisms that control eye movements. However, even within the domain of vision, attention is a process that operates at many levels, and it is likely that some aspects of visual attention do not perfectly overlap with the eye movement system, and the evidence suggests that there is some degree of separation between the mechanisms of attention and eye move ments. Nevertheless, the idea that there is a high degree of overlap between the mecha nisms of visual-spatial attention and eye movements is well supported by both behavioral and neurophysiological evidence.

Covert Visual-Spatial Attention One logical possibility for the relationship between visual-spatial selection and eye move ments is that the focus of attention is locked into alignment with the center of fixation (at the fovea). That (p. 264) is, you might always be attending where you eyes are fixating. However, evidence that attention can be shifted covertly away from fixation rejects this possibility (Eriksen & Yeh, 1985; J. M. Henderson, 1991; Posner, 1980). For instance, Posner’s cueing studies suggest that observers are able to shift their attention away from where their eyes are looking, and focus on locations in the periphery. This ability to shift attention away from the eyes is known as covert attention. As described above, other research has shown that it is possible to attend to multiple ob jects simultaneously (Pylyshyn & Storm, 1988). It is impossible to perform this task using eye movements alone because the targets move randomly and independently of each oth er. In other words, you can only look directly at one object at a time, and yet it is possible to track multiple objects simultaneously (anywhere from two to eight, depending on the speed and spacing of the items; Alvarez & Franconeri, 2007). Also, this task cannot be performed by grouping items into a single object (Yantis, 1992) because each item moves randomly and independently of the other items. Consequently, even if the items are per ceptually grouped, the vertices must be independently selected and tracked to perform the task. Thus, it would appear that we have a multifocal, covert attention system that can select and track multiple moving objects at once, independent of eye position (Ca vanagh & Alvarez, 2005).

Page 14 of 32

Attention and Action

Behavioral Evidence Linking Visual-Spatial Attention and Eye Move ments

Figure 13.2 Schematic of the paradigm employed by McConkie and Rayner (1975) for examining the span of perception. Observers read some passage of text (a), while their eyes are tracked (fixation position de noted by the *). Outside of a window around the fixa tion, all letters are changed to x. When the window is large (b), observers don’t notice, but when the win dow is small (c), they do notice. In general, the span of perception is asymmetrical, with a larger span ahead of the direction of eye movements (d).

It is clear that attention can be shifted covertly away from the center of fixation, but that visual-spatial attention and eye movements appear tightly coupled. Specifically, it appears that attention to a location precedes eye movements to that location. Early studies explor ing eye movements during reading found evidence suggesting that attention precedes saccadic eye movements. McConkie and Rayner (1975) developed a clever paradigm called the “moving window” or “gaze-contingent display” to investigate perceptual pro cessing during reading. They had observers read text (Figure 13.2a), and changed all of the letters away from fixation into the letter x. You might think this would be easily no ticeable to the readers, but it was possible to change most of the letters to x without read ers noticing. To accomplish this, researchers created a “window” around fixation where the text always had normal letters, and as observers moved their eyes, the window moved with them. Interestingly, observers did not notice this alteration of the text when the win dow was large (Figure 13.2b), but they did notice when the window was small (Figure 13.2c). Thus, by manipulating the size of the moving window, it is possible to determine the span of perception during reading (the number of letters readers notice as they read). Interestingly, the span of perception is highly asymmetrical around the point of fixation, with three to four letters to the left of fixation, and fourteen to fifteen letters to the right of fixation (McConkie & Rayner, 1976) (Figure 13.2d). What does this have to do with the relationship between attention and eye movements? One interpretation of this finding is that during reading, eye movements to the right are preceded by a shift of attention to the right. This interpretation assumes that attention enhances the perception of letters, which is consistent with the known enhancing effects of attention (Carrasco, Ling, & Read, 2004; Carrasco, Williams, & Yeshurun, 2002; Titch ener, 1908; Yeshurun & Carrasco, 1998). This interpretation is further supported by the finding that readers of Hebrew text, which is read from right to left, show the opposite pattern: More letters are perceived to the left of fixation than to the right of fixation (Pol latsek, Bolozky, Well, & Rayner, 1981). This asymmetry of the perceptual span does not Page 15 of 32

Attention and Action appear to rely on extensive reading practice: In general, the perceptual span appears to be asymmetrical in the direction of eye movements, even when reading in an atypical di rection (Inhoff, Pollatsek, Posner, & Rayner, 1989) or when scanning objects in the visual field in a task that does not involve reading (Henderson, Pollatsek, & Rayner, 1989).

Figure 13.3 Schematic of the paradigm employed by Hoffman and Subramaniam (1995) for examining the link between saccade location and attention. A loca tion is cued to designate where the eyes should move, but the eye movement is not executed right away. Then a tone is played, indicating that the eyes should now move to the target location. Before the eyes have a chance to move, letters briefly appear. The task is to determine whether one of those letters is a T or an L, and the target letter can appear at any of the four possible locations. However, the letter is detected more accurately when it happens to appear where the eye movement is planned (i.e., the eventu al location of the eyes).

Saccadic eye movements are ballistic movements of the eyes from one location to anoth er. Direct (p. 265) evidence for attention preceding saccadic eye movements comes from studies showing enhanced perceptual processing at saccade target locations before sac cade execution (Chelazzi, et al., 1995; Crovitz & Daves, 1962; J. M. Henderson, 1993; Hoffman & Subramaniam, 1995; Schneider & Deubel, 1995; Shepherd, Findlay, & Hock ey, 1986). The assumption of such studies is that faster, more accurate perceptual pro cessing is a hallmark of visual-spatial attention. In one study, Hoffman and Subramaniam (1995) used a dual-task paradigm in which they presented a set of four possible saccade target locations. For the first task, an arrow cue pointed to one of the four locations, and observers were instructed to prepare a saccade to the cued location (Figure 13.3). How ever, observers did not execute the saccade until they heard a beep. The critical question was whether attention would obligatorily be focused on the saccade target location, rather than on the other locations. Because attention presumably enhances perceptual processing at attended locations, it is possible to use a secondary task to probe whether attention is allocated to the saccade target location. If perceptual processing were en hanced at the saccade target location relative to other locations, then it would appear that attention was allocated to the saccade target location. For the second task, four let ters were briefly presented, with one letter appearing in each of the four possible sac cade target locations (see Figure 13.3). One of those letters was the target letter (either a T or an L), and the other letters were distractor letters, Es and Fs. The task was to identi fy whether a T or L was present, and the T or L could appear at any position, independent of the saccade target location. Because the letter could appear anywhere, there was no Page 16 of 32

Attention and Action incentive to look for the target letter in the saccade target position, and yet observers were more accurate in detecting the target letter when it appeared at the saccade target location. In a follow-up experiment, it was shown that observers could not attend to one location and simultaneously move the eyes to a different location. Thus, the allocation of attention to the saccade target location is in fact obligatory, occurring even when condi tions are in favor of attending to a location other than the saccade target location. Several studies have taken a similar approach, demonstrating that it does not appear pos sible to make an eye movement without first attentionally selecting the saccade target lo cation. Shepherd, Findlay, and Hockey (1986) showed observers two boxes, one to the left and one to the right of fixation. A central arrow cued one of the locations, and observers were required to shift attention to the cued location. Shortly before or after the saccade, a probe stimulus (a small square) appeared, and observers had to press a key as quickly as possible. Critically, observers knew that the probe would mostly likely appear in the box opposite of the saccade target location, providing an incentive to attend to the box opposite of the saccade target if possible. Nevertheless, response times were faster when the saccade target location and the probe location were the same, relative to when they appeared in different boxes. This was true even when the probe appeared before the sac cade occurred. It seems difficult or impossible to move the eyes in one direction while at tending to a location in the opposite direction. Thus, although it is possible to shift atten tion without moving the eyes, it does not seem possible to shift the eyes without shifting attention in that direction first. This tight coupling between visual-spatial attention and eye movements is not on ly true for saccadic eye movements but also appears to hold for smooth pursuit eye move ments (Khurana & Kowler, 1987; van Donkelaar, 1999; van Donkelaar & Drew, 2002). Smooth pursuit eye movements are the smooth, continuous eye movements used to main tain fixation on a moving object. Indirect evidence for the interaction between attention and smooth pursuit eye movements comes from studies requiring observers to smoothly track a moving target with their eyes, and then make a saccade toward a stationary pe ripheral target. Such saccades are faster (Krauzlis & Miles, 1996; Tanaka, Yoshida, & Fukushima, 1998) and more accurate (Gellman & Fletcher, 1992) to locations ahead of the pursuit direction relative to saccades toward locations behind the pursuit direction. These results are consistent with the notion that attention is allocated ahead of the direc tion of ongoing smooth pursuit. Van Donkelaar (1999) provides more direct evidence in fa vor of this interpretation. Observers were required to keep their eyes focused on a (p. 266)

smoothly moving target, while simultaneously monitoring for the appearance of a probe stimulus. The probe was a small circle that could be flashed ahead of the moving target (in the direction of pursuit) or behind the moving target (in the wake of pursuit). Critical ly, participants were required to continue smooth pursuit throughout the trial, and to press a key upon detecting the probe without breaking smooth pursuit. The results showed that responses were significantly faster for probes ahead of the direction of pur suit. This suggests that attentional selection of locations ahead of the direction of pursuit is required to maintain smooth pursuit eye movements. In support of this interpretation, subsequent experiments showed that probe detection is fastest at the target location and Page 17 of 32

Attention and Action just ahead of the target location, with peak enhancement ahead of eye position depending on pursuit speed (greater speed, further ahead; van Donkelaar & Drew, 2002). In summary, the behavioral evidence shows enhanced perceptual processing before eye movements at the eventual location of the eyes. Because attention is known to enhance perceptual processing in terms of both speed and accuracy, this behavioral evidence sup ports a model in which visual-spatial attention selects the targets of eye movements. However, the behavioral evidence alone cannot distinguish between a model in which vi sual-spatial attention and eye-movement control are in fact a single mechanism, as op posed to two separate but tightly interconnected mechanisms. One approach to directly address this question is to investigate the neural mechanisms of visual-spatial attention and eye-movement control.

Neurophysiological Evidence Linking Visual-Spatial Attention and Eye Movements Research on the neural basis of visual-spatial attention and eye movements strongly sug gests that there is a great deal of overlap between the neural mechanisms that control at tention and the neural mechanisms that control eye movements.

Functional Magnetic Resonance Imaging in Humans Much of the research on the neural substrates of visual-spatial attention and eye move ments in humans has been conducted with fMRI. The approach of such studies is to local ize the brain regions that are active when observers make covert shifts of visual-spatial attention and compare them to the brain regions that are active when observers move their eyes (Beauchamp, Petit, Ellmore, Ingeholm, & Haxby, 2001; Corbetta, et al., 1998). If shifting attention and making eye movements activate the same brain regions, it would suggest that visual-spatial attention and eye movements employ the same neural mecha nisms. In one study, Beauchamp et al. (2001) had observers perform either a covert attention task or an eye-movement task. In the covert attention task, observers kept their eyes fix ated on a point at the center of the screen and attended to a target that jumped from lo cation to location in the periphery. In the eye-movement task, observers moved their eyes, attempting to keep their eyes focused on the location of the target as it jumped from posi tion to position. In each condition, the rate at which the target changed location was var ied from 0.2 times per second, to 2.5 times per second. Activation for both tasks was com pared with a control task in which the target remained stationary at fixation. Relative to this control task, several areas were more active during both the covert attention task and the eye-movement task, including the precentral sulcus (PreCS), the intraparietal sul cus (IPS), the lateral occipital sulcus (LOS). Activity in the PreCS appeared to have two distinct foci, and the superior PreCS has been identified as the possible homolog of the Page 18 of 32

Attention and Action monkey frontal eye fields (FEFs) (Paus, 1996), which plays a central role in the control of eye-movements. Although these brain regions were active during both the covert attention task and the eye-movement task, the level of activity was significantly greater during the eyemovement task. Based on this result, Beauchamp et al. (2001) proposed the intriguing hy pothesis that a shift of visual-spatial attention is simply a subthreshold activation of the oculomotor control regions. Moderate, subthreshold activity in oculomotor control re gions will cause a covert shift of spatial attention without triggering an eye movement, whereas higher, suprathreshold activity in those same regions will cause eye movement. Of course, the resolution of fMRI does not allow us to determine that the exact same neu rons involved in shifting attention are involved with eye movements. A plausible alterna tive is that separate populations of neurons underlie shifts of attention and the control of eye movements, but that these populations are located in roughly the same anatomical re gions (see Corbetta et al., 2001, for a discussion of this possibility). However, if attention and eye movements had separate underlying populations of neurons, the degree of over lap between the two networks would suggest a functional purpose, perhaps to enable (p. 267)

communication between the systems. Thus, on either view, there is strong evidence for shared or tightly interconnected neural mechanisms for visual-spatial attention and eye movements. Recall that the behavioral evidence presented above suggests that a shift of attention necessarily precedes an eye movement. Thus, even if eye movements required neural mechanisms completely separate from shifts of attention, the brain regions involved in the eye movement task should include both the attention regions (which behavioral work suggests must be activated to make an eye movement) and those involved specifically with making eye movements (if there are any). How do we know that all of these active regions are not just part of the attention system? The critical question is not only whether the brain regions for attention and eye movements overlap but also whether there is any nonoverlap between them because the nonoverlap may represent eye-movement-specific neural mechanisms. Physiological studies in monkeys provide some insight into this ques tion.

Neurophysiological Studies in Monkeys Both cortical and subcortical brain regions are involved in planning and executing sac cadic eye movements, including the FEF (Bizzi, 1968; Bruce & Goldberg, 1985; Schall, Hanes, Thompson, & King, 1995; Schiller & Tehovnik, 2001), the supplementary eye field (Schlag & Schlag-Rey, 1987), the dorsolateral prefrontal cortex (Funahashi, Bruce, & Goldman-Rakic, 1991), the parietal cortex (Mountcastle, Lynch, Georgopoulos, Sakata, & Acuna, 1975; D. L. Robinson, Goldberg, & Stanton, 1978), the pulvinar nucleus of the thalamus (Petersen, Robinson, & Keys, 1985), and the superior colliculus (SC) (Robinson, 1972; Wurtz & Goldberg, 1972a, 1972b).

Page 19 of 32

Attention and Action Research employing direct stimulation of neurons in the FEF and SC supports the hypoth esis that the same neural mechanisms that control eye movements also control covert shifts of spatial attention. For instance, Moore and Fallah (2004) trained monkeys to de tect a target (a brief dimming) in the peripheral visual field. It was found that subthresh old stimulation of FEF cells improved performance at peripheral locations where the monkey’s eyes would move if suprathreshold stimulation were applied. Similarly, Ca vanaugh and Wurtz (2004) have shown that change detection is improved by subthresh old microstimulation of SC, and this improvement was only observed when the change oc curred at the location where the eyes would move if suprathreshold stimulation were ap plied. Thus, it appears that the cells in the FEF and SC that control the deployment of eye movements also control covert shifts of spatial attention. Although compelling, these microstimulation studies are still limited by the possibility that microstimuation likely influences several neurons within the region of the electrode. Thus, it remains possible that eye movements and attention are controlled by different cells, but these cells are closely intertwined. Indeed, there is evidence supporting the possibility that some neurons code for eye movements and not spatial selection, whereas other neurons code for spatial selection and not eye movements. For instance, Sato and Schall (2003) found that some neurons within the FEF appear to code for the locus of spa tial attention, whereas others code only for the saccade location. Moreover, Ig nashchenkova et al. (2004) measured single-unit activity in the SC and found that visual and visuomotor neurons were active during covert shifts of attention, but that purely mo tor neurons were not. Thus, both the FEF and SC appear to have neurons that are in volved with both attention and eye movements, as well as some purely motor eye-move ment neurons.

Summary What is the relationship between attention and action? To answer this question we must first (p. 268) clarify, at least a little bit, what we mean by the term “attention.” In this chapter the term has been used to refer to the process of selection that occurs at multiple levels of representation within the cognitive system, including goal selection, visual selec tion, and action selection. An overview of visual attention was presented, highlighting the capacity limitations on visual processing and the attentional mechanisms by which we cope with these capacity limitations, effectively tuning the system to the most important subset of the incoming information. There has been some debate about whether the limits on visual attention are due to information processing constraints or to limitations on physical action. The idea that visual attention is limited by action is undermined by the mismatch in limitations between visual attention and physical action. Moreover, there is ample evidence demonstrating that the architecture of the visual system alone, indepen dent of action, requires mechanisms of attention and constrains how those mechanisms operate. In other words, attention is clearly required to manage information processing constraints within the visual system, irrespective of limitations on physical action. This does not rule out the possibility that action limitations impose additional constraints on Page 20 of 32

Attention and Action visual attention, but it does argue against the idea that action limitations alone impose constraints on visual attention. Despite the claim that attention is not limited only because action is limited, it is clear that visual-spatial attention and action are tightly coupled. Indeed, in the case study of spatial attention and eye movements, the attention system and action system are highly overlapping both behaviorally and neurophysiologically, and where they do not overlap completely they are tightly intertwined. In this case the overlap makes perfect sense be cause attending to a location in space and moving the eyes to a position in space require similar computations (i.e., specifying a particular spatial location relative to some refer ence frame). It is interesting to consider whether other mechanisms of visual selection (feature based, object based) are as intimately tied to action, or whether the coupling is primarily between spatial attention and action. Further, the current chapter focused on eye movements, but reaching movements and locomotion are other important forms of ac tion that must be considered as well. Detailed models of the computations involved in these different forms of attentional selection and the control of action would predict that, like the case of spatial attention and eye movements, visual selection and action mecha nisms are likely to overlap to the extent that shared computations and representations are required by each. However, it is likely that the relationship between attention and ac tion will vary across different forms of attention and action. For instance, there is some evidence that the interaction between visual-spatial attention and hand movements might be more flexible than the interaction between spatial attention and eye movements. Specifically, spatial attention appears to select the targets for both eye movements and hand movements. However, it is possible to shift attention away from eventual hand loca tion before the hand is moved, but it is not possible to shift attention away from the even tual eye position before the eye has moved (Deubel & Schneider, 2003). Our understanding of visual-spatial attention and action has been greatly expanded by considering the relationship between these cognitive systems. Further exploring similar questions within other domains of attention (e.g., goal selection, other mechanisms of vi sual selection, auditory selection, tactile selection) and other domains of action (e.g., reaching, locomotion, complex coordinated actions) is likely to provide an even greater understanding of attention and action, and of how they operate together within an inte grated cognitive system.

References Allport, D. A. (1987). Selection for action: Some behavioral and neurophysiological con siderations of attention and action. In H. Heuer & A. F. Sanders (Eds.), Perspectives on perception and action (pp. 395–419). Cambridge, MA: Erlbaum. Allport, D. A. (1989). Visual attention. In M. I. Posner (Ed.), Foundations of cognitive sci ence (pp. 631–682). Cambridge, MA: MIT Press.

Page 21 of 32

Attention and Action Allport, D. A. (1993). Attention and control. Have we been asking the wrong questions? A critical review of twenty-five years. In D. E. Meyer & S. Kornblum (Eds.), Attention and performance XIV: Synergies in experimental psychology, artificial intelligence, and cogni tive neuroscience (pp. 183–218). Cambridge, MA: MIT Press. Alvarez, G. A., & Cavanagh, P. (2005). Independent resources for attentional tracking in the left and right visual hemifields. Psychological Science, 16 (8), 637–643. Alvarez, G. A., & Franconeri, S. L. (2007). How many objects can you track? Evidence for a resource-limited attentive tracking mechanism. Journal of Vision, 7 (13–14), 1–10. Atchley, P., & Kramer, A. F. (2001). Object-based attentional selection in three-dimension al space. Visual Cognition, 8, 1–32. Avrahami, J. (1999). Objects of attention, objects of perception. Perception & Psy chophysics, 61 (8), 1604–1612. Awh, E., & Pashler, H. (2000). Evidence for split attentional foci. Journal of Experimental Psychology: Human Perception and Performance, 26 (2), 834–846. Bacon, W. J., & Egeth, H. E. (1997). Goal-directed guidance of attention: Evidence from conjunctive visual search. Journal of Experimental Psychology Human Perception and Performance, 23 (4), 948–961. (p. 269)

Balazsi, A. G., Rootman, J., Drance, S. M., Schulzer, M., & Douglas, G. R. (1984). The ef fect of age on the nerve fiber population of the human optic nerve. American Journal of Ophthalmology, 97 (6), 760–766. Beauchamp, M. S., Petit, L., Ellmore, T. M., Ingeholm, J., & Haxby, J. V. (2001). A paramet ric fMRI study of overt and covert shifts of visuospatial attention. NeuroImage, 14 (2), 310–321. Ben-Shahar, O., Scholl, B. J., & Zucker, S. W. (2007). Attention, segregation, and textons: Bridging the gap between object-based attention and texton-based segregation. Vision Re search, 47 (6), 845–860. Bizzi, E. (1968). Discharge of frontal eye field neurons during saccadic and following eye movements in unanesthetized monkeys. Experimental Brain Research, 6, 69–80. Broadbent, D. (1958). Perception and communication. London: Pergamon. Bruce, C. J., & Goldberg, M. E. (1985). Primate frontal eye fields. I. Single neurons dis charging before saccades. Journal of Neurophysiology, 53 (3), 603–635. Bruesch, S. R., & Arey, L. B. (2004). The number of myelinated and unmyelinated fibers in the optic nerve of vertebrates. Journal of Comparative Neurology, 77 (3), 631–665.

Page 22 of 32

Attention and Action Bullier, J. (2004). Communications between cortical areas of the visual system. In L. M. Chalupa & J. S. Werner (Eds.), The visual neurosciences (pp. 522–540). Cambridge, MA: MIT Press. Carlson, T. A., Alvarez, G. A., & Cavanagh, P. (2007). Quadrantic deficit reveals anatomi cal constraints on selection. Proceedings of the National Academy of Sciences U S A, 104 (33), 13496–13500. Carrasco, M., Ling, S., & Read, S. (2004). Attention alters appearance. Nature Neuro science, 7 (3), 308–313. Carrasco, M., Talgar, C. P., & Cameron, E. L. (2001). Characterizing visual performance fields: Effects of transient covert attention, spatial frequency, eccentricity, task and set size. Spatial Vision, 15 (1), 61–75. Carrasco, M., Williams, P. E., & Yeshurun, Y. (2002). Covert attention increases spatial resolution with or without masks: support for signal enhancement. Journal of Vision, 2 (6), 467–479. Cavanagh, P., & Alvarez, G. A. (2005). Tracking multiple targets with multifocal attention. Trends in Cognitive Sciences, 9 (7), 349–354. Cavanaugh, J., & Wurtz, R. H. (2004). Subcortical modulation of attention counters change blindness. Journal of Neuroscience, 24 (50), 11236–11243. Chelazzi, L., Biscaldi, M., Corbetta, M., Peru, A., Tassinari, G., & Berlucchi, G. (1995). Oculomotor activity and visual spatial attention. Behavioural Brain Research, 71 (1–2), 81–88. Cooper, A., & Humphreys, G. (1999). A new, object-based visual illusion. Paper presented at the Psychonomic Society, Los Angeles. Corbetta, M., Akbudak, E., Conturo, T. E., Snyder, A. Z., Ollinger, J. M., Drury, H. A., et al. (1998). A common network of functional areas for attention and eye movements. Neuron, 21 (4), 761–773. Crovitz, H. F., & Daves, W. (1962). Tendencies to eye movement and perceptual accuracy. Journal of Experimental Psychology, 63, 495–498. Curcio, C. A., & Allen, K. A. (1990). Topography of ganglion cells in human retina. Journal of Comparative Neurology, 300 (1), 5–25. Curcio, C. A., Sloan, K. R., Jr., Packer, O., Hendrickson, A. E., & Kalina, R. E. (1987). Dis tribution of cones in human and monkey retina: Individual variability and radial asymme try. Science, 236 (4801), 579–582.

Page 23 of 32

Attention and Action Desimone, R. (1998). Visual attention mediated by biased competition in extrastriate visu al cortex. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 353 (1373), 1245–1255. Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. An nual Review of Neuroscience, 18, 193–222. Deubel, H., & Schneider, W. X. (2003). Delayed saccades, but not delayed manual aiming movements, require visual attention shifts. Annals of the New York Academy of Sciences, 1004, 289–296. Downing, C. J., & Pinker, S. (1985). The spatial structure of visual attention. In M. Posner & O. Martin (Eds.), Attention and performance XI (pp. 171–187). Hillsdale, NJ: Erlbaum. Duncan, J. (1984). Selective attention and the organization of visual information. Journal of Experimental Psychology General, 113 (4), 501–517. Duncan, J., & Humphreys, G. W. (1989). Visual search and stimulus similarity. Psychologi cal Review, 96 (3), 433–458. Egeth, H. E., Virzi, R. A., & Garbart, H. (1984). Searching for conjunctively defined tar gets. Journal of Experimental Psychology Human Perception and Performance, 10 (1), 32– 39. Egly, R., Driver, J., & Rafal, R. D. (1994). Shifting visual attention between objects and lo cations: evidence from normal and parietal lesion subjects. Journal of Experimental Psy chology General, 123 (2), 161–177. Engel, F. L. (1971). Visual conspicuity, directed attention and retinal locus. Vision Re search, 11 (6), 563–576. Eriksen, C. W., & St. James, J. D. (1986). Visual attention within and around the field of fo cal attention: a zoom lens model. Perception & Psychophysics, 40 (4), 225–240. Eriksen, C. W., & Yeh, Y. Y. (1985). Allocation of attention in the visual field. Journal of Ex perimental Psychology: Human Perception and Performance, 11 (5), 583–597. Folk, C. L., Remington, R. W., & Johnston, J. C. (1992). Involuntary covert orienting is con tingent on attentional control settings. Journal of Experimental Psychology Human Per ception and Performance, 18 (4), 1030–1044. Franconeri, S. L., Alvarez, G. A., & Enns, J. T. (2007). How many locations can be selected at once? Journal of Experimental Psychology: Human Perception and Performance, 33 (5), 1003–1012. Franconeri, S. L., Hollingworth, A., & Simons, D. J. (2005). Do new objects capture atten tion? Psychological Science, 16 (4), 275–281.

Page 24 of 32

Attention and Action Franconeri, S. L., & Simons, D. J. (2005). The dynamic events that capture visual atten tion: A reply to Abrams and Christ (2005). Perception & Psychophysics, 67 (6), 962–966. Funahashi, S., Bruce, C. J., & Goldman-Rakic, P. S. (1991). Neuronal activity related to saccadic eye movements in the monkey’s dorsolateral prefrontal cortex. Journal of Neuro physiology, 65 (6), 1464–1483. Galletti, C., Fattori, P., Kutz, D. F., & Gamberini, M. (1999). Brain location and visual topography of cortical area V6A in the macaque monkey. European Journal of Neuro science, 11 (2), 575–582. Gellman, R. S., & Fletcher, W. A. (1992). Eye position signals in human saccadic process ing. Experimental Brain Research, 89 (2), 425–434. Haines, R. F. (1991). A breakdown in simultaneous information processing. In G. Orbrecht & L. Stark (Eds.), Presbyopia research (pp. 171–175). New York: Plenum. (p. 270)

He, S., Cavanagh, P., & Intriligator, J. (1996). Attentional resolution and the locus of visual awareness. Nature, 383 (6598), 334–337. He, Z. J., & Nakayama, K. (1995). Visual attention to surfaces in three-dimensional space. Proceedings of the National Academy of Sciences U S A, 92 (24), 11155–11159. Helmholtz, H. V. (1962). Treatise on physiological optics (Vol. 3). New York: Dover. Henderson, J. M. (1991). Stimulus discrimination following covert attentional orienting to an exogenous cue. Journal of Experimental Psychology Human Perception and Perfor mance, 17 (1), 91–106. Henderson, J. M. (1993). Visual attention and saccadic eye movements. In G. d’Ydewalle & J. Van Rensbergen (Eds.), Perception and cognition: Advances in eye movement re search (pp. 37–50). Amsterdam: North-Holland. Henderson, J. M., Pollatsek, A., & Rayner, K. (1989). Covert visual attention and ex trafoveal information use during object identification. Perception & Psychophysics, 45 (3), 196–208. Hillstrom, A. P., & Yantis, S. (1994). Visual motion and attentional capture. Perception & Psychophysics, 55 (4), 399–411. Hoffman, J. E., & Subramaniam, B. (1995). The role of visual attention in saccadic eye movements. Perception & Psychophysics, 57 (6), 787–795. Horton, J. C., & Hoyt, W. F. (1991). Quadrantic visual field defects: A hallmark of lesions in extrastriate (V2/V3) cortex. Brain, 114 (4), 1703–1718. Humphreys, G. W. (1998). Neural representation of objects in space: A dual coding ac count. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 353 (1373), 1341–1351. Page 25 of 32

Attention and Action Ignashchenkova, A., Dicke, P. W., Haarmeier, T., & Thier, P. (2004). Neuron-specific contri bution of the superior colliculus to overt and covert shifts of attention. Nature Neuro science, 7 (1), 56–64. Inhoff, A. W., Pollatsek, A., Posner, M. I., & Rayner, K. (1989). Covert attention and eye movements during reading. Quarterly Journal of Experimental Psychology A, 41 (1), 63– 89. Intriligator, J., & Cavanagh, P. (2001). The spatial resolution of visual attention. Cognitive Psychology, 43 (3), 171–216. Itti, L., & Koch, C. (2000). A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research, 40 (10–12), 1489–1506. James, W. (1890). Principles of psychology. New York: Holt. Kahneman, D., Treisman, A., & Gibbs, B. J. (1992). The reviewing of object files: Objectspecific integration of information. Cognitive Psychology, 24 (2), 175–219. Kaptein, N. A., Theeuwes, J., & Van der Heijden, A. H. (1995). Search for a conjunctively defined target can be selectively limited to a color-defined subset of elements. Journal of Experimental Psychology: Human Perception and Performance, 21, 1053–1069. Khurana, B., & Kowler, E. (1987). Shared attentional control of smooth eye movement and perception. Vision Research, 27 (9), 1603–1618. Konkle, T., Brady, T. F., Alvarez, G., & Oliva, A. (2010). Conceptual distinctiveness sup ports detailed visual long-term memory for real-world objects. Journal of Experimental Psychology: General, 139 (3), 558–578. Kramer, A. F., Weber, T. A., & Watson, S. E. (1997). Object-based attentional selection: Grouped arrays or spatially invariant representations? Comment on Vecera and Farah (1994). Journal of Experimental Psychology: General, 126 (1), 3–13. Krauzlis, R. J., & Miles, F. A. (1996). Initiation of saccades during fixation or pursuit: Evi dence in humans for a single mechanism. Journal of Neurophysiology, 76 (6), 4175–4179. LaBerge, D. (1983). Spatial extent of attention to letters and words. Journal of Experimen tal Psychology: Human Perception & Performance, 9 (3), 371–379. LaBerge, D., & Brown, V. (1986). Variations in size of the visual field in which targets are presented: An attentional range effect. Perception & Psychophysics, 40 (3), 188–200. Lamy, D., & Tsal, Y. (2000). Object features, object locations, and object files: Which does selective attention activate and when? Journal of Experimental Psychology Human Per ception and Performance, 26 (4), 1387–1400. Land, M. F., & Hayhoe, M. (2001). In what ways do eye movements contribute to every day activities? Vision Research, 41 (25–26), 3559–3565. Page 26 of 32

Attention and Action Lavie, N., & Driver, J. (1996). On the spatial extent of attention in object-based visual se lection. Perception & Psychophysics, 58 (8), 1238–1251. Mack, A., & Rock, I. (1998). Inattentional blindness. Cambridge, MA: MIT Press. Mackeben, M. (1999). Sustained focal attention and peripheral letter recognition. Spatial Vision, 12 (1), 51–72. Marino, A. C., & Scholl, B. J. (2005). The role of closure in defining the “objects” of objectbased attention. Perception & Psychophysics, 67 (7), 1140–1149. Maunsell, J. H., & Van Essen, D. C. (1987). Topographic organization of the middle tempo ral visual area in the macaque monkey: Representational biases and the relationship to callosal connections and myeloarchitectonic boundaries. Journal of Comparative Neurolo gy, 266 (4), 535–555. McConkie, G. W., & Rayner, K. (1975). The span of the effective stimulus during fixation in reading. Perception & Psychophysics, 17, 578–586. McConkie, G. W., & Rayner, K. (1976). Asymmetry of the perceptual span in reading. Bul letin of the Psychonomic Society, 8, 365–368. McLay, R. W., Anderson, D. J., Sidaway, B., & Wilder, D. G. (1997). Motorcycle accident re construction under Daubert. Journal of the National Academy of Forensic Engineering, 14, 1–18. McMains, S. A., & Somers, D. C. (2004). Multiple spotlights of attentional selection in hu man visual cortex. Neuron, 42 (4), 677–686. Mitroff, S. R., Scholl, B. J., & Wynn, K. (2004). Divide and conquer: How object files adapt when a persisting object splits into two. Psychological Science, 15 (6), 420–425. Moore, T., & Fallah, M. (2004). Microstimulation of the frontal eye field and its effects on covert spatial attention. Journal of Neurophysiology, 91 (1), 152–162. Most, S. B., & Astur, R. (2007). Feature-based attentional set as a cause of traffic acci dents. PVIS, 15 (2), 125–132. Most, S. B., Scholl, B. J., Clifford, E. R., & Simons, D. J. (2005). What you see is what you set: Sustained inattentional blindness and the capture of awareness. Psychological Re view, 112 (1), 217–242. (p. 271)

Most, S. B., Simons, D. J., Scholl, B. J., Jimenez, R., Clifford, E., & Chabris, C. F.

(2001). How not to be seen: The contribution of similarity and selective ignoring to sus tained inattentional blindness. Psychological Science, 12 (1), 9–17. Motter, B. C., & Simoni, D. A. (2007). The roles of cortical image separation and size in active visual search performance. Journal of Vision, 7 (2), 6–15. Page 27 of 32

Attention and Action Mountcastle, V. B., Lynch, J. C., Georgopoulos, A., Sakata, H., & Acuna, C. (1975). Posteri or parietal association cortex of the monkey: Command functions for operations within extrapersonal space. Journal of Neurophysiology, 38 (4), 871–908. Mounts, J. R., & Melara, R. D. (1999). Attentional selection of objects or features: evi dence from a modified search task. Perception & Psychophysics, 61 (2), 322–341. Neisser, U. (1967). Cognitive psychology. New York: Appleton-Century-Crofts. Neumann, O. (1987). Beyond capacity: A functional view of attention. In H. Heuer & A. F. Sanders (Eds.), Perspectives on perception and action (pp. 361–394). Hillsdale, NJ: Erl baum. Neumann, O. (1990). Visual attention and action. In O. Neumann & W. Prinz (Eds.), Rela tionships between perception and action: Current approaches (pp. 227–267). Berlin: Springer. O’Craven, K. M., Downing, P. E., & Kanwisher, N. (1999). fMRI evidence for objects as the units of attentional selection. Nature, 401 (6753), 584–587. O’Regan, J. K., Deubel, H., Clark, J. J., & Rensink, R. A. (2000). Picture changes during blinks: Looking without seeing and seeing without looking. Visual Cognition, 7 (1), 191– 211. O’Regan, J. K., Rensink, R. A., & Clark, J. J. (1999). Change-blindness as a result of “mud splashes.” Nature, 398 (6722), 34–34. Østerberg, G. A. (1935). Topography of the layer of rods and cones in the human retina. Acta Ophthalmologica, 13 (Suppl 6), 1–97. Pashler, H. (1998). The psychology of attention. Cambridge, MA: MIT Press. Paus, T. (1996). Location and function of the human frontal eye-field: A selective review. Neuropsychologia, 34 (6), 475–483. Perry, V. H., & Cowey, A. (1985). The ganglion cell and cone distributions in the monkey’s retina: Implications for central magnification factors. Vision Research, 25 (12), 1795– 1810. Petersen, S. E., Robinson, D. L., & Keys, W. (1985). Pulvinar nuclei of the behaving rhesus monkey: Visual responses and their modulation. Journal of Neurophysiology, 54 (4), 867– 886. Pollatsek, A., Bolozky, S., Well, A. D., & Rayner, K. (1981). Asymmetries in the perceptual span for Israeli readers. Brain Language, 14 (1), 174–180. Polyak, S. L. (1941). The retina. Chicago: University of Chicago Press.

Page 28 of 32

Attention and Action Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32 (1), 3–25. Posner, M. I., Snyder, C. R., & Davidson, B. J. (1980). Attention and the detection of sig nals. Journal of Experimental Psychology, 109 (2), 160–174. Pylyshyn, Z. W., & Storm, R. W. (1988). Tracking multiple independent targets: Evidence for a parallel tracking mechanism. Spatial Vision, 3 (3), 179–197. Quigley, H. A., Addicks, E. M., & Green, W. R. (1982). Optic nerve damage in human glau coma. III. Quantitative correlation of nerve fiber loss and visual field defect in glaucoma, ischemic neuropathy, papilledema, and toxic neuropathy. Archives of Ophthalmology, 100 (1), 135–146. Remington, R., & Pierce, L. (1984). Moving attention: Evidence for time-invariant shifts of visual selective attention. Perception & Psychophysics, 35 (4), 393–399. Rensink, R. A., O’Regan, J. K., & Clark, J. J. (1997). To see or not to see: The need for at tention to perceive changes in scenes. Psychological Science, 8 (5), 368–373. Reppa, I., & Leek, E. C. (2003). The modulation of inhibition of return by object-internal structure: Implications for theories of object-based attentional selection. Psychonomic Bulletin & Review, 10 (2), 493–502. Rizzolatti, G., Riggio, L., Dascola, I., & Umilta, C. (1987). Reorienting attention across the horizontal and vertical meridians: evidence in favor of a premotor theory of attention. Neuropsychologia, 25 (1A), 31–40. Robinson, D. A. (1972). Eye movements evoked by collicular stimulation in the alert mon key. Vision Research, 12 (11), 1795–1808. Robinson, D. L., Goldberg, M. E., & Stanton, G. B. (1978). Parietal association cortex in the primate: Sensory mechanisms and behavioral modulations. Journal of Neurophysiolo gy, 74, 698–721. Rossi, A. F., & Paradiso, M. A. (1995). Feature-specific effects of selective visual attention. Vision Research, 35 (5), 621–634. Saenz, M., Buracas, G. T., & Boynton, G. M. (2002). Global effects of feature-based atten tion in human visual cortex. Nature Neuroscience, 5 (7), 631–632. Saenz, M., Buracas, G. T., & Boynton, G. M. (2003). Global feature-based attention for mo tion and color. Vision Research, 43 (6), 629–637. Sato, T. R., & Schall, J. D. (2003). Effects of stimulus-response compatibility on neural se lection in frontal eye field. Neuron, 38 (4), 637–648. Scalf, P. E., & Beck, D. M. (2010). Competition in visual cortex impedes attention to multi ple items. Journal of Neuroscience, 30 (1), 161–169. Page 29 of 32

Attention and Action Schall, J. D., Hanes, D. P., Thompson, K. G., & King, D. J. (1995). Saccade target selection in frontal eye field of macaque. I. Visual and premovement activation. Journal of Neuro science, 15 (10), 6905–6918. Schiller, P. H., & Tehovnik, E. J. (2001). Look and see: How the brain moves your eyes about. Progress in Brain Research, 134, 127–142. Schlag, J., & Schlag-Rey, M. (1987). Evidence for a supplementary eye field. Journal of Neurophysiology, 57 (1), 179–200. Schneider, W. X., & Deubel, H. (1995). Visual attention and saccadic eye movements: Evi dence for obligatory and selective spatial coupling. In J. M. Findlay, R. Kentridge & R. Walker (Eds.), Eye-movement research: Mechanisms, processes, and applications (pp. 317–324). New York: Elsevier. Scholl, B. J. (2000). Attenuated change blindness for exogenously attended items in a flicker paradigm. Visual Cognition, 7 (1/2/3), 377–396. Serences, J. T., & Boynton, G. M. (2007). Feature-based attentional modulations in the ab sence of direct visual stimulation. Neuron, 55 (2), 301–312. Shepherd, M., Findlay, J. M., & Hockey, R. J. (1986). The relationship between eye move ments and spatial attention. Quarterly Journal of Experimental Psychology A, 38 (3), 475– 491. Shulman, G. L., Remington, R. W., & McLean, J. P. (1979). Moving attention through visual space. Journal of Experimental Psychology, 5 (3), 522–526. Simons, D. J. (2000). Current approaches to change blindness. Visual Cognition, 1/2/3, 1–15. (p. 272)

Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: Sustained inattentional blind ness for dynamic events. Perception, 28 (9), 1059–1074. Simons, D. J., & Levin, D. T. (1998). Failure to detect changes to people in real-world in teraction. Psychonomic Bulletin and Review, 5, 644–649. Simons, D. J., & Rensink, R. (2005). Change blindness: Past, present, and future. Trends in Cognitive Sciences, 9 (1), 16–20. Tanaka, M., Yoshida, T., & Fukushima, K. (1998). Latency of saccades during smooth-pur suit eye movement in man: Directional asymmetries. Experimental Brain Research, 121 (1), 92–98. Tipper, S. P., Driver, J., & Weaver, B. (1991). Object-centred inhibition of return of visual attention. Quarterly Journal of Experimental Psychology A, 43 (2), 289–298. Titchener, E. B. (1908). Lectures on the elementary psychology of feeling and attention. New York: Macmillan. Page 30 of 32

Attention and Action Tootell, R. B., Switkes, E., Silverman, M. S., & Hamilton, S. L. (1988). Functional anatomy of macaque striate cortex. II. Retinotopic organization. Journal of Neuroscience, 8 (5), 1531–1568. Torralbo, A., & Beck, D. M. (2008). Perceptual-load-induced selection as a result of local competitive interactions in visual cortex. Psychological Science, 19 (10), 1045–1050. Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 12 (1), 97–136. Treue, S., & Martinez Trujillo, J. C. (1999). Feature-based attention influences motion pro cessing gain in macaque visual cortex. Nature, 399 (6736), 575–579. Tsotsos, J. K. (1988). A “complexity level” analysis of immediate vision. International Jour nal of Computer Vision, 1 (4), 303–320. Valdes-Sosa, M., Cobo, A., & Pinilla, T. (1998). Transparent motion and object-based at tention. Cognition, 66 (2), B13–B23. Van der Heijden, A. H. (1992). Selective attention in vision. London: Routledge. van Donkelaar, P. (1999). Spatiotemporal modulation of attention during smooth pursuit eye movements. NeuroReport, 10 (12), 2523–2526. van Donkelaar, P., & Drew, A. S. (2002). The allocation of attention during smooth pursuit eye movements. Progress in Brain Research, 140, 267–277. Vecera, S. P. (1994). Grouped locations and object-based attention: Comment on Egly, Dri ver, and Rafal (1994). Journal of Experimental Psychology: General, 123, 316–320. Vecera, S. P., & Farah, M. J. (1994). Does visual attention select objects or locations? Jour nal of Experimental Psychology General, 123 (2), 146–160. Ward, R., Goodrich, S., & Driver, J. (1994). Grouping reduces visual extinction: Neuropsy chological evidence for weight-linkage in visual selection. Visual Cognition, 1, 101–130. Wurtz, R. H., & Goldberg, M. E. (1972a). Activity of superior colliculus in behaving mon key. 3. Cells discharging before eye movements. Journal of Neurophysiology, 35 (4), 575– 586. Wurtz, R. H., & Goldberg, M. E. (1972b). Activity of superior colliculus in behaving mon key. IV. Effects of lesions on eye movements. Journal of Neurophysiology, 35 (4), 587–596. Yantis, S. (1992). Multielement visual tracking: Attention and perceptual organization. Cognitive Psychology, 24, 295–340. Yantis, S. (1998). Control of visual attention. In H. Pashler (Ed.), Attention (pp. 223–256). East Sussex, UK: Psychology Press.

Page 31 of 32

Attention and Action Yantis, S., & Jonides, J. (1984). Abrupt visual onsets and selective attention: Evidence from visual search. Journal of Experimental Psychology: Human Perception and Perfor mance, 10 (5), 601–621. Yeshurun, Y., & Carrasco, M. (1998). Attention improves or impairs visual performance by enhancing spatial resolution. Nature, 396 (6706), 72–75. Zohary, E., & Hochstein, S. (1989). How serial is serial processing in vision? Perception, 18 (2), 191–200.

George Alvarez

George A. Alvarez, Department of Psychology, Harvard University, Cambridge, MA

Page 32 of 32

Visual Control of Action

Visual Control of Action Melvyn A. Goodale The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0014

Abstract and Keywords The visual control of skilled actions, such as reaching and grasping, requires fundamen tally different computations from those mediating our perception of the world. These dif ferences in the computational requirements for vision for action and vision for perception are reflected in the organization of the two prominent visual streams of processing that arise from primary visual cortex in the primate brain. Although the ventral stream pro jecting to inferotemporal cortex mediates the visual processing underlying our visual ex perience of the world, the dorsal stream projecting to the posterior parietal cortex medi ates the visual control of skilled actions. Specialized visual-motor modules have emerged in the posterior parietal cortex for the visual control of eye, hand, and arm movements. Although the identification of goal objects and the selection of an appropriate course of action depend on the perceptual machinery of the ventral stream and associated cogni tive modules in the temporal and frontal lobes, the execution of the subsequent goal-di rected action is mediated by dedicated online control systems in the dorsal stream and associated motor areas. Ultimately then, both streams work together in the production of goal-directed actions. Keywords: action, visual-motor control, dorsal stream, ventral stream, two visual streams, grasping, reaching

Introduction The visual control of movement is a central feature of almost all our daily activities from playing tennis to picking up our morning cup of coffee. But even though vision is essen tial for all these activities, only recently have vision scientists turned their attention to the study of visual-motor control. For most of the past 100 years, researchers have instead concentrated their efforts on working out how vision constructs our perception of the world. Psychophysics, not the study of visual-motor control, has been the dominant methodology (Goodale, 1983). Indeed, with the notable exception of eye movements, which have typically been regarded as an information-seeking adjunct to visual percep tion, little attention has been paid to the way in which vision is used to program and con Page 1 of 40

Visual Control of Action trol our actions, particularly the movements of our hands and limbs. Nevertheless, in the past few decades, considerable progress has been made on this front. Enormous strides, for example, have been made in our understanding of the visual control of locomotion (see Patla, 1997; Warren & Fajen, 2004). But in this brief review, I focus largely on the vi sual control of reach-to-grasp movements, a class of behavior that is exceptionally well developed in humans, and one that exemplifies the importance of vision in the control of action. I begin by introducing research that has examined the visual cues that play a critical role in the control of reach-to-grasp movements. I then move on to a discussion of work on the neural substrates of that control, reviewing evidence that the visual pathways mediating visual-motor control are quite distinct from those supporting visual perception.

Visual Control of Reach-to-Grasp Move ments (p. 274)

Humans are capable of reaching out and grasping objects with great dexterity, and vision plays a critical role in this important skill. Think for a moment about what happens when you perform the deceptively simple act of reaching out and picking up the cup of coffee sitting on your desk. After identifying your cup among all the other objects on our desk, you begin to reach out toward the cup, choosing a trajectory that avoids the telephone and the computer monitor. At the same time, your fingers begin to conform to the shape of the cup’s handle well before your hand makes contact with the cup. As your fingers curl around the handle, the initial forces that are generated as you lift the cup are finely tuned to its anticipated weight—and to your (implicit) predictions about the friction coef ficients and compliance of the material from which the cup is made. Visual information is crucial at every stage of this behavior, but the cues that are used and the neural mecha nisms that are engaged are quite different for each of the components involved.

Visual Control of Reach-to-Grasp Movements Pioneering work by Jeannerod (1981, 1984, 1986, 1988) led to the idea that the reaching component of a grasping movement is relatively independent from the formation of the grip itself. Jeannerod showed that when a person reaches out to grasp an object, the size of the opening between the fingers and thumb is positively correlated with the size of the object: The bigger the object, the wider the grasp. This relationship can be clearly seen at the point of maximum grip aperture, which is achieved well before contact is made with the object (Figure 14.1). The velocity of the movement toward the object, however, is typi cally not affected that much by the size of the goal object. Instead, the peak velocity of the reach is more closely correlated with the distance of the object from the reaching hand: the further the object, the faster the reaching movement. These results would ap pear to suggest that the reach and grip components of a manual prehension movement are generated by independent visual-motor channels, albeit ones that are temporally cou Page 2 of 40

Visual Control of Action pled.1 This so-called dual-channel hypothesis has become the dominant model of human prehension.

Figure 14.1 Graph showing grip aperture (the dis tance between the index finger and thumb) changing over time as an individual reaches out to pick up ob jects of three different sizes. Notice that the maxi mum grip aperture, which is achieved about 70% of the way through the grasp, is strongly correlated with object size, even though the hand opens much wider in flight before closing down on the goal ob ject. Adapted with permission from Jakobson & Goodale, 1991.

According to Jeannerod’s (1981) dual-channel account, the kinematics of the reach com ponent, whereby the hand is transported to the goal, are largely determined by visual cues that are extrinsic to the goal object, such as its distance and location with respect to the grasping hand. In contrast, the kinematics of the grasp component reflect the size, shape, and other intrinsic properties of the goal object. Even though later studies (e.g., Chieffi & Gentilucci, 1993; Jakobson & Goodale, 1991) showed that the visual control of the reach and grip components may be more intimately related than Jeannerod had origi nally proposed, there is broad consensus that the two components show a good deal of functional independence and (as will be discussed later) are mediated by relatively inde pendent neural circuitry. Jeannerod’s (1981) dual-channel hypothesis has not gone unchallenged. Smeets and Bren ner (1999, 2001, 2009), for example, have proposed instead that the movements of each finger of the grasping hand are programmed and controlled independently. According to this account, when a person reaches out to grasp an object with a precision grip, the in dex finger is directed to one side of the object and the thumb to the other. The apparent scaling of grip aperture to object size is nothing more than an emergent property of the fact that the two digits are moving (independently) toward their respective end points. Page 3 of 40

Visual Control of Action Moreover, because both digits are attached to the same limb, the so-called reach (or transport) component is simply the joint movement (p. 275) of the two digits toward the object. Simply put, it is location rather than size that drives grasping—and there is no need to separate grasping into transport and grip components, each sensitive to a differ ent set of visual cues. Smeets and Brenner’s (1999, 2001) double-pointing hypothesis has the virtue of being parsimonious. Nevertheless, it is not without its critics (e.g., Dubrowski, Bock, Carnahan, & Jungling, 2002; Mon-Williams & McIntosh, 2000; Mon-Williams & Tresilian, 2001; van de Kamp & Zaal, 2007). Van de Kamp and Zaal, for example, showed that when one side of an object was perturbed as a person reached out to grasp it, the trajectories of both digits were adjusted in flight, a result that would not be predicted by Smeets and Brenner’s model but one that is entirely consistent with Jeannerod’s (1981) dual-channel hypothesis. Even more important (as we shall see later), the organization of the neural substrates of grasping as revealed by neuroimaging and neuropsychology can be more easily explained by the dual-channel than the double-pointing hypothesis. Most studies of grasping, including those discussed above, have used rather unnatural situations in which the goal object is the only object present in the workspace. In the real world, of course, the workspace is usually cluttered with other objects, some of which could be potential obstacles for a goal-directed movement. Nevertheless, when people reach out to grasp an object, their hand and arm rarely collide with other objects in the workspace. The ease with which this is accomplished belies the fact that a sophisticated obstacle avoidance system must be at work—a system that encodes possible obstructions to a goal-directed movement and incorporates this information into the motor plan. The few investigations that have examined obstacle avoidance have revealed an efficient sys tem that is capable of altering the spatial and temporal trajectories of goal-directed reaching and grasping movements to avoid other objects in the workspace in a fluid man ner (e.g., Castiello, 2001; Jackson, Jackson, & Rosicky, 1995; Tresilian, 1998; Vaughan, Rosenbaum, & Meulenbroek, 2001). There has been some debate as to whether the non goal objects are always being treated as obstacles or instead as potential targets for ac tion (e.g., Tipper, Howard, & Jackson, 1997) or even as frames of reference for the control of the movement (e.g., Diedrichsen, Werner, Schmidt, & Trommershauser, 2004; Obhi & Goodale, 2005). By positioning nongoal objects in different locations in the workspace with respect to the target, however, it is possible to show that most often these objects are being treated as obstacles and that when individuals reach out for the goal, the trialto-trial adjustments of their trajectories are remarkably sensitive to the position of obsta cles both in depth and in the horizontal plane, as well as to their height (Chapman & Goodale, 2008, 2010). Moreover, the system behaves conservatively, moving the trajecto ry of the hand and arm away from nontarget objects, even when those objects are unlike ly to interfere with the target-directed movement. Part of the reason for the rapid growth of research into the visual control of manual pre hension has been the development of reliable technologies for recording hand and limb movements in three dimensions. Expensive movie film was replaced with inexpensive Page 4 of 40

Visual Control of Action videotape in the 1980s—and over the past 25 years, the use of accurate recording devices based on active or passive infrared markers, ultrasound, magnetism, instrumented gloves, and an array of other technologies has grown enormously. This has made it possi ble for the development of what might be termed “visual-motor psychophysics,” in which investigators are exploring the different visual cues that are used in the programming and control of grasping. One of the most powerful sets of cues used by the visual-motor system in mediating grasping comes from binocular vision (e.g., Servos, Goodale, & Jakobson, 1992). Several studies, for example, have shown that covering one eye has clear detrimental effects on grasping (e.g., Keefe & Watt, 2009; Loftus, Servos, Goodale, Mendarozqueta, & MonWilliams, 2004; Melmoth & Grant, 2006; Servos, Goodale, & Jakobson, 1992; Watt & Bradshaw, 2000). People reach more slowly, show longer periods of deceleration, and exe cute more online adjustments of both their trajectory and their grip during the closing phase of the grasp. Not surprisingly, then, adults with stereo deficiencies from amblyopia have been shown to exhibit slower and less accurate grasping movements (Melmoth, Fin lay, Morgan, & Grant, 2009). Interestingly, however, individuals who have lost an eye are still able to grasp objects as accurately as normally sighted individuals who are using both eyes. It turns out that they do this by making use of monocular retinal motion cues generated by exaggerated head movements (Marotta, Perrot, Nicolle, Servos, & Goodale, 1995). The use of these self-generated motion cues appears to be learned: the longer the time between loss of the eye and testing, the more likely it is that these individuals will make unusually large vertical and lateral head movements during the execution of the grasp (Marotta, Perrot, Nicolle, & Goodale, 1995). Computation of the required distance for the grasp has been shown to depend more on vergence than on retinal disparity cues, whereas the scaling of the grasping movement and the final placement of the fingers depends more on retinal disparity than vergence (Melmoth, Storoni, Todd, Finlay, & Grant, 2007; Mon-Williams & Dijkerman, 1999). Similarly, motion parallax contributes more to the computation of reach distance than it does to the formation of the grasp, although motion parallax becomes important only when binocular cues are no longer available (Marotta, Kruyer, & Goodale, 1998; Watt & Bradshaw, 2003). (It should be noted that the differential contributions that these cues make to the reach and grasp components, respectively, are much more consistent with Jeannerod’s, 1981, dual-channel hypothesis than they are with Smeets and Brenner’s (1999) double-pointing account.) But even when one eye is covered and the head immobi (p. 276)

lized, people are still able to reach out and grasp objects reasonably well, suggesting that static monocular cues can be used to program and control grasping movements. Marotta and Goodale (1998, 2001), for example, showed that pictorial cues, such as height in the visual scene and familiar size, can be exploited to program and control grasping—but the contributions from these static monocular cues are usually overshadowed by binocular in formation from vergence or retinal disparity, or both.

Page 5 of 40

Visual Control of Action The role of the shape and orientation of the goal object in determining the formation of the grasp is poorly understood. It is clear that the posture of the grasping hand is sensi tive to these features (e.g., Cuijpers, Smeets, & Brenner, 2004; Goodale et al., 1994b; van Bergen, van Swieten, Williams, & Mon-Williams, 2007), but there have been only a few systematic investigations of how information about object shape and orientation is used to configure the hand during the planning and execution of grasping movements (e.g., Cuijpers, Smeets, & Brenner, 2004; Lee, Crabtree, Norman, & Bingham, 2008; Louw, Smeets, & Brenner, 2007; van Mierlo, Louw, Smeets, & Brenner, 2009). But understanding the cues that are used to program and control a grasping movement is only part of the story. To reach and grasp an object, one presumably has to direct one’s attention to that object as well as to other objects in the workspace that could be poten tial obstacles or alternative goals. Research on the deployment of overt and covert atten tion in reaching and grasping tasks has accelerated over the past two decades, and it has become clear that when vision is unrestricted, people shift their gaze toward the goal ob ject (e.g., Ballard, Hayhoe, Li, & Whitehead, 1992; Johansson, Westling, Bäckström, & Flanagan, 2001) and to those locations on the object where they intend to place their fin gers, particularly the points on the object where more visual feedback is required to posi tion the fingers properly (e.g., Binsted, Chua, Helsen, & Elliott, 2001; Brouwer, Franz, & Gegenfurtner, 2009). In cluttered workspaces, people also tend to direct their gaze to ob stacles that they might have to avoid (e.g., Johansson et al., 2001). Even when gaze is maintained elsewhere in the scene, there is evidence that attention is shifted covertly to the goal and is bound there until the movement is initiated (Deubel, Schneider, & Paprot ta, 1998). In a persuasive account of the role of attention in reaching and grasping, Bal dauf and Deubel (2010) have argued that the planning of a reach-to-grasp movement re quires the formation of what they call an “attentional landscape,” in which the locations of all the objects and features in the workspace that are relevant for the intended action are encoded. Interestingly, their model implies parallel rather than sequential deployment of attentional resources to multiple locations, a distinct departure from how attention is thought to operate in more perceptual-cognitive models of attention. Finally, and importantly for the ideas that I discuss later in this review, it should be noted that the way in which different visual cues are weighted for the control of skilled move ments is typically quite different from the way they are weighted for perceptual judg ments. For example, Knill (2005) found that participants gave significantly more weight to binocular compared with monocular cues when they were asked to place objects on a slanted surface in a virtual display compared with when they were required to make ex plicit judgments about the slant. Similarly, Servos (2000) demonstrated that even though people relied much more on binocular than monocular cues when they grasped an object, their explicit judgments about the distance of the same object were no better under binocular than under monocular viewing conditions. These and other, even more dramatic dissociations that I review later underscore the fundamental differences between how vi sion is used for action and for perceptual report. In the next section, I offer a speculative account of the origins of vision before moving on to discuss the neural organization of the Page 6 of 40

Visual Control of Action pathways supporting vision for action on the one hand and vision for perception on the other.

(p. 277)

Neural Substrates of Vision for Action

Visual systems first evolved, not to enable animals to see, but rather to provide distal sen sory control of their movements—so that they could direct movements with respect to ob jects that were some distance from the body. Vision as “sight” is a relative newcomer on the evolutionary landscape, but its emergence has enabled animals to carry out complex cognitive operations on visual representations of the world. Thus, vision in humans and nonhuman primates (and perhaps other animals as well) serves two distinct but interact ing functions: (1) the perception of objects and their relations, which provides a visual foundation for the organism’s cognitive life and its conscious experience of the world, and (2) the control of actions directed at (or with respect to) those objects, in which separate motor outputs are programmed and controlled online. These competing demands on vi sion have shaped the organization of the visual pathways in the primate brain, particular ly within the visual areas of the cerebral cortex.

Page 7 of 40

Visual Control of Action

Figure 14.2 The two streams of visual processing in human cerebral cortex. The retina sends projections to the dorsal part of the lateral geniculate nucleus (LGNd), which projects in turn to primary visual cor tex. Within the cerebral cortex, the ventral stream arises from early visual areas and projects to the in ferotemporal cortex. The dorsal stream also arises from early visual areas but projects instead to the posterior parietal cortex. Recently, it has been shown that the posterior parietal cortex also receives visual input from the pulvinar via projections to MT and V3, as well as from the interlaminar layers of LGNd via projections to MT (middle temporal area) and V3. The pulvinar receives projections from both the reti na and from the superior colliculus (SC). The approx imate locations of the two streams are shown on a three-dimensional reconstruction of the pial surface of the brain. The two streams involve a series of com plex interconnections that are not shown. Adapted with permission from Goodale & Westwood, 2004.

Beyond the primary visual cortex in the primate cerebral cortex, visual information is conveyed to a bewildering number of extrastriate areas (Van Essen, 2001). Despite the complexity of the interconnections between these different areas, two broad “streams” of projections from primary visual cortex have been identified in the macaque monkey brain: a ventral stream projecting eventually to the inferotemporal cortex and a dorsal stream projecting to the posterior parietal cortex (Ungerleider & Mishkin, 1982) (Figure 14.2). Although some caution must be exercised in generalizing from monkey to human (Sereno & Tootell, 2005), recent neuroimaging evidence suggests that the visual projections from early visual areas to the temporal and parietal lobes in the human brain also involve a separation into ventral and dorsal streams (Culham & Valyear, 2006; Grill-Spector & Malach, 2004). Traditional accounts of the division of labor between the two streams (e.g., Ungerleider & Mishkin, 1982) focused on the distinction between object vision and spatial vision. This distinction between what and where resonated not only with psychological accounts of perception that emphasized the role of vision in object recognition and spatial attention, but also with nearly a century of neurological thought about the functions of the temporal and parietal lobes in vision (Brown & Schäfer, 1888; Ferrier & Yeo, 1884; Holmes, 1918;). Page 8 of 40

Visual Control of Action In the early 1990s, however, the what-versus-where story (p. 278) began to unravel as new evidence emerged from work with both monkeys and neurological patients. It became ap parent that a purely perceptual account of ventral-dorsal function could not explain these findings. The only way to make sense of them was to consider the nature of the outputs served by the two streams—and to work out how visual information is eventually trans formed into motor acts. In 1992, Goodale and Milner proposed a reinterpretation of the Ungerleider and Mishkin (1982) account of the functional distinction between the two visual streams. According to the Goodale-Milner model, the dorsal stream plays a critical role in the real-time control of action, transforming moment-to-moment information about the location and disposition of objects into the coordinate frames of the effectors being used to perform the action (Goodale & Milner, 1992; Milner & Goodale 2006, 2008). The ventral stream (together with associated cognitive networks outside the ventral stream) helps to construct the rich and detailed representations of the world that allow us to identify objects and events, at tach meaning and significance to them, and establish their causal relations. Such opera tions are essential for accumulating and accessing a visual knowledge base about the world. Thus, it is the ventral stream that provides the perceptual foundation for the of fline control of action, projecting action into the future and incorporating stored informa tion from the past into the control of current actions. In contrast, processing in the dorsal stream does not generate visual percepts; it generates skilled actions (in part by modulat ing more ancient visual-motor modules in the midbrain and brainstem; see Goodale, 1996). Some of the most compelling evidence for the division of labor proposed by Goodale and Milner (1992) has come from studies of the visual deficits observed in patients with dam age to either the dorsal or ventral stream. It has been known for a long time, for example, that patients with lesions in the dorsal stream, particularly lesions of the superior regions of the posterior parietal cortex that invade the territory of the intraparietal sulcus and/or the parieto-occipital sulcus, can have problems using vision to direct a grasp or aiming movement toward the correct location of a visual target placed in different positions in the visual field, particularly the peripheral visual field. This deficit is often described as optic ataxia (following Bálint, 1909; Bálint & Harvey, 1995). But the failure to locate an object with the hand should not be construed as a problem in spatial vision; many of these patients, for example, can describe the relative position of the object in space quite accurately, even though they cannot direct their hand toward it (Perenin & Vighetto, 1988). Moreover, sometimes the deficit will be seen in one hand but not the other. (It should be pointed out, of course, that these patients typically have no difficulty using in put from other sensory systems, such as proprioception or audition, to guide their move ments.) Some of these patients are unable to use visual information to rotate their hand, scale their grip, or configure their fingers properly when reaching out to pick up an ob ject, even though they have no difficulty describing the orientation, size, or shape of ob jects in that part of the visual field (Goodale et al., 1994; Jakobson, Archibald, Carey, & Goodale, 1991; Figure 14.3A). Clearly, a “disorder of spatial vision” (Holmes, 1918; Ungerleider & Mishkin, 1982) fails to capture this range of visual-motor impairments. In Page 9 of 40

Visual Control of Action stead, this pattern of deficits suggests that the posterior parietal cortex plays a critical role in the visual control of skilled actions (for a more detailed discussion, see Milner & Goodale 2006).

Figure 14.3 Graphs showing the size of the aperture between the index finger and thumb during object-di rected grasping and manual estimates of object width for RV, a patient with optic ataxia, and DF, a patient with visual form agnosia. A, RV was able to indicate the size of the objects reasonably well (indi vidual trials marked as open diamonds), but her max imal grip aperture in flight was not well tuned. She simply opened her hand as wide as possible on every trial. B, In contrast, DF showed excellent grip scal ing, opening her hand wider for the 50-mm wide ob ject than for the 25-mm wide object. DF’s manual es timates of the width of the two objects, however, were grossly inaccurate and showed enormous vari ability from trial to trial.

The opposite pattern of deficits and spared abilities can be seen in patients with visual agnosia. Take the case of patient DF, who developed a profound visual form agnosia fol lowing carbon monoxide poisoning (Goodale et al., 1991; Milner et al., 1991). Although magnetic resonance imaging (MRI) showed evidence of diffuse damage consistent with hypoxia, most of the damage was evident in ventrolateral regions of the occipital cortex, with V1 remaining largely spared. Even though DF’s “low-level” visual abilities are rea sonably intact, she can no longer recognize everyday objects or the faces of her friends and relatives; nor can she identify even the simplest of geometric shapes. (If an object is placed in her hand, of course, she has no trouble identifying it by touch.) Remarkably, however, DF shows strikingly accurate guidance of her hand movements when she at tempts to pick up the very objects she cannot identify. Thus, when she reaches out to grasp objects of different sizes, her hand opens wider mid-flight for larger objects than it does for smaller ones, just like it does in people with normal vision (Figure 14.3B). Simi larly, she rotates her hand and wrist quite normally when she reaches out to grasp ob jects in different orientations, and she places her fingers correctly on the surface of ob jects with different shapes (Goodale et al., 1994). At the same time, she is quite unable to distinguish between any of these objects when they are presented to her in simple dis crimination tests. She even fails (p. 279) in manual “matching” tasks, in which she is asked to show how wide an object is by opening her index finger and thumb a corre sponding amount. DF’s spared visual-motor skills are not limited to grasping. She can Page 10 of 40

Visual Control of Action step over obstacles during locomotion as well as controls, even though her perceptual judgments about the height of these obstacles are far from normal. Contrary to what would be predicted from the what-versus-where hypothesis, then, a profound loss of form perception coexists in DF with a preserved ability to use form in guiding a broad range of actions. Such a dissociation, of course, is consistent with the idea that there are separate neural pathways for transforming incoming visual information for the perceptual repre sentation of the world and for the control of action. Presumably, it is the former and not the latter that is compromised in DF (for more details, see Goodale & Milner 2004; Milner & Goodale 2006). But where exactly is the damage in DF’s brain? As already mentioned, an early structural MRI showed evidence of extensive bilateral damage in the ventrolateral occipital cortex. A more recent high-resolution MRI scan confirmed this damage but revealed that the le sions were focused in a region of the lateral occipital cortex (area LO) that we now know is involved in the visual recognition of objects, particularly their geometric structure (James, Culham, Humphrey, Milner, & Goodale, 2003). It would appear that this selective damage to area LO has disrupted DF’s ability to perceive the form of objects. These le sions have not interfered with her ability to use visual information about form to shape her hand when she reaches out and grasps objects—presumably because the visual-motor networks in her dorsal stream are largely spared. Since the original work on DF, other patients with ventral stream damage have been iden tified who show strikingly similar dissociations between vision for perception and vision for action. Patient SB, who suffered several bilateral damage to his ventral stream early in life, shows remarkably preserved visual-motor skills (he plays table tennis and can ride a motorcycle) despite having profound deficits in his ability to identify objects, faces, col ors, visual texture, and words (Dijkerman, Lê S, Démonet, & Milner, 2004; Lê, Cardebat et al., 2002). Recently, another patient, who sustained bilateral damage to the ventral stream following a stroke, was tested on several of the same tests that we gave to DF more than a decade ago. Remarkably, this new patient (JS) behaved almost identically to DF: in other words, despite his inability to perceive the shape and orientation of objects, he was able to use these same object features to program and control grasping move ments directed at those objects (Karnath, Rüter, Mandler, & Himmelbach, 2009). Finally, it is worth noting that if one reads the early clinical reports of patients with visual form agnosia, one can find a number of examples of what appear to be spared visual-motor skills in the face of massive deficits in form perception. Thus, Campion (1987), (p. 280) for example, reports that patient RC, who showed a profound visual form agnosia after car bon monoxide poisoning, “could negotiate obstacles in the room, reach out to shake hands and manipulate objects or [pick up] a cup of coffee.” Thus, the pattern of visual deficits and spared abilities in DF (and in SB, JS, and other pa tients with visual form agnosia) is in many ways the mirror image of that observed in the optic ataxia patients described earlier. DF, who has damage in her ventral stream, can reach out and grasp objects whose form and orientation she does not perceive, whereas patients with optic ataxia, who have damage in their dorsal stream, are unable to use vi Page 11 of 40

Visual Control of Action sion to guide their reaching or grasping movements to objects whose form and orienta tion they perceive. This “double dissociation” cannot be easily accommodated within the traditional what-versus-where account but is entirely consistent with the division of labor between perception and action proposed by Goodale and Milner (1992). It should be not ed that the perception–action model is also supported by a wealth of anatomical, electro physiological, and lesion studies in the monkey too numerous to review here (for recent reviews, see Andersen & Buneo, 2003; Cohen & Andersen, 2002; Milner & Goodale, 2006; Tanaka, 2003). But perhaps some of the most convincing evidence for the perception–ac tion proposal has come from functional magnetic resonance imaging (fMRI) studies of the dorsal and ventral streams in the human brain.

Neuroimaging Evidence for Two Visual Streams As the organization of the human visual system beyond V1 began to be revealed with the advent of fMRI (Menon et al., 1992; Ogawa et al., 1992), it soon became apparent that there was a remarkable correspondence between the layout of extrastriate visual areas in monkeys and humans, including the separation of these areas into dorsal and ventral streams (Tootell, Tsao, & Vanduffel, 2003; Van Essen et al., 2001). In the ventral stream, regions have been identified that seem to be selectively responsive to different categories of visual stimuli. Early on, an area was isolated within the ventrolateral part of the occipi tal cortex (area LO) that appears to be involved in object recognition (for review, see GrillSpector, 2003). As mentioned earlier, DF has bilateral lesions in the ventral stream that include area LO in both hemispheres. Not surprisingly therefore, an fMRI investigation of activity in DF’s brain revealed no dif ferential activation for line drawings of common objects (vs. scrambled versions) any where in DF’s remaining ventral stream, mirroring her poor performance in identifying the objects depicted in the drawings (James et al., 2003) (Figure 14.4). Again, this strong ly suggest that area LO is essential for form perception, generating the geometrical struc ture of objects by combining information about edges and surfaces that has already been extracted from the visual array by low-level visual feature detectors. In addition to LO, other ventral stream areas have been identified that code for faces, hu man body parts, and places or scenes (for review, see Milner & Goodale, 2006). Although there is a good deal of debate about whether these areas are really category specific (e.g., Downing, Chan, Peelen, Dodds, & Kanwisher, 2006

Page 12 of 40

Visual Control of Action Kanwisher, 2006) (p. 281) or instead are particular nodes in a highly distributed sys tem (e.g., Cant, Arnott, & Goodale, 2009; Cant & Goodale, 2007; Haxby et al., 2001; Op de Beeck, Haushofer, & Kanwisher, 2008), the neuroimaging work continues to provide strong support for the idea that the ventral stream plays the major role in construct ing our perceptual represen tation of the world. Indeed, processing within ventral stream areas, such as area Figure 14.4 Neuroimaging in DF’s ventral stream. A, LO, exhibits exactly the char A right lateral view of DF’s brain, with the lesion in acteristics that one might ex area LO marked in blue (the lesion is also in the left pect to see in such a system. hemisphere). B, fMRI activation for line drawings For example, LO shows se (vs. scrambled drawings) plotted on a horizontal sec tion through DF’s brain at the level of the red line on lective activation for objects panel A. DF shows no selective activation for line irrespective of whether the drawings either in area LO or in neighboring areas. objects are defined by differ C, A control subject shows robust activation to the same drawings. The activation in the control ences in motion, texture, or subject’s brain, which has been mathematically mor luminance contrast (Grillphed onto DF’s brain, coincides well with her LO le Spector, Kushnir, Edelman, sions. Itzchak, & Malach, 1998). Adapted with permission from James et al., 2003. Moreover, LO also appears to code the overall geometric shape of an object rather than simply its local contours (Kourtzi & Kanwisher, 2001). Although there is evidence that area LO shows some sensitivity to changes in object viewpoint (Grill-Spec tor et al., 1999;), at least part of area LO appears to be largely insensitive to such changes and treats different views of the same object as equivalent (James, Humphrey, Gati, Menon, & Goodale, 2002; Valyear, Culham, Sharif, Westwood, & Goodale, 2006). Taken together, the neu roimaging work on the human ventral stream reinforces the idea that this set of pathways plays a fundamental role in constructing our perceptual representations of the world.

Just as was the case for visual-perceptual areas in the ventral stream, the development of fMRI has led to the discovery in the human dorsal stream of visual-motor areas that ap pear to be largely homologous with those in the monkey brain (for reviews, see Castiello, 2005; Culham & Kanwisher, 2001; Culham & Valyear, 2006). Early on, an area in the in traparietal sulcus of the posterior parietal cortex was identified that appeared to be acti vated when subjects shifted their gaze (or their covert attention) to visual targets. This area is thought by many investigators to be homologous with an area on the lateral bank of the intraparietal sulcus (area LIP) in the monkey that has been similarly associated with the visual control of eye movements and attention (for reviews, see Andersen & Buneo, 2003; Bisley & Goldberg, 2010), although in the human brain it is located more Page 13 of 40

Visual Control of Action medially in the intraparietal sulcus (Culham, Cavina-Pratesi, & Singhal, 2006; Grefkes & Fink, 2005; Pierrot-Deseilligny, Milea, & Muri, 2004). A region in the anterior part of the intraparietal sulcus has been identified that is consistently activated when people reach out and grasp visible objects in the scanner (Binkofski et al., 1998; Culham, 2004; Culham et al., 2003). This area has been called human AIP (hAIP) because it thought to be homol ogous with a region in the anterior intraparietal sulcus of the monkey that has also been implicated in the visual control of grasping (for review, see Sakata, 2003). Several more recent studies have also shown that hAIP is differentially activated during visually guided grasping (e.g., Cavina-Pratesi, Goodale, & Culham, 2007; Frey, Vinton, Norlund, & Grafton, 2005). Importantly, area LO in the ventral stream is not activated when subjects reach out and grasp objects (Cavina-Pratesi, Goodale, & Culham, 2007; Culham, 2004; Culham et al., 2003), suggesting that this object-recognition area in the ventral stream is not required for the programming and control of visually guided grasping and that hAIP and associated networks in the posterior parietal cortex (in association with premotor and motor areas) can do this independently. This conclusion is considerably strengthened by the fact that patient DF, who has large bilateral lesions of area LO, shows robust differ ential activation in area hAIP for grasping (compared with reaching), similar to that seen in healthy subjects (James et al. 2003) (Figure 14.5).

Figure 14.5 Neuroimaging in DF’s dorsal stream. Even though her cerebral cortex shows evidence of widespread degenerative change as a result of hy poxia, there is still robust differential activation for grasping versus reaching in a region of the intrapari etal sulcus that corresponds to hAIP. On individual trials in the brain scanner, DF was instructed to grasp the shape presented on the rotating drum or, in a control condition, to simply reach out and touch it with her knuckles. Adapted with permission from James et al., 2003.

There is preliminary fMRI evidence to suggest that when binocular information is avail able for the control of grasping, dorsal stream structures can mediate this control with out any additional activity in ventral stream areas such as LO. But when (p. 282) only monocular vision is available, and reliance on pictorial cues becomes more critical, acti Page 14 of 40

Visual Control of Action vation increases in LO along with increased activation in hAIP (Verhagen, Dijkerman, Grol, & Toni, 2008). This observation is consistent with the psychophysical work reviewed earlier showing that binocular vision plays the major role in the programming and control of manual prehension—and helps to explain why DF has great difficulty grasping objects under monocular viewing conditions (Marotta, Behrmann, & Goodale, 1997). But what about the visual control of reaching? As we saw earlier, the lesions associated with the misreaching that defines optic ataxia have been typically found in the posterior parietal cortex, including the intraparietal sulcus, and sometimes extending into the infe rior or superior parietal lobules (Perenin & Vighetto, 1988). More recent quantitative analyses of the lesion sites associated with misreaching have revealed several key foci in the parietal cortex, including the medial occipital-parietal junction, the superior occipital gyrus, the intraparietal sulcus, and the superior parietal lobule as well as parts of the in ferior parietal lobule (Karnath & Perenin, 2005). As it turns out, these lesion sites map nicely onto the patterns of activation found in a recent fMRI study of visually guided reaching that showed reach-related activation both in a medial part of the intraparietal sulcus (near the intraparietal lesion site identified by Karnath and Perenin) and in the me dial occipital-parietal junction (Prado, Clavagnier, Otzenberger, Scheiber, & Perenin, 2005). A more recent study found that the reach-related focus in the medial intraparietal sulcus was equally active for reaches with and without visual feedback, whereas an area in the superior parietal occipital cortex (SPOC) was particularly active when visual feed back was available (Filimon, Nelson, Huang, & Sereno, 2009). This suggests that the me dial intraparietal region may reflect proprioceptive more than visual control of reaching, whereas the SPOC may be more involved in visual control. Both these areas have also been implicated in the visual control of reaching in the monkey (Andersen & Buneo, 2003; Fattori, Gamberini, Kutz, & Galletti, 2001; Snyder, Batista, & Andersen, 1997). There is evidence to suggest that SPOC may play a role in some aspects of grasping, par ticularly wrist rotation (Grol et al., 2007; Monaco, Sedda, Fattori, Galletti, & Culham, 2009), a result that mirrors recent findings on homologous areas in the medial occipitalparietal cortex of the monkey (Fattori, Breveglieri, Amoroso, & Galletti, 2004; Fattori et al., 2010). But at the same time, it seems clear from the imaging data that more anterior parts of the intraparietal sulcus, such as hAIP, play a unique role in visually guided grasp ing and appear not to be involved in the visual control of reaching movements. Moreover, patients with lesions of hAIP have deficits in grasping but retain the ability to reach to ward objects (Binkofski et al., 1998), whereas other patients with lesions in more medial and posterior areas of the parietal lobe, including the SPOC, show deficits in reaching but not grip scaling (Cavina-Pratesi, Ietswaart, Humphreys, Lestou, & Milner, 2010). The identification of areas in the human posterior parietal cortex for the visual control of reaching that are anatomically distinct from those implicated in the visual control of grasping, particularly the scaling of grip aperture, lends additional support to Jeannerod’s (1981) proposal that the transport and grip components of reach-to-grasp movements are programmed and controlled relatively independently. None of these observations, howev

Page 15 of 40

Visual Control of Action er, can be easily accommodated within the double-pointing hypothesis of Smeets and Brenner (1999). As mentioned earlier, not only are we adept at reaching out and grasping objects, but we are also able to avoid obstacles that might potentially interfere with our reach. Although to date there is no neuroimaging evidence about where in the brain the location of obsta cles is coded, there is persuasive neuropsychological evidence for a dorsal stream locus for this coding. Thus, unlike healthy control subjects, patients with optic ataxia from dor sal stream lesions do not automatically alter the trajectory of their grasp to avoid obsta cles located to the left and right of the path of their hand as they reach out to touch a tar get beyond the obstacles—even though they certainly see the obstacles and can indicate the midpoint between them (Schindler et al., 2004). Conversely, patient DF shows normal avoidance of the obstacles in the same task, even though she is deficient at indicating the midpoint between the two obstacles (Rice et al., 2006). But where is the input to all these visual-motor areas in the dorsal stream coming from? Although it is clear that V1 has prominent projections to the motion processing area MT and other areas that provide input to dorsal stream networks, it has been known for a long time that humans (and monkeys) with large bilateral lesions of V1 are still capable of performing many visually guided actions despite being otherwise blind with respect to the controlling stimuli (for review, see Milner & Goodale, 2006; (p. 283) Weiskrantz, 1997). These residual visual abilities, termed “blindsight” by Sanders et al. (1974), presumably depend on projections that must run outside of the geniculostriate pathway, such as those going from the eye to the superior colliculus, the interlaminar layers of the dorsal lateral geniculate nucleus, or even directly to the pulvinar (for review, see Cowey, 2010). Some of these extra-geniculate projections may also reach visual-motor networks in the dorsal stream. It has recently been demonstrated, for example, that a patient with a complete le sion of V1 in the right hemisphere was still capable of avoiding obstacles in his blind left hemifield while reaching out to touch a visual target in his sighted right field (Striemer, Chapman, & Goodale, 2009). The avoidance of obstacles in this kind of task, as we have already seen, appears to be mediated by visual-motor networks in the dorsal stream (Rice et al., 2006; Schindler et al., 2004). Similarly, there is evidence that such patients show some evidence for grip scaling when they reach out and grasp objects placed in their blind field (Perenin & Vighetto, 1996). This residual ability also presumably depends on dorsal stream networks that are being accessed by extra-geniculostriate pathways. There is increasing evidence that projections from the superior colliculus to the pulvinar—and from there to MT and area V3—may be the relay whereby visual inputs reach the visualmotor networks in the dorsal stream (e.g., Lyon, Nassi, & Callaway, 2010). Some have even suggested that a direct projection from the eye to the pulvinar—and then to MT— might be responsible (Warner, Goldshmit, & Bourne, 2010). But whatever the pathways might be, it is clear that the visual-motor networks in the dorsal stream that are known to mediate grasping and obstacle avoidance during reaching are receiving visual input by pass V1.

Page 16 of 40

Visual Control of Action In summary, the neuropsychological and neuroimaging data that have been amassed over the past 25 years suggest that vision for action and vision for perception depend on dif ferent and relatively independent visual pathways in the primate brain. In short, the visu al signals that give us the percept of our coffee cup sitting on the breakfast table are not the same ones that guide our hand as we reach out to pick up it up! Although I have focused almost entirely on the role of the dorsal stream in the control of action, it is important to emphasize that the posterior parietal cortex also plays a critical role in the deployment of attention, as well as in other high level cognitive tasks, such as numeracy and working memory. Even so, a strong argument can be made that these func tions of the dorsal stream (and associated networks in premotor cortex and more inferior parietal areas) grew out of a pivotal role that the dorsal stream plays in the control of eye movements and goal-directed limb movements (for more on these issues, see Moore, 2006; Nieder & Dehaene, 2009; Rizzolatti & Craighero, 1998; Rizzolatti, Riggio, Dascola, & Umiltá, 1987).

Different Neural Computations for Perception and Action Although the evidence from a broad range of empirical studies points to that fact that there are two relatively independent visual pathways in the primate cerebral cortex, the question remains as to why two separate systems evolved in the first place. Why couldn’t one “general purpose” visual system handle both vision for perception and vision for ac tion? The answer to this question lies in the differences in the computational require ments of vision for perception on the one hand and vision for action on the other. Consid er the coffee cup example introduced earlier. To be able to grasp the cup successfully, the visual-motor system has to deal with the actual size of the cup and its orientation and po sition with respect to the hand you intend to use to pick it up. These computations need to reflect the real metrics of the world, or at the very least, make use of learned “look-up tables” that link neurons coding a particular set of sensory inputs with neurons that code the desired state of the limb (Thaler & Goodale, 2010). The time at which these computa tions are performed is equally critical. Observers and goal objects rarely stay in a static relationship with one another and, as a consequence, the egocentric location of a target object can often change radically from moment to moment. In other words, the required coordinates for action need to be computed at the very moment the movements are per formed.

Page 17 of 40

Visual Control of Action

Figure 14.6 The effect of a size-contrast illusion on perception and action. A, The traditional Ebbinghaus illusion in which the central circle in the annulus of larger circles is typically seen as smaller than the central circle in the annulus of smaller circles, even though both central circles are actually the same size. B, The same display, except that the central cir cle in the annulus of larger circles has been made slightly larger. As a consequence, the two central cir cles now appear to be the same size. C, A three-di mensional (3D) version of the Ebbinghaus illusion. Participants are instructed to pick up one of the two 3D disks placed either on the display shown in panel A or the display shown in panel B. D, Two trials with the display shown in panel B, in which the partici pant picked up the small disk on one trial and the large disk on another. Even though the two central disks were perceived as being the same size, the grip aperture in flight reflected the real, not the appar ent, size of the disks. Adapted with permission from Aglioti et al., 1995.

In contrast to vision for action, vision for perception does not need to deal with the ab solute size of objects or their egocentric locations. In fact, very often such computations would be counterproductive because our viewpoint with respect to objects does not re main constant—even though our perceptual representations of those objects do show con stancy. Indeed, one can argue that it would be better to encode the size, orientation, and location of objects relative to each other. Such a scene-based frame of reference permits a perceptual representation of objects that transcends particular viewpoints, (p. 284) while preserving information about spatial relationships (as well as relative size and ori entation) as the observer moves around. The products of perception also need to be avail able over a much longer time scale than the visual information used in the control of ac tion. We may need to recognize objects we have seen minutes, hours, days—or even years before. To achieve this, the coding of the visual information has to be somewhat abstract —transcending particular viewpoint and viewing conditions. By working with perceptual representations that are object or scene based, we are able to maintain the constancies of size, shape, color, lightness, and relative location, over time and across different viewing conditions. Although there is much debate about the way in which this information is cod Page 18 of 40

Visual Control of Action ed, it is pretty clear that it is the identity of the object and its location within the scene, not its disposition with respect to the observer, that is of primary concern to the percep tual system. In fact, current perception, combined with stored information about previ ously encountered objects, not only facilitates the object recognition but also contributes to the control of goal-directed movements when we are working in offline mode (i.e., con trolling our movements, not in real time, but rather on the basis of the memory of goal objects that are no longer visible and their remembered locations in the world). The differences in the metrics and frames of reference used by vision for perception and vision for action have been demonstrated in normal observers in experiments that have made use of pictorial illusions, particularly size-contrast illusions. Aglioti, DeSouza, and Goodale (1995), for example, showed that the scaling of grip aperture in flight was re markably insensitive to the Ebbinghaus illusion, in which a target disk surrounded by smaller circles appears to be larger than the same disk surrounded by larger circles. They found that maximum grip aperture was scaled to the real, not the apparent, size of the target disk (Figure 14.6). A similar dissociation between grip scaling and perceived size was reported by Haffenden and Goodale (1998), under conditions where participants had no visual feedback during the execution of grasping movements made to targets present ed in the context of an Ebbinghaus illusion. Although grip scaling escaped the influence of the illusion, the illusion did affect (p. 285) performance in a manual matching task, a kind of perceptual report, in which participants were asked to open their index finger and thumb to indicate the perceived size of a disk. (This measure is akin to the typical magni tude estimation paradigms used in conventional psychophysics, but with the virtue that the manual estimation makes use of the same effector that is used in the grasping task.) To summarize, then, the aperture between the finger and thumb was resistant to the illu sion when the vision-for-action system was engaged (i.e., when the participant grasped the target) and sensitive to the illusion when the vision-for-perception system was en gaged (i.e., when the participant estimated its size). This dissociation between what people do and what they say they see underscores the dif ferences between vision for action and vision for perception. The obligatory size-contrast effects that give rise to the illusion (in which different elements of the array are com pared) presumably play a crucial role in scene interpretation, a central function of vision for perception. But the execution of a goal-directed act, such as manual prehension, re quires computations that are centered on the target itself, rather than on the relations be tween the target and other elements in the scene. In fact, the true size of the target for calibrating the grip can be computed from the retinal-image size of the object coupled with an accurate estimate of distance. Computations of this kind, which do not take into account the relative difference in size between different objects in the scene, would be expected to be quite insensitive to the kinds of pictorial cues that distort perception when familiar illusions are presented. The initial demonstration by Aglioti et al. (1995) that grasping is refractory to the Ebbing haus illusion engendered a good deal of interest among researchers studying vision and motor control—and there have been numerous investigations of the effects (or not) of pic Page 19 of 40

Visual Control of Action torial illusions on visual-motor control. Some investigators have replicated the original observations of Agioti et al. with the Ebbinghaus illusion (e.g., Amazeen & DaSilva, 2005; Fischer, 2001; Kwok & Braddick, 2003)—and others have observed a similar insensitivity of grip scaling to the Ponzo illusion (Brenner & Smeets, 1996; Jackson & Shaw, 2000), the horizontal-vertical illusion (Servos, Carnahan, & Fedwick, 2000), the Müller-Lyer illusion (Dewar & Carey, 2006), and the diagonal illusion (Stöttinger & Perner, 2006; Stöttinger, Soder, Pfusterschmied, Wagner, & Perner, 2010). Others have reported that pictorial illu sions affect some aspects of motor control but not others (e.g., Biegstraaten et al., 2007; Daprati & Gentilucci, 1997; Gentilucci et al., 1996; Glazebrook, de Grave, Brenner, & Smeets, 2005; van Donkelaar, 1999). And a few investigators have found no dissociation whatsoever between the effects of pictorial illusions on perceptual judgments and the scaling of grip aperture (e.g., Franz et al., 2000; Franz, Bülthoff & Fahle, 2003). Demonstrating that actions such as grasping are sometimes sensitive to illusory displays is not by itself a refutation of the idea of two visual systems. One should not be surprised that visual perception and visual-motor control can interact in the normal brain. Ultimate ly, after all, perception has to affect our actions or the brain mechanisms mediating per ception would never have evolved! The real surprise, at least for monolithic accounts of vision, is that there are clear instances when visually guided action is apparently unaf fected by pictorial illusions, which, by definition, affect perception. But from the stand point of the duplex perception–action model, such instances are to be expected (see Goodale, 2008; Milner & Goodale, 2006, 2008). Nevertheless, the fact that action has been found to be affected by pictorial illusions in some instances has led a number of au thors to argue that the earlier studies demonstrating a dissociation had not adequately matched action and perception tasks for various input, attentional, and output demands (e.g., Smeets & Brenner, 2001; Vishton & Fabre, 2003)—and that when these factors are taken into account, the apparent differences between perceptual judgments and motor control could be resolved without invoking the idea of two visual systems. Other authors, notably Glover (2004), have argued that action tasks involve multiple stages of processing from purely perceptual to more “automatic” visual-motor control. According to his plan ning/control model, illusions would be expected to affect the early but not the late stages of a grasping movement (Glover 2004; Glover & Dixon 2001a, 2001b). Some of these competing accounts, such as Glover’s (2004) planning/control model, can be viewed simply as modifications of the original perception–action model, but there are a number of other studies in which the results cannot easily be reconciled with the two vi sual systems model, and it remains a real question as to why actions appear to be sensi tive to illusions in some experiments but not in others. But as it turns out, there are sever al reasons why grip aperture might appear (p. 286) to be sensitive to illusions under cer tain testing conditions—even when it is not. In some cases, notably the Ebbinghaus illu sion, the flanker elements can be treated as obstacles, influencing the posture of the fin gers during the execution of the grasp (de Grave et al., 2005; Haffenden, Schiff, & Goodale, 2001; Plodowski & Jackson, 2001). In other words, the apparent effect of the il lusion on grip scaling in some experiments might simply reflect the operation of visualmotor mechanisms that treat the flanker elements of the visual arrays as obstacles to be Page 20 of 40

Visual Control of Action avoided. Another critical variable is the timing of the grasp with respect to the presenta tion of the stimuli. When targets are visible during the programming of a grasping move ment, maximum grip aperture is usually not affected by size-contrast illusions, whereas when vision is occluded before the command to initiate programming of the movement is presented, a reliable effect of the illusion on grip aperture is typically observed (West wood, Heath, & Roy, 2000; Westwood & Goodale, 2003; Fischer, 2001; Hu & Goodale, 2000). As discussed earlier, vision for action is designed to operate in real time and is not normally engaged unless the target object is visible during the programming phase, when (bottom-up) visual information can be immediately converted into the appropriate motor commands. The observation that (top-down) memory-guided grasping is affected by the il lusory display reflects the fact that the stored information about the target’s dimensions was originally derived from the earlier operation of vision for perception (for a more de tailed discussion of these and related issues, see Bruno, Bernadis, & Gentilucci, 2008; Goodale, Westwood, & Milner, 2004). Nevertheless, some have argued that if the perceptual and grasping tasks are appropri ately matched, then grasping can be shown to be as sensitive to size-contrast illusions as psychophysical judgments (Franz, 2001; Franz et al., 2000;) Although this explanation, at least on the face of it, is a compelling one, it cannot explain why Aglioti et al. (1995) and Haffenden and Goodale (1998) found that when the relative sizes of the two target ob jects in the Ebbinghaus display were adjusted so that they appeared to be perceptually identical, the grip aperture that participants used to pick up the two targets continued to reflect the physical difference in their size. Experiments by Ganel, Tanzer, and Goodale (2008b) provide evidence that is even more difficult to explain away by appealing to a failure to match testing conditions and other task-related variables. In this experiment, which used a version of the Ponzo illusion, a re al difference in size was pitted against a perceived difference in size in the opposite direc tion (Figure 14.7). The results were remarkably clear. Despite the fact that people be lieved that the shorter object was the longer one (or vice versa), their in-flight grip aper ture reflected the real, not the illusory, size of the target objects (Figure 14.8). In other words, on the same trials in which participants erroneously decided that one object was the longer (or shorter) of the two, the anticipatory opening between their fingers reflect ed the real direction and magnitude of size differences between the two objects. More over, the subjects in this experiment showed the same differential scaling to the real size of the objects whether the objects were shown on the illusory display or on the control display. Not surprisingly, when subjects were asked to use their finger and thumb to esti mate the size of the target objects rather than pick them up, their manual estimates re flected the apparent, not the real, size of the targets. Overall, these results underscore once more the profound difference in the way visual information is transformed for action and perception. Importantly, too, the results are difficult to reconcile with any argument that suggests that grip aperture is sensitive to illusions, and that the absence of an effect found in many studies is simply a consequence of differences in the task demands (Franz, 2001; Franz et al., 2000). Page 21 of 40

Visual Control of Action One exceptionally interesting (and controversial) finding with respect to differences in the computations used by vision for perception and vision for action is the recent demon stration that grip scaling, unlike manual estimates of object size, does not appear to obey Weber’s law (Ganel, Chajut, & Algom, 2008a). In other words, when people estimated the size of an object (either by adjusting a comparison line on a computer screen or by mak ing a manual estimate), the Just Noticeable Difference (JND) increased with physical size in accord with Weber’s law; but when they reached out and picked up the object, the JND, as indicated by differences in grip aperture, was unaffected by variations in the size of the object. This surprising finding would appear to suggest that Weber’s law is violated for visually guided actions, reflecting a fundamental difference in the way that object size is computed for action and for perception.

Figure 14.7 Stimuli and experimental design of Ganel et al. (2008) study. A, The experimental para digm and the version of the Ponzo illusion used. B, The arrangement of the objects on incongruent trials in which the real size and the illusory size were pit ted against one another. In this example, object 1 is perceived in most cases as shorter than object 2 (due to the illusory context), although it is actually longer. The real difference in size can be clearly seen in C, where the two objects are placed next to one another (for illustrative purposes) on the nonillusory control display. Adapted with permission from Ganel et al., 2008b.

Page 22 of 40

Visual Control of Action

Figure 14.8 Maximum grip aperture and perceptual estimates of length for objects placed on the illusory display (A) and control display (B). Only incongruent trials in which participants made erroneous deci sions about real size are shown for the grip aperture and estimates with the illusory display. As Panel A shows, despite the fact that participants erroneously perceived the physically longer object to be the shorter one (and vice versa), the opening between their finger and thumb during the grasping move ments reflected the real difference in size between the objects. This pattern of results was completely reversed when they made perceptual estimates of the length of the objects. With the control display (B) , both grip aperture and manual estimates went in the same direction. Adapted with permission from Ganel et al., 2008b.

Of course, this finding (as well as the fact that actions are often resistant to size-contrast illusions) fits well with Smeets and Brenner’s (1999, 2001) double-pointing hypothesis. They would (p. 287) argue that the visual-motor system does not compute the size of the object but instead computes the two locations on the surface of object where the digits will be placed. According to their double-pointing hypothesis, size is irrelevant to the planning of these trajectories, and thus variation in size will not affect the accuracy with which the finger and thumb are placed on either side the target (p. 288) object. In short, Weber’s law is essentially irrelevant (Smeets & Brenner, 2008). The same argument ap plies to grasping movements made in the context of size-contrast illusions: Because grip scaling is simply an epiphenomenon of the independent finger trajectories, grip aperture seems to be impervious to the effects of the illusion. Although, as discussed earlier, Smeets and Brenner’s account has been challenged, it has to be acknowledged that their double-pointing model offers a convincing explanation of all these findings (even if the neuropsychological and neuroimaging data are more consistent with Jeannerod’s, 1981, two-visual-motor channel account of reach-to-grasp movements). Even so, there are some behavioral observations that cannot also be accommodated by the Smeets and Brenner (1999, 2001) model. For example, as discussed earlier, if a delay is introduced between viewing the target and initiating the grasp, the scaling of the antic ipatory grip aperture is much more likely to be sensitive to size-contrast illusions (Fisch er, 2001; Hu & Goodale, 2000; Westwood & Goodale, 2003; Westwood, Heath, & Roy, 2000). Moreover, if a similar delay is introduced in the context of the Ganel et al. (2008a) experiments just described, grip aperture now obeys Weber’s law. These results cannot Page 23 of 40

Visual Control of Action be easily explained by the Smeets and Brenner model without conceding that—with delay —grip scaling is no longer a consequence of programming individual digit trajectories, but instead reflects the perceived size of the target object. Nor can the Smeets and Bren ner model explain what happens when unpracticed finger postures (e.g., the thumb and ring fingers) are used to pick up objects in the context of a size-contrast illusion. In con trast to skilled grasping movements, grip scaling with unpracticed awkward grasping is quite sensitive to the illusory difference in size between the objects (Gonzalez, Ganel, Whitwell, Morrissey, & Goodale, 2008). Only with practice does grip aperture begin to re flect the real size of the target objects. Smeets and Brenner’s model cannot account for this result without positing that individual control over the digits occurs only after prac tice. Finally, recent neuropsychological findings with patient DF suggest that she could use action-related information about object size (presumably from her intact dorsal stream) to make explicit judgments regarding the length of an object that she was about to pick up (Schenk & Milner, 2006), suggesting that size, rather than two separate loca tions, is implicitly coded during grasping. At this point, the difference between the per ception–action model (Goodale & Milner, 1992; Milner & Goodale, 2006) and the (modi fied) Smeets and Brenner account begins to blur. Both accounts posit that real-time con trol of skilled grasping depends on visual-motor transformations that are quite distinct from those involved in the control of delayed or unpracticed grasping movements. The difference in the two accounts turns on the nature of the control exercised over skilled movements performed in real time. But note that even if Smeets and Brenner are correct that the trajectories of the individual digits are programmed individually on the basis of spatial information that ignores the size of the object, this would not obviate the idea of two visual systems, one for constructing our perception of the world and one for control ling our actions in that world. Indeed, the virtue of the perception–action model is that it accounts not only for the dissociations outlined above between the control of action and psychophysical report in normal observers in a number of different settings, but it also accounts for a broad range of neuropsychological, neurophysiological, and neuroimaging data (and is completely consistent with Jeannerod’s dual-channel model of reach-to-grasp movements). It is worth noting that dissociations between perceptual report and action have been re ported for other classes of responses as well. For example, Tavassoli and Ringach (2010) found that eye movements in a visual tracking task responded to fluctuations in the veloc ity of the moving target that were perceptually invisible to the subjects. Moreover, the perceptual errors were independent of the accuracy of the pursuit eye movements. These results are in conflict with the idea that the motor control of pursuit eye movements and the perception are based on the same motion signals and are affected by shared sources of noise. Similar dissociations have been observed between saccadic eye movements and perceptual report. Thus, by exploiting the illusory mislocalization of a flashed target in duced by visual motion, de’Sperati and Baud-Bovy (2008) showed that fast but not slow saccades escaped the effects of the illusion and were directed to the real rather than the apparent location of the target. This result underscores the fact that the control of action often depends on processes that unfold much more rapidly than those involved in percep Page 24 of 40

Visual Control of Action tual processing (e.g., Castiello & Jeannerod, 1991). Indeed, as has been already dis cussed, visual-motor control may often be mediated by fast feedforward mechanisms, in contrast to conscious perception, which requires (slower) feedback to earlier visual ar eas, including V1 (Lamme, 2001).

Interactions Between the Two Streams When the idea of a separate vision-for-action system was first proposed 20 years ago, the emphasis was on the independence of this system from vision for perception. But clearly the two systems must work closely together in the generation of purposive behav ior. One way to think about the interaction between the two streams (an interaction that takes advantage of the complementary differences in their computational constraints) is in terms of a “tele-assistance” model (Goodale & Humphrey, 1998). In tele-assistance, a human operator who has identified a goal object and decided what to do with it communi cates with a semi-autonomous robot that actually performs the required motor act on the flagged goal object (Pook & Ballard, 1996). In terms of this tele-assistance metaphor, the perceptual machinery in the ventral stream, with its rich and detailed representations of (p. 289)

the visual scene (and links with cognitive systems), would be the human operator. Processes in the ventral stream participate in the identification of a particular goal and flag the relevant object in the scene, perhaps by means of an attention-like process. Once a particular goal object has been flagged, dedicated visual-motor networks in the dorsal stream (in conjunction with related circuits in premotor cortex, basal ganglia, and brain stem) are then activated to transform the visual information about the object into the ap propriate coordinates for the desired motor act. This means that in many instances a flagged object in the scene will be processed in parallel by both ventral and dorsal stream mechanisms—each transforming the visual information in the array for different purpos es. In other situations, where the visual stimuli are particularly salient, visual-motor mechanisms in the dorsal stream will operate without any immediate supervision by ven tral stream perceptual mechanisms. Of course, the tele-assistance analogy is far too simplified. For one thing, the ventral stream by itself cannot be construed as an intelligent operator that can make assess ments and plans. Clearly, there has to be some sort of top-down executive control—almost certainly engaging prefrontal mechanisms—that can initiate the operation of attentional search and thus set the whole process of planning and goal selection in motion (for re view, see Desimone & Duncan, 1995; Goodale & Haffenden, 2003). Reciprocal interac tions between prefrontal/premotor areas and the areas in the posterior parietal cortex un doubtedly play a critical role in recruiting specialized dorsal stream structures, such as LIP, which appear to be involved in the control of both voluntary eye movements and covert shifts of spatial attention in monkeys and humans (Bisley & Goldberg, 2010; Cor betta, Kincade, & Shulman, 2002). In terms of the tele-assistance metaphor, area LIP can be seen as acting like a video camera on the robot scanning the visual scene, and thereby providing new inputs that the ventral stream can process and pass on to frontal systems that assess their potential importance. In practice, of course, the video camera/LIP sys tem does not scan the environment randomly: It is constrained to a greater or lesser de Page 25 of 40

Visual Control of Action gree by top-down information about the nature of the potential targets and where those targets might be located, information that reflects the priorities of the operator/organism that are presumably elaborated in prefrontal systems. What happens next goes beyond even these speculations. Before instructions can be transmitted to the visual-motor control systems in the dorsal stream, the nature of the ac tion required needs to be determined. This means that praxis systems, perhaps located in the left hemisphere, need to “instruct” the relevant visual-motor systems. After all, ob jects such as tools demand a particular kind of hand posture. Achieving this not only re quires that the tool be identified, presumably using ventral stream mechanisms (Valyear & Culham, 2010), but also that the required actions to achieve that posture be selected as well via a link to these praxis systems. At the same time, the ventral stream (and related cognitive apparatus) has to communicate the locus of the goal object to these visual-mo tor systems in the dorsal stream. One way that this ventral-dorsal transmission could hap pen is via recurrent projections from foci of activity in the ventral stream back down stream to primary visual cortex and other adjacent visual areas. Once a target has been “highlighted” on these retinotopic maps, its location could then finally be forwarded to the dorsal stream for action (for a version of this idea, see Lamme & Roelfsema, 2000). Moreover, LIP itself, by virtue of the fact that it would be “pointing” at the goal object, could also provide the requisite coordinates, once it has been cued by recognition sys tems in the ventral stream. When the particular disposition and location of the object with respect to the actor have been computed, that information has to be combined with the postural requirements of the appropriate functional grasp for the tool, that as I have already suggested are pre sumably provided by praxis systems that are in turn cued by recognition mechanisms in the ventral (p. 290) stream. At the same time, the initial fingertip forces that should be ap plied to the tool (or any object, for that matter) are based on estimations of its mass, sur face friction, and compliance that are derived from visual information (e.g., Gordon, West ling, Cole, & Johansson, 1993). Once contact is made, somatosensory information can be used to fine-tune the applied forces—but the specification of the initial grip and lift forces must be derived from learned associations between the object’s visual appearance and prior experience with similar objects or materials (Buckingham, Cant, & Goodale, 2009). This information presumably can be provided only by the ventral visual stream in con junction with stored information about past interactions. Again, it must be emphasized that all of this is highly speculative. Nevertheless, whatever complex interactions might be involved, it is clear that goal-directed action is unlikely to be mediated by a simple serial processing system. Multiple iterative processing is almost certainly required, involving a constant interplay among different control systems at dif ferent levels of processing (for a more detailed discussion of these and related issues, see Milner & Goodale, 2006). A full understanding of the contrasting (and complementary) roles of the ventral and dorsal streams in this complex network will come only when we

Page 26 of 40

Visual Control of Action can specify the neural and functional interconnections between the two streams (and oth er brain areas) and the nature of the information they exchange.

References Aglioti, S., DeSouza, J., & Goodale, M. A. (1995). Size-contrast illusions deceive the eyes but not the hand. Current Biology, 5, 679–685. Amazeen, E. L., & DaSilva, F. (2005). Psychophysical test for the independence of percep tion and action. Journal of Experimental Psychology: Human Perception and Performance, 31, 170–182. Andersen, R. A., & Buneo, C. A. (2003). Sensorimotor integration in posterior parietal cor tex. Advances in Neurology, 93, 159–177. Ballard, D. H., Hayhoe, M. M., Li, F., & Whitehead, S. D. (1992). Hand–eye coordination during sequential tasks. Philosophical Transactions of the Royal Society London B Biologi cal Sciences, 337, 331–338. Bálint, R. (1909). Seelenlähmung des “Schauens,” optische Ataxie, räumliche Störung der Aufmerksamkeit. Monatsschrift für Psychiatrie und Neurologie, 25, 51–81. Bálint, R., & Harvey, M. (1995). Psychic paralysis of gaze, optic ataxia, and spatial disor der of attention. Cognitive Neuropsychology, 12, 265–281. Baldauf, D., & Deubel, H. (2010). Attentional landscapes in reaching and grasping. Vision Research, 50, 999–1013. Biegstraaten, M., de Grave, D. D. J., Brenner, E., & Smeets, J. B. J. (2007). Grasping the Muller-Lyer illusion: Not a change in perceived length. Experimental Brain Research, 176, 497–503. Binkofski, F., Dohle, C., Posse, S., Stephan, K. M., Hefter, H., Seitz, R. J., & Freund, H. J. (1998). Human anterior intraparietal area subserves prehension: A combined lesion and functional MRI activation study. Neurology, 50, 1253–1259. Binsted, G., Chua, R., Helsen, W., & Elliott, D. (2001). Eye–hand coordination in goal-di rected aiming. Human Movement Sciences, 20, 563–585. Bisley, J. W., & Goldberg, M. E. (2010). Attention, intention, and priority in the parietal lobe. Annual Review of Neuroscience, 33, 1–21. Brenner, E., & Smeets, J. B. (1996). Size illusion influences how we lift but not how we grasp an object. Experimental Brain Research, 111, 473–476. Brouwer, A. M., Franz, V. H., & Gegenfurtner, K. R. (2009). Differences in fixations be tween grasping and viewing objects. Journal of Vision, 9, 18. 1–24.

Page 27 of 40

Visual Control of Action Brown, S., & Schäfer, E. A. (1888). An investigation into the functions of the occipital and temporal lobes of the monkey’s brain. Philosophical Transactions of the Royal Society of London, 179, 303–327. Bruno, N., Bernardis, P., & Gentilucci, M. (2008). Visually guided pointing, the Müller-Ly er illusion, and the functional interpretation of the dorsal-ventral split: Conclusions from 33 independent studies. Neuroscience and Biobehavioral Reviews, 32, 423–437. Buckingham, G., Cant, J. S., & Goodale, M. A. (2009). Living in a material world: How vi sual cues to material properties affect the way that we lift objects and perceive their weight. Journal of Neurophysiology, 102, 3111–3118. Campion, J. (1987). Apperceptive agnosia: The specification and description of constructs. In G. W. Humphreys & M. J. Riddoch (Eds.), Visual object processing: A cognitive neu ropsychological approach (pp. 197–232). London: Erlbaum. Cant, J. S., Arnott, S. R., & Goodale, M. A. (2009). fMR-adaptation reveals separate pro cessing regions for the perception of form and texture in the human ventral stream. Ex perimental Brain Research, 192, 391–405. Cant, J. S., & Goodale, M. A. (2007). Attention to form or surface properties modulates dif ferent regions of human occipitotemporal cortex. Cerebral Cortex, 17, 713–731. Castiello, U. (2001). The effects of abrupt onset of 2-D and 3-D distractors on prehension movements. Perception and Psychophysics, 63, 1014–1025. Castiello, U. (2005). The neuroscience of grasping. Nature Reviews Neuroscience, 6, 726– 736. Castiello, U., & Jeannerod, M. (1991). Measuring time to awareness. NeuroReport, 2, 797–800. Cavina-Pratesi, C., Goodale, M. A., & Culham, J. C. (2007). FMRI reveals a dissociation be tween grasping and perceiving the size of real 3D objects. PLoS One, 2, e424. Cavina-Pratesi, C., Ietswaart, M., Humphreys, G. W., Lestou, V., & Milner, A. D. (2010). Impaired grasping in a patient with optic ataxia: Primary visuomotor deficit or secondary consequence of misreaching? Neuropsychologia, 48, 226–234. Chapman, C. S., & Goodale, M. A. (2008). Missing in action: The effect of obstacle position and size on avoidance while reaching. Experimental Brain Research, 191, 83–97. (p. 291)

Chapman, C. S., & Goodale, M. A. (2010). Seeing all the obstacles in your way: The effect of visual feedback and visual feedback schedule on obstacle avoidance while reaching. Experimental Brain Research, 202, 363–375. Chieffi, S., & Gentilucci, M. (1993). Coordination between the transport and the grasp components during prehension movements. Experimental Brain Research, 94, 471–477. Page 28 of 40

Visual Control of Action Cohen, Y. E., & Andersen, R. A. (2002). A common reference frame for movement plans in the posterior parietal cortex. Nature Reviews Neuroscience, 3, 553–562. Corbetta, M., Kincade, M. J. & Shulman, G. L. (2002). Two neural systems for visual ori enting and the pathophysiology of unilateral spatial neglect. In H.-O. Karnath, A. D. Mil ner, & G. Vallar (Eds.), The cognitive and neural bases of spatial neglect (pp. 259–273). Oxford, UK: Oxford University Press. Cowey, A. (2010). The blindsight saga. Experimental Brain Research, 200, 3–24. Cuijpers, R. H., Smeets, J. B., & Brenner, E. (2004). On the relation between object shape and grasping kinematics. Journal of Neurophysiology, 91, 2598–2606. Cuijpers, R. H., Smeets, J. B., & Brenner, E. (2004). On the relation between object shape and grasping kinematics. Journal of Neurophysiology, 91, 2598–2606. Culham, J. C. (2004). Human brain imaging reveals a parietal area specialized for grasp ing. In N. Kanwisher & J. Duncan (Eds.) Attention and performance XX: Functional brain— Imaging of human cognition (417–438). Oxford, UK: Oxford University Press. Culham, J. C., Cavina-Pratesi, C., & Singhal, A. (2006). The role of parietal cortex in visuo motor control: What have we learned from neuroimaging? Neuropsychologia, 44, 2668– 2684. Culham, J. C., Danckert, S. L., DeSouza, J. F. X., Gati, J. S., Menon, R. S., & Goodale, M. A. (2003). Visually-guided grasping produces activation in dorsal but not ventral stream brain areas. Experimental Brain Research, 153, 158–170. Culham, J. C., & Kanwisher, N. G. (2001). Neuroimaging of cognitive functions in human parietal cortex. Current Opinion in Neurobiology, 11, 157–163. Culham, J. C., & Valyear, K. F. (2006). Human parietal cortex in action. Current Opinion in Neurobiology, 16, 205–212. Daprati, E., & Gentilucci, G. (1997). Grasping an illusion. Neuropsychologia, 35, 1577– 1582. de Grave, D. D., Biegstraaten, M., Smeets, J. B., Brenner, E. (2005) Effects of the Ebbing haus figure on grasping are not only due to misjudged size. Experimental Brain Research 163, 58–64. Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. An nual Review of Neuroscience, 18, 193–222. de’Sperati, C., & Baud-Bovy, G. (2008). Blind saccades: An asynchrony between seeing and looking. Journal of Neuroscience, 28, 4317–4321.

Page 29 of 40

Visual Control of Action Deubel, H., Schneider, W. X., & Paprotta, I. (1998). Selective dorsal and ventral process ing: Evidence for a common attentional mechanism in reaching and perception. Visual Cognition, 5, 81–107. Dewar, M. T., & Carey, D. P. (2006). Visuomotor “immunity” to perceptual illusion: A mis match of attentional demands cannot explain the perception-action dissociation. Neu ropsychologia, 44, 1501–1508. Diedrichsen, J., Werner, S., Schmidt, T., & Trommershauser, J. (2004). Immediate spatial distortions of pointing movements induced by visual landmarks. Perception and Psy chophysics, 66, 89–103. Dijkerman, H. C., Lê, S., Démonet, J. F., & Milner, A. D. (2004). Visuomotor performance in a patient with visual agnosia due to an early lesion. Brain Research: Cognitive Brain Research, 20, 12–25. Downing, P. E., Chan, A. W., Peelen, M. V., Dodds, C. M., & Kanwisher, N. (2006). Domain specificity in visual cortex. Cerebral Cortex, 16, 1453–1461. Dubrowski, A., Bock, O., Carnahan, H., & Jungling, S. (2002). The coordination of hand transport and grasp formation during single- and double-perturbed human prehension movements. Experimental Brain Research, 145, 365–371. Fattori, P., Breveglieri, R., Amoroso, K., & Galletti, C. (2004). Evidence for both reaching and grasping activity in the medial parieto-occipital cortex of the macaque. European Journal of Neuroscience, 20, 2457–2466. Fattori, P., Gamberini, M., Kutz, D. F., & Galletti, C. (2001). “Arm-reaching” neurons in the parietal area V6A of the macaque monkey. European Journal of Neuroscience, 13, 2309– 2313. Fattori, P., Raos, V., Breveglieri, R., Bosco, A., Marzocchi, N., & Galletti C. (2010). The dorsomedial pathway is not just for reaching: Grasping neurons in the medial parieto-oc cipital cortex of the macaque monkey. Journal of Neuroscience, 30, 342–349. Ferrier, D., & Yeo, G. F. (1884). A record of experiments on the effects of lesion of differ ent regions of the cerebral hemispheres. Philosophical Transactions of the Royal Society of London. 175, 479–564. Filimon, F., Nelson, J. D., Huang, R. S., & Sereno, M. I. (2009). Multiple parietal reach re gions in humans: Cortical representations for visual and proprioceptive feedback during on-line reaching. Journal of Neuroscience, 29, 2961–2971. Fischer, M. H. (2001). How sensitive is hand transport to illusory context effects? Experi mental Brain Research, 136, 224–230. Franz, V. H. (2001). Action does not resist visual illusions. Trends in Cognitive Sciences, 5, 457–459. Page 30 of 40

Visual Control of Action Franz, V. H., Bulthoff, H. H., & Fahle, M. (2003). Grasp effects of the Ebbinghaus illusion: Obstacle avoidance is not the explanation. Experimental Brain Research, 149, 470–477. Franz, V. H., Gegenfurtner, K. R., Bulthoff, H. H., & Fahle, M. (2000). Grasping visual illu sions: no evidence for a dissociation between perception and action. Psychological Science, 11, 20–25. Frey, S. H., Vinton, D., Norlund, R., & Grafton, S. T. (2005). Cortical topography of human anterior intraparietal cortex active during visually guided grasping. Brain Research: Cog nitive Brain Research, 23, 397–405. Ganel, T., Chajut, E., & Algom, D. (2008a). Visual coding for action violates fundamental psychophysical principles. Current Biology, 18, R599–R601. Ganel, T., Tanzer, M., & Goodale, M. A. (2008b). A double dissociation between action and perception in the context of visual illusions: Opposite effects of real and illusory size. Psy chological Science, 19, 221–225. Gentilucci, M., Chieffi, S., Daprati, E., Saetti, M. C., & Toni, I. (1996). Visual illusion and action. Neuropsychologia, 34, 369–376. Glazebrook, C. M., Dhillon, V. P., Keetch, K. M., Lyons, J., Amazeen, E., Weeks, D. J., & Elliott, D. (2005). Perception-action and the Muller-Lyer illusion: Amplitude or end point bias? Experimental Brain Research, 160, 71–78. (p. 292)

Glover, S. (2004). Separate visual representations in the planning and control of action. Behavioural and Brain Sciences, 27, 3–24; discussion 24–78. Glover, S., & Dixon, P. (2001a). Motor adaptation to an optical illusion. Experimental Brain Research, 137, 254–258. Glover, S., & Dixon, P. (2001b). The role of vision in the on-line correction of illusion ef fects on action. Canadian Journal of Experimental Psychology, 55, 96–103. Gonzalez, C. L. R., Ganel, T., Whitwell, R. L., Morrissey, B., & Goodale, M. A. (2008). Prac tice makes perfect, but only with the right hand: Sensitivity to perceptual illusions with awkward grasps decreases with practice in the right but not the left hand. Neuropsy chologia, 46, 624–631. Goodale, M. A. (1983). Vision as a sensorimotor system. In T. E. Robinson (Ed.), Behav ioral approaches to brain research (pp. 41–61). New York: Oxford University Press. Goodale, M. A. (1995). The cortical organization of visual perception and visuomotor con trol. In S. Kosslyn and D. N. Oshershon (Ed.), An invitation to cognitive science. Vol. 2. Vi sual cognition and action (2nd ed., pp. 167–214). Cambridge, MA: MIT Press. Goodale, M. A. (1996). Visuomotor modules in the vertebrate brain. Canadian Journal of Physiology and Pharmacology, 74, 390–400. Page 31 of 40

Visual Control of Action Goodale, M. A. (2008). Action without perception in human vision. Cognitive Neuropsy chology, 25, 891–919. Goodale, M. A., & Haffenden, A. M. (1998). Frames of reference for perception and action in the human visual system. Neuroscience and Biobehavioral Reviews, 22, 161–172. Goodale, M. A., & Haffenden, A. M. (2003). Interactions between dorsal and ventral streams of visual processing. In A. Siegel, R. Andersen, H.-J. Freund, & D. Spencer (Eds.), Advances in neurology: The parietal lobe (Vol. 93, pp. 249–267). Philadelphia: LippincottRaven. Goodale, M. A., Meenan, J. P., Bülthoff, H. H., Nicolle, D. A., Murphy, K. S., & Racicot, C. I. (1994). Separate neural pathways for the visual analysis of object shape in perception and prehension. Current Biology, 4, 604–610. Goodale, M. A., & Milner, A. D. (1992). Separate visual pathways for perception and ac tion. Trends in Neurosciences, 15, 20–25. Goodale, M. A., & Milner, A. D. (2004). Sight unseen: An exploration of conscious and un conscious vision. Oxford, UK: Oxford University Press. Goodale, M. A., Milner, A. D., Jakobson, L. S., & Carey, D. P. (1991). A neurological dissoci ation between perceiving objects and grasping them. Nature, 349, 154–156. Goodale, M. A., Westwood, D. A., & Milner, A. D. (2004). Two distinct modes of control for object-directed action. Progress in Brain Research 144, 131–144. Gordon, A. M., Westling, G., Cole, K. J., & Johansson, R. S. (1993). Memory representa tions underlying motor commands used during manipulation of common and novel ob jects. Journal of Neurophysiology, 69, 1789–1796. Grefkes, C., & Fink, G. R. (2005). The functional organization of the intraparietal sulcus in humans and monkeys. Journal of Anatomy, 207, 3–17. Grill-Spector, K. (2003). The neural basis of object perception. Current Opinion in Neuro biology, 13, 159–166. Grill-Spector, K., Kushnir, T., Edelman, S., Avidan, G., Itzchak, Y., & Malach, R. (1999). Dif ferential processing of objects under various viewing conditions in the human lateral oc cipital complex. Neuron, 24, 187–203. Grill-Spector, K., Kushnir, T., Edelman, S., Itzchak, Y., & Malach, R. (1998). Cue-invariant activation in object-related areas of the human occipital lobe. Neuron, 21, 191–202. Grill-Spector, K., & Malach, R. (2004). The human visual cortex. Annual Review of Neuro science, 27, 649–677.

Page 32 of 40

Visual Control of Action Grol, M. J., Majdandzić, J., Stephan, K. E., Verhagen, L., Dijkerman, H. C., Bekkering, H., Verstraten, F. A., & Toni, I. (2007). Parieto-frontal connectivity during visually guided grasping. Journal of Neuroscience, 27, 11877–11887. Haffenden, A., & Goodale, M. A. (1998). The effect of pictorial illusion on prehension and perception. Journal of Cognitive Neuroscience, 10, 122–136. Haffenden, A. M., Schiff, K. C., & Goodale, M. A. (2001). The dissociation between percep tion and action in the Ebbinghaus illusion: Nonillusory effects of pictorial cues on grasp. Current Biology, 11, 177–181. Haxby, J. V., Gobbini, M. I., Furey, M. L., Ishai, A., Schouten, J. L., & Pietrini, P. (2001). Distributed and overlapping representations of faces and objects in ventral temporal cor tex. Science, 293, 2425–2430. Holmes, G. (1918). Disturbances of visual orientation. British Journal of Ophthalmology, 2, 449–468. Hu, Y., & Goodale, M. A. (2000). Grasping after a delay shifts size-scaling from absolute to relative metrics. Journal of Cognitive Neuroscience, 12, 856–868. Hu, Y., Osu, R., Okada, M., Goodale, M. A., & Kawato, M. (2005). A model of the coupling between grip aperture and hand transport during human prehension. Experimental Brain Research, 167, 301–304. Jackson, S. R., Jackson, G. M., & Rosicky, J. (1995). Are non-relevant objects represented in working memory? The effect of non-target objects on reach and grasp kinematics. Ex perimental Brain Research, 102, 519–530. Jackson, S. R., & Shaw, A. (2000). The Ponzo illusion affects grip-force but not grip-aper ture scaling during prehension movements. Journal of Experimental Psychology: Human Perception and Performance, 26, 418–423. Jakobson, L. S., Archibald, Y. M., Carey, D. P., & Goodale, M. A. (1991). A kinematic analy sis of reaching and grasping movements in a patient recovering from optic ataxia. Neu ropsychologia, 29, 803–809. Jakobson, L. S., & Goodale, M. A. (1991). Factors affecting higher-order movement plan ning: A kinematic analysis of human prehension. Experimental Brain Research, 86, 199– 208. James, T. W., Culham, J., Humphrey, G. K., Milner, A. D., & Goodale, M. A. (2003). Ventral occipital lesions impair object recognition but not object-directed grasping: A fMRI study. Brain, 126, 2463–2475. James, T. W., Humphrey, G. K., Gati, J. S., Menon, R. S., & Goodale, M. A. (2002). Differen tial effects of viewpoint on object-driven activation in dorsal and ventral streams. Neuron, 35, 793–801. Page 33 of 40

Visual Control of Action Jeannerod, M. (1981). Intersegmental coordination during reaching at natural visual ob jects. In J. Long & A. Baddeley (Eds.), Attention and performance IX (pp. 153–168). Hills dale, NJ: Erlbaum. Jeannerod, M. (1984). The timing of natural prehension movements. Journal of Motor Be havior, 16, 235–254. Jeannerod, M. (1986). The formation of finger grip during prehension: A cortically mediated visuomotor pattern. Behavioral Brain Research, 19, 99–116. (p. 293)

Jeannerod, M. (1988). The neural and behavioural organization of goal-directed move ments. Oxford, UK: Clarendon Press. Johansson, R., Westling, G., Bäckström, A., & Flanagan, J. R. (2001). Eye–hand coordina tion in object manipulation. Journal of Neuroscience, 21, 6917–6932. Karnath, H. O., & Perenin, M.-T. (2005). Cortical control of visually guided reaching: Evi dence from patients with optic ataxia. Cerebral Cortex, 15, 1561–1569. Karnath, H. O., Rüter, J., Mandler, A., & Himmelbach, M. (2009). The anatomy of object recognition—Visual form agnosia caused by medial occipitotemporal stroke. Journal of Neuroscience, 29, 5854–5862. Keefe, B. D., & Watt, S. J. (2009). The role of binocular vision in grasping: A small stimu lus-set distorts results. Experimental Brain Research, 194, 435–444. Knill, D. C. (2005). Reaching for visual cues to depth: the brain combines depth cues dif ferently for motor control and perception. Journal of Vision, 5, 103–115. Kourtzi, Z., & Kanwisher, N. (2001). Representation of perceived object shape by the hu man lateral occipital complex. Science, 293, 1506–1509. Kwok, R. M., & Braddick, O. J. (2003) When does the Titchener circles illusion exert an ef fect on grasping? Two- and three-dimensional targets. Neuropsychologia, 41, 932–940. Lamme, V. A. F. (2001). Blindsight: The role of feedforward and feedback corticocortical connections. Acta Psychologica, 107, 209–228. Lamme, V. A. F., & Roelfsema, P. R. (2000). The distinct modes of vision offered by feed forward and recurrent processing. Trends in Neurosciences, 23, 571–579. Lê, S., Cardebat, D., Boulanouar, K., Hénaff, M. A., Michel, F., Milner, D., Dijkerman, C., Puel, M., & Démonet, J.-F. (2002). Seeing, since childhood, without ventral stream: A be havioural study. Brain, 125, 58–74. Lee, Y. L., Crabtree, C. E., Norman, J. F., & Bingham, G. P. (2008). Poor shape perception is the reason reaches-to-grasp are visually guided online. Perception and Psychophysics, 70, 1032–1046. Page 34 of 40

Visual Control of Action Loftus, A., Servos, P., Goodale, M. A., Mendarozqueta, N., & Mon-Williams, M. (2004). When two eyes are better than one in prehension: monocular viewing and end-point vari ance. Experimental Brain Research, 158, 317–327. Louw, S., Smeets, J. B., & Brenner, E. (2007). Judging surface slant for placing objects: A role for motion parallax. Experimental Brain Research, 183, 149–158. Lyon, D. C., Nassi, J. J., & Callaway, E. M. (2010). A disynaptic relay from superior collicu lus to dorsal stream visual cortex in macaque monkey. Neuron, 65, 270–279. Marotta, J. J., Behrmann, M., & Goodale, M. A. (1997). The removal of binocular cues dis rupts the calibration of grasping in patients with visual form agnosia. Experimental Brain Research, 116, 113–121. Marotta, J. J., & Goodale, M. A. (1998). The role of learned pictorial cues in the program ming and control of grasping. Experimental Brain Research, 121, 465–470. Marotta, J. J., & Goodale, M. A. (2001). The role of familiar size in the control of grasping. Journal of Cognitive Neuroscience, 13, 8–17. Marotta, J. J., Kruyer, A., & Goodale, M. A. (1998). The role of head movements in the con trol of manual prehension. Experimental Brain Research, 120, 134–138. Marotta, J. J., Perrot, T. S., Nicolle, D., & Goodale, M. A. (1995). The development of adap tive head movements following enucleation. Eye, 9 (3), 333–336. Marotta, J. J., Perrot, T. S., Servos, P., Nicolle, D., & Goodale, M. A. (1995). Adapting to monocular vision: Grasping with one eye. Experimental Brain Research, 104, 107–114. Melmoth, D. R., Finlay, A. L., Morgan, M. J., & Grant, S. (2009). Grasping deficits and adaptations in adults with stereo vision losses. Investigative Ophthalmology and Visual Science, 50, 3711–3720. Melmoth, D. R., & Grant, S. (2006). Advantages of binocular vision for the control of reaching and grasping. Experimental Brain Research, 171, 371–388. Melmoth, D. R., Storoni, M., Todd, G., Finlay, A. L., & Grant, S. (2007). Dissociation be tween vergence and binocular disparity cues in the control of prehension. Experimental Brain Research, 183, 283–298. Menon, R. S., Ogawa, S., Kim, S. G., Ellermann, J. M., Merkle, H., Tank, D. W., & Ugurbil, K. (1992). Functional brain mapping using magnetic resonance imaging: Signal changes accompanying visual stimulation. Investigative Radiology, Suppl 2, S47–S53. Milner, A. D., & Goodale, M.A. (2006). The visual brain in action (2nd ed.). Oxford, UK: Oxford University Press. Milner, A. D., & Goodale M. A. (2008). Two visual systems re-viewed. Neuropsychologia, 46, 774–785. Page 35 of 40

Visual Control of Action Milner, A. D., Perrett, D. I., Johnston, R. S., Benson, P. J., Jordan, T. R., Heeley, D. W., Bet tucci, D., Mortara, F., Mutani, R., Terazzi, E., & Davidson, D. L. W. (1991). Perception and action in “visual form agnosia.” Brain, 114, 405–428. Monaco, S., Sedda, A., Fattori, P., Galletti, C., & Culham, J. C. (2009). Functional magnetic resonance adaptation (fMRA) reveals the involvement of the dorsomedial stream in wrist orientation for grasping. Society for Neuroscience Abstracts, 307, 1. Mon-Williams, M., & Dijkerman, H. C. (1999). The use of vergence information in the pro gramming of prehension. Experimental Brain Research, 128, 578–582. Mon-Williams, M., & McIntosh, R. D. (2000). A test between two hypotheses and a possi ble third way for the control of prehension. Experimental Brain Research, 134, 268–273. Mon-Williams, M., & Tresilian, J. R. (2001). A simple rule of thumb for elegant prehen sion. Current Biology, 11, 1058–1061. Moore, T. (2006). The neurobiology of visual attention: Finding sources. Current Opinion in Neurobiology, 16, 159–165. Nieder, A., & Dehaene, S. (2009). Representation of number in the brain. Annual Review of Neuroscience, 32, 185–208. Obhi, S. S., & Goodale, M. A. (2005). The effects of landmarks on the performance of de layed and real-time pointing movements. Experimental Brain Research, 167, 335–344. Ogawa, S., Tank, D. W., Menon, R., Ellermann, J. M., Kim, S. G., Merkle, H., & Ugurbil, K. (1992). Intrinsic signal changes accompanying sensory stimulation: Functional brain map ping with magnetic resonance imaging. Proceedings of the National Academy of Sciences U S A, 89, 5951–5955. Op de Beeck, H. P., Haushofer, J., & Kanwisher, N. G. (2008). Interpreting fMRI data: Maps, modules and dimensions. Nature Reviews Neuroscience, 9, 123–135. Patla, A. E. (1997). Understanding the roles of vision in the control of human locomotion. Gait and Posture, 5, 54–69. Perenin, M.-T., & Rossetti, Y. (1996). Grasping without form discrimination in a hemi anopic field. NeuroReport, 7, 793–797. Perenin, M.-T., & Vighetto, A. (1988). Optic ataxia: A specific disruption in visuo motor mechanisms. I. Different aspects of the deficit in reaching for objects. Brain, 111, (p. 294)

643–674. Pierrot-Deseilligny C. H., Milea, D., & Muri, R. M. (2004). Eye movement control by the cerebral cortex. Current Opinion in Neurology, 17, 17–25. Plodowski, A., & Jackson, S. R. (2001). Vision: Getting to grips with the Ebbinghaus illu sion. Current Biology, 11, R304–R306. Page 36 of 40

Visual Control of Action Pook, P. K., & Ballard, D. H. (1996). Deictic human/robot interaction. Robotics and Au tonomous Systems, 18, 259–269. Prado, J., Clavagnier, S., Otzenberger, H., Scheiber, C., & Perenin, M.-T. (2005). Two corti cal systems for reaching in central and peripheral vision. Neuron, 48, 849–858. Rice, N. J., McIntosh, R. D., Schindler, I., Mon-Williams, M., Démonet, J. F., & Milner, A. D. (2006). Intact automatic avoidance of obstacles in patients with visual form agnosia. Ex perimental Brain Research, 174, 176–188. Rizzolatti, G., & Craighero, L. (1998). Spatial attention: Mechanisms and theories. In M. Sabourin, F. Craik, & M. Robert (Eds.), Advances in psychological science: Vol.2. Biologi cal and cognitive aspects (pp. 171–198). East Sussex, UK: Psychology Press. Rizzolatti, G., Riggio, L., Dascola, I., & Umiltá, C. (1987). Reorienting attention across the horizontal and vertical meridians: Evidence in favor of a premotor theory of attention. Neuropsychologia, 25, 31–40. Sakata, H. (2003). The role of the parietal cortex in grasping. Advances in Neurology, 93, 121–139. Sanders, M. D., Warrington, E. K., Marshall, J., & Weiskrantz, L. (1974). “Blindsight”: Vi sion in a field defect. Lancet, 20, 707–708. Schenk, T., & Milner, A. D. (2006). Concurrent visuomotor behaviour improves form dis crimination in a patient with visual form agnosia. European Journal of Neuroscience, 24, 1495–1503. Schindler, I., Rice, N. J., McIntosh, R. D., Rossetti, Y., Vighetto, A., & Milner, A. D. (2004). Automatic avoidance of obstacles is a dorsal stream function: Evidence from optic ataxia. Nature Neuroscience, 7, 779–784. Sereno, M. I., & Tootell, R. B. (2005). From monkeys to humans: what do we now know about brain homologies? Current Opinion in Neurobiology, 15, 135–144. Servos, P., Carnahan, H., & Fedwick, J. (2000). The visuomotor system resists the horizon tal-vertical illusion. Journal of Motor Behavior, 32, 400–404. Servos, P., Goodale, M. A., & Jakobson, L. S. (1992). The role of binocular vision in pre hension: A kinematic analysis. Vision Research, 32, 1513–1521. Smeets, J. B., & Brenner, E. (1999). A new view on grasping. Motor Control, 3, 237–271. Smeets, J. B., & Brenner, E. (2001). Independent movements of the digits in grasping. Ex perimental Brain Research, 139, 92–100. Smeets, J. B., & Brenner, E. (2008). Grasping Weber’s law. Current Biology, 18, R1089– R1090. Page 37 of 40

Visual Control of Action Smeets, J. B., Brenner, E., & Martin, J. (2009). Grasping Occam’s razor. Advances in Ex perimental Medical Biology, 629, 499–522. Snyder, L. H., Batista, A. P., & Andersen, R. A. (1997). Coding of intention in the posterior parietal cortex. Nature, 386, 167–170. Stöttinger, E., & Perner, J. (2006). Dissociating size representation for action and for con scious judgment: Grasping visual illusions without apparent obstacles. Consciousness and Cognition, 15, 269–284. Stöttinger, E., Soder, K., Pfusterschmied, J., Wagner, H., & Perner, J. (2010). Division of labour within the visual system: fact or fiction? Which kind of evidence is appropriate to clarify this debate? Experimental Brain Research, 202, 79–88. Striemer, C., Chapman, C. S., & Goodale, M. A. (2009). “Realtime” obstacle avoidance in the absence of primary visual cortex. Proceedings of the National Academy of Sciences, U S A, 106, 15996–16001. Tanaka, K. (2003). Columns for complex visual object features in the inferotemporal cor tex: Clustering of cells with similar but slightly different stimulus selectivities. Cerebral Cortex, 13, 90–99. Tavassoli, A., & Ringach, D. L. (2010). When your eyes see more than you do. Current Bi ology, 20, R93–R94. Thaler, L., & Goodale, M. A. (2010). Beyond distance and direction: The brain represents target locations non-metrically. Journal of Vision, 10, 3. 1–27. Tipper, S. P., Howard, L. A., & Jackson, S. R. (1997). Selective reaching to grasp: Evidence for distractor interference effects. Visual Cognition, 4, 1–38. Tootell, R. B. H., Tsao, D., & Vanduffel, W. (2003). Neuroimaging weighs in: Humans meet macaques in “primate” visual cortex. Journal of Neuroscience, 23, 3981–3989. Tresilian, J. R. (1998). Attention in action or obstruction of movement? A kinematic analy sis of avoidance behavior in prehension. Experimental Brain Research, 120, 352–368. Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. J. Ingle, M. A. Goodale, & R. J. W. Mansfield (Eds.), Analysis of visual behavior (pp. 549–586). Cambridge MA: MIT Press. Valyear, K. F., & Culham, J. C. (2010). Observing learned objectspecific functional grasps preferentially activates the ventral stream. Journal of Cognitive Neuroscience, 22, 970– 984. Valyear, K. F., Culham, J. C., Sharif, N., Westwood, D., & Goodale, M. A. (2006). A double dissociation between sensitivity to changes in object identity and object orientation in the ventral and dorsal visual streams: A human fMRI study. Neuropsychologia, 44, 218–228. Page 38 of 40

Visual Control of Action van Bergen, E., van Swieten, L. M., Williams, J. H., & Mon-Williams, M. (2007). The effect of orientation on prehension movement time. Experimental Brain Research, 178, 180–193. van de Kamp, C., & Zaal, F. T. J. M. (2007). Prehension is really reaching and grasping. Ex perimental Brain Research, 182, 27–34. van Donkelaar, P. (1999). Pointing movements are affected by size-contrast illusions. Ex perimental Brain Research, 125, 517–520. Van Essen, D. C., Lewis, J. W., Drury, H. A., Hadjikhani, N., Tootell, R. B., Bakircioglu, M., & Miller, M. I. (2001). Mapping visual cortex in monkeys and humans using surface-based atlases. Vision Research, 41, 1359–1378. van Mierlo, C. M., Louw, S., Smeets J. B., & Brenner, E. (2009). Slant cues are processed with different latencies for the online control of movement. Journal of Vision, 9, 25. 1–8. Vaughan, J., Rosenbaum, D. A., & Meulenbroek, R. G. (2001). Planning reaching and grasping movements: The problem of obstacle avoidance. Motor Control, 5, 116–135. Verhagen, L., Dijkerman, H. C., Grol, M. J., & Toni, I. (2008). Perceptuo-motor in teractions during prehension movements. Journal of Neuroscience, 28, 4726–4735. (p. 295)

Vilaplana, J. M., Batlle, J. F., & Coronado, J. L. (2004). A neural model of hand grip forma tion during reach to grasp. 2004 IEEE International Conference on Systems, Man, and Cybernetics, 1–7, 542–546. Vishton, P. M., & Fabre, E. (2003). Effects of the Ebbinghaus illusion on different behav iors: One- and two-handed grasping; one- and two-handed manual estimation; metric and comparative judgment. Spatial Vision, 16, 377–392. Warner, C. E., Goldshmit, Y., & Bourne, J. A. (2010). Retinal afferents synapse with relay cells targeting the middle temporal area in the pulvinar and lateral geniculate nuclei. Frontiers in Neuroanatomy, 4, 8. Warren, W. H., & Fajen, B. R. (2004). From optic flow to laws of control. In L. M. Vaina, S. A. Beardsley, & S. K. Rushton (Eds.), Optic flow and beyond (pp. 307–337). Norwell, MA: Kluwer Academic. Watt, S. J., & Bradshaw, M. F. (2000). Binocular cues are important in controlling the grasp but not the reach in natural prehension movements. Neuropsychologia, 38, 1473– 1481. Watt, S. J., & Bradshaw M. F. (2003). The visual control of reaching and grasping: Binocu lar disparity and motion parallax. Journal of Experimental Psychology: Human Perception and Performance, 29, 404–415. Weiskrantz, L. (1997). Consciousness lost and found: A neuropsychological exploration. Oxford, UK: Oxford University Press. Page 39 of 40

Visual Control of Action Westwood, D. A., & Goodale, M. A. (2003). Perceptual illusion and the real-time control of action. Spatial Vision, 16, 243–254. Westwood, D. A., Heath, M., & Roy, E. A. (2000). The effect of a pictorial illusion on closed-loop and open-loop prehension. Experimental Brain Research, 134, 456–463.

Notes: (1) . Several solutions to the temporal coupling problem have been proposed (e.g., Hu, Osu, Okada, Goodale, & Kawato, 2005; Mon-Williams & Tresilian, 2001; Vilaplana, Batlle, & Coronado, 2004).

Melvyn A. Goodale

Melvyn A. Goodale, The Brain and Mind Institute, The University of Western Ontario

Page 40 of 40

Development of Attention

Development of Attention M. Rosario Rueda The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0015

Abstract and Keywords Functions of attention include achievement and maintenance of a state of alertness, selec tion of information from sensory input, and regulation of responses when dominant or well-learned behavior is not appropriate. These functions have been associated with acti vation of separate networks of brain areas. In this chapter, the developmental course of the attention networks during infancy and childhood and the neural mechanisms underly ing its maturation are reviewed. Alerting is active early in infancy, although the ability to endogenously maintain the level of alertness develops through late childhood. The ability to orient to external stimulation is also present from quite early in life, and mostly as pects of orienting related to the control of disengagement and voluntary orientation im prove during childhood. Executive attention starts developing by the end of the first year of life, showing major maturational changes during the preschool years. The efficiency of all three functions is subject to important individual differences, which may be due to both genetic endowment and educational and social experiences. In the final section, I discuss evidence indicating that efficiency of attention can be improved through training during childhood. Attention training has the potential to benefit aspects of behavior cen tral to education and socialization processes. Keywords: development, attention, cognitive development, brain development, attention networks, alerting, orient ing, executive control

Attention has been a matter of interest to researchers since the emergence of psychology as an experimental science. Titchener (1909) gave attention a central role in cognition, considering it “the heart of the psychological enterprise. ” Some years earlier, William James had provided an insightful definition of attention, which has become one of the most popular to this date: “Everyone knows what attention is. It is the taking possession of the mind in clear and vivid form of one out of what seem several simultaneous objects or trains of thought. Focalization, concentration and consciousness are of its essence. It implies withdrawal from some things in order to deal effectively with others …” (James, 1890).

Page 1 of 40

Development of Attention Despite its subjective nature, this definition contains important insights about the various aspects that would be studied by scientists interested in attention in the times to follow. For instance, James’ definition suggests that we are limited by the amount of information we can consciously process at a time, and he attributed to the attention system the func tion of selecting the relevant out of all the stimulation available to our sensory systems. What is relevant to a particular individual depends on the current goals and intentions of that individual. Therefore, although implicitly, it is also suggested that attention is intrin sically connected to volition. Additionally, words like “concentration” and “focalization” point to the effortful and resource-consuming nature of the continuous monitoring be tween the flow of information and goals of the individual that is required in order to know what (p. 297) stimulation is important to attend to at each particular moment. After James’ insightful ideas on attention, the research on inner states and cognitive processes was for the most part neglected owing to an exacerbated emphasis on “observ able” behavior during most of the first half of the twentieth century. However, when the interest in cognition was reinstituted during and after World War II, all the multiple com ponents of the attentional construct contained in James’ definition, such as selection, ori enting, and executive control, were further developed by many authors, including Donald Hebb (1949), Colin Cherry (1953), Donald Broadbent (1958), Daniel Kahneman (1973), and Michael Posner (1978, 1980), among others. In this chapter, I first review a variety of constructs studied within the field of attention. Then, I discuss on the anatomy of brain networks related to attention as revealed from imaging studies. Afterward, I present evidence on the developmental time course of the attention networks during infancy and childhood. Next, I discuss how individual differ ences in attentional efficiency have been studied over the past years as well as the influ ence of both constitutional and experiential factors on the development of attention. Fi nally, I present recent research focused on optimizing attentional capacities during devel opment.

Constructs of Attention and Marker Tasks The cognitive approach to attention provides a variety of models and conceptual frame works for studying its development across the lifespan and underlying neural mecha nisms. Perhaps the most general question is whether to think of attention as one thing or as a number of somewhat separate issues. Hebb (1949) argued that all stimuli had two ef fects: (1) providing information about the nature of the stimulating event after following the major sensory pathways in the brain; and (2) keeping the cortex tuned in the waking state through the reticular activating system pathway. A classic distinction in the field is to divide attention by considering separately the intensive and selective aspects (Kahne man, 1973). Alerting effects can be contrasted with selective attention, which involves committing processing resources to some particular event. These aspects can in turn be separated from the role of attention in cognitive control, which is needed when situations call for a careful and volitional control of thoughts and behavior as opposed to when re Page 2 of 40

Development of Attention sponses to stimulation can be automatically triggered (Norman & Shallice, 1986; Posner & Snyder, 1975). These three aspects of attention are further reviewed below.

Attention as State The attention state of an organism varies according to changes in both internal and exter nal conditions. Generally, alertness reflects the state of the organism for processing infor mation and is an important condition in all tasks. Intrinsic or tonic alertness refers to a state of general wakefulness, which clearly changes over the course of the day from sleep to waking and within the waking state from sluggish to highly alert. Sustained attention and vigilance have been defined as the ability to maintain a certain level of arousal and alertness that allows responsiveness to external stimulation (Posner, 1978; Posner & Boies, 1971). Tonic alertness requires mental effort and is subject to top-down control of attention (Posner, 1978); thus, the attention state is likely to diminish after a period of maintenance. In all tasks involving long periods of processing, the role of changes of state may be important. Thus, vigilance or sustained attention effects probably rest at least in part on changes in the tonic alerting system. The presentation of an external stimulus also increases the state of alertness. Preparation from warning cues (phasic alertness) can be measured by comparing the speed and accu racy of response to stimulation with and without warning signals (Posner, 2008). Warning cues appear to accelerate processes of response selection (Hackley & Valle-Inclán, 1998), which commonly result in increased response speed. Depending on the conditions, this ef fect can be automatic as it occurs with an auditory accessory event that does not predict a target, but it can also be partly due to voluntary actions based on information conveyed by the cue about the time of the upcoming target. The reduction of reaction time (RT) fol lowing a warning signal is accompanied by vast changes in the physiological state of the organism. These changes are thought to bring on the suppression of ongoing activity in order to prepare the system for a rapid response. This often happens before the response being sufficiently contrasted, as with short intervals between warning cue and target, leading to reduced accuracy in performance (Morrison, 1982; Posner, 1978).

Attention as Selectivity Attention is also an important mechanism for conscious perception. When examining a vi sual scene, there is the general feeling that all information (p. 298) about it is available. However, important changes can occur in the scene without being noticed by the observ er, provided they take place away from the focus of attention. For instance, changes in the scene are often completely missed if cues that are normally effective in producing a shift of attention, such as luminance changes or movements, are suppressed (Rensink, O’Reagan, & Clark, 1997), a phenomenon called change blindness. Something similar happens in the auditory modality. A classic series of studies conducted by Colin Cherry (1953) in which different stimuli were presented simultaneously to the two ears of individ uals showed that information presented to the unattended ear could go totally unnoticed. To explain the role of attention on the selection of information, Broadbent (1958) Page 3 of 40

Development of Attention developed a model that considers that attention acts as a filter, which selects a channel of entry and sends information to a perceptual processing system of limited capacity. Later on, Hillyard and colleagues studied the mechanisms of attentional selection. Using eventrelated potentials (ERP), they showed that attended stimuli generate early ERP compo nents of larger amplitude than unattended ones, suggesting that attention facilitates per ceptual processing of attended information (Hillyard, 1985; Mangun & Hillyard, 1987). Attention is thus considered a mechanism that allows selecting out irrelevant information and gives priority to relevant information for conscious processing. Much of the selection of external stimulation is achieved by orienting to the source of in formation. This orientation can be carried out overtly, as when head and eye movements are directed toward the source of input, or it can be carried out covertly, as when only at tention is oriented to the stimulation. Posner (1980) developed a cueing paradigm to study orienting of attention. In this task, a visual target appears at one of two possible lo cations, either to the right or left of fixation, and participants are asked to detect its pres ence by pressing a key. Before the target, a cue is presented at one of the two possible lo cations. When the preceding cue appears at the same location as the subsequent target (valid cue), a reduction in RT is observed compared with when there is no cue or it is dis played at fixation (neutral cue). This allows measuring benefits in RT due to moving atten tion to the location of the target before its appearance. On the contrary, when the cue is presented at a different location from the target (invalid cue), an RT cost is observed with respect to neutral cues, provided that attention must be disengaged from the wrong loca tion and moved to where the target appeared. The same result is obtained even when no sufficient time for a saccadic movement is allowed between cue and target (i.e., less than 200 milliseconds). This shows that attention can be oriented independently of eye and head movements and that covert shifts of attention also enhance the speed of responding. However, how free covert attention is from the eye movement systems is still a debated matter (see Klein, 2004). Another important distinction concerning orienting of attention is related to whether at tention shifts are triggered by external stimulation or voluntarily generated by the indi vidual. In Posner’s paradigm, this is studied by manipulating the nature, location, and percentage of valid cues. Cues that consist on luminance changes at peripheral locations are thought to capture attention automatically. On the contrary, cues that are presented at fixation and must be interpreted before attention is moved, for example, an arrow pointing left or right, are considered to cause attention shifts that are endogenously gen erated. In addition, if cues reliably predict the location of targets, as when they are valid in more than 50 percent of the trials, they are more likely to induce voluntary shifts of at tention. The distinction between endogenous and exogenous orienting of attention is im portant because they appear to be two relatively independent mechanisms with separate underlying neuroanatomy (Corbetta & Shulman, 2002). In developmental terms, this dis tinction is important to understanding the development of orienting from birth to early adolescence, as will be discussed later.

Page 4 of 40

Development of Attention Studies using functional magnetic resonance imaging (fMRI) and cellular recording have demonstrated that brain areas that are activated by cues, such as the superior parietal lobe and temporal parietal junction, play a key role in modulating activity within primary and extrastriate visual systems when attentional orienting occurs (Corbetta & Shulman, 2002; Desimone & Duncan, 1995). This shows that the attentional system achieves selec tion by modulating the functioning of sensory systems. Attention can thus be considered a domain-general system that regulates the activation of domain-specific sensory process ing systems. Inhibition is also an important mechanism of attentional orienting. In cognitive tasks, in hibition is inferred from an increase in RT or an increased error rate. In orienting tasks when a peripheral cue is presented more than half second before a target, inhibition takes place at the location of the cue and (p. 299) RT at that location increases, an effect called inhibition of return (IOR; Posner & Cohen, 1984). IOR is thought to fulfill an impor tant function because it prevents reexamination of locations that have already been ex plored (Klein, 2000). Because alertness is also increased with changing environmental conditions, it seems that orienting and alerting bias the organism for novelty and change.

Attention as Executive Control Same as orienting to external sources of stimulation, attention can be directed internally to coordinate memories, thoughts, and emotions. The phenomenon called negative prim ing is an example of selection at the level of memory representations. Negative priming consists of increased RT to stimuli that have been previously ignored. This effect can be accounted for by an inhibitory process that acts on the representation of the ignored in formation, allowing the system to focus on information relevant to current actions (Houghton & Tipper, 1994). In addition, a mechanism similar to inhibition of return has been described by Fuentes (2004) in the domain of representations in semantic memory. Negative priming can be observed for targets preceded by semantically related primes with sufficiently long intervals between the two events. It is as if inhibiting the represen tation of a concept extends to semantically related representations, making them harder to reactivate shortly after if needed. Likewise, attention has been proposed as a central mechanism for the control of working memory representations (Baddeley, 1993). Attention is also an important mechanism for action monitoring, particularly when selec tion has to be achieved in a voluntary effortful mode. Posner and Snyder (1975) first ar gued about the central role of attention for cognitive control. In well-practiced tasks, ac tion coordination does not require attentional control because responses can be automati cally triggered by the stimulation. However, attention is needed in a variety of situations in which automatic processing is either not available or is likely to produce inappropriate responses. Years later, Norman and Shallice (1986) developed a cognitive model for dis tinguishing between automatic and controlled modes of processing. According to their model, psychological processing systems rely on a number of hierarchically organized schemas of action and thought used for routine actions. These schemas are automatically triggered and contain well-learned responses or sequences of actions. However, a differ Page 5 of 40

Development of Attention ent mode of operation involving the attention system is required when situations call for more carefully elaborated responses. These are situations that involve (1) novelty, (2) er ror correction or troubleshooting, (3) some degree of danger or difficulty, or (4) overcom ing strong habitual responses or tendencies. A way to study cognitive control in the lab consists of inducing conflict between respons es by instructing people to execute a subdominant response while suppressing a domi nant tendency. A basic measure of conflict interference is provided by the Stroop task (Stroop, 1935). The original form of this task requires subjects to look at words denoting colors and to report the color of ink the words are written in instead of reading them. Pre senting incongruent perceptual and semantic information (e.g., the word “blue” written with red ink) induces conflict and produces a delay in response time compared with when the two sources of information match. The Flanker task is another widely used method to study conflict resolution. In this task, the target is surrounded by irrelevant stimulation that can either match or conflict with the response required by the target (Eriksen & Eriksen, 1974). As with the Stroop task, resolving interference from distracting incongru ent stimulation delays RT. Cognitive tasks involving conflict have been extensively used to measure the efficiency with which control of action is exerted. The extent of interference is usually measured by subtracting average RT in nonconflict conditions from that of con ditions involving conflict. The idea is that additional involvement of the attention system is required to detect conflict and resolve it by inhibiting the dominant but inappropriate response (Botvinick, Braver, Barch, Carter, & Cohen, 2001; Posner & DiGirolamo, 1998). Thus, larger interference scores are interpreted as indicative of less efficiency of cogni tive control. Another important form of attention regulation is related to the ability to detect and cor rect errors. Detection and monitoring of errors has been studied using ERP (Gehring, Gross, Coles, Meyer, & Donchin, 1993). A large negative deflection over midline frontal channels is often observed about 100 ms after the commission of an error, called the er ror-related negativity (ERN). This effect has been associated with an attention-based selfregulatory mechanism (Dehaene, Posner, & Tucker, 1994) and provides a means to exam ine the emergence of this function during infancy and childhood (Berger, Tzur, & Posner, 2006).

Posner’s Model of Attention All three aspects of attention considered above are simultaneously involved in much of our behavior. (p. 300) However, the distinction between alerting, orienting, and executive control has proved useful to understanding the neural basis of attention. Over the past decades, Posner and colleagues have developed a neurocognitive model of attention. Pos ner proposes a division of attention into three distinct brain networks. One of these in volves changes of state and is called alerting. The other two are closely involved with se lection and are called orienting and executive attention (Posner, 1995; Posner & Fan, 2008; Posner & Petersen, 1990; Posner, Rueda, & Kanske, 2007). The alerting network deals with the intensive aspect of attention related to how the organism achieves and Page 6 of 40

Development of Attention maintains the alert state. The orienting network deals with selective mechanisms operat ing on sensory input. Finally, the executive network is involved in the regulation of thoughts, feelings, and behavior. The three attention networks in Posner’s model are not considered completely indepen dent. For instance, as mentioned earlier, alerting signals produce faster but more inaccu rate responses, a result that Posner interpreted as an inhibitory interaction between the alerting and executive networks (Posner, 1978). On the other hand, alerting appears to fa cilitate the orienting of attention by speeding up attention shifts. Also, by focusing on tar gets, orienting aids on filtering-out irrelevant information, thus enhancing the function of executive attention (see Callejas, Lupiáñez, & Tudela, 2004, for futher discussion on at tention networks interactions). However, despite their interacting nature, the three atten tion networks have been shown to have relatively independent neuroanatomies and ap pear to involve distinct neuromodulatory mechanisms (Posner, Rueda, & Kanske, 2007; Posner & Fan, 2008; see also Table 15.1). Several years ago, Fan et al. (2002) developed an experimental task to study the function ing of the three attentional networks, called the attention network task (ANT). The task is based on traditional experimental paradigms to study the functions of alerting (prepara tion cues), orienting (orienting cues), and executive control (flanker task) (Figure 15.1). Completion of the task allows calculation of three scores related to the efficiency of the attention networks. The alerting score is calculated by subtracting RT to trials with dou ble cue from RT to trials with no cue. This provides a measure of the benefit in perfor mance by having a signal that informs about the immediate upcoming of the target and by using this information to get ready to respond. The orienting score provides a measure of how much benefit is obtained in responding when information is given about the loca tion of the upcoming target. It is calculated by subtracting RT to spatial cue trials from that of central cue trials. Finally, the executive attention score indicates the amount of in terference experienced in performing the task when stimulation conflicting with the tar get is presented in the display. It is calculated by subtracting RT to congruent trials from RT to incongruent trials. Larger scores indicate more interference from distractors and therefore less efficiency of conflict resolution mechanisms (executive attention). Posner’s view of attention as an organ system with its own functional anatomy (Posner & Fan, 2008) provides a model of great heuristic power. Connecting cognitive and neural levels of analysis aids in answering many complex issues related to attention, such as the maturational processes underlying its development and factors that are likely to influence maturation, such as genes and experience. In the next section, anatomy of the three at tention networks in Posner’s model is briefly described.

Neuroanatomy of Attention The emergence of the field of cognitive neuroscience constituted a turning point in the study of the relationship between brain and cognition from which the study of attention has benefited greatly. Functional neuroimaging has allowed many cognitive tasks to be Page 7 of 40

Development of Attention analyzed in terms of the brain areas they activate (Posner & Raichle, 1994). Studies of at tention have been among the most often examined in this way (Corbetta & Shulman, 2002; Driver, Eimer, & Macaluso, 2007; Posner & Fan, 2008; Wright & Ward, 2008), and per haps the areas of activation have been more consistent for the study of attention than for any other cognitive system (Hillyard, Di Russo, & Martinez, 2006; Posner et al., 2007; Raz & Buhle, 2006). A summary of the anatomy and neurotransmitters involved in the three networks is shown in Table 15.1.

Alerting Network Research in the past years has shown that structures of the right parietal and right frontal lobes as well as a number of midbrain neural modulators such as norepinephrine and dopamine are involved in alertness (Sturm et al., 1999). Arousal of the central ner vous system involves input from brainstem systems that modulate activation of the cor tex. (p. 301) Primary among these is the locus coeruleus, which is the source of the brain’s norepinephrine. It has been demonstrated that the influence of warning signals operate via this brain system because drugs that block it also prevent the changes in the alert state that lead to improved performance after a warning signal is provided (Coull, Nobre, & Frith, 2001; Marrocco & Davidson, 1998).

Page 8 of 40

Development of Attention Table 15.1 Marker Tasks, Anatomy, Neurochemistry, and Genetics of Attention Networks Marker Tasks

Anatomy

Neurotransmitters

Genes

Alerting

Warning signals (pha sic) CPT, tasks of sustained attention (tonic)

Locus coeruleus Parietal and frontal cortex (right—tonic; left—phasic)

Noradrenaline

MAOA ADRA2A NET

Orienting

Dual tasks (selectivity) Cueing task (orienting) Visual search

Superior colliculus Superior parietal lobe Temporal-parietal junc tion Inferior frontal cortex Frontal eye fields

Acetylcholine

CHRNA4 CHRNA7

Executive attention

Conflict tasks Inhibition (go/no-go)

Anterior cingulate cor tex Prefrontal cortex

Dopamine

COMT DRD4 DAT1 DBH

Page 9 of 40

Development of Attention The involvement of frontal and parietal regions of the right hemisphere is supported by studies showing that lesions in those areas impair patients’ ability to maintain the alert state in the absence of warning signals (Dockree et al., 2004; Sturm et al., 1999). Imaging studies have also supported the involvement of right frontal-parietal structures in the en dogenous maintenance of alertness (Coull, Frith, Frackowiak, & Grasby, 1996; Dockree et al., 2004). However, the neural basis for tonic alertness may differ from those involving phasic changes of alertness following warning cues. Warning signals provide a phasic change in level of alertness over millisecond intervals. This change involves widespread variation in autonomic signals, such as heart rate (Kahneman, 1973), and cortical changes, such as the contingent negative variation (CNV; Walter, 1964). Several studies have shown that the CNV is generated by activation in the frontal lobe, with specific regions depending on the type of task being used (Cui et al., 2000; Tarkka & Basile, 1998). When using fixed cue–target intervals, warning signals are informative of when a target will occur, thus producing a preparation in the time domain. Under these conditions, warning cues ap pear to activate frontal-parietal structures on the left hemisphere, instead of the right (Coull, Frith, Büchel, & Nobre, 2000; Nobre, 2001). Using the ANT, Fan et al. (2005) observed cortical frontal-parietal activation that was stronger on the left hemisphere fol lowing warning cues along with activation on the thalamus.

Orienting Network The orienting system for visual events has been associated with posterior brain areas in cluding the superior parietal lobe, the temporal-parietal junction, and the frontal eye fields. Lesions of the parietal lobe and superior temporal lobe have been consistently re lated to difficulties in orienting (Karnath, Ferber, & Himmelbach, 2001). Activation of cor tical areas has been specifically associated with operations of disengagement from the current focus of attention. Moving attention from one location to another involves the su perior colliculus, whereas engaging attention requires thalamic areas such as the pulv inar nucleus (Posner & Raichle, 1994). Also, Corbetta and Shulman (2002) reviewed a se ries of imaging studies and showed that partially segregated networks appear to be in volved in endogenous and exogenous orientation of attention. Goal-directed or top-down selection involves activation of a dorsal network that includes intraparietal and frontal cortices, whereas stimuli-driven (exogenous) attention activates a ventral network con sisting of the temporal-parietal junction and inferior frontal cortex mostly on the right hemisphere. The ventral network is involved in orienting attention in a reflexive mode to salient events and has the capacity to overcome the voluntary orientation associated with the dorsal system. The function of the orienting network appears to be modulated by acetylcholine (Ach). It has been shown that lesions of cholinergic systems in the basal forebrain in mon keys interfere with orienting attention (Voytko et al., 1994). In addition, administration of a muscarinic antagonist, scopolamine, appears to delay orientation to spatial cues but not to cues that only have an alerting effect (Davidson, Cutrell, & Marrocco, 1999). Further (p. 302)

Page 10 of 40

Development of Attention evidence shows that the parietal cortex is the site where this modulation takes place. In jections of scopolamine directly in parietal areas containing cells that respond to spatial cues affect the ability to shift attention to the cued location. Systemic injections of scopo lamine have a smaller effect on covert orienting of attention than do local injections in the parietal area (Davidson & Marrocco, 2000). These observations in the monkey have also been confirmed by similar studies in the rat (Everitt & Robbins, 1997) and by studies with nicotine in humans (Newhouse, Potter, & Singh, 2004).

Executive Attention Network We know from adult brain imaging studies that Stroop tasks activate the anterior cingu late cortex (ACC). In a meta-analysis of imaging studies, the dorsal section of the ACC was activated in response to cognitive conflict tasks such as variants of the Stroop task, whereas the ventral section appeared to be mostly activated by emotional tasks and emo tional states (Bush, Luu, & Posner, 2000). The two divisions of the ACC also seem to inter act in a mutually exclusive way. For instance, when the cognitive division is activated, the affective division tends to be deactivated, and vice versa, suggesting the possibility of rec iprocal effortful and emotional controls of attention (Drevets & Raichle, 1998). Also, re solving conflict from incongruent stimulation in the flanker task activates the dorsal por tion of the ACC together with other regions of the lateral prefrontal cortex (Botvinick, Ny strom, Fissell, Carter, & Cohen, 1999; Fan, Flombaum, McCandliss, Thomas, & Posner, 2003). Different parts of the ACC appear to be well connected to a variety of other brain regions, including limbic structures and parietal and frontal areas (Posner, Sheese, Odlu das, & Tang, 2006). Support for the voluntary exercise of self-regulation comes from stud ies that examine either the instruction to control affect or the connections involved in the exercise of that control. For example, the instruction to avoid arousal during processing of erotic events (Beauregard, Levesque, & Bourgouin, 2001) or to ward off emotion when looking at negative pictures (Ochsner, Bunge, Gross, & Gabrieli, 2002) produces a locus of activation in midfrontal and cingulate areas. In addition, if people are required to se lect an input modality, the cingulate shows functional connectivity to the selected sensory system (Crottaz-Herbette & Menon, 2006). Similarly, when involved with emotional pro cessing, the cingulate shows a functional connection to limbic areas (Etkin, Egner, Per aza, Kandel, & Hirsch, 2006). These findings support the role of cingulate areas in the control of cognition and emotion. As with the previous networks, pharmacological studies conducted with monkeys and rats have aided our understanding of the neurochemical mechanisms affecting efficiency of the executive attention network. In this case, data suggest that dopamine (DA) is the im portant neurotransmitter for executive control. Blocking DA in the dorsal-lateral pre frontal cortex (DLPFC) of rhesus monkeys causes deficits on tasks involving inhibitory control (Brozoski, Brown, Rosvold, & Goldman, 1979). Additionally, activation of mesocor tical dopaminergic neurons in rats enhances activity in the prefrontal cortex (McCulloch, Savaki, McCulloch, Jehle, & Sokoloff, 1982), as does expression of DA receptors in the an terior cingulate cortex (Stanwood, Washington, Shumsky, & Levitt, 2001). Page 11 of 40

Development of Attention

Development of Attention Networks Each of the functions of attention considered in the neurocognitive model just described is present to some degree in infancy, but each undergoes a long developmental process. In the next sections, the development of these functions is traced from birth to adoles cence.

Infancy and Toddlerhood Attention in infancy is less developed than later in life, and the functions of alerting, ori enting, and executive control in particular are less independent during infancy. In their volume Attention in Early Development, Ruff and Rothbart (1996) extensively reviewed the development of attention across early childhood. They suggest that both reactive and self-regulatory systems of attention are at play during the first years of life. Initially, reac tive attention is involved in more automatic engagement and orienting processes. Then, by the end of the first year of life, attention can be more voluntarily controlled. Across the toddler and preschool years, the self-regulatory system increasingly assumes control of attentional processes, allowing for a more flexible and goal-oriented control of attentional resources. (p. 303) In this section, I first discuss components of attention related to the state of engagement and selectivity and then consider attention in relation to self-regula tion. The early life of the infant is concerned with changes in state. Sleep dominates at birth, and the waking state is relatively rare at first. The newborn infant spends nearly threefourths of the time sleeping (Colombo & Horowitz, 1987). There is a dramatic change in the percentage of time in the waking state over the first 3 months of life. By the 12th postnatal week, the infant has become able to maintain the alert state during much of the daytime hours, although this ability still depends heavily on external sensory stimulation, much of it provided by the caregiver. Newborns show head and eye movements toward novel stimuli, but the control of orient ing is initially largely in the hands of caregiver presentations. Eye movements are prefer entially directed toward moving stimuli and have been shown to depend on properties of the stimulus, for example, how much they resemble human faces (Johnson & Morton, 1991). Much of the response to external stimuli involves orienting toward the source of stimula tion. The orienting response is a relatively automatic or involuntary response in reaction to moderately intense changes in stimulation (Sokolov, 1963). It depends on the physical characteristics of the stimulation, such as novelty and intensity. With the orienting re sponse, the organism is alerted and prepared to learn more about the event in order to respond appropriately. Orienting responses are accompanied by a deceleration in heart rate that is sustained during the attention period (Richards & Casey, 1991). In babies, other physical reactions involve decreases in motor activity, sucking, and respiration (Graham, Anthony, & Ziegler, 1983). The heart rate deceleration is observed after 2 Page 12 of 40

Development of Attention months of age and increases in amplitude until about 9 months. Subsequently, the ampli tude decreases until it approximates the adult response, which consists of a small decel eration in response to novel stimulation (Graham et al., 1983). Orienting responses of greater magnitude lead to more sustained periods of focused attention, which likely in crease babies’ opportunities to explore objects and scenes. Infants become more able to recognize objects also partially because of maturation of temporal-parietal structures and increases in neural transmission. As this happens, the novelty of objects, and hence their capacity to alert, diminishes, causing shorter periods of sustained attention. Reductions of automatic orientation may then facilitate the emergence of self-initiated voluntary at tention (Ruff & Rothbart, 1996). The most frequent method of studying orienting in infancy involves the use of eye move ments tracking devices. As in adults, there is a close relation between the direction of gaze and the infants’ attention. The attention system can be driven to particular locations by external input from birth (Richards & Hunter, 1998); however, orientation to the source of stimulation continues to improve in precision over many years. Infants’ eye movements often fall short of the target, and peripheral targets are often foveated by a series of head and eyes movements. Although not as easy to track, the covert system like ly follows a similar trajectory. One strategy to examine covert orienting consists of pre senting brief cues that do not produce an eye movement followed by targets that do. Us ing this strategy, it has been shown that the speed of the eye movement to the target is enhanced by the cue, and this enhancement improves over the first year of life (Butcher, 2000). In more complex situations, for example, when there are competing targets, the improvement may go on for longer periods (Enns & Cameron, 1987). The orienting network appears to be fully functional by around 6 months of age. During the first months of life there is a progressive development of the connection between vi sual processing pathways and parietal systems involved in attentional control. This matu ration allows for visual orientation to be increasingly under attentional control. The dor sal visual pathway, primarily involved in processing spatial properties of objects and loca tions, maturates earlier compared with the ventral visual pathway, involved in object identification. This explains why infants show preference for novel objects instead of nov el locations (Harman, Posner, Rothbart, & Thomas-Thrapp, 1994). Preference for novel lo cations is in fact shown from very early on. Inhibition of return is shown by newborns for very salient objects such as lights (Valenza, Simion, & Umiltá, 1994), and sometime later, at about 6 months of age, for more complex objects (Harman et al., 1994). Gaining control over disengaging attention is also necessary to be able to shift between objects or locations. Infants in the first 2 or 3 months of life often have a hard time disen gaging from salient objects and events and might become distressed before they are able to move away from the target. By 4 months, however, infants become more able to look away from central displays (Johnson, Posner, & Rothbart, 1991).

(p. 304)

From then on,

the latency to turn from central to peripheral displays decreases substantially with age (Casey & Richards, 1988). Before disengaging attention, the heart rate begins to acceler ate back to preattentive levels, and infants become more distractible. After termination of Page 13 of 40

Development of Attention attention, there seems to be a refractory period of about 3 seconds during which orient ing to a novel event in the previous location or a nearby one is inhibited (Casey & Richards, 1991), a process that might be related to inhibition of return. The ability to dis engage gaze is achieved before the capacity to disengage attention. The voluntary disen gagement of attention requires further inhibitory skills that appear to emerge later on, at about 18 months of age (Ruff & Rothbart, 1996). Orienting to sensory input is a major mechanism for regulation of distress. Decrements in heart rate that occur with orienting responses are likely to have a relaxing effect in in fants. In fact, it has been reported that greater orienting skill in the laboratory is associ ated with lower temperamental negative emotion and greater soothability as reported by parents (Johnson et al., 1991). Additional evidence of the regulatory function of attention is provided by caregivers’ attempts to distract their infants by bringing their attention to other stimuli. As infants orient, they are often quieted, and their distress appears to di minish. In one study, 3- to 6-month-old infants were first shown a sound and light display; about 50 percent of the infants became distressed to the stimulation, but then strongly oriented to interesting visual and auditory soothing events when these were presented. While the infants oriented, facial and vocal signs of distress disappeared. However, as soon as the orienting stopped, the infants’ distress returned to almost exactly the levels shown before presentation of the soothing object, even when the quieted period lasted for as long as 1 minute (Harman, Rothbart, & Posner, 1997). The authors have speculated that an internal system involving the amygdala holds a computation of the initial level of distress, so that this initial level returns if the infant’s orientation to the novel event is lost. Late infancy is the time when self-regulation develops. At about the end of the first year, executive attention-related frontal structures come into play, and this allows for a pro gressive increase in the duration of orientation based on goals and intentions. For in stance, periods of focused attention during free play increase steadily after the first year of life (Ruff & Lawson, 1990). With the development of voluntary attention, young chil dren become increasingly more sensitive to others’ line of regard, establishing a basis for joint attention and social influences on selectivity. Increasingly, infants are able to gain control of their own emotions and other behaviors, and this transition marks the emer gence of the executive attention system. Perhaps the earliest evidence of activation of the executive attention network is at about 7 months of age. As discussed earlier, an important form of self-regulation is related to the ability to detect errors, which has been linked to activation of the ACC. One study ex amined the ability of infants of 7 to 9 months to detect errors (Berger, Tsur, & Posner, 2006). In this study, infants observed a scenario in which one or two puppets were hidden behind a screen. A hand was seen to reach behind the screen and either add or remove a puppet. When the screen was removed, there were either the correct number of puppets or an incorrect number. Wynn (1992) found that infants of 7 months looked longer when the number was in error that when it was correct. Whether the increased looking time in volved the same executive attention circuitry that is active in adults when they detect er Page 14 of 40

Development of Attention rors was simply unknown. Berger and colleagues replicated the Wynn study but used a 128-channel electroencephalogram (EEG) to determine the brain activity that occurred during error trials in comparison with that found when the infant viewed a correct solu tion. They found that the same EEG component over the same electrode sites differed be tween correct and erroneous displays both in infants and adults. This suggests that a sim ilar brain anatomy as in adult studies is involved in infants’ ability to detect errors. Of course, activating this anatomy for observing an error is not the same as what occurs in adults, who actually slow down after an error and adjust their performance. However, it suggests that even very early in life, the anatomy of the executive attention system is at least partly functional. Later in the first year of life, there is evidence of further development of executive func tions, which may depend on executive attention. One example is Adele Diamond’s work using the “A not B” task and the reaching task. These two marker tasks involve inhibition of an action that is strongly elicited by the situation. In the “A not B” task, the experi menter shifts the location of a hidden object from location A to location B, after the infant’s retrieving from location A had been reinforced as correct in the previous trials (Diamond, 1991). In the reaching task, visual information about the correct route (p. 305) to a toy is put in conflict with the cues that normally guide reaching. The normal tenden cy is to reach for an object directly along the line of sight. In the reaching task, a toy is placed under a transparent box in front of the child. The opening of the box is on one of the lateral sides instead of the front side. In this situation, the infant can reach the toy on ly if the tendency to reach directly along the line of sight is inhibited. Important changes in the performance of these tasks are observed from 6 to 12 months (Diamond, 2006). Comparison of performance between monkeys with brain lesions and human infants on the same marker tasks suggests that the tasks are sensitive to the development of the prefrontal cortex, and maturation of this brain area seems to be critical for the develop ment of this form of inhibition. Another task that reflects the executive system involves anticipatory looking in a visual sequence task. In the visual sequence task, stimuli are placed in front of the infant in a fixed and predictable sequence of locations. The infant’s eyes are drawn reflexively to the stimuli because they are designed to be attractive and interesting. After a few trials, some infants will begin to anticipate the location of the next target by correctly moving their eyes in anticipation of the target. Anticipatory looks are thought to reflect the devel opment of a more voluntary attention system that might depend in part on the orienting network and also on the early development of the executive network. It has been shown that anticipatory looking occurs with infants as young as 3½ to 4 months (Clohessy, Pos ner, & Rothbart, 2001; Haith, Hazan, & Goodman, 1988). However, there are also impor tant developments that occur during infancy (Pelphrey et al., 2004) and later (Garon, Bryson, & Smith, 2008). Learning more complex sequences of stimuli, such as sequences in which a location is followed by one of two or more different locations, depending on the location of previous stimuli within the sequence (e.g., location 1, then location 2, then location 1, then location 3, and so on…), requires the monitoring of context and, in adult studies, has been shown to depend on lateral prefrontal cortex (Keele, Ivry, Mayr, Hazel Page 15 of 40

Development of Attention tine, & Heuer, 2003). Infants of 4 months do not learn to go to locations where there is conflict as to which location is the correct one. The ability to respond when such conflict occurs is not present until about 18 to 24 months of age (Clohessy et al., 2001). At 3 years, the ability to respond correctly when there is conflict in the sequential looking task correlates with the ability to resolve conflict in a spatial conflict task (Rothbart, Ellis, Rueda, & Posner, 2003). These findings support the slow development of the executive at tention network during the first and second years of life. The visual sequence task is related to other features that reflect executive attention. One of these is the cautious reach toward novel toys. Rothbart and colleagues found that the slow, cautious reach of infants of 10 months predicted higher levels of effortful control as measured by parent report at 7 years of age (Rothbart, Ahadi, Hersey, & Fisher, 2001). In fants of 7 months who show higher levels of correct anticipatory looking in the visual se quence task also show longer inspection before reaching toward novel objects and slower reaching toward the object (Sheese, Rothbart, Posner, Fraundorf, & White, 2008). This suggests that successful anticipatory looking at 7 months is one feature of self-regulation. In addition, infants with higher levels of correct anticipatory looking also showed evi dence for higher levels of emotionality in a distressing task and more evidence of efforts to self-regulate their emotional reactions. Thus, even at 7 months, the executive attention system is showing some properties of self-regulation, even though it is not yet sufficiently developed to resolve the simple conflicts used in the visual sequence task or the task of reaching away from the line of sight in the transparent box task. An important question about early development of executive attention is its relation to the orienting network. The findings to date suggest that orienting is playing some of the reg ulatory roles in early infancy that are later exercised by the executive network. I argued that the orienting network seems to have a critical role in regulation of emotion by the caregiver as early as 4 months. It has been recently shown that orienting as measured from the Infant Behavior Questionnaire at 7 months is not correlated with effortful con trol as measured in the same infants at 2 years (Sheese, Voelker, Posner, & Rothbart, 2009). However, orienting did show some relation with early regulation of emotional re sponding of the infants because it was positively related to positive affect and negatively related to negative affect. After toddlerhood, emotional control by orienting might experi ence a transition toward control by executive attention because a negative relationship between effortful control and negative affect has been repeatedly found in preschoolaged and older children (Rothbart & Rueda, 2005). In 2001, Colombo presented a summary of attentional functions in infancy, which includ ed (p. 306) alertness, spatial orienting, object-oriented attention, and endogenous atten tion (Colombo, 2001). This division is similar to the network approach, but divides orient ing into space and features and includes the functions of interstimulus shifts and sus tained attention as part of endogenous attention. Colombo argues that alerting reaches the mature state at about 4 months, orienting by 6 to 7 months, and endogenous atten

Page 16 of 40

Development of Attention tion by 4 to 5 years. This schedule is similar to our discussion in order, but as discussed in the next section, all these functions continue developing during childhood.

Childhood

Figure 15.1 The child ANT task. In each trial, a warning tone is presented on half of the trials (in the other half of the trials, no tone is presented). After a brief interval, a cue consisting of an asterisk is pre sented on 2/3 of the trials, which appears in the same location of the subsequent target (valid cue) 50% of the time, and in the opposite location (invalid cue) in the remaining 50%. Finally, a target is presented consisting of a colorful fish pointing either right or left. The target appears either above or below the fix ation cross and is flanked by two fish on each side. The flanking fish may point in the same direction as the target (congruent trials), or in the opposite direc tion (incongruent trials), equally often. In successive trials, participants are instructed to discriminate the direction of the target fish as rapidly and accurately as possible, and usually both reaction time (RT) and accuracy of the response are registered.

In the preschool years, children become more able to follow instructions and perform RT tasks. To study the development of attention functions across childhood, a child-friendly version of the ANT was developed (Rueda, Fan, et al., 2004). This version is structurally similar to the adult version but uses fish instead of arrows as target stimuli. This allows contextualization of the task in a game in which the goal is to feed the middle fish (tar get), or simply to make it happy, by pressing a key corresponding to the direction in which it points. After the response is made, a feedback consisting of an animation of the middle fish is provided, which intends to help the child’s motivation to complete the task. Using the child ANT, the development of attention networks has been traced during the primary school period into early adolescence. In a first study, this task was used with chil dren aged 6 to 10 years and in adults (Rueda, Fan, et al., 2004). Results showed separate developmental trajectories for each attention function. Alerting scores showed stability across early and middle childhood, but children’s scores were higher than the scores ob Page 17 of 40

Development of Attention tained by adults, suggesting further development of alerting during late childhood. Ori enting scores showed no differences across ages, suggesting an early development of this network. However, in this study invalid cues were not used; thus, the load of operations of disengagement and reallocation of attention was rather low. (p. 307) Finally, flanker inter ference scores indexing executive attention efficiency showed considerable reduction from age 6 to 7 years. However, interference, as calculated with both RT and percentage of errors, remained about the same from age 7 years to adulthood, suggesting that early childhood is the period of major development of executive attention. Recently, we have conducted a developmental study with a slightly modified version of the child ANT (see Figure 15.1). Callejas et al. (2004) suggested a modification of the ANT consisting of separating the presentation of alerting and orienting signals and including invalid orienting cues, as well as presenting alerting signals in the auditory modality. This modification has two potential advantages over the original ANT: (1) It allows measure ment of alerting and orienting effects separately, and (2) it provides orienting scores with greater load of disengagement and reorienting operations. We modified the child ANT ac cording to these suggestions and conducted a study with groups of 4- to 7-year-olds, 7- to 10-year-olds, and 10- to 13-year-olds and adults (Abundis, Checa, & Rueda, 2013). Again, data showed separate developmental courses for the three functions. Executive attention scores showed a progressive development across early and middle childhood and no dif ferences between the oldest children and adults. Similarly, alerting scores were larger for the youngest groups (4–7 years and 7–10 years), with no differences between 10- to 13year-olds and adults. With invalid cues included, the orienting network followed a differ ent developmental trajectory compared with the previous study. We observed larger ori enting scores for all age groups compared with adults, which suggests that the develop ment of operations of disengagement and reorienting of attention extends over late child hood. In this study, we also registered electrophysiological activation during performance of the task. Data revealed modulation of distinct ERP components associated with each network. Alerting cues produced early positive and negative components over frontal leads followed by the expected CNV. Consistent with the behavioral data, the two younger groups did not show the early frontal effect, indicating that activation of frontal struc tures related to response preparation is delayed in younger children with respect to older children and adults. Compared with valid orienting cues, invalid cues elicited larger P3 over parietal channels, an effect that was larger for children than adults and suggests their need for greater engagement of parietal structures in order to disengage and reori ent attention. Finally, compared with congruent conditions, incongruent flankers pro duced a larger negativity around 300 ms in channels along the frontal midline in adults. This effect was more sustained for children and had a broader left-lateralized anterior distribution than in adults. Again, this suggests that the executive attention network has not reached its highest level of efficiency during early childhood; thus, additional frontal structures need to be activated during longer periods of time in order to reduce interfer ence from flankers.

Page 18 of 40

Development of Attention One of the strengths of the child ANT is that it is a theoretically grounded task that com bines experimental strategies widely used to study alerting (e.g., warning signals), orient ing (e.g., spatial cues), and attention control (e.g., flanker conflict) within the same task. However, much of the developmental research on attention has been conducted separate ly for each function. The main findings of this research are reviewed next.

Alerting Several studies have examined developmental changes in phasic alertness between preschoolers, older children, and adults. Increasing age is generally associated with larg er reductions in RT in response to warning cues. Developmental differences in response preparation may relate to the speed and maintenance of preparation while expecting the target. It has been shown that young children (5-year-olds) need more time than older children (8-year-olds) and adults to get full benefit from a warning cue (Berger, Jones, Rothbart, & Posner, 2000), and they also seem to be less able to sustain the optimal level of alertness over time (Morrison, 1982). Using the child ANT, Mezzacappa (2004) observed a trend toward higher alerting scores (difference between RT in trials with and without warning cues) with age in a sample of 5- to 7-year-old children. Increasing age was associated with larger reductions in RT in response to warning cues. Older children also show lower rates of omissions overall, which indicates greater ability to remain vigi lant during the task period. The fact that alertness is subject to more fluctuations in younger children can in part explain age differences in processing speed because alert ness is thought to speed the processing of subsequent events. Sustained attention is frequently measured by examining variations in performance in a task along a relatively extended period of time, as in the continuous performance task (CPT). Variations in the level of alertness can be observed by examining the (p. 308) per centage of correct and/or omitted responses to targets or by means of indexes of percep tual sensitivity (d’) over time. With young children, the percentage of individuals that are able to complete a particular task can also be indicative of maturational differences in the ability to sustain attention. In a study conducted with preschoolers, only 30 to 50 percent of 3- to 4-year-olds were able to complete the task, whereas the percentage rose to 70 percent for 4- to 4½-year-olds and close to 100 percent from that age up (Levy, 1980). Us ing the CPT, Danis et al. (2008) found considerable increases in the ability to maintain and regain attention after a distraction period between 2½ and 3½ years of age, and more consistent control of attention after 4½ years of age. However, even though the largest development of vigilance seems to occur during the preschool period, children continue to show larger declines in performance in CPT over time compared with adults through middle and late childhood, especially under more difficult task conditions. For instance, 7to 9-year-old children show a larger decline in sensitivity (d’) and hits over time com pared with adults in an auditory version of the CPT, which is thought to be more challeng ing than the visual version of the task (Curtindale, Laurie-Rose, Bennett-Murphy, & Hull, 2007). Likewise, while performing a CPT with degraded stimuli, a steady increase of d’ and rate of hits with age has been observed, reaching the adult level by about age 13 years (Lin, Hsiao, & Chen, 1999). Page 19 of 40

Development of Attention Developmental changes in alertness during childhood and early adolescence appear to re late to continuous maturation of frontal systems during this period. One way to examine brain mechanisms underlying changes in alertness is through studying the CNV, an elec trophysiological index associated with activation in the right ventral and medial frontal areas of the brain (Segalowitz & Davies, 2004). In adolescents as well as adults, the CNV has been shown to relate to performance in various measures of intelligence and execu tive functions as well as functional capacity of the frontal cortex (Segalowitz, Unsal, & Dywan, 1992). Various studies have shown that the amplitude of the CNV increases with age, especially during middle childhood. Jonkman (2006) found that the CNV amplitude is significantly smaller for 6- to 7-year-old children compared with adults, but no differences were observed between 9- and 10-year-olds and adults. Moreover, the difference in CNV amplitude between children and adults seems to be restricted to early components of the CNV observed over right frontal-central channels (Jonkman, Lansbergen, & Stauder, 2003), which suggests a role of maturation of the frontal alerting network.

Orienting and Selectivity Aspects of the attention system that increase precision and voluntary control of orienting continue developing throughout childhood and adolescence. For the most part, infant studies examine overt forms of orienting. By the time children are able to follow instruc tions and respond to stimulation by pressing keys, both overt and covert orienting can be measured. Mostly using Posner’s cuing paradigm, several studies have examined the de velopment of orienting during childhood. Despite a progressive increase in orienting speed to valid cues during childhood (Schul, Townsend, & Stiles, 2003), data generally show no age differences in the orienting benefit effect between young children (5–6 years of age), older children (8–10 years) and adults (Enns & Brodeur, 1989), regardless of whether the effect is measured in covert or overt orienting conditions (Wainwright & Bryson, 2002). However, there seems to be an age-related decrease in the orienting cost (Enns & Brodeur, 1989; Schul et al., 2003; Wainwright & Bryson, 2002). Besides, the ef fect of age when disengagement and reorienting to an uncued location are needed ap pears to be larger under endogenous orienting conditions (e.g., longer intervals between cue and target) (Schul et al., 2003; Wainwright & Bryson, 2005). This suggests that as pects of orienting related to the control of disengagement and voluntary orientation, which depend on the dorsal frontoparietal network in the Corbetta and Shulman (2002) model, improve with age during childhood. In a study in which endogenous orienting was examined in children aged 6 to 14 years and adults, all groups but the youngest children showed larger orienting effects (difference in RT to targets appearing at cued vs. uncued locations) with longer cue–target intervals (Wainwright & Bryson, 2005). This indicates that young children seem to have problems endogenously adjusting the scope of their at tentional focus. This idea was also suggested by Enns and Girgus (1985), who found that attentional focusing as well as the ability to effectively divide or switch attention between stimuli improves with age between ages 5, 8, and 10 years, and adulthood. A similar developmental pattern emerges when orienting and selectivity are studied in the auditory modality. Coch et al. (2005) developed a task to measure sustained selective Page 20 of 40

Development of Attention auditory attention (p. 309) in young children. They used a dichotic listening task in which participants were asked to selectively focus attention to one auditory stream consisting of a narration of a story while ignoring a different stream containing another story. A pic ture related to the to-be-attended story is visually presented to the child in order to facili tate the task. Then, ERP are recorded to probes embedded in the attended and unattend ed channels. Adults show increased P1 and N1 amplitudes to probes presented in the at tended stream. Six- to 8-year-old children and preschoolers also show an attention-relat ed modulation of ERP components, which is more sustained in children than adults (Coch et al., 2005; Sanders, Stevens, Coch, & Neville, 2006). Differences in the topographic dis tribution of the effect between children and adults also suggest that brain mechanisms related to sustained selectivity continue developing beyond middle childhood. Additional ly, further development is necessary when attention has to be disengaged from the at tended channel and moved to a different one, as occurred for the visual modality. An im portant improvement in performance between 8 and 11 years of age has been reported when children are asked to disengage attention from the attended channel and reallocate it to the other channel in the dichotic listening task (Pearson & Lane, 1991).

Executive Attention Children are able to perform simple conflict tasks in which their RT can be measured from age 2 years on, although the tasks need to be adapted to appear child friendly. One such adaptation is the spatial conflict task (Gerardi-Caulton, 2000), which induces con flict between the identity and the location of objects. It is a touch-screen task in which pictures of houses of two animals (i.e., a duck and a cat) are presented in the bottom left and right sides of the screen, then one of the two animals appears either on the left or right side of the screen in each trial, and the child is required to touch the house corre sponding to the animal. Location is the dominant aspect of the stimulus, although instruc tions require responding according to its identity. Thus, conflict trials in which the animal appears on the side of the screen opposite to its house usually result in slower responses and larger error rates than nonconflict trials (when the animal appears on the side of its house). Between 2 and 4 years of age, children progressed from an almost complete in ability to carry out the task to relatively good performance. Although 2-year-old children tended to perseverate on a single response, 3-year-olds performed at high accuracy lev els; although, like adults, they responded more slowly and with reduced accuracy to con flict trials (Gerardi-Caulton, 2000; Rothbart, Ellis, Rueda, & Posner, 2003). Another way to study action monitoring consists of examining the detection and correc tion of errors. While performing the spatial conflict task, 2½- and 3-year-old children showed longer RT following erroneous trials than following correct ones, indicating that children were noticing their errors and using them to guide performance in the next trial. However, no evidence of slowing following an error was found at 2 years of age (Rothbart et al., 2003). A similar result with a different time frame was found when using a version of the Simon Says game. In this task, children are asked to execute a response when a command is given by one stuffed animal and to inhibit a response commanded by a sec ond animal (Jones, Rothbart, & Posner, 2003). Children 36 to 38 months of age were un Page 21 of 40

Development of Attention able to inhibit their response and did not show the slowing-after-error effect, but at 39 to 41 months of age, children showed both the ability to inhibit and the slowing of RT follow ing errors. These results suggest that between 30 and 39 months of age, children greatly develop their ability to detect and correct erroneous responses and that this ability may relate to the development of inhibitory control. Data collected with the ANT and reported earlier suggested that the development of con flict resolution continues during the preschool and early childhood periods. Nonetheless, studies in which the difficulty of the conflict task is increased by other demands such as switching rules or holding more information in working memory have shown further de velopment of conflict resolution between late childhood and adulthood. For example, Davidson et al. (2006) manipulated memory load, inhibitory demand, and rule switching (cognitive flexibility) in a spatial conflict task. They found that the cost resulting from the need for inhibitory control was larger for children than for adults. Also, even under low memory load conditions, the switching cost was still larger for 13-year-old children than for adults. The longer developmental course observed with this task might be due to the requirement of additional frontal brain areas to those involved in executive attention. Other studies have used ERP to examine the brain mechanisms that underlie the develop ment of executive attention. In one of these studies, a flanker task was used to compare conflict resolution (p. 310) in three groups of children aged 5 to 6, 7 to 9, and 10 to 12 years, and a group of adults (Ridderinkhof & van der Molen, 1995). Developmental differ ences were examined in two ERP components, one related to response preparation (LRP) and another one related to stimulus evaluation (P3). The authors found differences be tween children and adults in the latency of the LRP, but not in the latency of the P3 peak, suggesting that developmental differences in the ability to resist interference are mainly related to response competition and inhibition, but not to stimulus evaluation. As discussed earlier, brain responses to errors are also informative of the efficiency of the executive attention system. The amplitude of the ERN seems to reflect detection of the error as well as its salience in the context of the task and therefore is subject to individ ual differences in affective style or motivation. Generally, larger ERN amplitudes are as sociated with greater engagement in the task and/or greater efficiency of the error-detec tion system (Santesso, Segalowitz, & Schmidt, 2005; Tucker, Hartry-Speiser, McDougal, Luu, & deGrandpre, 1999). Developmentally, the amplitude of the ERN shows a progres sive increase during childhood into late adolescence (Segalowitz & Davies, 2004), with young children (age 7–8 years) being less likely to show the ERN to errors than older chil dren and adults—at least when performing a flanker task. Another evoked potential, the N2, is also modulated by the requirement for executive control (Kopp, Rist, & Mattler, 1996) and has been associated with a source of activation in the ACC (van Veen & Carter, 2002). In a flanker task, such as the fish version of the child ANT, adults show larger N2 for incongruent trials over the mid-frontal leads (Rueda, Posner, Rothbart, & Davis-Stober, 2004b). Four-year-old children also show a larger nega tive deflection for the incongruent condition compared with the congruent one at midPage 22 of 40

Development of Attention frontal electrodes. However, compared with adults, this congruency effect had a larger size and extended over a longer period of time. Later in childhood, developmental studies have shown a progressive decrease in the amplitude and latency of the N2 effect with age (Davis, Bruce, Snyder, & Nelson, 2004; Johnstone, Pleffer, Barry, Clarke, & Smith, 2005; Jonkman, 2006). The reduction of the amplitude appears to relate to the increase in effi ciency of the system and not to the overall amplitude decrease that is observed with age (Lamm, Zelazo, & Lewis, 2006). Also, the effects are more widely distributed for young children, and they become more focalized with age (Jonkman, 2006; Rueda et al., 2004b). Source localization analyses indicate that, compared with adults, children need additional activations to adequately explain the distribution (Jonkman, Sniedt, & Kemner, 2007). The focalization of signals in adults compared with children is consistent with neuroimag ing studies, in which children appear to activate the same network of areas as adults when performing similar tasks, but the average volume of activation appears to be re markably greater in children than in adults (Casey, Thomas, Davidson, Kunz, & Franzen, 2002; Durston et al., 2002). Altogether, these data suggest that the brain circuitry under lying executive functions becomes more focal and refined as it gains in efficiency. This maturational process involves not only greater anatomical specialization but also reduc ing the time these systems need to resolve each of the processes implicated in the task. This is consistent with recent data showing that the network of brain areas involved in at tentional control shows increased segregation of short-range connections but increased integration of long-range connections with maturation (Fair et al., 2007). Segregation of short-range connectivity may be responsible for greater local specialization, whereas in tegration of long-range connectivity likely increases efficiency by improving coordinated responses between different processing networks.

Figure 15.2 Schematic representation of the devel opmental time course of attention networks. The alerting and orienting networks appear to mature largely during infancy and early childhood, although both networks continue developing up to late child hood, showing improvements in the endogenous con trol of processes related to preparation and selectivi ty. The executive attention network appears to under go a more protracted maturation, emerging at about the end of the first year of life and continuing during childhood into adolescence.

Page 23 of 40

Development of Attention In summary, evidence shows that the attention networks have different developmental courses across childhood. The different developmental courses are represented in Figure 15.2. Shortly after birth infants show increasing levels of alertness, although alertness is highly dependent on exogenous stimulation. Then, preparation from warning signals shows a progressive development during the first years of life, whereas the ability to en dogenously sustain attention improves up to late childhood. Orienting also shows pro gressive development of increasingly complex functions. Infants are able to orient to ex ternal cues by age 4 months. From there, children’s orientation is increasingly more pre cise and less dependent on exogenous stimulation. Endogenous disengagement and reori enting of attention progress up to late childhood and early adolescence. The earlier signs of executive attention appear by the end of the first year of life. From there on, there is a progressive development, especially during the preschool period, of the ability to inhibit dominant responses and suppress irrelevant stimulation. However, with increasingly com plex conditions, as when other executive functions (p. 311) (e.g., working memory, plan ning) come into play, executive attention shows further development during childhood and adolescence.

Individual Differences in Attentional Efficiency Reasons for studying the emergence of attention are strengthened because cognitive measures of attention efficiency in laboratory tasks have been linked to aspects of children’s behavior in naturalistic settings. For example, it has been shown that the effi ciency of executive attention as measured with the child ANT is related to the ability to accommodate to social norms, like smiling when receiving a disappointing gift (Simonds, Kieras, Rueda, & Rothbart, 2007). Additionally, Eisenberg and her colleagues have shown that children with good attentional control tend to deal with anger by using nonhostile verbal methods rather than overt aggressive methods (Eisenberg, Fabes, Nyman, Bernzweig, & Pinuelas, 1994). Efficiency of attention is also related to peer-reported mea sures of unsocial behavior in the classroom and increased risk for social rejection (Checa, Rodriguez-Bailon, & Rueda, 2008). In that same study a positive relation between atten tional efficiency and schooling competence was reported, including measures of academ ic achievement and skills important for school success, such as rule following and toler ance to frustration. The relationship between poor attention, school maladjustment, and low academic achievement seems to be consistent across ages and cultures. Mechanisms of the execu tive attention network are likely to play a role in this relationship. We have recently found that brain activation registered while performing the flanker task is a significant predic tor of mathematics grades in school (Checa & Rueda, 2011). Also, in a study conducted with a flanker task and ERPs, children who committed more errors on incongruent trials showed smaller amplitudes of the ERN. This result suggests less sensibility of the brains of these children to the commission of errors. Moreover, the amplitude of the ERN was predicted by individual differences in social behavior, in that children with poorer social sensitivity as assessed by a self-report personality questionnaire showed ERNs of smaller Page 24 of 40

Development of Attention amplitude (Santesso et al., 2005). On the other hand, empathy appears to show a positive relation with amplitude of the ERN (Santesso & Segalowitz, 2009). Altogether, these data suggest that attentional flexibility is required to link affect, internalized social norms, and action in everyday life situations (Rueda, Checa, & Rothbart, 2010). Studies of this sort are important for establishing links between biology and behavior. Knowing the neural substrates of attention also provides a tool for examining which as pects of the attention functions are subject to genetic influence, as well as how the effi ciency of this system may be influenced by experience.

Genes Describing the development of a specific neural network is only one step toward a biologi cal understanding. It is also important to know the genetic and environmental influences that together built up the neural network. Some features or psychological functions may be more subject to genetic influ ences than others. The degree of heritability of each attention network in Posner’s model (p. 312)

was examined in a twin study using the scores provided by the ANT as phenotypes (Fan, Wu, Fossella, & Posner, 2001). The study showed that the executive attention network and, to a lesser degree, the alerting network show evidence of heritability. This suggested that genetic variation contributes to individual differences at least in these two functions. Links between specific neural networks of attention and chemical modulators allow inves tigating the genetic basis of normal attention (Fossella et al., 2002; Green et al., 2008). Information on the neuromodulators that influence the function of the attention networks has been used to search for genes related to these modulators. Thus, since the sequenc ing of the human genome in the year 2001 (Venter et al., 2001), many studies have shown that genes influence individuals’ attention capacity (see Posner, Rothbart, & Sheese, 2007; see also Table 15.1). Polymorphisms in genes related to the norepinephrine and dopamine systems have been associated with individual differences in the efficiency of alerting and executive attention (Fossella et al., 2002). For example, it has been found that scores of the executive attention network are specifically related to variations on the dopamine receptor D4 (DRD4) gene and the monoanime oxidase-A (MAOA) gene (Fossella et al., 2002), as well as other dopaminergic genes such as the dopamine transporter 1 (DAT1) gene (Rueda, Rothbart, McCandliss, Saccomanno, & Posner, 2005) and the cate chol-O-methyltransferase (COMT) gene (Diamond, Briand, Fossella, & Gehlbach, 2004). Moreover, individuals carrying the alleles associated with better performance showed greater activation in the anterior cingulate gyrus while performing the ANT (Fan, Fossel la, Sommer, Wu, & Posner, 2003). On the other hand, polymorphisms of genes influencing the function of cholinergic receptors have been associated with attentional orienting as measured with both RT and ERP (Parasuraman, Greenwood, Kumar, Fossella, et al., 2005; Winterer et al., 2007).

Page 25 of 40

Development of Attention

Experience Genetic data could wrongly lead to the impression that attention is not susceptible to the environment and cannot be enhanced or harmed by experience. However, this conclusion would greatly contradict evidence on the extraordinarily plastic capacity of the human nervous system, especially during development (see Posner & Rothbart, 2007). There is some evidence suggesting that susceptibility to the environment might even be embed ded in genetic endowment because some genetic polymorphisms, often under positive se lection, appear to make children more susceptible to environmental factors such as par enting (Sheese, Voelker, Rothbart, & Posner, 2007). In the past years, much evidence has been provided in favor of the susceptibility of sys tems of self-regulation to the influence of experience. One piece of evidence comes from studies showing vulnerability of attention to environmental aspects such as parenting and socioeconomic status (SES; Bornstein & Bradley, 2003). Noble, McCandliss, and Farah (2007) assessed predictability of SES in a wide range of cognitive abilities in children. These investigators found that SES accounts for portions of variance, particularly in lan guage but also in other superior functions including executive attention. Parental level of education is also an important environmental factor, highly predictive of the family SES. A recent study has shown that children whose parents have lower levels of education ap pear to have more difficulty selecting out irrelevant information as shown by ERP than those with highly educated parents (Stevens, Lauinger, & Neville, 2009). All these data indicate that children’s experiences can shape the functional efficiency of brain networks and also suggest that providing children with the appropriate experiences may constitute a good method to enhance attentional skills.

Optimizing Attention Development Several studies have shown that different intervention methods lead to significant im provements in attentional efficiency. Several years ago, in collaboration with Michael Pos ner and Mary Rothbart at the University of Oregon, we designed a set of computer exer cises aimed at training attention and tested a 5-day training intervention with children between 4 and 6 years of age (Rueda et al., 2005). Before and after training, the children performed the child ANT while their brain activation was recorded with an EEG system. Children in the intervention group showed clear evidence of improvement in the execu tive attention network after training, in comparison with a control group who viewed in teractive videos matched to the duration of the intervention. The frontal negative ERP typically observed in conflict tasks showed a more adult-like pattern (i.e. shorter delay and progressively more posterior scalp distribution) in trained (p. 313) children compared with controls, suggesting that the training altered the brain mechanisms of conflict reso lution in the direction of maturation. The beneficial effect of training attention also trans ferred to nontrained measures of fluid intelligence. Recently, we extended the number of exercises and sessions in our training program and replicated the benefits of training in brain activation and intelligence with a group of 5-year-old children (Rueda, Checa & Page 26 of 40

Development of Attention Combita, 2012). In this study, trained children showed a faster and more efficient activa tion of the brain circuitry involved in executive attention, an effect that was still observed at 2 months’ follow-up. An important question related to intervention is whether it has the potential to overcome the influence of negative experience or unfavorable constitutional conditions. Although further data on this question are undoubtedly needed, current evi dence indicates that training may be an important tool, especially for children with greater risk for experiencing attentional difficulties. Consistently with our results, other studies have shown beneficial effects of cognitive training on attention and other forms of executive functions during development. For in stance, auditory selective attention was improved by training with a computerized pro gram designed to promote oral language skills in both language-impaired and typically developing children (Stevens, Fanning, Coch, Sanders, & Neville, 2008). Klingberg and colleagues have shown that working memory training has benefits and shows some de gree of transfer to aspects of attention (Thorell, Lindqvist, Nutley, Bohlin, & Klingberg, 2009). The Klingberg group has also shown evidence that training can affect various lev els of brain function including activation (Olesen, Westerberg, & Klingberg, 2004) and changes in the dopamine D1 receptor system (McNab et al., 2009) in areas of the cere bral cortex involved in the trained function. There is also some evidence that curricular interventions directly carried out in the class room can lead to improvements in children’s cognitive control. Diamond et al. (2007) tested the influence of a specific curriculum on preschoolers’ control abilities and found beneficial effects as measured by various conflict tasks. A somewhat indirect but proba bly not less beneficial form of fostering attention in school could be provided by multilin gual education. There is growing evidence indicating that bilingual individuals perform better on executive attention tasks than monolinguals (Bialystok, 1999; Costa, Hernan dez, & Sebastian-Galles, 2008). The idea is that people using multiple languages on a reg ular basis might train executive attention because of the need to suppress one language while using the other. Although all this evidence shows promising results about the effectiveness of interven tions and particular educational methods to promote attentional skills, questions on vari ous aspects of training remain to be answered. In future studies, it will be important to address questions such as whether genetic variation and other constitutionally based variables influence the extent to which the executive attention network can be modified by experience, and whether there are limits to the ages at which training can be effec tive. Additionally, further research will be needed to examine whether the beneficial ef fects of these interventions transfer to abilities relevant for schooling competence.

Summary and Conclusions The emergence of the field of cognitive neuroscience constituted a turning point in the study of the relationship between brain and cognition, from which the study of attention benefited greatly. Attention has been related to a variety of constructs including the state Page 27 of 40

Development of Attention of alertness, selectivity, and executive control. These various functions have been ad dressed in cognitive studies conducted for the most part during the second half of the twentieth century. Since then, imaging studies have shown that each function is associat ed with activation of a particular network of brain structures. It has been also determined that the function of each attention network appears to be modulated by particular neuro chemical mechanisms. Using Posner’s neuroanatomical model as a theoretical frame work, I have reviewed developmental studies conducted with infants and children. Evi dence show that maturation of each attention network follows a particular trajectory that extends from birth to late childhood and, in the case of executive control, may continue during adolescence. Apart from the progressive competence acquired with maturation, efficiency of attentional functions is subject to important differences among individuals of the same age. Evidence shows that individual differences in attentional efficiency depend on constitutional as well as environmental factors, or the combination of both. Important ly, these individual differences largely appear to contribute to central aspects in the life of children, including social-emotional development and school competence. For the future, it will be important to understand the mechanisms by which genes and experience influ ence the organization and (p. 314) efficiency of the attention networks and whether there exist sensitive periods when intervention to foster attention may have the most prevalent benefits.

Author Note This work was supported by grants from the Spanish Ministry of Science and Innovation, refs. PSI2008-02955 and PSI2011–27746.

References Abundis, A., Checa, P., Castellanos, C., & Rueda, M. R. (2013). Electrophysiological corre lates of the development of attention networks in childhood. Manuscript submitted for publication. Baddeley, A. D. (1993). Working memory or working attention? In A. D. Baddeley & L. Weiskrantz (Eds.), Attention, selection, awareness and control (pp. 152–170). Oxford, UK: Clarendon Press. Beauregard, M., Levesque, J., & Bourgouin, P. (2001). Neural correlates of conscious selfregulation of emotion. Journal of Neuroscience, 21 (18), 6993–7000. Berger, A., Jones, L., Rothbart, M. K., & Posner, M. I. (2000). Computerized games to study the development of attention in childhood. Behavior Research Methods, Instru ments & Computers, 32 (2), 297–303. Berger, A., Tzur, G., & Posner, M. I. (2006). Infant brains detect arithmetic errors. Pro ceedings of the National Academy of Sciences, 103 (33), 12649–12653.

Page 28 of 40

Development of Attention Bialystok, E. (1999). Cognitive complexity and attentional control in the bilingual mind. Child Development, 70 (3), 636. Bornstein, M. H., & Bradley, R. H. (2003). Socioeconomic status, parenting, and child de velopment. Mahwah, NJ: Erlbaum. Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S., & Cohen, J. D. (2001). Conflict monitoring and cognitive control. Psychological Review, 108 (3), 624–652. Botvinick, M., Nystrom, L. E., Fissell, K., Carter, C. S., & Cohen, J. D. (1999). Conflict monitoring versus selection-for-action in anterior cingulate cortex. Nature, 402 (6758), 179–181. Broadbent, D. E. (1958). Perception and communication. New York: Pergamon. Brozoski, T. J., Brown, R. M., Rosvold, H. E., & Goldman, P. S. (1979). Cognitive deficit caused by regional depletion of dopamine in prefrontal cortex of rhesus monkey. Science, 205, 929–932. Bush, G., Luu, P., & Posner, M. I. (2000). Cognitive and emotional influences in anterior cingulate cortex. Trends in Cognitive Sciences, 4 (6), 215–222. Butcher, P. R. (2000). Longitudinal studies of visual attention in infants: The early devel opment of disengagement and inhibition of return. Meppel, The Netherlands: Aton. Callejas, A., Lupiáñez, J., & Tudela, P. (2004). The three attentional networks: On their in dependence and interactions. Brain and Cognition, 54 (3), 225–227. Casey, B. J., & Richards, J. E. (1988). Sustained visual attention in young infants measured with an adapted version of the visual preference paradigm. Child Development, 59, 1514– 1521. Casey, B. J., & Richards, J. E. (1991). A refractory period for the heart rate response in in fant visual attention. Developmental Psychobiology, 24, 327–340. Casey, B., Thomas, K. M., Davidson, M. C., Kunz, K., & Franzen, P. L. (2002). Dissociating striatal and hippocampal function developmentally with a stimulus-response compatibility task. Journal of Neuroscience, 22 (19), 8647–8652. Checa, P., Rodriguez-Bailon, R., & Rueda, M. R. (2008). Neurocognitive and temperamen tal systems of self-regulation and early adolescents’ school competence. Mind, Brain and Education, 2 (4), 177–187. Checa, P., & Rueda, M. R. (2011). Behavioral and brain measures of executive attention and school competence in late childhood. Developmental Neuropsychology, 36 (8), 1018– 1032. Cherry, C. E. (1953). Some experiments on the recognition of speech with one and two ears. Journal of the Acoustical Society, 25, 975–979. Page 29 of 40

Development of Attention Clohessy, A. B., Posner, M. I., & Rothbart, M. K. (2001). Development of the functional vi sual field. Acta Psychologica, 106 (1–2), 51–68. Coch, D., Sanders, L. D., & Neville, H. J. (2005). An event-related potential study of selec tive auditory attention in children and adults. Journal of Cognitive Neuroscience, 17 (4), 605–622. Colombo, J. (2001). The development of visual attention in infancy. Annual Review of Psy chology, 52, 337–367. Colombo, J., & Horowitz, F. D. (1987). Behavioral state as a lead variable in neonatal re search. Merrill-Palmer Quarterly, 33, 423–438. Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven atten tion in the brain. Nature Reviews Neuroscience, 3 (3), 201–215. Costa, A., Hernandez, M., & Sebastian-Galles, N. (2008). Bilingualism aids conflict resolu tion: Evidence from the ANT task. Cognition, 106 (1), 59–86. Coull, J. T., Frith, C. D., Büchel, C., & Nobre, A. C. (2000). Orienting attention in time: Be havioral and neuroanatomical distinction between exogenous and endogenous shifts. Neuropsychologia, 38 (6), 808–819. Coull, J. T., Frith, C. D., Frackowiak, R. S. J., & Grasby, P. M. (1996). A fronto-parietal net work for rapid visual information processing: A PET study of sustained attention and working memory. Neuropsychologia, 34 (11), 1085–1095. Coull, J. T., Nobre, A. C., & Frith, C. D. (2001). The noradrenergic a2 agonist clonidine modulates behavioural and neuroanatomical correlates of human attentional orienting and alerting. Cerebral Cortex, 11 (1), 73–84. Crottaz-Herbette, S., & Menon, V. (2006). Where and when the anterior cingulate cortex modulates attentional response: Combined fMRI and ERP evidence. Journal of Cognitive Neuroscience, 18 (5), 766–780. Cui, R. Q., Egkher, A., Huter, D., Lang, W., Lindinger, G., & Deecke, L. (2000). High resolu tion spatiotemporal analysis of the contingent negative variation in simple or complex mo tor tasks and a non-motor task. Clinical Neurophysiology, 111 (10), 1847–1859. Curtindale, L., Laurie-Rose, C., Bennett-Murphy, L., & Hull, S. (2007). Sensory modality, temperament, and the development of sustained attention: A vigilance study in children and adults. Developmental Psychology, 43 (3), 576–589. Danis, A., Pêcheux, M.-G., Lefèvre, C., Bourdais, C., & Serres-Ruel, J. (2008). A continuous performance task in preschool children: Relations between attention and performance. European Journal of Developmental Psychology, 5 (4), 401–418.

Page 30 of 40

Development of Attention Davidson, M. C., Amso, D., Anderson, L. C., & Diamond, A. (2006). Development of cogni tive control and executive functions from 4 to 13 years: Evidence from manipulations of memory, inhibition, and task switching. Neuropsychologia, 44 (11), 2037–2078. Davidson, M. C., Cutrell, E. B., & Marrocco, R. T. (1999). Scopolamine slows the orienting of attention in primates to cued visual targets. Psychopharmacology, 142 (1), 1– 8. (p. 315)

Davidson, M. C., & Marrocco, R. T. (2000). Local infusion of scopolamine into intrapari etal cortex slows covert orienting in rhesus monkeys. Journal of Neurophysiology, 83 (3), 1536–1549. Davis, E. P., Bruce, J., Snyder, K., & Nelson, C. A. (2004). The X-trials: Neural correlates of an inhibitory control task in children and adults. Journal of Cognitive Neuroscience, 15, 532–443. Dehaene, S., Posner, M. I., & Tucker, D. M. (1994). Localization of a neural system for er ror detection and compensation. Psychological Science, 5 (5), 303–305. Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. An nual Review of Neuroscience, 18, 193–222. Diamond, A. (1991). Neuropsychological insights into the meaning of object concept de velopment. In S. Carey & R. Gelman (Eds.), The epigenesis of mind: Essays on biology and cognition (pp. 67–110). Hillsdale, NJ: Erlbaum. Diamond, A. (2006). The early development of executive functions. In E. Bialystok & F. I. M. Craik (Eds.), Lifespan cognition: Mechanisms of change (pp. 70–95 xi, 397). New York: Oxford University Press. Diamond, A., Barnett, W. S., Thomas, J., & Munro, S. (2007). Preschool program improves cognitive control. Science, 318 (5855), 1387–1388. Diamond, A., Briand, L., Fossella, J., & Gehlbach, L. (2004). Genetic and neurochemical modulation of prefrontal cognitive functions in children. American Journal of Psychiatry, 161 (1), 125–132. Dockree, P. M., Kelly, S. P., Roche, R. A. P., Reilly, R. B., Robertson, I. H., & Hogan, M. J. (2004). Behavioural and physiological impairments of sustained attention after traumatic brain injury. Cognitive Brain Research, 20 (3), 403–414. Drevets, W. C., & Raichle, M. E. (1998). Reciprocal suppression of regional cerebral blood flow during emotional versus higher cognitive processes: Implications for interactions be tween emotion and cognition. Cognition & Emotion, 12 (3), 353–385. Driver, J., Eimer, M., & Macaluso, E. (2007). Neurobiology of human spatial attention: Modulation, generation and integration. In N. Kanwisher & J. Duncan (Eds.), Attention

Page 31 of 40

Development of Attention and performance XX: Functional brain imaging of visual cognition (pp. 267–300). New York: Oxford University Press. Durston, S., Thomas, K. M., Yang, Y., Ulug, A. M., Zimmerman, R. D., & Casey, B. (2002). A neural basis for the development of inhibitory control. [Journal Peer Reviewed Journal]. Developmental Science, 5 (4), F9–F16. Eisenberg, N., Fabes, R. A., Nyman, M., Bernzweig, J., & Pinuelas, A. (1994). The rela tions of emotionality and regulation to children’s anger-related reactions. Child Develop ment, 65 (1), 109–128. Enns, J. T., & Brodeur, D. A. (1989). A developmental study of covert orienting to peripher al visual cues. Journal of Experimental Child Psychology, 48 (2), 171–189. Enns, J. T., & Cameron, S. (1987). Selective attention in young children: The relations be tween visual search, filtering, and priming. Journal of Experimental Child Psychology, 44, 38–63. Enns, J. T., & Girgus, J. S. (1985). Developmental changes in selective and integrative vi sual attention. Journal of Experimental Child Psychology, 40, 319–337. Eriksen, B. A., & Eriksen, C. W. (1974). Effects of noise letters upon the identification of a target letter in a nonsearch task. Perception & Psychophysics, 16 (1), 143–149. Etkin, A., Egner, T., Peraza, D. M., Kandel, E. R., & Hirsch, J. (2006). Resolving emotional conflict: A role for the rostral anterior cingulate cortex in modulating activity in the amyg dala. Neuron, 51 (6), 871–882. Everitt, B. J., & Robbins, T. W. (1997). Central cholinergic systems and cognition. Annual Review of Psychology, 48. Fair, D. A., Dosenbach, N. U. F., Church, J. A., Cohen, A. L., Brahmbhatt, S., Miezin, F. M., et al. (2007). Development of distinct control networks through segregation and integra tion. PNAS Proceedings of the National Academy of Sciences of the United States of America, 104 (33), 13507–13512. Fan, J., Flombaum, J. I., McCandliss, B. D., Thomas, K. M., & Posner, M. I. (2003). Cogni tive and brain consequences of conflict. NeuroImage, 18 (1), 42–57. Fan, J., Fossella, J., Sommer, T., Wu, Y., & Posner, M. I. (2003). Mapping the genetic varia tion of executive attention onto brain activity. Proceedings of the National Academy of Sciences U S A, 100 (12), 7406–7411. Fan, J., McCandliss, B. D., Fossella, J., Flombaum, J. I., & Posner, M. I. (2005). The activa tion of attentional networks. NeuroImage, 26 (2), 471–479.

Page 32 of 40

Development of Attention Fan, J., McCandliss, B. D., Sommer, T., Raz, A., & Posner, M. I. (2002). Testing the efficien cy and independence of attentional networks. Journal of Cognitive Neuroscience, 14 (3), 340–347. Fan, J., Wu, Y., Fossella, J., & Posner, M. I. (2001). Assessing the heritability of attentional networks. BMC Neuroscience, 2, 14. Fossella, J., Sommer, T., Fan, J., Wu, Y., Swanson, J. M., Pfaff, D. W., et al. (2002). Assess ing the molecular genetics of attention networks. BMC Neuroscience, 3, 14. Fuentes, L. J. (2004). Inhibitory processing in the attentional networks. In M. I. Posner (Ed.), Cognitive neuroscience of attention (pp. 29–44). New York: Guilford Press. Garon, N., Bryson, S. E., & Smith, I. M. (2008). Executive function in preschoolers: A re view using an integrative framework. Psychological Bulletin, 134, 31–60. Gehring, W. J., Gross, B., Coles, M. G. H., Meyer, D. E., & Donchin, E. (1993). A neural sys tem for error detection and compensation. Psychological Science, 4, 385–390. Gerardi-Caulton, G. (2000). Sensitivity to spatial conflict and the development of self-reg ulation in children 24-36 months of age. Developmental Science, 3 (4), 397–404. Graham, F. K., Anthony, B. J., & Ziegler, B. L. (1983). The orienting response and develop mental processes. In D. Siddle (Ed.), Orienting and habituation: Persperctives in human research (pp. 371–430). New York: Wiley. Green, A. E., Munafo, M. R., DeYoung, C. G., Fossella, J. A., Fan, J., & Gray, J. R. (2008). Using genetic data in cognitive neuroscience: From growing pains to genuine insights. Nature Reviews Neuroscience, 9, 710–720. Hackley, S. A., & Valle-Inclán, F. (1998). Automatic alerting does not speed late motoric processes in a reaction-time task. Nature, 391 (6669), 786–788. Haith, M. M., Hazan, C., & Goodman, G. S. (1988). Expectation and anticipation of dynam ic visual events by 3.5 month old babies. Child Development, 59, 467–469. Harman, C., Posner, M. I., Rothbart, M. K., & Thomas-Thrapp, L. (1994). Development of orienting to objects and locations in human infants. Canadian Journal of Experimental Psychology, 48, 301–318. Harman, C., Rothbart, M. K., & Posner, M. I. (1997). Distress and attention inter actions in early infancy. Motivation and Emotion, 21 (1), 27–43. (p. 316)

Hebb, D. O. (1949). Organization of behavior. New York: John Wiley & Sons. Hillyard, S. A. (1985). Electrophysiology of human selective attention. Trends in Neuro sciences, 8 (9), 400–405.

Page 33 of 40

Development of Attention Hillyard, S. A., Di Russo, F., & Martinez, A. (2006). The imaging of visual attention. In J. Duncan & N. Kanwisher (Eds.), Attention and performance XX: Functional brain imaging of visual cognition (pp. 381–388). Oxford, UK: Oxford University Press. Houghton, G., & Tipper, S. P. (1994). A dinamic model of selective attention. In D. Dagen bach & C. T. (Eds.), Inhibitory mechanisms in attention, memory and language (pp. 53– 113). Orlando, FL: Academic Press. James, W. (1890). The principles of psychology. New York: H. Holt. Johnson, M. H., & Morton, J. (1991). Biology and cognitive development: The case of face recognition. Oxford, UK: Blackwell. Johnson, M. H., Posner, M. I., & Rothbart, M. K. (1991). Components of visual orienting in early infancy: Contingency learning, anticipatory looking, and disengaging. Journal of Cognitive Neuroscience, 3, 335–344. Johnstone, S. J., Pleffer, C. B., Barry, R. J., Clarke, A. R., & Smith, J. L. (2005). Develop ment of inhibitory processing during the Go/NoGo Task: A behavioral and event-related potential study of children and adults. Journal of Psychophysiology, 19 (1), 11–23. Jones, L. B., Rothbart, M. K., & Posner, M. I. (2003). Development of executive attention in preschool children. Developmental Science, 6 (5), 498–504. Jonkman, L. M. (2006). The development of preparation, conflict monitoring and inhibi tion from early childhood to young adulthood: A Go/Nogo ERP study. Brain Research, 1097 (1), 181–193. Jonkman, L. M., Lansbergen, M., & Stauder, J. E. A. (2003). Developmental differences in behavioral and event-related brain responses associated with response preparation and inhibition in a go/nogo task. Psychophysiology, 40 (5), 752–761. Jonkman, L., Sniedt, F., & Kemner, C. (2007). Source localization of the Nogo-N2: A devel opmental study. Clinical Neurophysiology, 118 (5), 1069–1077. Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ: Prentice Hall. Karnath, H. O., Ferber, S., & Himmelbach, M. (2001). Spatial awareness is a function of the temporal not the posterior parietal lobe. Nature, 411, 950–953. Keele, S. W., Ivry, R. B., Mayr, U., Hazeltine, E., & Heuer, H. (2003). The cognitive and neural architecture of sequence representation. Psychological Review, 110, 316–339. Klein, R. M. (2000). Inhibition of return. Trends in Cognitive Sciences, 4, 138–147. Klein, R. M. (2004). On the control of visual orienting. In M. I. Posner (Ed.), Cognitive neuroscience of attention (pp. 29–44). New York: Guilford Press.

Page 34 of 40

Development of Attention Kopp, B., Rist, F., & Mattler, U. (1996). N200 in the flanker task as a neurobehavioral tool for investigating executive control. Psychophysiology, 33, 282–294. Lamm, C., Zelazo, P. D., & Lewis, M. D. (2006). Neural correlates of cognitive control in childhood and adolescence: Disentangling the contributions of age and executive func tion. Neuropsychologia, 44 (11), 2139–2148. Levy, F. (1980). The development of sustained attention (vigilance) in children: Some nor mative data. Journal of Child Psychology and Psychiatry, 21 (1), 77–84. Lin, C. C. H., Hsiao, C. K., & Chen, W. J. (1999). Development of sustained attention as sessed using the Continuous Performance Test among children 6–15 years of age. Journal of Abnormal Child Psychology, 27 (5), 403–412. Mangun, G. R., & Hillyard, S. A. (1987). The spatial allocation of visual attention as in dexed by event-related brain potentials. Human Factors, 29 (2), 195–211. Marrocco, R. T., & Davidson, M. C. (1998). Neurochemistry of attention. In R. Parasura man (Ed.), The attentive brain (pp. 35–50): Cambridge, MA: MIT Press. McCulloch, J., Savaki, H. E., McCulloch, M. C., Jehle, J., & Sokoloff, L. (1982). The distrib ution of alterations in energy metabolism in the rat brain produced by apomorphine. Brain Research, 243, 67–80. McNab, F., Varrone, A., Farde, L., Jucaite, A., Bystritsky, P., Forssberg, H., et al. (2009). Changes in cortical dopamine D1 receptor binding associated with cognitive training. Science, 323 (5915), 800–802. Mezzacappa, E. (2004). Alerting, orienting, and executive attention: Developmental prop erties and sociodemographic correlates in an epidemiological sample of young, urban children. Child Development, 75 (5), 1373–1386. Morrison, F. J. (1982). The development of alertness. Journal of Experimental Child Psy chology, 34 (2), 187–199. Newhouse, P. A., Potter, A., & Singh, A. (2004). Effects of nicotinic stimulation on cogni tive performance. Current Opinion in Pharmacology, 4 (1), 36–46. Noble, K. G., McCandliss, B. D., & Farah, M. J. (2007). Socioeconomic gradients predict individual differences in neurocognitive abilities. Developmental Science, 10 (4), 464–480. Nobre, A. C. (2001). Orienting attention to instants in time. Neuropsychologia, 39 (12), 1317–1328. Norman, D. A., & Shallice, T. (1986). Attention to action: Willed and automatic control of behavior. In R. J. Davison, G. E. Schwartz, & D. Shapiro (Eds.), Consciousness and selfregulation (pp. 1–18). New York: Plenum Press.

Page 35 of 40

Development of Attention Ochsner, K. N., Bunge, S. A., Gross, J. J., & Gabrieli, J. D. (2002). Rethinking feelings: An fMRI study of the cognitive regulation of emotion. Journal of Cognitive Neuroscience, 14 (8), 1215–1229. Olesen, P. J., Westerberg, H., & Klingberg, T. (2004). Increased prefrontal and parietal ac tivity after training of working memory. Nature Neuroscience, 7 (1), 75–79. Parasuraman, R., Greenwood, P. M., Kumar, R., Fossella, J., et al. (2005). Beyond heritabil ity: Neurotransmitter genes differentially modulate visuospatial attention and working memory. Psychological Science, 16, 200–207. Pearson, D. A., & Lane, D. M. (1991). Auditory attention switching: A developmental study. Journal of Experimental Child Psychology, 51 (2), 320–334. Pelphrey, K. A., Reznick, J. S., Goldman, B. D., Sasson, N., Morrow, J., Donahoe, A., et al. (2004). Development of visuospatial short-term memory in the second half of the first year. Developmental Psychology, 40 (5), 836–851. Posner, M. I. (1978). Chronometric explorations of mind. Hillsdale, NJ: Erlbaum. Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32 (1), 3–25. Posner, M. I. (1995). Attention in cognitive neuroscience: An overview. In M. S. Gazzaniga (Ed.), The cognitive neurosciences (pp. 615–624). Cambridge, MA: MIT Press. (p. 317)

Posner, M. I. (2008). Measuring alertness. Annals of the New York Academy of Sciences, 1129 (Molecular and Biophysical Mechanisms of Arousal, Alertness, and Attention), 193– 199. Posner, M. I., & Boies, S. J. (1971). Components of attention. Psychological Review, 78 (5), 391–408. Posner, M. I., & Cohen, Y. (1984). Components of visual orienting. In H. Bouma & D. Bouwhuis (Eds.), Attention and performance X (pp. 531–556). London: Erlbaum. Posner, M. I., & DiGirolamo, G. J. (1998). Executive attention: Conflict, target detection, and cognitive control. Cambridge, MA: MIT Press. Posner, M. I., & Fan, J. (2008). Attention as an organ system. In J. R. Pomerantz (Ed.), Top ics in integrative neuroscience (pp. 31–61). New York: Cambridge University Press. Posner, M. I., & Petersen, S. E. (1990). The attention system of human brain. Annual Re view of Neuroscience, 13, 25–42. Posner, M. I., & Raichle, M. E. (1994). Images of mind. New York: Scientific American Li brary; Dist. W.H. Freeman.

Page 36 of 40

Development of Attention Posner, M. I., & Rothbart, M. K. (2007). Educating the human brain. Washington, DC: American Psychological Association. Posner, M. I., Rothbart, M. K., & Sheese, B. E. (2007). Attention genes. Developmental Science, 10 (1), 24–29. Posner, M. I., Rueda, M. R., & Kanske, P. (2007). Probing the mechanisms of attention. In J. T. Cacioppo, J. G. Tassinary, & G. G. Berntson (Eds.), Handbook of psychophysiology (3er ed., pp. 410–432). Cambridge, UK: Cambridge University Press. Posner, M. I., Sheese, B. E., Odludas, Y., & Tang, Y. (2006). Analyzing and shaping human attentional networks. Neural Networks, 19 (9), 1422–1429. Posner, M. I., & Snyder, C. R. R. (1975). Attention and cognitive control. In R. Solso (Ed.), Information processing and cognition: The Loyola Symposium (pp. 55–85). Hillsdale, NJ: Erlbaum. Raz, A., & Buhle, J. (2006). Typologies of attentional networks. Nature Reviews Neuro science, 7 (5), 367–379. Rensink, R. A., O’Reagan, J. K., & Clark, J. J. (1997). To see or not to see: The need for at tention to perceive changes in scenes. Psychological Science, 8 (5), 368–373. Richards, J. E., & Casey, B. J. (1991). Heart rate variability during attention phases in young infants. Psychophysiology, 28 (1), 43–53. Richards, J. E., & Hunter, S. K. (1998). Attention and eye movements in young infants: Neural control and development. In J. E. Richards (Ed.), Cognitive neuroscience of atten tion. Mahwah, NJ: LEA. Ridderinkhof, K. R., & van der Molen, M. W. (1995). A psychophysiological analysis of de velopmental differences in the ability to resist interference. Child Development, 66 (4), 1040–1056. Rothbart, M. K., Ahadi, S. A., Hersey, K. L., & Fisher, P. (2001). Investigations of tempera ment at three to seven years: The Children’s Behavior Questionnaire. Child Development, 72 (5), 1394–1408. Rothbart, M. K., Ellis, L. K., Rueda, M., & Posner, M. I. (2003). Developing mechanisms of temperamental effortful control. Journal of Personality, 71 (6), 1113–1143. Rothbart, M. K., & Rueda, M. R. (2005). The development of effortful control. In U. Mayr, E. Awh, & S. W. Keele (Eds.), Developing individuality in the human brain. A tribute to Michael I. Posner (pp. 167–188). Washington, DC: American Psychological Association. Rueda, M. R., Checa, P., & Combita, L. M. (2012). Enhanced efficiency of the executive at tention network after training in preschool children: Immediate changes and effects after two months. Developmental Cognitive Neuroscience, 2S, S192–S204. Page 37 of 40

Development of Attention Rueda, M. R., Checa, P., & Rothbart, M. K. (2010). Contributions of attentional control to social emotional and academic development. Early Education and Development, 21 (5), 744–764. Rueda, M., Fan, J., McCandliss, B. D., Halparin, J. D., Gruber, D. B., Lercari, L. P., et al. (2004). Development of attentional networks in childhood. Neuropsychologia, 42 (8), 1029–1040. Rueda, M. R., Posner, M. I., & Rothbart, M. K. (2004a). Attentional control and self-regu lation. In R. F. Baumeister & K. D. Vohs (Eds.), Handbook of self-regulation: Research, the ory, and applications (pp. 283–300). New York: Guilford Press. Rueda, M. R., Posner, M. I., Rothbart, M. K., & Davis-Stober, C. P. (2004b). Development of the time course for processing conflict: An event-related potentials study with 4 year olds and adults. BMC Neuroscience, 5 (39), 1–13. Rueda, M. R., Rothbart, M. K., McCandliss, B. D., Saccomanno, L., & Posner, M. I. (2005). Training, maturation, and genetic influences on the development of executive attention. Proceedings of the National Academy of Sciences U S A, 102 (41), 14931–14936. Ruff, H. A., & Lawson, K. R. (1990). Development of sustained, focused attention in young children during free play. Developmental Psychology, 26 (1), 85–93. Ruff, H. A., & Rothbart, M. K. (1996). Attention in early development: Themes and varia tions. New York: Oxford University Press. Sanders, L. D., Stevens, C., Coch, D., & Neville, H. J. (2006). Selective auditory attention in 3-to 5-year-old children: An event-related potential study. Neuropsychologia, 44 (11), 2126–2138. Santesso, D. L., & Segalowitz, S. J. (2009). The error-related negativity is related to risk taking and empathy in young men. Psychophysiology, 46 (1), 143–152. Santesso, D. L., Segalowitz, S. J., & Schmidt, L. A. (2005). ERP correlates of error moni toring in 10-year olds are related to socialization. Biological Psychology, 70 (2), 79–87. Schul, R., Townsend, J., & Stiles, J. (2003). The development of attentional orienting dur ing the school-age years. Developmental Science, 6 (3), 262–272. Segalowitz, S. J., & Davies, P. L. (2004). Charting the maturation of the frontal lobe: An electrophysiological strategy. Brain and Cognition, 55 (1), 116–133. Segalowitz, S. J., Unsal, A., & Dywan, J. (1992). Cleverness and wisdom in 12-year-olds: Electrophysiological evidence for late maturation of the frontal lobe. Developmental Neu ropsychology, 8, 279–298.

Page 38 of 40

Development of Attention Sheese, B. E., Rothbart, M. K., Posner, M. I., Fraundorf, S. H., & White, L. K. (2008). Exec utive attention and self-regulation in infancy. Infant Behavior & Development, 31 (3), 501– 510. Sheese, B. E., Voelker, P., Posner, M. I., & Rothbart, M. K. (2009). Genetic variation influ ences on the early development of reactive emotions and their regulation by attention. Cognitive Neuropsychiatry, 14 (4–5), 332–355. Sheese, B. E., Voelker, P. M., Rothbart, M. K., & Posner, M. I. (2007). Parenting quality interacts with genetic variation in dopamine receptor D4 to influence tempera ment in early childhood. Development and Psychopathology, 19 (4), 1039–1046. (p. 318)

Simonds, J., Kieras, J. E., Rueda, M., & Rothbart, M. K. (2007). Effortful control, executive attention, and emotional regulation in 7-10-year-old children. Cognitive Development, 22 (4), 474–488. Sokolov, E. N. (1963). Perception and the conditioned reflex. Oxford, UK: Pergamon. Stanwood, G. D., Washington, R. A., Shumsky, J. S., & Levitt, P. (2001). Prenatal cocaine exposure produces consistent developmental alteration in dopamine-rich regions of the cerebral cortex. Neuroscience, 106, 5–14. Stevens, C., Fanning, J., Coch, D., Sanders, L., & Neville, H. (2008). Neural mechanisms of selective auditory attention are enhanced by computerized training: Electrophysiologi cal evidence from language-impaired and typically developing children. Brain Research, 1205, 55–69. Stevens, C., Lauinger, B., & Neville, H. (2009). Differences in the neural mechanisms of selective attention in children from different socioeconomic backgrounds: An event-relat ed brain potential study. Developmental Science, 12 (4), 634–646. Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experi mental Psychology, 18, 643–662. Sturm, W., de Simone, A., Krause, B. J., Specht, K., Hesselmann, V., Radermacher, I., et al. (1999). Functional anatomy of intrinsic alertness: Evidence for a fronto-parietal-thalamicbrainstem network in the right hemisphere. Neuropsychologia, 37 (7), 797–805. Tarkka, I. M., & Basile, L. F. H. (1998). Electric source localization adds evidence for taskspecific CNVs. Behavioural Neurology, 11 (1), 21–28. Thorell, L. B., Lindqvist, S., Nutley, S. B., Bohlin, G., & Klingberg, T. (2009). Training and transfer effects of executive functions in preschool children. Developmental Science, 12 (1), 106–113. Titchener, E. B. (1909). Experimental psychology of the thought processes. New York: Macmillan.

Page 39 of 40

Development of Attention Tucker, D. M., Hartry-Speiser, A., McDougal, L., Luu, P., & deGrandpre, D. (1999). Mood and spatial memory: Emotion and right hemisphere contribution to spatial cognition. Bio logical Psychology, 50, 103–125. Valenza, E., Simion, F., & Umiltá, C. (1994). Inhibition of return in newborn infants. Infant Behavior & Development, 17, 293–302. van Veen, V., & Carter, C. (2002). The timing of action-monitoring processes in the anteri or cingulate cortex. Journal of Cognitive Neuroscience, 14, 593–602. Venter, J. C., et al. (2001). The sequence of the human genome. Science, 291, 1304–1351. Voytko, M. L., Olton, D. S., Richardson, R. T., Gorman, L. K., Tobin, J. R., & Price, D. L. (1994). Basal forebrain lesions in monkeys disrupt attention but not learning and memo ry. Journal of Neuroscience, 14 (1), 167–186. Wainwright, A., & Bryson, S. E. (2002). The development of exogenous orienting: Mecha nisms of control. Journal of Experimental Child Psychology, 82 (2), 141–155. Wainwright, A., & Bryson, S. E. (2005). The development of endogenous orienting: Con trol over the scope of attention and lateral asymmetries. Developmental Neuropsychology, 27 (2), 237–255. Walter, W. G. (1964). Contingent negative variation: An electric sign of sensorimotor asso ciation and expectancy in the human brain. Nature, 203, 380–384. Winterer, G., et al. (2007). Association of attentional network function with exon 5 varia tions of the CHRNA4 gene. Human Molecular Genetics, 16, 2165–2174. Wright, R. D., & Ward, L. E. (2008). Orienting of attention. New York: Oxford University Press. Wynn, K. (1992). Addition and subtraction by human infants. Nature, 358, 749–750.

M. Rosario Rueda

M. Rosario Rueda, Departamento de Psicología Experimental, Universidad de Grana da, Spain

Page 40 of 40

Attentional Disorders

Attentional Disorders Laure Pisella, A. Blangero, Caroline Tilikete, Damien Biotti, Gilles Rode, Alain Vighetto, Jason B. Mattingley, and Yves Rossetti The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0016

Abstract and Keywords While reviewing Bálint’s syndrome and its different interpretations, a new framework of the anatomical-functional organization of the posterior parietal cortex (dorsal visual stream) is proposed in this chapter based on recent psychophysical, neuropsychological, and neuroimaging data. In particular, the authors identify two main aspects that exclude eye and hand movement disorders as specific deficit categories (optic ataxia). The first aspect is spatial attention, and the second is visual synthesis. Ipsilesional attentional bias (and contralesional extinction) may be caused by the dysfunction of the saliency map of the superior parietal lobule (SPL) directly (by its lesion) or indirectly (by a unilateral le sion creating an inter-hemispheric imbalance between the two SPL). Simultanagnosia may appear as concentric shrinking of spatial attention due to bilateral SPL damage. Oth er symptoms such as spatial disorganization (constructional apraxia) and impaired spatial working memory or visual remapping may be attributable to the second, nonlateralized aspect. Even if described initially in patients with left-sided neglect, these deficits occur in the entire space and only after right inferior parietal lobule (IPL) damage. These two aspects might also correspond to the ventral and dorsal networks of exogenous and en dogenous attention, respectively. Keywords: Bálint’s syndrome, extinction, simultanagnosia, neglect, optic ataxia, constructional apraxia, endoge nous/exogenous attention, visual remapping, saliency maps, dorsal/ventral network of attention, dorsal/ventral vi sual streams

Page 1 of 56

Attentional Disorders

Introduction

Figure 16.1 Schematic representation of the puta tive lesions underlying the main syndromes pertain ing to the initial Bálint’s syndrome and observed af ter unilateral or bilateral parietal damage. Note that this model is schematic and deals with only two pari etal systems (like Corbetta & Shulman, 2002): a sym metrical dorsal one (empty circles) labeled SPL-IPS similarly organized in the right and left cortical hemi spheres (RH and LH) and a right-hemispheric ventral one (gray oval) that we labeled right IPL. A cross symbolizes damage to these systems. Optic ataxia and extinction would result from damage of the SPLIPS system symmetrically (but see in the text nonlo calizationist accounts of extinction from the hemi spheric balance hypothesis). Full neglect syndrome would result from damage to both the SPL and IPL in the right hemisphere or from damage restricted to the right IPL when, in the acute phase, it also in duces inhibition of the right SPL (Corbetta et al., 2005; this effect is symbolized by an arrow and a cross with dashed lines in the right SPL). When the balance between the left and right SPLs is reestab lished, only constructional apraxia may persist after a restricted lesion of the right IPL (Russell et al., 2010). Simultanagnosia would result from bilateral damage restricted to the superior parietal lobules, whereas extent of the damage to right IPL after stroke or posterior cortical atrophy (Benson’s syn drome is consecutive to larger bilateral parietal-tem poral site of neuronal degeneration) would lead to more severe simultanagnosia with additional deficit of visual synthesis and revisiting behavior.

In this chapter, we focus on the role of the posterior parietal cortex (PPC) in visual atten tion and use the disorder of Bálint’s syndrome (Bálint, 1909) as an illustrative example. The PPC lies between the occipital-parietal sulcus (POS) and the post-Rolando sulcus. It includes a superior parietal lobule (SPL) above and an inferior parietal lobule (IPL) below the intraparietal sulcus (IPS). After reviewing the main interpretations (Andersen & Buneo 2002; Colby & Goldberg, 1999; Milner & Goodale 1995; Ungerleider & Mishkin 1982) of the functional role of the dorsal (occipital-parietal) stream of visual processing Page 2 of 56

Attentional Disorders and the classic categorizations of Bálint’s syndrome, we propose a new framework for un derstanding the anatomical-functional organization of the PPC in the right hemisphere (Figure 16.1). In particular, we put forward that the core processing of the PPC appears to be attention and to only consequently affect “vision for action.” Koch and Ullman (1985) have postulated the existence of a topographic feature map, which codes the “saliency” of all information within the visual field. By definition, the amount of activity at a given location within the saliency map represents the relative “conspicuity” or rele vance of the corresponding location in the visual field. Interestingly, such a saliency map has been described by electrophysiologists within the macaque PPC. For instance, singleunit recordings from the lateral intraparietal (LIP) (p. 320) area suggest that the primate PPC contains a relatively sparse representation of space, with only those stimuli that are most salient or behaviorally relevant being strongly represented (Figure 16.2; Gottlieb et al., 1998). Area LIP provides one potential neural substrate for the levels of visual space representation proposed by Niebur and Koch (1997), according to whom visual percep tion does not correspond to a basic retinotopic representation of the visual input, but is instead a complex, prioritized interpretation of the environment (Pisella & Mattingley 2004). The autonomous, sequential selection of salient regions to be explored, overtly or covertly, is postulated to follow the firing rate of populations of neurons in the saliency map, starting with the most salient location and sampling the visual input in decreasing order of salience (Niebur & Koch 1997). The saliency map would thus be a representa tional level from which the pattern of attentional or motor exploration of the visual world is achieved.

Dorsal and Ventral Visual Streams: Historical Review of Their Differ ent Functional Interpretations It is usually acknowledged that the central processing of visual information follows two parallel cortical pathways. From the early occipital areas, an inferotemporal pathway (or ventral stream) processes the objects for visual recognition and identification, and a pari etal pathway (or dorsal stream) makes a spatial (metric) analysis of visual inputs and pre pares goal-directed movements. By oversimplification, one often considers that the ven tral stream is the pathway of “vision for perception” and the dorsal stream the pathway of “vision for action” (Milner & Goodale 1995), overlooking the fact that an extensive frontal-parietal network underlies the visual-spatial and attentional processes necessary for visual perception and awareness (Driver & Mattingley, 1998; Rees, 2001). Initially, a “what” versus “where” dissociation depicted the putative function of these two streams (Figure 16.3; Ungerleider & Mishkin 1982). Most subsequent theoretical formu lations, however, have emphasized a dissociation between detecting the presence of a vi sual stimulus (“conscious,” “explicit,” “perceptual”) and describing its attributes (“what,” “semantic”), on the one hand, and performing simple, automatic motor responses toward a visual stimulus (“unconscious,” “implicit,” “visuomotor,” “pragmatic,” “how”) on the other (see Milner & Goodale, 1995; Jeannerod & Rossetti, (p. 321)

Page 3 of 56

Attentional Disorders

Figure 16.2 Saliency map. Gottlieb et al. (1998) recorded the activity of single neurons in the lateral intraparietal (LIP) of macaques. The receptive field of each neuron was first assessed in a passive task in which visual stimuli were flashed during eye fixation. The responses of the cells were then assessed when stimuli were brought into their receptive field by a saccade. A circular array of eight stimuli remained stably presented from the beginning of the experi ment, so that when the monkey made a saccade to ward the center of the array, the receptive field of the neurons matched the location of one of the array elements. In this first condition, the same stimulus that activated the cell strongly when flashed in the passive condition entered the receptive field but elicited far less neuronal response. In a variant task, only seven symbols of the array were present initially on the screen; the eighth (missing) symbol appeared anew in the receptive field of the cell before to make the saccade toward the center of the array. This time, the neuron responded intensely, showing that its ac tivity was critically dependent on the abrupt onset of the stimulus, which rendered it salient. In another variant, when the monkey maintained a peripheral fixation, a cue appeared that matched one stimulus of the array. Then the monkey made a first saccade to the center of the array (bringing one array stimu lus into the receptive field) and a second saccade to the cued array element. When the cued element was brought into the receptive field under study by the first saccade, the neuron discharge started around the first saccade and continued until after the second saccade (see panel a). In contrast, when an uncued element of the array was brought into the receptive field under study, the neuron did not respond to the irrelevant array stimulus, even though it entered the receptive field by means of the first saccade (see panel b).

Page 4 of 56

Attentional Disorders 1993, Rossetti & Revonsuo, 2000). (p. 322) This evolution in the functional interpreta tion of the dorsal stream emerged from a large body of psychophysical data on vi sual and visual-motor func tions, which became fertile ground for speculating on Figure 16.3 Famous use of the landmark test to the neural basis of percep characterize the function of the dorsal stream and tion–action dissociations distinguish it from the ventral stream function. In (e.g., Milner & Goodale, this test, an object has to be localized with respect to 1995, 2008; Rossetti, 1998; others lying on both sides of it (landmark test). After a lesion to the dorsal visual stream (black area of the Rossetti & Pisella, 2002). monkey brain on the right), the monkeys are unable When human neuropsycholo to find where the food is based on such a spatial gy is considered, the con landmark (hole the closest to the landmark) but re trast between optic ataxia mained able to find the food based on an associated texture to identify (test on the left). The reverse pat (OA) on the one hand and ac tern is observed after a lesion to the ventral visual tion-blindsight and visual ag stream (black area of the monkey brain on the left). nosia on the other repre The perceptual version of the line bisection test (Bisi ach et al., 1998, Fink et al., 2000, Fierro et al., 2000) sents the core argument for is also a landmark test. the suggested perception–ac Lesion study in monkey from Ungerleider & Mishkin, tion division between the 1982. ventral and dorsal anatomi cal streams (Milner & Goodale 1995, 2008; Rossetti & Pisella, 2002). The ventral stream is fed almost exclusively by V1 and organized in a serial hierarchical fashion, whereas the dorsal stream is also fed by the supe rior colliculus and is organized in a more parallel fashion, including numerous shortcuts (Nowak & Bullier, 1997; Rossetti, Pisella, & Pélisson, 2000). Consequently, after a lesion in V1 (actionblindsight), only the dorsal stream remains active (Girard et al., 1991, 1992; see also review in Danckert & Rossetti, 2005). Conversely, OA is a neurological condition encountered after dam age to the PPC or dorsal stream. Behaviorally, action-blindsight and OA seem to be two sides of the same coin: In action-blindsight, the patients remain able to guide actions toward stimuli that they do not see, whereas in OA, patients are impaired to guide actions toward visual stimuli that they see. Similarly, after lesions to the inferior temporal lobe (visual agnosia), patient D.F. could not recognize objects but could nevertheless perform simple reach-and-grasp movements that fitted the location and the object visual properties of size and orientation (Carey et al., 1996; Goodale et al., 1991). These neuropsychological dissociations have provided decisive elements for considering the PPC as a vision-for-action pathway (Milner & Goodale 1995), reinforced by the work of Andersen and Buneo (2002) indicating that the specialized spatial maps of the PPC are involved in the planning of different actions (eye, hand, fingers).

As reviewed later in this chapter, however, the neurological consequences of lesions of the dorsal stream are not restricted to OA, and even OA patients show deficits in the per ceptual domain. Bálint-Holmes syndrome, which arises from bilateral lesions of the PPC extended toward the inferior parietal lobule (IPL) and temporal-parietal junction (TPJ) in the right hemisphere, includes OA but also two symptoms affecting visual perception: uni lateral spatial neglect and simultanagnosia. In these patients, the ventral stream is pre served bilaterally, but visual perception is limited. In patients with neglect, perceptual Page 5 of 56

Attentional Disorders awareness is restricted to the ipsilesional side of space or objects. In patients with simul tanagnosia, perceptual awareness is restricted to a reduced central field of view, or to just one object among others, even when the objects are superimposed in the same loca tion of space (e.g., in the case of overlapping figures). After a lesion to the IPL and TPJ in the left hemisphere, patients may present with Gerstmann’s syndrome, which will not be a focus of this chapter but which also includes a visual perceptual symptom. In addition to dysgraphia, dyscalculia, and finger agnosia, these patients are impaired at determining whether a letter is correctly oriented (leftright confusion). This also suggests a role of the left hemisphere dorsal stream in building oriented (canonical) perceptual representations. Accordingly, a neuroimaging study from Konen and Kastner (2008) has revealed object-selective responses displayed in visual ar eas of both dorsal and ventral streams. Foci of activity in the lateral occipital cortex (ven tral stream) and in the IPS (dorsal stream) have been shown to represent objects inde pendent of viewpoint and size, whereas the responses are viewpoint and size specific in the occipital cortex. Such symptoms, pertaining to Gerstmann’s or Bálint’s syndrome, af fect the perceptual domain after a lesion to the PPC and thereby highlight a necessary in teraction of dorsal and ventral streams for visual perception.

Figure 16.4 Cortical visual representation magnifies central vision. The ventral stream includes multiple specialized representations of central vision through the prism of the processing of different elementary visual attributes, while areas of the dorsal stream mainly represent peripheral vision. Adapted with permission from Wade et al., 2002.

According to an alternative characterization of the functional distinction between ventral and dorsal streams, the ventral stream includes multiple areas specialized for different el ementary visual attributes (e.g., color, texture, shape) mainly processed in central vision, whereas areas of the dorsal stream mainly represent peripheral vision or the entire visual field (Figure 16.4; Wade et al., 2002). An interaction between the ventral and dorsal visu al streams for perception therefore appears necessary as soon as one takes into account the structural organization of the retina (and in particular the distribution of cones and Page 6 of 56

Attentional Disorders rods), which allows a detailed analysis only in the restricted area corresponding to the fovea (Figure 16.5). If the eyes are fixed, the ventral stream will provide information about only a small central part of the visual scene (see Figure 16.5), even though the sub jective experience of most normally sighted individuals is of a coherent, richly detailed world where everything is apprehended simultaneously. This subjective sensation of an instantaneous, (p. 323) global, and detailed perception of the visual scene is an illusion, as hinted at by the phenomenon of “change blindness” in healthy subjects (reviewed in Pisel la & Mattingley, 2004) and provided thanks to mechanisms of “active vision” (Figure 16.6) relying on the PPC (Berman & Colby, 2009).

Role of the Dorsal Stream in Visual Perception: Active Vision

Figure 16.5 The structural organization of the retina (and in particular the distribution of cones and rods) allows a detailed analysis only in the restricted area corresponding to the fovea. If the eyes are fixed, only a small central part of the visual scene will be per ceived with a good acuity.

There are three mechanisms of active vision that rely on the PPC (visual attention, sac cadic eye movements, and visual remapping), as we outline below.

Figure 16.6 A complex triad of processes underlying oculomotor mechanisms, spatial representations, and visual attention are in play for visual perception.

First, shifts of attention without eye displacements (known as covert attention) have been shown to improve spatial resolution and contrast sensitivity in peripheral vision (Carrasco Page 7 of 56

Attentional Disorders et al., 2000; Yeshurun & Carrasco, 1998). Attention can be defined as the neural process es that allow us to prioritize information of interest from a cluttered visual scene while suppressing irrelevant detail. Note that a deficit of visual attention can be expressed as a spatial deficit (p. 324) (omissions of relevant information in contralesional peripheral visu al space, i.e., eye-centered space) or a temporal one (increased time needed for relevant information to be selected by attention and further processed; Husain, 2001). So a spatial bias and a temporal processing impairment may be the two sides of the same coin. The level of interest or relevance of a stimulus can be provided by physical properties that al low it to be distinguished easily from the background (bottom-up or exogenous attention al selection) or by its similarity to a specific target defined by the current task or goal (top-down or endogenous attentional selection). Second, rapid eye movements (known as saccades or overt attention) can bring a new part of the visual scene to the fovea, allowing it to be analyzed with optimal visual acuity. Active ocular exploration of a visual scene (three saccades per second on average) during free viewing has been demonstrated using video-oculography by Yarbus (1967). The glob al image of the visual scene is thus built on progressively through active ocular explo ration, with each new saccade bringing a new component of the visual scene to the fovea to be analyzed precisely by the retina and the visual cortex. Most visual areas in the oc cipital and posterior inferotemporal cortex (ventral stream) are organized with a retinal topography, with a large magnification of the representation of central vision, relative to peripheral vision (see Figure 16.4). At the level of these retinocentric visual representa tions, the components of the visual scene that are successively brought to the fovea by the ocular exploration actually overwrite each other, at each new ocular fixation. These different components are thus analyzed serially by the ventral stream as unlocalized snapshots. A spatial linkage of these different details is necessary for global visual per ception. This spatial linkage (or visual synthesis) of multiple snapshots is the third mechanism nec essary to perceive a large, stable, and coherent visual environment. Brain areas more or less directly related to saccadic eye movements (frontal and parietal eye fields, superior colliculus) demonstrate “oculocentric” (instead of retinocentric) spatial representations acting as “spatial buffers”: The location of successively explored components is stored and displaced in spatial register with the new eye position after each saccade (Figure 16.7; Colby et al., 1995). The PPC is specifically involved in the “remapping” (or “updat ing”) mechanisms compensating for the continuous changes of eye position (Heide et al., 1995), which are crucial to building these dynamic oculocentric representations. Transsaccadic oculocentric remapping processes have been explored using computational mod els (e.g., Anderson & Van Essen, 1987), and demonstrated using specific psychophysical paradigms like the double-step saccade task (Figure 16.8), in which the retinal location of a stimulus has to be maintained in memory and recoded with respect to a new ocular lo cation owing to an intervening saccade. Such paradigms have been used in monkey elec trophysiology (e.g., Colby et al., 1995), human neuroimaging (Medendorp et al., 2003; Merriam et al., 2003, 2007), after chronic lesions to the PPC (Duhamel et al., 1992; Heide Page 8 of 56

Attentional Disorders et al., 1995; Khan et al., 2005a, 2005b; Pisella et al., 2011), and during transcranial mag netic stimulation (TMS) of the PPC (Morris et al., 2007, Van Koningsbruggen et al., 2010).

Figure 16.7 Visual remapping in the lateral intra parietal (LIP) area in monkey. Monkey electrophysiol ogy has described dynamic oculocentric representa tions in which the neuronal response can outlast the duration of a visual stimulus of interest within the retinotopic receptive field, and this “memory” activi ty can be transferred to another neuron to recode the location of the (extinguished) stimulus with re spect to new ocular position. Such neuronal activity has been described in oculomotor centers also known to be crucial for attention, that is, the superi or colliculus (Mays & Sparks 1980), the frontal eye field (Goldberg & Bruce 1990; Tian et al. 2000), and the LIP area. The role of the LIP area appears crucial for visual remapping because it also contains neu rons that activity start in anticipation of a saccade that will bring the location of the extinguished visual stimulus into their receptive field (review in Colby et al., 1995, Gottlieb et al., 1998). From with permission Colby et al., 1995.

Consistent with these putative mechanisms of active vision, the perceptual symptoms of Bálint-Holmes syndrome can be seen as arising from deficits in attentive vision, ocular ex ploration, and global apprehension of visual scenes. One common component of the syn drome is a limited capacity of attentive vision such that patients fail to perceive at any time the totality of items forming a visual scene (leading to “piecemeal” perception of the environment, and a local perceptual bias). Description or copy of images is accordingly composed of elements of the original figure without perception of the whole scene. Pa tients not only explore working space in a disorganized fashion but also may return to scrutinize the same item repeatedly. This “revisiting behavior” may lead, for example, to patients counting more dots than are actually present in a test display (personal observa Page 9 of 56

Attentional Disorders tion of the behavior of patients with Bálint’s syndrome after posterior cortical atrophy). This revisiting behavior has been reported during ocular exploration of visual scenes in patients with unilateral neglect and ascribed to a spatial working memory deficit (Husain et al., 2001). Indeed, a study has demonstrated that patients with parietal neglect show, in addition to their characteristic right attentional bias, a specific deficit of working mem ory for location and not for color or shape (Figure 16.9; (p. 325) Pisella et al., 2004). Pisel la and Mattingley (2004) ascribed this revisiting behavior and spatial working memory deficit to a loss of “visual remapping” or “visual synthesis” due to parietal lobe damage, which will contribute to severe visual neglect syndrome. This association of (1) spatially restricted attention and (2) visual synthesis deficit ap pears as a consistent frame to understand the different expressions of posterior parietal lobe lesions. Attention will be restricted to central vision in bilateral lesions or to the ip silesional (eye-centered) visual field in unilateral lesions, affecting visual perception and action in peripheral vision. Additional deficit of visual synthesis might occur in the entire visual field, producing a disorganization of active vision mechanisms and thereby increas ing the severity of the patient’s restricted field of view (unilateral in neglect and bilateral in simultanagnosia).

Page 10 of 56

Attentional Disorders

Bálint-Holmes Syndrome

Figure 16.8 Double-step saccade task. A, Example of a double-step stimulus with the two targets, A and B, being flashed successively while the gaze is directed to a central fixation point (FP). When both saccades are performed after all of the targets have disap peared, the motor vector of the second saccade (A → B) is different from the retinal vector of the second target (FP → B or A → B’). However, in this condition, the saccade toward position B is achieved correctly both in humans and in animals. There is thus a need to postulate remapping mechanisms allowing the oculomotor system to anticipate the new retinal posi tion B by integrating the displacement on the retina produced by the first saccade toward position A. (Re drawn from Heide & Kömpf, 1997.) B, Results of the double-step task in patients with a left or right pari etal lesion (in the posterior parietal cortex [PPC]) compared with patients with a right frontal lesion (in the frontal eye field [FEF] or prefrontal cortex [PFC]) and controls. It represents the mean absolute error of final eye position (FEP) after double-step trials, plotted separately for the four different stimulus con ditions of the study (R-R, centripetal double-step within the right hemifield; L-L, centripetal doublestep within the left hemifield; R-L, double-step be tween hemifields starting with target on the right; LR, double-step between hemifields starting with tar get on the left). Double-steps with retinospatial disso nance (upper panel) necessitates remapping process es, whereas double-steps with no retinospatial disso nance do not need remapping processes. Significant errors relative to control performance are indicated by asterisks. Note that patients with a parietal lesion exhibit errors specific to double-step with retinospa tial dissonance (contrary to patients with frontal le sion): patients a with right PPC lesion (black bars) are impaired for both between-hemifield doublesteps and the L-L stimulus, whereas patients with a right PPC lesion are only impaired for between-hemi fields stimuli. Standard deviations of the control group are indicated by vertical lines. Results in pa tients with a left prefrontal cortex lesion are not shown in this diagram because they were not signifi cantly different from the control group. (Reproduced from Heide et al., 1995, with permission.)

Page 11 of 56

Attentional Disorders The syndrome reported by Reszo Bálint in 1909 and later, in other terms, by Holmes (1918) is a clinical entity that combines variously a set of complex spatial behavior disor ders following bilateral damage (p. 326) to the occipital-parietal junction (dorsal stream). Both diversity of terminology used in the literature and bias in clinical descriptions, which often reflect a particular opinion of the authors on underlying mechanisms, add to the dif ficulty in describing and comprehending this rare and devastating syndrome. Despite these flaws, Bálint-Holmes syndrome can be identified at the bedside examination and al lows a robust anticipation of lesion localization. Initially, Bálint’s syndrome was described as a triad, composed of the following: • Optische Ataxie, a defect of visually guided hand movements characterized by spatial errors when the patient uses the contralesional hand and/or reaches objects in periph eral vision in the contralesional visual field • Räumliche Storung der Aufmerksamkeit—described as a lateralized spatial attention disorder in which attention in the extrapersonal space is oriented to the right of the body midline and in which stimuli lying to the left of fixation are neglected—corre sponding to what is now called unilateral neglect • Seelenlähmung des Schauens—described as an extreme restriction of visual atten tion, such that only one object is seen at a time—which corresponds to what Luria called later “disorder of simultaneous perception” (1959), following the term “simul tanagnosia” coined by Wolpert in 1924. This set of symptoms has often been translated as “psychic paralysis of gaze” to highlight that although the patients exhibit no visual field defect and no oculomotor paralysis, they manifests no attention for visual events appearing in peripheral vision.

Page 12 of 56

Attentional Disorders

Figure 16.9 Spatial working memory deficit in the entire visual space in parietal neglect. Added to the attentional left-right gradient, a deficient spatial working memory for the whole visual space is evi denced by the difference between conditions of change detection with (1 second of interstimuli inter val; black lines) and without (white lines) delay, in parietal neglect only. Note that the location change always occurred in one object only, within a vertical quadrant (= column of the grid), as illustrated. Adapted with permission from Pisella et al., 2004.

The related “visual disorientation” syndrome described a few years later by Holmes (1918; Smith & Holmes, 1916) in soldier patients with bilateral parietal lesions (see also Inouye, 1900, cited by Rizzo & Vecera, 2002) highlighted a particular oculomotor disor der: wandering of gaze in search for peripheral objects and a difficulty to maintain fixa tion. This eye movement disorganization, later (p. 327) labeled gaze ataxia or apraxia (reviewed in Vallar, 2007), was accompanied by a complete deficit for visually guided reach-to-grasp movements, even when performed in central vision. The comparison between Bálint’s and Holmes’ descriptions of the consequences of le sions to the posterior parietal cortex put forward three differences that still correspond to debated questions and that constitute the plan of this chapter. First, the lateralized aspect of the behavioral deficit for left visual space in Bálint’s syn drome (corresponding to räumliche Störung der Aufmerksamkeit) does not appear in the patients’ description of Holmes, suggesting that the component known nowadays as uni lateral neglect might be easily dissociated from the two others components of the triad by its specific right-hemispheric localization. Visual extinction is often considered as minimal severity of neglect, and prevalence for right-hemispheric lesions is reported for visual ex tinction as well as for neglect (Becker & Karnath, 2007). Nevertheless, their dissociation has been shown clinically (Karnath et al., 2003) and experimentally: Contralesional visual Page 13 of 56

Attentional Disorders extinction is symmetrically linked to the function of the right or left superior parietal lob ules (Hilgetag et al., 2001; Pascual-Leone, 1994), whereas some aspects of neglect (espe cially those concerned with visual scanning; Ashbridge et al., 1997; Ellison et al., 2004; Fierro et al., 2000; Fink et al., 2000; Muggleton et al., 2008) are distinct from visual ex tinction and specifically linked to the right inferior parietal cortex (and superior temporal cortex). Furthermore, the recent literature tends to highlight the contribution of nonlater alized deficits to neglect syndrome, that is, deficits specifically linked to right-hemispher ic damage but nonspatially restricted to the contralesional visual space, like sustained at tention or spatial working memory (Husain, 2001, 2008; Husain & Rorden, 2003; Malho tra et al., 2005, 2009; Pisella et al., 2004; Robertson, 1989). The dissociation between lat eralized and nonlateralized deficits following parietal damage are re-explored and rede fined in this chapter, notably in the section devoted to neglect, discussed with respect to visual extinction. A second difference between Bálint and Holmes is their controversial interpretation of the visual-manual deficits of OA. Optische Ataxie was interpreted by Bálint and further by Garcin et al. (1967) and Perenin and Vighetto (1988) as a specific interruption of visual projections to the hand motor center. For Holmes, these visual-manual errors simply re flected a general visual disorientation considered as basically visual (resulting from a reti nal or an extraocular muscle position sense deficit). The debate between a visual-spatial or a specifically visual-manual nature of OA deficits has been renewed by recent data showing subclinical saccadic and attentional deficits in patients with OA (reviewed in Pisella et al., 2009). In this chapter, we explore how these attentional deficits can be dis tinguished from those of neglect patients. A third difference is the presentation of the eye movement troubles as causing the deficit by (p. 328) Holmes, whereas they were considered consequences of higher level attention al deficits by Bálint (Seelenlähmung des Schauens most often translated in English by “psychic paralysis of gaze”). This question corresponds to the conflicting “intentional” and “attentional” functional views of the parietal cortex (Andersen & Buneo, 2002; Colby & Goldberg, 1999). For the former, the PPC is anatomically segregated into regions per mitting the planning of different types of movements (reach, grasp, saccade), whereas for the latter, the different functional regions represent locations of interest of external and internal space with respect to different reference frames.

Page 14 of 56

Attentional Disorders

Figure 16.10 Unilateral neglect. Left, Unilateral ne glect may behaviorally manifest as omission to con sider the left space of the body (e.g., while shaving), of objects (e.g., while preparing an apple pie; Rode et al., 2007a), or of mental images (e.g., while mentally exploring a map of France; Rode et al., 1995, 2007b). Right, Unilateral neglect syndrome is more classical ly diagnosed using a series of paper-and-pencil tests, which should include a drawing from memory (e.g., daisy), a drawing copy (e.g., the bicycle), a cancella tion task in which each object of a visual scene must be individuated as if they were to be counted (e.g., line cancellation), the midline judgment in its motor (line bisection) or perceptual (landmark test) version and a spontaneous writing test on a white sheet.

This chapter reviews historical and recent arguments that have been advanced in the context of these three debates, with a first section on neglect, a second on OA, and a third on the less defined third component of the Bálint’s syndrome: psychic paralysis of gaze. Visual extinction is advocated in each of these three sections. This clinical review of attentional disorders should allow us, through the patients’ behavioral descriptions, to start to delineate theoretically and neurologically the concepts of selective versus sus tained attention, object versus spatial attention, saliency versus oculomotor maps, and ex ogenous versus endogenous attention. A conceptual segregation, different from Bálint’s, of these symptoms consecutive to lesions of the PPC will be progressively introduced and finally proposed based on the recent advances on posterior parietal functional organiza tion made through psychophysical assessments of patients and neuroimaging data (see Figure 16.1).

Unilateral Neglect Basic Arguments for Dissociation Between Neglect and Extinction Unilateral neglect is defined by a set of symptoms in which the left part of space or ob jects is not explicitly behaviorally considered. This lateralized definition (left-sided ne glect instead of contralesional neglect), and its global distinction with contralesional visu al extinction, is acknowledged as soon as one considers that only the neglect symptoms (and not the extinction symptoms) appear in the most complex, typically human, types of Page 15 of 56

Attentional Disorders visual-spatial behavior like ecological (Figure 16.10, left panel; e.g., make up or shave on ly the right side of the face, prepare an apple pie, represent mentally a well-known space, art) or paper-and-pencil (see Figure 16.10, right panel; e.g., draw, copy, enumerate, evalu ate lengths between left and right spatial positions, write on a white sheet) tasks.

Figure 16.11 Percentage of correct detection of leftsided targets. Histograms are plotted as a function of trial type (single versus double presentation) for each of the seven conditions illustrated below (each panel shows the relative locations of visual stimuli— targets and fixation cross): fixate left, fixate center, and fixate right; fixate right-left target at 10.8° or 21.6°; fixate left-right target at 10.8° or 21.6°. The cross disappeared 50 ms before target onset. The pattern of performance suggests extinction in eyecentered coordinates. Reprinted with permission from Mattingley et al. Copyright © 2000 Routledge.

However, it is more classical to distinguish unilateral neglect from visual extinction based on a more experimental subtlety, the latter being a failure to report an object located con tralesionally only when it competes for attentional resources with an object located ipsile sionally. Critically, report of contralesional items presented in isolation should be normal (or better than for simultaneous stimuli) in visual extinction. Visual extinction is thus revealed in conditions of brief presentation and attentional competition, whereas vi (p. 329)

sual neglect refers to loss of awareness of a contralesional stimulus even when it is pre sented alone and permanently, in free gaze conditions. Note that the use of the term “contralesionally” rather than “in the contralesional visual field” in the above definitions reveals the lack of acknowledgment of the reference frame in which these deficits can occur. Even if visual extinction is usually tested between visual fields and strictly eye-centered pattern has been described (Figure 16.11; Mattingley et al.,

Page 16 of 56

Attentional Disorders 2000), Làdavas (1990) has reported cases in which an item in the ipsilesional field can be extinguished by a more eccentric one (i.e., allo centric extinction). Note that this is still consistent with the notion of a left-right gra dient of salience or competi tive strength across the en tire visual field (Figure Figure 16.12 Effect of adaptation to a rightward 16.12; see Figure 16.9). In prismatic deviation (10°, same as in Rossetti et al., cross-modal extinction be 1998) on the attentional left-right gradient in a pa tient with extinction. Attentional gradient was mea tween visual stimulus near sured by reaction time (RT, in milliseconds) to detect the hand and tactile stimulus visual targets presented at varied eccentricities (pix on the other hand, a limbels) in the right and left visual fields. When the tar centered pattern revealed by get was detected, the subject had to respond by lift ing the finger from a tactile screen. Two sessions crossing the hands has al were performed before (pre) and two sessions after lowed investigators to disso (post) prismatic adaptation. The ANOVA showed no ciate egocentric and limbsignificant main effect of prisms (pre versus post: centered spatial coordinates F(1,154) = 2.3; p = 0.12), a significant main effect of visual field (left versus right: F(1,154) = 94; p < (Bartolomeo et al., 2004). 0.01), and a trend of interaction between prisms and Neglect and extinction have visual field (p = 0.10), which allows us to perform in common this possible oc planned comparisons: pre versus post was significant in the right visual field (p < 0.05) and not in the left currence (p. 330) in multiple visual field (p = 0.94). The effect, therefore, appears reference frames: objectas a decrease of the ipsilesional hyperattentional based versus space-based, bias, resulting in a more flat/balanced attentional allocentric versus egocentric gradient. (in body space or external space: eye-centered; Egly et al., 1994; Hillis & Caramazza, 1991; Ota et al., 2001; Riddoch et al., 2010; Vuilleumier & Rafal, 2000; but see the review by Driver, 1999, for interpretation as a com mon attentional reference frame), and in perceptual (in multiple sensory modalities) or motor domains (motor extinction: Hillis et al., 2006; motor neglect: Mattingley et al., 1998; see reviews in Heilman, 2004; Husain et al., 2000).

Page 17 of 56

Attentional Disorders

Figure 16.13 A patient with a right focal posterior parietal infarct. Data indicated that the patient had no clinical neglect (assessed by classic neuropsycho logical paper-and-pencil tests like line bisection, drawing, and cancellation) nor clinical extinction (tested with simultaneous dots flashed in the right and left visual fields while fixation to a central cross; left panel, dark gray bars). However, a deficit of covert attention in the contralesional visual field was revealed as an extinction pattern when letters were flashed instead of dots (middle panel, light gray bars) : the deficit of detection of the letter presented in the left visual field appeared only when it was in compe tition with a letter in the right visual field (bilateral presentation trials). The contralesional attentional deficit was worsen and manifested as a neglect pat tern when letter identification (right panel, black bars) was required instead of simple detection (left, right, or both): the performance to report the letter presented in the left visual field was affected not on ly in bilateral presentations but also in single presen tation.

Although the literature has treated extinction as a single phenomenon (contrary to ne glect), this may not be the case and may explain why, as for neglect, the anatomical site of visual extinction is still debated. Both the tests used for diagnosis and the condition in which they are tested are crucial. For example, it seems that the processes are different whether the stimuli are presented at short or wide eccentricities (Riddoch et al., 2010). Extinction is classically tested with stimuli presented bilaterally at relatively short eccen tricity (perhaps to ensure reasonable report of single items in the contralesional visual field; Riddoch et al., 2010) and has been shown to benefit from similarly based groupings (Mattingley et al., 1997). However, the reverse effect of similarity has been observed with items presented at wide eccentricities. Riddoch et al. (2010) argue that at far separations participants need to select both items serially, with attention switched from one location to another. This may also be the case when one increases task difficulty (Figure 16.13) or attentional demand. In such conditions, extinction might then arise from a limitation in short-term visual memory rather than from competition for selection between stimuli that are available in parallel. Given that Pisella et al. (2004) have shown that spatial working memory deficit is a nonlateralized component of parietal neglect, this makes the frontier between patients considered as exhibiting extinction or neglect rather confusing.

Page 18 of 56

Attentional Disorders As a consequence, visual extinction is often considered a mild form of visual neglect. Di Pellegrino et al. (1998) have demonstrated in a left-sided extinction patient that a stimu lus presented in the left visual field can be extinguished by a right-sided one even if the left-sided stimulus was presented several hundred milliseconds before the right one. Sim ilar default contralesional disadvantage, even when there is no initial right-sided stimulus for attention to engage upon, has been revealed through temporal order judgment para digm by Rorden et al. (1997). Visual extinction, after a lesion to the SPL-IPS network, is therefore better defined as a deficit of visual saliency for stimuli in the contralesional vi sual field than as a deficit appearing only in the condition of attentional competition of ip silesional visual stimulation. As illustrated in Figure 16.13, extinction and neglect, as test ed with dots in a simple environment, can therefore be considered the gradual expres sions of a same attentional deficit (ipsilesional lateralized (p. 331) bias) because they can both be expressed by the same patient, depending on the task difficulty. In sum, these experimental conditions using dots are unsatisfactory to distinguish be tween visual extinction and neglect, whereas their clinical distinction in terms of handi cap and recovery and with paper-and-pencil tasks is almost intuitive (as mentioned above). Alternatively, the presence of nonlateralized components, in addition to the com mon lateralized bias of attention, could be used as criteria to distinguish “clinical” unilat eral neglect from visual extinction.

Lateralized and Nonlateralized Deficits Within the Neglect Syndrome The clinical prevalence of spatial neglect for a right-hemispheric lesion is supported by converging arguments for a specialization of the right hemisphere for spatial cognition and attention throughout the whole visual space, as already suggested by Kinsbourne (1993). Such representation of the whole space in the right IPL is crucially used for com plex visual-spatial tasks that have to be performed throughout the visual field because they require a visual synthesis of detailed snapshots (Malhotra et al., 2009), such as the “landmark” tests that require the middle between two objects to be mentally defined in order to judge whether a landmark object is in the middle or closer to the right or to the left object (illustrated in Figure 16.3; Bisiach et al., 1998; Fierro et al., 2000; Fink et al., 2000; Ungerleider & Mishkin 1992). Accordingly, the recent literature on neglect has highlighted that after right-hemispheric damage to IPL (Mort et al., 2003; Vallar & Perani, 1986), neglect patients present with deficits of visual space exploration and integration that are not restricted to the contralesional hemifield (Husain, 2008; Husain et al., 2001; Husain & Rorden, 2003; Kennard et al., 2005; Pisella et al., 2004; Wojciulik et al., 2001). Pisella and Mattingley (2004) have postulated that this specific representation of the whole space within the human IPL may be a privileged map in which oculocentric remap ping processes operate, thereby allowing coherent visual synthesis. Human studies have revealed implication of both occipital (Merriam et al., 2007) and parietal (Heide et al., 1995; Medendorp et al., 2003; Morris et al., 2007) lobes in visual remapping, with a domi nance of the right PPC (Heide & Kömpf, 1997; Heide et al., 2001, Kennard et al., 2005; Malhotra et al., 2009; Mannan et al., 2005; Pisella et al., 2011; van Koningsbruggen et al., 2010). Heide et al. (1995) have provided a neuropsychological assessment of the brain re Page 19 of 56

Attentional Disorders gions specifically involved in remapping mechanisms using the double-step saccadic para digm with four combinations of saccades directions in different groups of stroke patients. Patients with left PPC lesions were impaired when the initial saccade was made toward the right, followed by a second saccade toward the left (right-left condition), but not in the condition of two successive rightward saccades. Patients with right PPC lesions were impaired in left-right and left-left conditions, and also in right-left condition (only the right-right combination was correctly performed; see Figure 16.8). As reviewed in Pisella and Mattingley (2004), this asymmetry in remapping impairment matches the clinical consequences of lesions of the human IPL and is probably due to an asymmetry of visual space representation between the two hemispheres. Later, Heide and Kömpf (1997) wrote about their study (Heide et al., 1995): “our data confirm the key role of the PPC in the analysis of visual space with a dominance of the right hemisphere” (p. 166), and provided new information on the lesions and symptoms of their patients’ groups: the focus of PPC lesions located “in the inferior parietal lobule along the border between the angular and supramarginal gyrus, extending cranially toward the intraparietal sulcus, caudally to the temporo-parietal junction, and posteriorly into the angular gyrus” (p. 158). Compatible with this lesion site, patients of the right PPC lesion group in the study of Heide et al. (1995) presented with symptoms of hemineglect. Furthermore, their deficit in the doublestep saccade task did correlate with patients’ impairment in copying Rey’s complex figure (Figure 16.14), but not with other tests measuring severity of left hemineglect (Heide & Kömpf, 1997). Accordingly, Pisella and Mattingley (2004) have suggested that an impair ment of remapping processes may contribute to a series of symptoms that pertain to uni lateral visual neglect syndrome and that are unexplained by the attentional hypothesis alone (the ipsilesional attentional bias), such as revisiting, spatial transpositions, and dis organization in the whole visual field. The severity of neglect might depend on two disso ciated factors: 1) the strength of the ipsilesional attentional bias, and 2) the severity of non-lateralized attentional deficits.

Page 20 of 56

Attentional Disorders

Figure 16.14 Spatial transpositions on the copy of Rey figure following right posterior parietal lesion. In his copy, the patient with neglect (bottom panel) not only omits most elements of the left side of the figure but also inappropriately adds to the right side some elements pertaining to the left side (Rode et al., 2007a). The patient with constructional apraxia with out neglect (upper panel) copies almost all the figure components but exhibits errors in their relative local ization (Heide & Kömpf, 1997). Another patient with constructional apraxia and neglect (unpublished) had to search for the target (circle), which normally easi ly “pops up” among the distracters (squares). The lines represent the continuous eye position recorded until the patient provided his response (target present or absent). As shown by the ocular tracking, both patients with constructional apraxia and neglect showed much revisiting behavior during their visual search, with lack of exploration of the left part of the visual scene exhibited in addition in the neglect pa tient.

Another attempt to account for the prevalence of visual neglect following right-hemi spheric lesion has highlighted its specialization in “nonspatial” processes that are critical for visual selection, such as sustained attention (or arousal; Robertson, 1989). In this re spect, the review of Husain (2001) on the nonspatial temporal deficits associated with ne glect is of prime interest. Husain et al. (1997) have shown (p. 332) that lesions of either the frontal lobe (4), the parietal lobe (3), or the basal ganglia (1) causing an ipsilesional attentional bias are also associated with a limited-capacity visual processing system caus ing abnormal visual processing over time between letters presented at the same location in space (lengthening of the “attentional blink” in central vision tested by the Rapid Seri al Visual Presentation paradigm; Broadbent & Broadbent, 1987; Raymond et al., 1992). As reviewed in Husain (2001), a later study by di Pelligrino et al. (1998) suggested that this deficit in time and the spatial bias were the two sides of the same coin by showing in a patient with left-sided visual extinction that the attentional blink was lengthened when Page 21 of 56

Attentional Disorders stimuli were presented in the left visual field but within normal range when stimuli were presented in the right visual field. As a conclusion, the temporal deficits of attention in central vision (such as lengthening of attentional blink) can appear as a deficit of spatial attention and do not seem to be related specifically to neglect (because it also occurs in patients with extinction), nor anatomically to the posterior parietal lobe. In contrast, the deficit of visual spatial working memory (visual synthesis) described in neglect but also more recently in constructional apraxia has been ascribed to be specific to right IPL le sions (Pisella et al., 2004; Russell et al., 2010; see Figure 16.9). In this context, the cru cial contribution of visual remapping impairment in severe neglect (and not in left-sided extinction) proposed by Pisella and Mattingley (2004) is not questioned by the studies that have shown that spatial working memory deficit of patients with neglect also ap pears in a vertical display (Ferber & Danckert, 2006; Malhotra et al., 2005). Indeed, the right IPL is conceived to be able to remap (and establish relationships between) locations throughout the whole visual field (Pisella et al., 2004, 2011; see Figure 16.9). The lateral ized remapping impairments revealed by Heide et al. (1995) but also more recently by van Koningsbruggen et al. (2010) result from the combination of the right IPL (p. 333) special ization for space and the ipsilesional bias of attention, which additionally may delay or de crease the representation of contralesional visual stimuli. In contrast, remapping impair ments may express as deficient visual synthesis and revisiting behavior without lateral ized spatial bias in several nonlateralized syndromes like constructional apraxia (see Fig ure 16.14), which appears as persisting visual-spatial disorder following right parietal damage when neglect lateralized bias has resolved (Russell et al., 2010).

A Recent View of Posterior Parietal Cortex Organization and Unilater al Neglect Syndrome The standard view of the PPC and Bálint’s syndrome in the context of the predominant model of Milner and Goodale (1995) was that the most superior part of the PPC (dorsal stream) was devoted to action (with OA as illustrative example of specific visual-motor deficit) and the most inferior part of the PPC was more intermediate between vision for action and vision for perception, with unilateral neglect as illustrative example. Recent converging evidence tends to distinguish behaviorally but also anatomically the lateral ized and nonlateralized components of unilateral neglect, linking the well-known lateral ized bias of neglect to the dysfunction of the superior parietal-frontal network and the newly defined deficits nonlateralized in space to the inferior parietal-frontal network and TPJ. Indeed, the consequences of PPC lesions in humans suggest that, within the PPC, symmetrical and asymmetrical (right-hemispheric dominant) visual-spatial maps coexist. TMS applied unilaterally on the SPL symmetrically causes in humans contralesional visu al extinction (Hilgetag et al., 2001; Pascual-Leone et al., 1994): In bilateral crossed-hemi field visual presentation of two simultaneous objects, only the ipsilesional one is reported. The spatial representations of the SPL, whose damage potentially induces contralesional OA and contralesional visual extinction, concern egocentric (eye-centered) localization in the contralesional visual field (Blangero et al., 2010a; see Figure 16.11) and do not exhib it right-hemispheric dominance (Blangero et al., 2010a; Hilgetag et al., 2001; PascualPage 22 of 56

Attentional Disorders Leone et al., 1994). In contrast, hemineglect for the right space is rare and usually is found in people who have an unusual right-hemispheric lateralization of language. Ac cording to the rightward bias in midline judgment characteristic of more classical leftsided visual neglect after a unilateral (right) lesion, brain imaging (Fink et al., 2000) and TMS (Fierro et al., 2000) studies have revealed that this landmark (allocentric localiza tion: perceptual line bisection) task activates in humans a specialized and lateralized net work including the right IPL and the left cerebellum. The right IPL (IPL is used here to distinguish it from SPL and designates a large functional region in the right hemisphere that also includes the superior temporal gyrus and the TPJ) is also specifically involved in processes such as sustaining and reorienting attention to spatial locations in the whole vi sual field, useful in visual search tasks (Ashbridge et al., 1997; Corbetta et al., 2000, 2005; Ellison et al., 2004; Malhotra et al., 2009; Mannan et al., 2005; Muggleton et al., 2008; Schulman et al., 2007). Because neglect patients by definition (1) have a deficit of atten tion for contralesional space and (2) fail in visual scanning tasks of line bisection and can cellation, it seems that visual neglect syndrome is a combination of left visual extinction (produced by damage of the right SPL) and visual synthesis deficits in the entire visual field caused by damage to the right IPL (Pisella & Mattingley, 2004). Accordingly, Pisella et al. (2004) have shown a combination of left-right attentional gradient and spatial work ing memory in the entire visual space in neglect consecutive to parietal damage (see Fig ure 16.9). One can further speculate (model on Figure 16.1) that bilateral lesions of the SPL in hu mans may cause the Bálint’s symmetrical shrinkage of attention, Seelenlähmung des Schauens, later called simultanagnosia (Luria, 1959; Wolpert, 1924,), in which the patient reports only one object among two in a symmetrical way, that is, not systematically the right or the left one. Accordingly, the symptoms of simultanagnosia may simply appear as a bilateral visual extinction (Humphreys et al., 1994), without more severe spatial disor ganization. This might correspond to the differential severity between the handicap con secutive to bilateral SPL lesion after a stroke, in which the patients exhibit bilateral OA and subclinical simultanagnosia only displayed as a shrinking of visual attention (e.g., pa tient AT: Michel & Henaff, 2004; patient IG, personal observation), and the larger handi cap consecutive to posterior cortical atrophy (Benson’s syndrome). In Benson’s syn drome, simultanagnosia may be worsened by an extent of the damage toward the right IPL, thereby affecting the maps in which the whole space is represented and in which the remapping mechanisms may specifically operate in order to integrate visual information collected via multiple snapshots into a (p. 334) global concept. This extent of neural dam age toward the right IPL would also be the crucial parameter to explain the different severity between left-sided extinction (without clinical neglect syndrome) after a unilater al SPL lesion and neglect syndrome, including extinction, deficit in landmark tests, and remapping impairments or defect of visual synthesis (see Figure 16.1). The observation of a young patient with an extremely focal lesion of the right IPL caused by a steel nut penetrating his brain during an explosion (Patterson & Zangwill, 1944, Case 1) is a direct argument for the model we propose in this chapter (see Figure 16.1). The right IPL is the region the most commonly associated with visual neglect (Mort et al., Page 23 of 56

Attentional Disorders 2003). Patterson & Zangwill (1944) describe a patient with left-sided extinction and “a complex disorder affecting perception, appreciation and reproduction of spatial relation ships in the central visual field of vision” (p. 337). This “piecemeal approach” was associ ated with a lack of “any real grasp of the object as a whole” (p. 342) that “could be de fined as a fragmentation of the visual contents with deficient synthesis” (p. 356). This de fect of visual synthesis in central vision was qualitatively similar to the consequences of bilateral lesion of the posterior parietal lobe, which causes simultanagnosia, a nonlateral ized extinction restricting visual perception to seeing only one object a time, even though another overlapping object may occupy the same location in space (Humphreys et al., 1994; Husain, 2001; Luria, 1959;). It is striking that the behavioral impact of this focal unilateral lesion was almost equivalent to a full Bálint’s syndrome with ipsilesional biases and extreme restriction of attention. In addition, a defect of establishing spatial relation ships and integration of visual snapshots as a whole was described (that has been related to spatial working memory or visual remapping impairments in the whole visual field; Dri ver & Husain, 2002; Husain et al., 2001; Pisella et al., 2004; Pisella & Mattingley, 2004). According to neuropsychological and neuroimaging observations, Corbetta et al. (2005) developed a model based on the notion that the right inferior parietal-frontal network is activated by unpredicted visual events throughout the whole visual field and that the su perior parietal-frontal eye field network of attention influences the top-down stimulus–re sponse selection in the contralesional visual space. In their model, the lesion of the right inferior parietal-frontal network, via a “circuit-breaking signal,” decreases activity in the ipsilateral superior parietal-frontal eye field network and consequently biases the activity in the occipital visual cortex toward the ipsilesional visual field. In other words, this mod el predicts that the lesion of the right IPL would, in addition to directly damaging the in ferior parietal-frontal network, indirectly decrease functional activity in the right SPL and thereby cause ipsilesional attentional bias and left-sided visual extinction (arrow from the right IPL to the right SPL on the model of Figure 16.1). The presence of lateralized attentional bias in the patient described by Patterson and Zangwill (1944) may alternatively be understood within the general framework of inter hemispheric balance. It appears that any lesion affecting directly or indirectly the pari etal-frontal networks of attention in an asymmetrical way will cause an imbalance ex pressed as a lateralized bias. According to this more general assumption, a lateralized bias (left-right gradient of attention) is observed, for example, after a lesion to the basal ganglia but without spatial working memory deficit, which is more specific to lesions of the right IPL (Pisella et al., 2004; see Figure 16.9). Even less specifically, a spatial bias to direct attention (or a limited-capacity visual processing system) may presumably occur because of unopposed contralesional hemisphere activity. To sum up, in our model (see Figure 16.1), visual extinction is a sensitive test to reveal the ipsilesional attentional bias, which is the lateralized component of neglect. This ipsile sional attentional bias (or left-right attentional gradient) is caused by the dysfunction of the SPL directly (lesion of the SPL) or indirectly (the lesion of the right IPL causes an im balance between the right and the left SPL; Corbetta et al., 2005), or by a lesion else where causing similar imbalance in the dorsal parietal-frontal networks between the right Page 24 of 56

Attentional Disorders and the left hemispheres. It seems that prismatic adaptation, and most other treatments of neglect, improve this lateralized component (common to neglect and extinction pa tients, whatever the lesion causing the attentional gradient; see Figure 16.12; but see Striemer et al., 2008), probably by acting on this imbalance (Luauté et al., 2006; Pisella et al., 2006b). This is also an explanation of the paradoxical improvement of neglect (follow ing right-hemisphere damage) by subsequent damage of the left hemisphere (Sprague ef fect; see Weddell, 2004, for a recent reference). The other (nonlateralized) components of neglect rely on the right IPL and its general function of visual-spatial synthesis and may be more resistant to treatments (but see Rode et al., 2006; Schindler et al., 2009).

(p. 335)

Optic Ataxia

Reaching Errors Might Be Explained by Attentional Deficit The basic feature of OA is a deficit for reach and grasp to a visual target in peripheral vi sion that cannot be considered purely visual (the patient can see and describe the visual world, and binocular vision is unaffected), proprioceptive (the patient can match joint an gles between the two arms), or motor (the patient can move the two arms freely and can usually reach and grasp in central vision) (Blangero et al., 2007; Garcin et al., 1967; Perenin & Vighetto, 1988). The two bilateral OA patients who have been tested most extensively in the last decade (IG and AT; see Khan et al., 2005b, for detailed description of their lesion) initially pre sented with associated Bàlint’s symptoms (simultanagnosia but no neglect) causing deficits in central vision. For example, patient AT showed a limited deficit when grasping unfamiliar objects in central vision during the early stage of her disease, when more con comitant Bàlint’s symptoms were present (Jeannerod et al., 1994). Subsequent studies of grasping in these patients used only peripheral object presentations (e.g., Milner et al., 2001, 2003). The reaching study conducted in these two bilateral patients (after they re covered from initial Bálint’s symptoms) also typically showed that accuracy was essential ly normal in central vision, whereas errors increased dramatically with target eccentricity (Milner et al., 1999, 2003; Rossetti et al., 2005). In light of these observations, we suggest that patients with bilateral lesions suffering from Bàlint’s syndrome may be impaired in central vision owing to additional visual-spatial deficits such as simultanagnosia. As re ported by IG herself soon after her stroke, simultanagnosia prevents “the concomitant viewing of the hand and the target,” which prevents motor execution from any control and can cause reach-and-grasp errors in central vision. Accordingly, patients with reach ing deficits in central vision are shown to be more accurate when they are prevented from visual feedback about the hand (open-loop condition) during the execution of the movement (Jackson et al., 2005; Jakobson et al., 1991; Buxbaum & Coslett, 1998). This demonstrates that in these patients there is a clear interference between visual informa tion from the target guiding the action and visual information from the hand position pro viding online feedback, which is resolved by removing the visual feedback of the hand. Such simultanagnosia, described by Bálint (1909) as difficulty looking at an object other Page 25 of 56

Attentional Disorders than the one he was fixating, can also be advocated to explain the “magnetic misreach ing” behavior (Carey et al., 1997), in which the patient can only reach toward where he is fixating, and that we have had the opportunity to observe in patient CF, in the acute phase when Bálint’s syndrome was exhibited. For patient IG, this perceptual deficit in the acute phase is mentioned in the Pisella et al. (2000) study, where it is written that online reaching control was tested only when IG re covered from her initial simultanagnosia. Even if clinical signs of simultanagnosia had re solved after a few months (the patient had retrieved the full perceptual view of her hand, whereas she initially had perceived only two fingers at the same time, and could report bilateral stimulation even when tested on a computer with simultaneous presentation of small dots), a complaint remained, for example, about pouring water into a glass without touching it. For patient AT, who exhibited a larger lesion with more extent toward the oc cipital lobule and IPL, a full report of the remaining perceptual deficits has been provided by Michel and Henaff (2004) and summed up as a concentric shrinking of the attentional field, impairing tasks, such as mazes, in which global attention is required.

Page 26 of 56

Attentional Disorders

Figure 16.15 Field effect and hand effect. A, Clinical examination of optic ataxia patients (Vighetto, 1980). The clinician stands behind the patient and asks him to fixate straight ahead. The clinician then succes sively presents in the two fields target objects to be grasped with the two hands. This patient with right posterior parietal cortex (PPC) damage exhibits a gross deficit when reaching to left-sided objects (con tralesional field effect) with his left hand (contrale sional hand effect). Once the object has been missed, he exhibits exploratory movements comparable to blind subjects. This poor visual-motor performance can be contrasted with the ability of the patient to describe the object and his normal ability to reach to central targets. B, Histograms of the mean absolute errors when pointing to visual targets in the dark (Bl angero et al., 2007). Columns represent the means and standard deviations of the end points errors (in millimeters) for each of four combinations of hemi fields and pointing hands for the two patients Can and OK presenting unilateral optic ataxia. The dotted line shows the mean normal performance in the same condition. C, Illustration of the four conditions of on line motor control tested and of the performance of patient CF with left optic ataxia (values: percentage of movements corrected in response to the target jump in the four conditions of hand and jump direc tion). The movements were made by the healthy or by the ataxic hand, and the target could jump from central vision toward either the patient’s healthy or ataxic visual field. Reprinted from Cortex, 44(5), Blangero et al., “A hand and a field effect in on-line motor control in uni lateral optic ataxia,” 560–568, Copyright (2008), with permission from Elsevier.

After unilateral focal lesions of the SPL, cases of “pure” OA have been described, that is, in absence of clinical perceptual and oculomotor symptoms (e.g., Garcin et al., 1967), even in the acute stage. These descriptions have provided arguments for a pure visionfor-action pathway involving the SPL-IPS network (Milner & Goodale, 1995). When OA is observed after a unilateral lesion, the symptoms predominate on the contralesional pe ripheral visual field (field effect), usually combined with a hand effect, the use of the con Page 27 of 56

Attentional Disorders tralesional hand causing additional errors throughout the whole visual field, especially when vision of the hand is not provided (Figure 16.15A, B; Blangero et al., 2007; Brou chon et al., 1986; Vighetto, 1980). This combination of hand and field effect is observed both for reaching to stationary targets and for online motor control in response to a tar get jump (Figure 16.15C; Blangero et al., 2008,); the impairment of manual automatic corrections is linked both to the deficient updating of target location when it jumps in the contralesional visual field (field effect) and to the deficient monitoring of the contralesion al hand location within the contralesional space at movement start and during ongoing movement (hand effect). This combination of hand and field effects has been considered characteristic of a deficit in visual-manual (p. 336) transformation. However, recent exper imental investigation of the field effect has revealed that this deficit specifically corre sponds to the impairment of a visual-spatial (not only visual-manual) transformation defin ing a contralesional target location in an eye-centered reference frame (Blangero et al., 2010a; Dijkerman et al., 2006; Gaveau et al., 2008; Khan et al., 2005a). That is, there is a visual-spatial module coding locations in the eye-centered reference frame commonly used for eye and hand movements within the SPL (Figure 16.16; Gaveau et al., 2008). This common module could thus also be involved in covert (attentional) orienting shifts with perceptual consequences. Accordingly, a lateralized bias of attention has been demonstrated in a Posner task by Striemer et al. (2007). Michel and Henaff (2004) have reported that target jumps were perceived without an apparent motion in a bilateral OA patient (AT), and reaction times to discriminate with a mouse left or right jumping of a target seen in central vision are delayed for contralesional directions (e.g., with a right PPC lesion, reaction time to target jumps to the left were longer than target jumps to the right and longer than control performance; unpublished observation). Recent studies have revealed that even pure cases of unilateral OA appear to systemati cally demonstrate a lateralized contralesional deficit of covert attention (reviewed in Pisella et al., 2007, 2009). This attentional deficit has to be specifically explored because it can be subclinical (see Figure 16.13). An attentional deficit or ipsilesional bias can be revealed in conditions of increased attentional load (request of identification of the letter rather than simple detection) or attentional competition (with presence of an ipsilesional item or of flankers within the (p. 337) same visual field). These increases of task demand were always at the disadvantage of the performance within the contralesional visual field in unilateral OA patients, similarly as in patients with extinction or neglect. The similarity is also highlighted by the fact that OA can express in space or in time (Rossetti et al., 2003), as already mentioned for attentional deficit (reviewed in Husain, 2001). The last decade of study of OA has described the deficit in space (errors in eye-centered reference frame increasing with visual eccentricity; reviewed in Blangero et al., 2010a) and in time (Milner et al., 1999, 2001; Pisella et al., 2000; reviewed in Pisella et al., 2006a; Rossetti & Pisella, 2002; Rossetti et al., 2003). The time effect can be summed up as a paradoxical improvement of visual-manual guidance in delayed offline conditions and even by a guid ance based on past representations (more stable than eye-centered ones) of targets (“grasping the past” Milner et al., 2001). This description does not seem finally too far away from what Husain (2001) concluded about attentional function in the context of vi Page 28 of 56

Attentional Disorders sual extinction and neglect: that it may consist in the keeping track of object features across space and time. Note that location in space is a crucial feature for reaching as well as for feature binding in the context of perception and more complex action. To sum up, evidence of a lateralized deficit of covert attention in OA does not allow us to claim a pure visual-motor deficit and to maintain a functional dissociation between OA and (subclinical) visual extinction after lesions to the SPL.

Dissociation Between the Attentional Deficits in Neglect and Optic Ataxia At this stage of the chapter, we have claimed that both neglect and OA exhibit a deficit of attentional orienting toward contralesional space emerging from damage to the symmet rical SPL-IPS network. The specific right-hemispheric localization of neglect would rely on the association of nonlateralized deficit consecutive to right IPL damage. Interestingly, these common and different components can be revealed using the Posner paradigm (Pos ner, 1980), a well-known means of testing attentional orienting in space. In the Posner spatial-cueing task, participants keep their eyes on a central fixation point and respond as fast as possible to targets presented in the right or the left visual field. Targets can be preceded by cues presented either on the same side (valid trials) or on the opposite side (invalid trials). An increased cost in responding to the target presented on the opposite side of the cue (incongruent trials) is attributed to a covert orienting shift toward the cue and necessary disengagement of attention from this cue to detect the target (Posner, 1980).

Page 29 of 56

Attentional Disorders

Figure 16.16 Similar pattern of errors for saccade and reach in bilateral optic ataxia. A, Comparison of the pattern of errors between pointing and reaching to stationary peripheral visual targets in bilateral op tic ataxia (patient IG). Individual pointing (left panels) and saccadic (right panels) trajectories to ward peripheral targets for one control subject (up per panels) and bilateral optic ataxia patient IG (low er panels). In the pointing task, subjects were asked to maintain their gaze fixed (black cross) while point ing to central (black dot) or peripheral (gray dots) targets. Control subject reached accurately target positions estimated in peripheral vision, whereas IG’s pointing behavior showed a pathological hand movement hypometria, which increased with target eccentricity (Milner et al., 2003; Rossetti et al., 2005). In the saccade task, subjects were instructed to move their eyes from a fixation point (“0”) toward one of three peripheral targets (gray dots). Control Page 30 of 56

Attentional Disorders subject presented a well-known undershoot (hypome tria) of his primary saccade, which increases with target eccentricity; IG presented a pathological in crease of hypometria of primary saccades with target eccentricity, which appears similar to her pointing hypometria for peripheral targets but at further ec centricities (the eccentricities are provided in cen timeters, but in both reach and saccade conditions il lustrated here, the targets are similarly positioned along a frontal-parallel line at reaching distance). It must be noted that following corrective saccade (not shown) will eventually correct this hypometria of pri mary saccades in patients with optic ataxia as in con trols (Gaveau et al., 2008). B, Comparison of hand and eye online visual-motor control in bilateral optic ataxia (patient IG). This figure describes the perfor mance of a control subject and a patient with bilater al optic ataxia (IG) in two experimental conditions. First, static objects were presented in central vision, and all subjects were able to reach appropriately to grasp them in either position C or position R. Se cond, when the object was quickly moved at the time of movement onset, controls were able to alter their ongoing trajectory and reach for the final location of the object. The patient with bilateral optic ataxia was specifically impaired in this condition and produced a serial behavior. She performed a whole movement to the first location of the object (C), which she fol lowed with a secondary movement toward the sec ond location (R). The time of target grasping was subsequently delayed with respect to stationary tar gets and control performance (Gréa et al., 2002; see also Pisella et al., 2000). Similarly, her saccadic be havior (on the right; Gaveau et al., 2008) consisted of two corrective saccades in addition to the primary saccade. The first “corrective” saccade was generat ed was directed to the initial target location (A) with no reaction time increase (i.e., as if the target had not been displaced), then a second late corrective saccade achieved visual capture of the target with a delay with respect to stationary targets and control performance. This pathological behavior (pointed by a black arrow, compared with the control perfor mance pointed by an empty arrow) reveals a core deficit of online integration of visual target location.

Page 31 of 56

Attentional Disorders

Figure 16.17 Differential pattern of performance in the Posner paradigm between patients with optic ataxia (OA) and patients with unilateral spatial ne glect (NSU). In these parietal patients, the common deficit is that targets on the contralesional side are detected more slowly than those on the ipsilesional side, consistent with a pathological attentional gradi ent. This left-right difference is the main effect in pa tients with optic ataxia (Striemer et al., 2007). A deficit in disengaging attention (in invalid trials) has been highlighted as a specific additional characteris tic of neglect: For these patients, the need to gener ate sequential orienting toward the left then the right side appears even more problematic than the need to orient a saccade in a direction opposite to their pathological rightward attentional gradient. In deed, data replotted from Posner et al. (1984) show that reaction times to respond to target presented in the ipsilesional visual field in invalid trials (ipsi-in valid) are almost 100 ms longer than reaction times to respond to target presented in the contralesional visual field in valid trials (contra-valid trials).

In parietal patients, targets on the contralesional side are detected more slowly than those (p. 338) (p. 339) on the ipsilesional side (Friedrich et al., 1998; Posner et al., 1984), consistent with a pathological attentional gradient. As can be seen on Figure 16.17, deficit in disengaging attention from a right (ipsilesional) cue in invalid condition after right parietal damage has been highlighted as a specific additional characteristic of ne glect (right hyperattention; Bartolomeo & Chokron, 1999, 2002). Note, however, on Fig ure 16.17 that a cost to disengage attention from a left (contralesional) cue is also ob served in the other condition of spatial invalidity of the cue in neglect patients: When the cue appears on the left side and the target on the right side, neglect patients are about 100 ms slower than in the valid condition with the target presented on the left side. Al though the difference between leftward and rightward orienting in valid trials is about 20 ms only (Posner et al., 1984; redrawn in Figure 16.17). This means that for these patients with parietal neglect, the need to generate sequential left-right orienting appears more problematic than the need to orient attention in a direction opposite to their pathological rightward attentional gradient (illustrated in Figure 16.12). This reflects the combination of a rightward attentional gradient and a general (nonlateralized) double-step orienting deficit, with an interaction between the two deficits causing the disengagement cost from Page 32 of 56

Attentional Disorders right cues to be higher than the disengagement cost from left cues. We have suggested that remapping mechanisms, crucial for double-step orienting, can account for this gener al disengagement cost in neglect patients and further for observation of ipsilesional ne glect patterns after left cueing (Pisella & Mattingley, 2004). Crucially, this hypothesis im plies that remapping processes might work for covert shifts of attention as well as for overt shifts of the eyes. Contrary to neglect patients who exhibit a major disengagement deficit leading to the longest reaction times in the two invalid conditions (Posner et al., 1984), Striemer et al. (2007) have suggested that OA patients exhibit only an engagement deficit toward the contralesional visual field in a Posner task and no specific disengagement deficit. In con sequence, performance to respond to ipsilesional items is never impaired, even in invalid trials, whereas response to contralesional items is slow in both valid and invalid trials (see Figure 16.17). Striemer et al. (2007) have therefore suggested damage of the salien cy maps representing the contralesional visual field after an SPL-IPS lesion in a patient with OA. In other words, they show the same deficit as neglect to orient attention toward the contralesional visual field but no additional deficit in the double-step orienting need ed in invalid trials. Accordingly, patients with OA exhibit no signs of visual synthesis deficit. Moreover, OA patients have shown preserved visual remapping of saccade in the context of a pointing experiment (Khan et al., 2005a). This absence of visual synthesis im pairment fits with the absence of disengagement cost in invalid trials.

Page 33 of 56

Attentional Disorders

Figure 16.18 Performance of two patients with left optic ataxia in letter discrimination tasks in visual periphery in condition of central (endogenous) cue versus peripheral (exogenous) cue. The cues are al ways valid. The endogenous cue consists of a central arrow, which indicates a direction, and a color (al ways the green location). The exogenous cue con sists of a flash of the peripheral green circle on one side. In both conditions, the letter E is flashed for 250 ms in regular or inverted orientation at the green location at 8° of visual eccentricity, surround ed by flankers who also change from “8” to “2” or “5” symbols for 250 ms. Then everything is masked by the reappearance of the “8” symbols (first initial image). The patients maintain central fixation throughout the experiment (eye position is record ed). Both patients show a clear deficit in their left (contralesional) visual field in the task of endogenous covert attention shifting, whereas in the covert ex ogenous task, patient CF shows no deficit, and Mme P also shows similar performance in both visual fields. This task is an adaptation of the paradigm of Deubel and Schneider, 1996.

Luo et al. (1998) and later Bartolomeo et al. (2000) showed that neglect patients are more specifically impaired in exogenous (stimulus-driven) conditions of attentional orienting and remain able to shift attention leftward when instructed (i.e., voluntarily, goal direct ed). This framework can be used to explain the superiority of the disengaging costs with respect to the bias resulting from the pathological attentional gradient in neglect pa tients. When the Posner task is undertaken by participants during functional magnetic resonance imaging (Corbetta et al. 2000), symmetrical activations (contralateral to stimu lus presentation) are observed in the SPL, and specific activation of the right IPL is addi tionally observed in invalid trials in either visual field. Based on this pattern of activation, Corbetta and (p. 340) Shulman (2002) have proposed this distinction between goal-direct ed (endogenous) and stimulus-driven (exogenous) systems. This model was suggested by the stimulus-driven nature of the invalid trials (detecting events at unpredicted locations Page 34 of 56

Attentional Disorders could only been stimulus-driven by definition) activating representation of the entire visu al space within the IPL. If exogenous attention relies on the right IPL, according to neu roimaging and neuropsychology of spatial neglect, then the symmetrical SPL-IPS network could subtend endogenous attention shifts. Accordingly, the tests of attentional shifting we have used on several OA patients (OK: Blangero et al., 2010b, Pisella et al., 2009; CF and Mme P: see Figure 16.18) involved endogenous cues (central arrow). Patients were instructed to undertake a letter-discrimination task at a cued location within the visual periphery. They failed to identify the letter flashed at the cued location in their contrale sional (ataxic) visual field. Is there a double dissociation between OA and neglect in this respect? This remains to be investigated on a large scale. We have recently started to test OA patients in an exogenous version of the same letter-discrimination task (adapted from Deubel & Schneider, 1996), in which the cue was a flash of the peripheral location where the letter would be presented. The two patients who are still available for testing (CF and Mme P) have exhibited less contralesional deficit in this exogenous covert attention task than in the endogenous version. In the endogenous version (right panel of Figure 16.18), the percentage of correct letter discrimination was at chance level in the contralesional visual field and at almost 80 percent in the ipsilesional visual field during the task. In the exogenous version (left panel of Figure 16.18), there were no more difference between performance in left and right visual fields. This possible clinical dissociation within the PPC between exogenous and endogenous covert attention might therefore be (p. 341) an other difference between the attentional deficits of neglect and OA patients. Whether this constitutes a systematic neuropsychological difference between SPL-IPS and right IPL le sion consequences needs to be further investigated. Note that such dissociation between exogenous and endogenous attention has also been reported within the frontal lobe (Ve cera & Rizzo, 2006): Contrary to frontal neglect cases, their patient exhibited a general impairment in orienting attention endogenously, although he could use peripheral cues to direct attention. To sum up, the Posner task reveals that OA patients exhibit attentional deficits different from those exhibited by neglect patients because neglect appears as a combination of lat eralized deficits (common with OA patients) and deficits specific to the right IPL lesion that need to be further characterized. Another issue that needs to be further investigated is whether the attentional deficits of OA patients are associated with their visual-motor impairment or are causal. The same issue is developed in the following section in relation to attentional and eye movements’ impairments.

How Can We Now Define Psychic Paralysis of Gaze? Bálint’s description of Seelenlähmung des Schauens included difficulty with both finding visual targets with the eyes (wandering of gaze) and with being “captured” by visual tar get once fixated (visual grasp reflex). Patients stare open eyed, with gaze locked to the place they are fixating, and they may only be able to disrupt such sticky fixation after a Page 35 of 56

Attentional Disorders blink. When patients are asked to move their eyes to a target suddenly appearing in the peripheral field, they may generate no movement, or they may initiate wandering eye movements that consist in erratic and usually hypometric displacement of eyes in space, ending with incidental acquisition of the target. Patients do not seem to perceive visual targets located away from a small area, which is usually the area of foveation, despite preserved visual fields. They exhibit a reduction of “useful field of vision,” operationally defined as the field of space that can be attended while keeping central fixation (Rizzo & Vecera, 2002). This can be tested by asking patients either to direct their eyes or their hand to, or to name, objects presented extrafoveally. Generally, responses are more likely to be given after verbal encouragement, a finding that indicates that the deficit is not a consequence of a reduction of the visual fields but rather of attention scanning for non central events. The three elements of Bálint’s triad (1909) have been subjected to numerous interpreta tions. As emphasized by de Renzi (1989), Holmes (1918) added to his description of pa tients with parietal lesions a deficit for oculomotor functions, which had been excluded by Bálint for his patient. This oculomotor deficit described by Holmes (1918) has often been incorporated into the description of Bálint’s syndrome, and confounded with the Seelen lähmung des Schauens most often translated in English as a psychic paralysis of gaze. As de Renzi (1989) pointed out, psychic paralysis of gaze appears to be an erroneous transla tion of Bálint’s description. A first alternative translation proposed by Hecaen and De Ajuriaguerra (1954)—psychic paralysis of visual fixation, also known as spasm of fixation (or inability to look toward a peripheral target)—suggested that it can be dissociated from intrinsic oculomotor disorders, as already argued by Bálint, but also from the two atten tional disturbances of Bálint’s syndrome: a general impairment of attention that corre sponded to a selective perception of foveal stimuli and a lateralized component of the at tention deficit, which can now be described as unilateral neglect. Husain and Stein’s translation (1988) of Bálint’s description of the psychic paralysis of gaze corresponds to a restriction of the patient’s “field of view, or we can call it the psychic field of vision.” De Renzi (1989) only distinguished two types of visual disorders: one corresponding to the lateralized deficit known as unilateral neglect, and the other to a nonlateralized restric tion of visual attention. These interpretations of the Bálint-Holmes syndrome excluded the presence of intrinsic oculomotor deficits. By contrast, Rizzo and Vecera (2002; Rizzo, 1993), like Holmes (1918), included in the posterior parietal dysfunction the oculomotor deficits which they distinguished from the psychic paralysis of gaze assimilated to spasm of fixation or ocular apraxia. To these two ocular symptoms they associated spatial disorder of attention corresponding to simul tanagnosia (Wolpert 1924), unilateral neglect, and a concentric restriction of the atten tive field. Evaluation of spatial disorder of attention using a motor response was therefore considered to be possibly affected by concurrent visual-motor deficits (OA and ocular apraxia). Verbal response was considered to more directly probe conscious or attentive perception. This framework resembles the theory developed by Milner and Goodale (1995) postulating a neural dissociation between perception and action, with attention po sitioned together with perception. In this context of the dominant view of the dorsal Page 36 of 56

Attentional Disorders stream being devoted to vision for action, Bálint’s syndrome has been described as a set of visual-motor deficits: deficit of arm (reach) and hand (grasp) movements and psychic paralysis of gaze as a deficit of eye movements. The attentional disorders that were initially associated with the psychic paralysis of gaze have been implicitly grouped together with the lateralized attentional deficits assimilated to unilateral neglect and sup posed to all rely on the IPL and TPJ, a ventrodorsal pathway intermediate between the dorsal and the ventral stream. (p. 342)

Altogether these apparent subtleties have given rise to several conceptualizations of the oculomotor, perceptual, and attentional aspects of the syndrome, which reflect the diffi culty to assign these oculomotor disorders to a well-defined dysfunction, as well as the va riety of eye movement problems observed from case to case. The first subsection deals with the assimilation of psychic paralysis of gaze to simultanagnosia, and the second sub section reviews in detail the oculomotor impairments that have been described after a le sion to the PPC.

Simultanagnosia

Figure 16.19 One main characteristic of simultanag nosia is the lack of global perception. For example, looking at a painting from Arcimboldo, patients with simultanagnosia report the vegetables but not the face. This deficit also prevents them from performing labyrinth tasks (B) and from judging whether a dot is on line 1 or 2 (A) or whether a dot is within or with out a undefined closed shape (C). Adapted with permission from Michel & Hénaff, 2004.

Simultanagnosia, a term initially coined by Wolpert (1924), defines a deficit in which pa tients see only one object at a time (Figure 16.19). This aspect was reminiscent of the de scription of Bálint’s patient who was not able to perceive the light of a match while focus ing on a cigarette until he felt a burning sensation, a symptom that Bálint (1909) included within the component he called Seelenlähmung des Schauens. Bálint (1909) mentioned that this limited capacity of attentive vision for only one object at a time does not depend on the size of the object. This is a distinction from a visual field deficit. Moreover, in pa tients with simultanagnosia, description or copy of complex figures is laborious and slow, and patients focus serially on details, apprehending portions of the pictures with a piece Page 37 of 56

Attentional Disorders meal approach but failing to switch attention from local details to global structures. Copy of images is accordingly composed of tiny elements of the original figure without percep tion of the whole scene (Wolpert, 1924). Luria (1959) assessed visual perception in a si multanagnosic patient with tachistoscopic presentation of two overlapping triangles in the configuration of the Star of David. When the two triangles were drawn in the same color, the patient reported a star, but when one was drawn in red and the other in blue, the patient reported seeing only one triangle (never two, and never a star). The description of isolated cases of simultanagnosia (Luria, 1959; Wolpert, 1924), without lateralized bias (unilateral visual neglect) and with OA or impaired visual capture being considered consecutive to a dissociated visual-motor component, has tended to confirm Bálint’s classification. However, as mentioned above, the causal relationship between at tentional and visual-motor deficit is a possibility, even in patients with subclinical atten tional disorder. Michel and Hénaff (2004) have provided a comprehensive examination of a patient with bilateral PPC lesions whose initial Bálint’s syndrome had been reduced 20 years after onset to bilateral OA and a variety of attentional deficits that could be all in terpreted as a concentric shrinking of the attentional field. Spatial disorder of attention (Bálint, 1909), restriction (de Renzi, 1989) or shrinkage of the attentional field (Michel & Henaff, 2004), or disorder of simultaneous perception (Luria, 1959), are equivalent terms to designate this complex symptom, which can be viewed as a limitation of visual-spatial attentional resources following bilateral lesions of the parietal-occipital junction. It is tempting to view both the visual grasp reflex and the wandering of gaze as direct consequences of the shrinking of the attentional field. It is quite obvious that if one does not perceive any visual target in the periphery, one will not capture this target with the eyes. Hence the gaze should remain anchored on a target once acquired (visual grasp reflex). Then, if one is forced to find a specific target in the visual scene, one would have to make blind exploratory movements, as would be the case in tunnel vision. The shrinking of the visual (p. 343) field would therefore also account for the wandering of gaze. However, a competition between visual objects or features for at tentional resources is necessary to integrate to the spatial interpretation, in order to ac count for the perception of one object among others in these patients, independent of its size (Bálint, 1909) and even if the two objects are presented at the same location (Luria, 1959). These space-based and object-based aspects of simultanagnosia could be inter preted in a more general framework of attention (see a similar view for object-based and space-based neglect in Driver, 1999) as follows: The feature currently being processed ex tinguishes—or impairs processing of—the other feature(s).

Oculomotor Deficits Not Linked to Underlying Attentional Deficit? Gaze apraxia is characterized by severe abnormalities of generation of eye movements in response to visual targets in space, in the absence of ocular motor palsy, ascertained by full reflexive eye movements. Eye movement recordings usually show several abnormali ties, such as prolonged latency, fragmentation and hypometria of saccades, fixation drift, and absence of smooth pursuit (Girotti et al., 1982; Michel et al., 1963). Pattern of oculo Page 38 of 56

Attentional Disorders motor scanning is highly abnormal during scene exploration (Zihl, 2000). Both accuracy of fixation and saccadic localization are impaired, and spatial-temporal organization of eye displacements does not fit with the spatial configuration of the scene to be analyzed (Tyler, 1968). Saccadic behavior is abnormal in rich (natural) environment arrays, which requires continuous selection between concurrent stimuli, but it may be normal in a sim plified context, for example, when the task is to direct the eyes to a peripheral light-emit ting diode in the dark (Guard et al., 1984). The issue of whether the parietal lobe is specifically involved in oculomotor processes per se should first be assessed by testing such simple saccades to isolated single dots. Such a paradigm has been tested in patients with a specific lesion of a parietal eye field, unilateral neglect, or OA, as reviewed below. Pierrot-Deseilligny and Müri (1997) have observed increased reaction times and hypome tria for contradirectional reflexive saccades after a lesion to the IPL. They have therefore postulated the existence of an oculomotor parietal region (parietal eye field; see also Müri et al., 1996) whose lesion specifically affects the processes of planning and triggering of contradirectional reflexive saccades. These authors mention that the increase in reaction time for contradirectional saccades is shorter when the fixation point vanishes about 200 ms before the presentation of the target in the contralesional visual field (“gap para digm”) than in a condition of overlap between the two visual locations. This effect sug gests that the deficit may be linked to visual extinction or a deficit of attentional disen gagement from the cross initially fixated. Patients with unilateral neglect have also been reported to exhibit late and hypometric leftward saccades (e.g., Girotti et al., 1983; Walker & Findlay, 1997) when their saccadic accuracy is reported as normal elsewhere (Behrmann et al., 1997). Finally, Niemeier and Karnath (2000) have shown that hypometria can be observed casually only for reflexive saccades triggered in response to left peripheral target presentation, leftward and right ward saccades being equivalent in amplitude in conditions of free ocular exploration of vi sual scenes. By contrast, the strategy of exploration is clearly impaired in patients with unilateral neglect (e.g., Ishiai, 2002). No available result seems to rule out that these tem poral and strategic deficits in neglect patients may result from a deficient allocation of at tention to visual targets in the left periphery. Finally, patients with pure OA arising from damage of the SPL are, by definition, impaired for visual-manual reach-and-grasp guidance within their peripheral visual field, without primary visual, proprioceptive, and motor deficits (Garcin et al., 1967; Jeannerod, 1986); this definition is supposed to also exclude oculomotor deficits. The slight impairments of saccadic eye movements detected from clinical tests in some OA patients have not been considered sufficient to account for their major misreaching deficit (e.g., Rondot et al., 1977; Vighetto & Perenin, 1981). Indeed, they correspond to only one part of the mis reaching deficit: the contribution of the deficit to localizing the target commonly affecting saccade and reach (field effect), and not the hand effect, which is specific to the reach (Gaveau et al., 2008; see Figure 16.16 stationary targets). In addition, a deficit to monitor hand location in peripheral vision (at the start of and during reaching execution) has to be added to get the whole deficit. A more simple explanation could be that the misreach Page 39 of 56

Attentional Disorders ing arises from only two deficits: one to monitor visual information in eye-centered coor dinates (impaired localization of the target and impaired visual guidance of the hand in the contralesional visual field) and the other to monitor proprioceptive information in eyecentered coordinates (typical (p. 344) hand effect as impaired proprioceptive guidance of the ataxic hand; Blangero et al., 2007). Further investigations are necessary to evaluate whether the typical mislocalization of the visual target in eye-centered coordinates (field effect; Blangero et al., 2010a) is func tionally linked to a deficit or delay in making the attentional selection of the peripheral target, as suggested by their systematic co-occurrence when the subclinical attentional deficit is searched for. Striemer et al. (2009) have suggested that it was not the case, but McIntosh et al. (2011) have revealed a very good correlation between perceptual and vi sual-motor delays with a more satisfactory design to compare perceptual and motor deficits. To sum up, the most consistent observation following a unilateral lesion of the posterior parietal cortex is an increase of latency for saccades in the contralesional direction or—in the case of a bilateral lesion—a poverty of eye movements, that may culminate in a condi tion often referred to as spasm of fixation, or visual grasp reflex. As mentioned above, parietal patients may exhibit a defect in shifting spatial attention (Verfaellie et al., 1990) exogenously or endogenously. This may be central for the deficit of “attentional disen gagement from fixated objects” (Rizzo & Vecera, 2002) and thereby for the visual grasp reflex to occur. Second, patients are usually able to move their eyes spontaneously or on verbal command, but they are impaired to perform visually guided saccades. The more at tention and complex visual processing the eye movement requires, the less it is likely to be performed. In this line, visual search behavior is particularly vulnerable. The variabili ty and instability of the oculomotor deficit after a parietal lesion can be highlighted, for single saccades as well as for exploratory scanning, as an argument for an underlying at tentional deficit. Accordingly, inactivation of LIP area, considered the parietal saccadic region in monkeys, affects visual search but not systematically saccades to single targets (Li et al., 1999; Wardak et al., 2002). We therefore tend to propose that when oculomotor deficits are observed following a posterior parietal lesion, they stem from an attentional disorder. This view stands in contrast to the premotor theory of attention (Rizzolatti et al., 1987), which postulates that spatial attention emerges from (a more basic process of) mo tor preparation of eye movements, functionally and anatomically. In our view, the causal effect is solely due to a functional coupling between attention and motor preparation. In deed, attentional selection and motor selection rely on dissociated neural substrates (Blangero et al., 2010b; Khan et al., 2009) even if they tightly interact with each other. Moreover, our view is not theoretical or evolutionist. The direction of the causal effect is not seen as an absolute hierarchy between two systems but is rather proposed only in the case of a posterior parietal lesion. Other functional links may emerge from different le sion locations (e.g., the frontal cortex).

Page 40 of 56

Attentional Disorders

Conclusion and Perspectives Bálint-Holmes syndrome provides insights into the functional roles assigned to the neu ronal populations of the dorsal (occipital-parietal-frontal) stream. These functions include spatial perception, gating and directing spatial attention, spatial dynamical representa tions in multiple reference frames, and spatial coding of eye and hand movements in the immediate extrapersonal space. We propose to identify two main aspects of the posterior parietal symptomatology that exclude oculomotor disorders as a specific category of deficit. The wandering of gaze and the visual grasp reflex appear to depend on the con centric shrinking of the attentional field called simultanagnosia. This first component may appear as unilateral visual extinction in the case of a unilateral lesion. Other symptoms such as disorganized visual exploration, with typical revisiting behavior, and disappear ance of previously viewed components in a visual scene during its exploration or copy may be attributable to the second aspect, namely visual synthesisimpairment. Even if this latter component (spatial working memory or visual remapping) has been described in patients with left neglect (Heide & Kömpf, 1997; Husain et al., 2001) after a parietal le sion, this deficit appears to occur in the entire space (Pisella et al., 2004, Figure 16.9) and to be specific not of neglect syndrome but rather of an anatomical right IPL localization (because it is also advocated for constructional apraxia; Russell et al., 2010). It is tempting to conclude from this chapter that, schematically, all deficits that occur af ter lesions to the superior parietal lobule (extinction, simultanagnosia, and OA) can be un derstood in the framework of visual attention (defined as mechanisms allowing the brain to increase spatial resolution in the visual periphery and supposed to express as deficits in time or in space), whereas all deficits consecutive to a lesion extending toward the IPL include deficits of visual synthesis (spatial representations and remapping mechanisms). However, before reaching this conclusion, further investigations are (p. 345) needed to confirm two main assumptions that we have been forced to make to establish our model on Figure 16.1. First, remapping processes for covert shifts of attention should exist within the right IPL. Prime et al. (2008) have shown that only TMS of the right PPC, and not of the left PPC, disrupts spatial working memory across saccades but also in static conditions. Visual remapping across overt eye movements and spatial working memory therefore appear similar in terms of anatomical network. However, they operate at different time scales and may constitute related but different processes. Indeed, contrasting with the doublestep saccadic literature (Duhamel et al., 1992; Heide et al., 1995), which has claimed that patients with neglect are impaired to remap contralesional (leftward) saccades, Vuilleumi er et al. (2007) have shown impairment in a perceptual spatial working memory task across saccades after a first rightward saccade in neglect patients. Interestingly, a more complex account of parietal attentional function has been proposed recently by Riddoch et al. (2010), distinguishing functionally within the ventral right-hemispheric attentional network of the dorsal stream several subregions: the supramarginal and angular gyri of the IPL, but also the TPJ and the superior temporal gyrus. This should provide neural sub strates for possible distinction between visual remapping of a goal stimulus for a goal-di Page 41 of 56

Attentional Disorders rected (saccadic or pointing) response and perceptual response based on a spatial repre sentation of the configuration of several salient objects of the visual scene. This would al so provide a more complex framework to account for the multiple dissociations that have been reported within the neglect syndrome (e.g., between near and far space, body and external space, cancellation and bisection tasks). Finally, the possible causal link between attentional deficit and OA needs to be further in vestigated, as should that between attentional deficit and eye movement deficits follow ing a lesion to the PPC. The view developed here is that the parietal cortex contains at tentional prioritized representations of all possible targets (salient object locations; Got tlieb et al., 1998) that feed oculomotor maps (e.g., by communicating the next location to reach) but are not oculomotor maps themselves. More specifically, further studies are needed to explain how an attentional deficit can cause a visual-motor deficit, expressed spatially as hypometric errors in eye-centered coordinates, increasing with target eccen tricity (Blangero et al., 2010a; Gaveau et al., 2008) or temporally by a delay of visual cap ture. Old psychophysical studies have demonstrated that low-energy targets actually elic it high-latency and hypometric saccadic eye movements (Pernier et al., 1969; Prablanc & Jeannerod, 1974). An explanation of this phenomenon in terms of spatial map organiza tion at different subcortical and cortical levels is necessary to establish a link between at tention and visual-motor metric errors.

References Andersen, R. A., & Buneo, C. A. (2002). Intentional maps in posterior parietal cortex. An nual Review of Neuroscience, 25, 189–220. Anderson, C., & Van Essen, D. (1987). Shifter circuits: A computational strategy for dy namic aspects of visual processing. Proceedings of the National Academy of Sciences U S A, 84, 6297–6301. Ashbridge, E., Walsh, V., & Cowey, A. (1997). Temporal aspects of visual search studied by transcranial magnetic stimulation. Neuropsychologia, 35 (8), 1121–1131. Bálint, R. (1909). “Seelenlähmung des Schauens, optische Ataxie, raümliche Störung der Aufmerksamkeit.” Monatsschrift für Psychiatrie und Neurologie, 25, 51–81. Bartolomeo, P. (2000). Inhibitory processes and spatial bias after right hemisphere dam age. Neuropsychological Rehabilitation, 10 (5), 511–526. Bartolomeo, P., & Chokron, S. (1999). Left unilateral neglect or right hyperattention? Neurology, 53 (9), 2023–2027. Bartolomeo, P., & Chokron, S. (2002). Orienting of attention in left unilateral neglect. Neuroscience and Biobehavioral Reviews, 26, 217–234. Bartolomeo, P., Perri, R., & Gainotti, G. (2004). The influence of limb crossing on left tac tile extinction. Journal of Neurology, Neurosurgery, and Psychiatry, 75 (1), 49–55. Page 42 of 56

Attentional Disorders Becker, E., & Karnath, H. O. (2007). Incidence of visual extinction after left versus right hemisphere stroke. Stroke, 38 (12), 3172–3174. Behrmann, M., Watt, S., Black, S. E., & Barton, J. J. (1997). Impaired visual search in pa tients with unilateral neglect: An oculographic analysis. Neuropsychologia, 35 (11), 1445– 1458. Berman, R. A., & Colby C. 2009 Attention and active vision. Vision Research, 49 (10), 1233–1248. Bisiach, E., Ricci, R., Lualdi, M., & Colombo, M. R. (1998). Perceptual and response bias in unilateral neglect: Two modified versions of the Milner landmark task. Brain and Cog nition, 37 (3), 369–386. Blangero, A., Delporte, L., Vindras, P., Ota, H., Revol, P., Boisson, D., Rode, G., Vighetto, A., Rossetti, Y. & Pisella, L. (2007). Optic ataxia is not only “optic”: Impaired spatial inte gration of proprioceptive information. NeuroImage, 36, 61–68. Blangero, A., Gaveau, V., Luauté, J., Rode, G., Salemme, R., Boisson, D., Guinard, M., Vighetto, A., Rossetti, Y. & Pisella, L. (2008). A hand and a field effect on on-line motor control in unilateral optic ataxia. Cortex, 44 (5), 560–568. Blangero, A., Khan, A. Z., Salemme, R., Laverdure, N., Boisson, D., Rode, G., Vighetto, A., Rossetti, Y., & Pisella L. (2010b). Pre-saccadic perceptual facilitation can oc cur without covert orienting of attention. Cortex, 46 (9), 1132–1137. (p. 346)

Blangero, A., Ota, H., Rossetti, Y., Fujii, T., Luaute, J., Boisson, D., Ohtake, H., Tabuchi, M., Vighetto, A., Yamadori, A., Vindras, P., & Pisella, L. (2010a). Systematic retinotopic er ror vectors in unilateral optic ataxia. Cortex, 46 (1), 77–93. Broadbent, D. E., & Broadbent, M. H. P. (1987). From detection to identification: Re sponse to multiple targets in rapid serial visual presentation. Perception and Psy chophysics, 42, 105–113. Brouchon, M., Joanette, Y., & Samson, M. (1986). From movement to gesture: “Here” and “there” as determinants of visually guided pointing. In J. L. Nespoulos, A. Perron, & R. A. Lecours (Eds.), Biological foundations of gesture (pp. 95–107). Mahwah, NJ: Erlbaum. Buxbaum, L. J., & Coslett, H. B. (1998). Spatio-motor representations in reaching: Evi dence for subtypes of optic ataxia. Cognitive Neuropsychology, 15 (3), 279–312. Carey, D. P., Coleman R. J., & Della Sala, S. (1997). Magnetic misreaching. Cortex, 33 (4), 639–652. Carey, D. P., Harvey, M., & Milner, A. D. (1996). Visuomotor sensitivity for shape and ori entation in a patient with visual form agnosia. Neuropsychologia, 34 (5), 329–337.

Page 43 of 56

Attentional Disorders Carrasco, M., Penpeci-Talgar, C., & Eckstein, M. (2000). Spatial covert attention enhances contrast sensitivity across the CSF: Support for signal enhancement. Vision Research, 40, 1203–1215. Colby, C. L., Duhamel, J.-R., & Goldberg, M. E. (1995). Oculocentric spatial representation in parietal cortex. Cerebral Cortex, 5, 470–481. Colby, C. L., & Goldberg, M. E. (1999). Space and attention in parietal cortex. Annual Re view of Neuroscience, 22, 319–349. Corbetta, M., Kincade, M. J., Lewis, C., Snyder, A. Z., & Sapir, A. (2005). Neural basis and recovery of spatial attention deficits in spatial neglect. Nature Neuroscience, 8 (11), 1603–1610. Corbetta, M., Kincade, J. M., Ollinger, J. M., McAvoy, M. P., & Shulman, G. L. (2000). Vol untary orienting is dissociated from target detection in human posterior parietal cortex. Nature Neuroscience, 3 (3), 292–297. Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven atten tion in the brain. Nature Reviews Neuroscience, 3 (3), 201–215. Danckert, J., & Rossetti, Y. (2005). Blindsight in action: What can the different sub-types of blindsight tell us about the control of visually guided actions? Neuroscience and Biobe havioral Reviews, 29 (7), 1035–1046. De Renzi, E. (1989). Bálint-Holmes syndrome. In Classic cases in neuropsychology (pp. 123–143). Hove, UK: Psychology Press. Deubel, H., & Schneider, W. X. (1996). Saccade target selection and object recognition: Evidence for a common attentional mechanism. Vision Research, 36, 1827–1837. Dijkerman, H. C., McIntosh, R. D., Anema, H. A., de Haan, E. H., Kappelle, L. J., & Milner, A. D. (2006). Reaching errors in optic ataxia are linked to eye position rather than head or body position. Neuropsychologia, 44, 2766–2773. Di Pellegrino, G., Basso, G., & Frassinetti, F. (1998). Visual extinction as a spatio-temporal disorder of selective attention. NeuroReport, 9, 835–839. Driver, J. (1999). Egocentric and object-based visual neglect. In N. Burgess, K. J. Jeffery & J. O. O’Keefe (Eds.), The hippocampal and parietal foundations of spatial cognition. (pp. 66–89). Oxford, UK: Oxford University Press. Driver, J., & Husain, M. (2002). The role of spatial working memory deficits in pathologi cal search by neglect patients. In H. O. Karnath, A. D. Milner, & G. Vallar (Eds.), The cog nitive and neural bases of spatial neglect (pp. 351–364). Oxford, UK: Oxford University Press.

Page 44 of 56

Attentional Disorders Driver, J., & Mattingley J. B. (1998). Parietal neglect and visual awareness. Nature Neuro science, 1, 17–22. Duhamel, J.-R., Goldberg, M. E., Fitzgibbon, E. J., Sirigu, A., & Grafman, J. (1992). Sac cadic dysmetria in a patient with a right frontoparietal lesion. Brain, 115, 1387–1402. Egly, R., Driver, J., & Rafal, R. D. (1994). Shifting visual attention between objects and lo cations: evidence from normal and parietal lesion subjects. Journal of Experimental Psy chology: General, 123 (2), 161–177. Ellison, A., Schindler, I., Pattison, L. L., & Milner, A. D. (2004). An exploration of the role of the superior temporal gyrus in visual search and spatial perception using TMS. Brain, 127 (Pt 10), 2307–2315. Ferber, S., & Danckert, J. (2006). Lost in space—the fate of memory representations for non-neglected stimuli. Neuropsychologia, 44 (2), 320–325. Fierro, B., Brighina, F., Oliveri, M., Piazza, A., La Bua, V., Buffa, D., & Bisiach, E. (2000). Contralateral neglect induced by right posterior parietal rTMS in healthy subjects. Neu roReport, 11 (7), 1519–1521. Fink, G. R., Marshall, J. C., Shah, N. J., Weiss, P. H., Halligan, P. W., Grosse-Ruyken, M., Ziemons, K., Zilles, K., & Freund, H. J. (2000). Line bisection judgments implicate right parietal cortex and cerebellum as assessed by fMRI. Neurology, 54 (6), 1324–1331. Friedrich, F. J., Egly, R., Rafal, R. D., & Beck, D. (1998). Spatial attention deficits in hu mans: A comparison of superior parietal and temporo-parietal junction lesions. Neuropsy chology, 12, 193–207. Garcin, R., Rondot, P., & de Recondo J. (1967). Ataxie optique localisée aux deux hémichamps visuels homonymes gauches. Revue Neurologique, 116 (6), 707–714. Gaveau, V., Pélisson, D., Blangero, A., Urquizar, C., Prablanc, C., Vighetto, A. & Pisella, L. (2008). Saccadic control and eye-hand coordination in optic ataxia. Neuropsychologia, 46, 475–486. Girard, P., Salin, P. A., & Bullier, J. (1991). Visual activity in macaque area V4 depends on area 17 input. NeuroReport, 2 (2), 81–84. Girard, P., Salin, P. A., & Bullier, J. (1992). Response selectivity of neurons in area MT of the macaque monkey during reversible inactivation of area V1. Journal of Neurophysiolo gy, 67 (6), 1437–1446. Girotti, F., Casazza, M., Musicco, M., & Avanzini, G. (1983). Oculomotor disorders in corti cal lesions in man: The role of unilateral neglect. Neuropsychologia, 21, 543–553.

Page 45 of 56

Attentional Disorders Girotti, F., Milanese, C., et al. (1982). Oculomotor disturbances in Bálint’s syndrome: Anatomoclinical findings and electrooculographic analysis in a case. Cortex, 18 (4), 603– 614. Goldberg, M. E., & Bruce, C. J. (1990). Primate frontal eye fields. III. Maintenance of a spatially accurate saccade signal. Journal of Neurophysiology, 64, 489–508. Goodale, M. A., Milner, A. D., Jacobson, L. S., & Carey, D. P. (1991). A neurological dissoci ation between perceiving objects and grasping them. Nature, 349, 154–156. Gottlieb, J. P., Kusunoki, M., & Goldberg, M. E. (1998). The representation of visu al salience in monkey parietal cortex. Nature, 391, 481–484. (p. 347)

Gréa, H., Pisella, L., Rossetti, Y., Desmurget, M., Tilikete, C., Prablanc, C., & Vighetto, A. (2002). A lesion of the posterior parietal cortex disrupts on-line adjustments during aim ing movements. Neuropsychologia, 40, 2471–2480. Guard, O., Perenin, M. T., Vighetto, A., Giroud, M., Tommasi, M., & Dumas, R. (1984). Syndrome pariétal bilatéral ressemblant au syndrome de Bálint. Revue Neurologique, 140 (5), 358–367. Hecaen, H., & De Ajuriaguerra, J. (1954). Bálint’s syndrome (psychic paralysis of visual fixation) and its minor forms. Brain, 77 (3), 373–400. Heide, W., Binkofski, F., Seitz, R. J., Posse, S., Nitschke, M. F., Freund, H. J., & Kömpf, D. (2001). Activation of frontoparietal cortices during memorized triple-step sequences of saccadic eye movements: an fMRI study. European Journal of Neuroscience, 13 (6), 1177– 1189. Heide, W., Blankenburg, M., Zimmermann, E., & Kömpf, D. (1995). Cortical control of double-step saccades: implications for spatial orientation. Annals of Neurology, 38, 739– 748. Heide, W., & Kömpf, D. (1997). Specific parietal lobe contribution to spatial constancy across saccades. In: P. Thier & H.-O. Karnath (Eds.), Parietal lobe contributions to orienta tion in 3D space (pp. 149–172). Heidelberg: Springer-Verlag. Heilman, K. M. (2004). Intentional neglect. Frontiers in Bioscience, 9, 694–705. Hilgetag, C. C., Théoret, H., & Pascual-Leone, A. (2001). Enhanced visual spatial atten tion ipsilateral to rTMS-induced “virtual lesions” of human parietal cortex. Nature Neuro science, 4 (9), 953–957. Hillis, A. E., & Caramazza, A. (1991). Deficit to stimulus-centered, letter shape represen tations in a case of “unilateral neglect.” Neuropsychologia, 29 (12), 1223–1240.

Page 46 of 56

Attentional Disorders Hillis, A. E., Chang, S., Heidler-Gary, J., Newhart, M., Kleinman, J. T., Davis, C., Barker, P. B., Aldrich, E., & Ken, L. 2006 Neural correlates of modality-specific spatial extinction. Journal of Cognitive Neuroscience, 18 (11), 1889–1898. Holmes, G. (1918). Disturbances of visual orientation. British Journal of Ophthalmology, 2, 449–468, 506–518. Humphreys, G. W., Romani, C., Olson, A., Riddoch, M. J., & Duncan, J. (1994). Non-spatial extinction following lesions of the parietal lobe in humans. Nature, 372, 357–359. Husain, M. (2001). A spatio-temporal framework for disorders of visual attention. In K. Shapiro (Ed)., The limits of attention: Temporal constraints in human information process ing (pp. 229–246). Oxford, UK: Oxford University Press. Husain, M. (2008). Hemispatial neglect. In G. Goldenberg & B. Miller (Eds.), Handbook of clinical Neurology, 3rd Series, Volume 88: Neuropsychology and behavioral neurology (pp. 359–372). Amsterdam: Elsevier. Husain, M., Mannan, S., Hodgson, T., Wojciulik, E., Driver, J., & Kennard, C. (2001). Im paired spatial working memory across saccades contributes to abnormal search in pari etal neglect. Brain, 124 (Pt 5), 941–952. Husain, M., Mattingley, J. B., Rorden, C., Kennard, C., & Driver, J. (2000). Distinguishing sensory and motor biases in parietal and frontal neglect. Brain, 123 (Pt 8), 1643–1659. Husain, M., & Rorden, C. (2003). Non-spatially lateralized mechanisms in hemispatial ne glect. Nature Reviews Neurosci ence, 4 (1), 26–36. Husain, M., Shapiro, K., Martin, J., & Kennard, C. (1997). Abnormal temporal dynamics of visual attention in spatial neglect patients. Nature, 385, 154–156. Husain, M., & Stein, J. (1988). Reszö Bálint and his most celebrated case. Archives of Neurology, 45, 89–93. Ishiai, S. (2002). Perceptual and motor interaction in unilateral spatial neglect. In H. O. Karnath, A. D. Milner, G. Vallar (Eds.), The cognitive and neural bases of spatial neglect (pp. 181–195). Oxford, UK: Oxford University Press. Jackson, S. R., Newport, R., Mort, D., & Husain, M. (2005). Where the eye looks, the hand follows: Limb-dependent magnetic misreaching in optic ataxia. Current Biology, 15, 42– 46. Jakobson, L. S., Archibald, Y. M., Carey, D. P., & Goodale, M. A. (1991). A kinematic analy sis of reaching and grasping movements in a patient recovering from optic ataxia. Neu ropsychologia, 29, 803–809. Jeannerod, M. (1986). Mechanisms of visuo-motor coordination: A study in normals and brain-damaged subjects. Neuropsychologia, 24, 41–78. Page 47 of 56

Attentional Disorders Jeannerod, M., Decety, J., et al. (1994). Impairment of grasping movements following bi lateral posterior parietal lesion. Neuropsychologia, 32, 369–380. Jeannerod, M., & Rossetti, Y. (1993). Visuomotor coordination as a dissociable function: Experimental and clinical evidence. In C. Kennard (Ed.), Visual perceptual defects. Baillière’s clinical neurology, international practise and research (pp. 439–460). London, Ballière Tindall. Karnath, H. O., Himmelbach, M., & Küker, W. (2003). The cortical substrate of visual ex tinction. NeuroReport, 14 (3), 437–442. Kennard, C., Mannan, S. K., Nachev, P., Parton, A., Mort, D. J., Rees, G., Hodgson, T. L., & Husain, M. (2005). Cognitive processes in saccade generation. Annals of N Y Academy of Sci ence, 1039, 176–183. Khan, A. Z., Blangero, A., Rossetti, Y., Salemme, R., Luauté, J., Laverdure, N., Rode, G., Boisson, D., & Pisella, L. (2009). Parietal damage dissociates saccade planning from presaccadic perceptual facilitation. Cerebral Cortex, 19 (2), 383–387. Khan, A. Z., Pisella, L., Rossetti, Y., Vighetto, A., & Crawford, J. D. (2005b). Impairment of gaze-centered updating of reach targets in bilateral parietal-occipital damaged patients. Cerebral Cortex, 15 (10), 1547–1560. Khan, A. Z., Pisella, L., Vighetto, A., Cotton, F., Luauté, J., Boisson, D., Salemme, R., Craw ford, J. D., & Rossetti, Y. (2005a). Optic ataxia errors depend on remapped, not viewed, target location. Nature Neuroscience, 8 (4), 418–420. Kinsbourne, M. (1993). Orientational bias model of unilateral neglect: Evidence from at tentional gradients within hemispace. In I. H. Robertson & J. C. Marshall (Eds.), Unilater al neglect: Clinical and experimental studies (pp. 63–86). Hove, UK: Erlbaum. Koch, C., & Ullman, S. (1985). Shifts in selective visual attention: Towards the underlying neural circuitry. Human Neurobiology, 4, 219–227. Konen, C. S., & Kastner, S. (2008). Two hierarchically organized neural systems for object information in human visual cortex. Nature Neuroscience, 11 (2), 224–231. Làdavas, E. (1990). Selective spatial attention in patients with visual extinction. Brain, 113 (Pt 5), 1527–1538. (p. 348)

Li, C. S., Mazzoni, P., & Andersen, R. A. (1999). Effect of reversible inactivation of

macaque lateral intraparietal area on visual and memory saccades. Journal of Neurophys iology, 81 (4), 1827–1838. Luauté, J., Halligan, P., Rode, G., Jacquin-Courtois, S., & Boisson, D. (2006). Prism adapta tion first among equals in alleviating left neglect: A review. Restorative Neurology and Neuroscience, 24 (4–6), 409–418.

Page 48 of 56

Attentional Disorders Luo, C. R., Anderson, J. M., & Caramazza, A. (1998). Impaired stimulus-driven orienting of attention and preserved goal-directed orienting of attention in unilateral visual neglect. American Journal of Psychology, 111 (4), 487–507. Luria, A. R. (1959). Disorders of “simultaneous” perception in a case of occipito-parietal brain injury. Brain, 82, 437–449. Malhotra, P., Coulthard, E. J., & Husain, M. (2009). Role of right posterior parietal cortex in maintaining attention to spatial locations over time. Brain, 132 (Pt 3), 645–660. Malhotra, P., Jäger, H. R., Parton, A., Greenwood, R., Playford, E. D., Brown, M. M., Dri ver, J., & Husain, M. (2005). Spatial working memory capacity in unilateral neglect. Brain, 128 (Pt 2), 424–435. Mannan, S. K., Mort, D. J., Hodgson, T. L., Driver, J., Kennard, C., & Husain, M. (2005). Re visiting previously searched locations in visual neglect: Role of right parietal and frontal lesions in misjudging old locations as new. Journal of Cognitive Neuroscience, 17 (2), 340– 354. Mattingley, J. B., Davis, G., & Driver, J. (1997). Preattentive filling-in of visual surfaces in parietal extinction. Science, 275 (5300), 671–674. Mattingley, J. B., Husain, M., Rorden, C., Kennard, C., & Driver, J. (1998). Motor role of human inferior parietal lobe revealed in unilateral neglect patients. Nature, 392 (6672), 179–182. Mattingley, J. B., Pisella, L., Rossetti, Y., Rode, G., Tilikete, C., Boisson, D., Vighetto, A. (2000). Visual extinction in retinotopic coordinates: a selective bias in dividing attention between hemifields. Neurocase 6, 465–475. Mays, L. E., & Sparks, D. L. (1980). Dissociation of visual and saccade-related responses in superior colliculus neurons. Journal of Neurophysiology, 43, 207–232. McIntosh, R.D., Mulroue, A., Blangero, A., Pisella, L., & Rossetti, Y. (2011). Correlated deficits of perception and action in optic ataxia. Neuropsychologia, 49, 131–137. Medendorp, W. P., Goltz, H. C., Vilis, T., & Crawford, J. D. (2003). Gaze-centered updating of visual space in human parietal cortex. Journal of Neuroscience, 23 (15), 6209–6214. Merriam, E. P., Genovese, C. R., & Colby, C. L. (2003). Spatial updating in human parietal cortex. Neuron, 39, 361–373. Merriam, E. P., Genovese, C. R., & Colby, C. L. (2007). Remapping in human visual cortex. Journal of Neurophysiology, 97 (2), 1738–1755. Michel, F., & Hénaff, M. A. (2004). Seeing without the occipito-parietal cortex: Simul tanagnosia as a shrinkage of the attentional visual field. Behavioural Neurology, 15, 3–13.

Page 49 of 56

Attentional Disorders Michel, F., Jeannerod, M., & Devic M. (1963). Un cas de désorientation visuelle dans les trois dimensions de l’espace (A propos du syndrome de Bálint et du syndrome décrit par G Holmes). Revue Neurologique, 108, 983–984. Milner, A. D., Dijkerman, H. C., McIntosh, R. D., Rossetti, Y., & Pisella, L. (2003). Delayed reaching and grasping in patients with optic ataxia. In D. Pelisson, C. Prablanc & Y. Ros setti (Eds.), Progress in Brain Research series: Neural control of space coding and action production (pp. 142, 225–242). Amsterdam: Elsevier. Milner, A. D., Dijkermann, C., Pisella, L., McIntosh, R., Tilikete, C., Vighetto, A., & Rosset ti, Y. (2001). Grasping the past: Delaying the action improves visuo-motor performance. Current Biology, 11 (23), 1896–1901. Milner, A. D., & Goodale, M. A. (1995). The visual brain in action. Oxford, UK: Oxford Uni versity Press. Milner, A. D., & Goodale, M. A. (2008). Two visual systems re-viewed. Neuropsychologia, 46 (3), 774–785. Milner, A. D., Paulignan Y., Dijkerman H. C., Michel F., & Jeannerod M. (1999). A paradox ical improvement of misreaching in optic ataxia: New evidence for two separate neural systems for visual localization. Proceedings of the Royal Society of London B, 266, 2225– 2229. Morris, A. P., Chambers, C. D., & Mattingley, J. B. (2007). Parietal stimulation destabilizes spatial updating across saccadic eye movements. Proceedings of the National Academy of Science U S A, 104 (21), 9069–9074. Mort, D. J., Malhotra, P., Mannan, S. K., Rorden, C., Pambakian, A., Kennard, C., & Hu sain, M. (2003). The anatomy of visual neglect. Brain, 126 (Pt 9), 1986–1997. Muggleton, N. G., Cowey, A., & Walsh, V. (2008). The role of the angular gyrus in visual conjunction search investigated using signal detection analysis and transcranial magnetic stimulation. Neuropsychologia, 46 (8), 2198–2202. Müri, R. M., Iba-Zizen, M. T., et al. (1996). Location of the human posterior eye field with functional magnetic resonance imaging. Journal of Neurology, Neurosurgery, and Psychia try, 60, 445–448. Niebur, E., & Koch, C. (1997). Computational architectures for attention. In R. Parasura man (Ed.), The attentive brain (pp. 163–186). Cambridge, MA: MIT Press. Niemeier, M., & Karnath, H. O. (2000). Exploratory saccades show no direction-specific deficit in neglect. Neurology, 54 (2), 515–518. Nowak, L., & Bullier, J. (1997). The timing of information transfer in the visual system. In J. Kaas, K. Rochland, & A. Peters (Eds.), Extrastriate cortex in primates. New York: Plenum Press. Page 50 of 56

Attentional Disorders Ota, H., Fujii, T., Suzuki, K., Fukatsu, R., & Yamadori, A. (2001). Dissociation of body-cen tered and stimulus-centered representations in unilateral neglect. Neurology, 57 (11), 2064–2069. Pascual-Leone, A., Gomez-Tortosa, E., Grafman, J., Always, D., Nichelli, P., & Hallett, M. (1994). Induction of visual extinction by rapid-rate transcranial magnetic stimulation of parietal lobe. Neurology, 44 (3 Pt 1), 494–498. Patterson, A., & Zangwill, O. L. (1944). Disorders of visual space perception associated with lesions of the right cerebral hemisphere. Brain, 67, 331–358. Perenin, M.-T. & Vighetto, A. (1988) Optic ataxia: a specific disruption in visuomotor mechanisms. I. Different aspects of the deficit in reaching for objects. Brain, 111 (Pt 3), 643–674. Pernier, J., Jeannerod, M., & Gerin, P. (1969) Preparation and decision in saccades: adap tation to the trace of the stimulus. Vision Research, 9 (9), 1149–1165. Pierrot-Deseilligny, C., & Müri, R. (1997). Posterior parietal cortex control of saccades in humans. In P. Thier & H.-O. Karnath (Eds.), Parietal lobe contributions to orientation in 3D space (pp. 135–148). Heidelberg: Springer-Verlag. Pisella, L., Alahyane, N., Blangero, A., Thery, F., Blanc, S., Rode, G., & Pelisson, D. (2011). Right-hemispheric dominance for visual remapping in humans. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 366 (1564), 572–585. Pisella, L., Berberovic, N., & Mattingley, J. B. (2004). Impaired working memory for location but not for colour or shape in visual neglect: a comparison of parietal and non-parietal lesions. Cortex, 40 (2), 379–390. (p. 349)

Pisella, L., Binkofski, F., Lasek, K., Toni, I., & Rossetti, Y. (2006a). No double-dissociation between optic ataxia and visual agnosia: Multiple sub-streams for multiple visuo-manual integrations. Neuropsychologia, 44 (13), 2734–2748. Pisella, L., Gréa, H., Tilikete, C., Vighetto, A., Desmurget, M., Rode, G., Boisson, D., & Rossetti, Y. (2000). An automatic pilot for the hand in the human posterior parietal cortex toward a reinterpretation of optic ataxia. Nature Neuroscience, 3, 729–736. Pisella, L., & Mattingley, J. B. (2004). The contribution of spatial remapping impairments to unilateral visual neglect. Neuroscience and Biobehavioral Reviews, 28 (2), 181–200. Pisella, L., Ota, H., Vighetto, A., & Rossetti, Y. (2007). Optic ataxia and Bálint syndrome: Neurological and neurophysiological prospects. In G. Goldenberg & B. Miller (Eds.), Handbook of clinical neurology, 3rd Series, Volume 88: Neuropsychology and behavioral neurology (pp. 393–416). Amsterdam: Elsevier.

Page 51 of 56

Attentional Disorders Pisella, L., Rode, G., Farne, A., Tilikete, C., & Rossetti, Y. (2006b). Prism adaptation in the rehabilitation of patients with visuo-spatial cognitive disorders. Current Opinion in Neu rology, 19 (6), 534–542. Pisella, L., Sergio, L., Blangero, A., Torchin, H., Vighetto, A., & Rossetti, Y. (2009). Optic ataxia and the function of the dorsal stream: Contribution to perception and action. Neu ropsychologia, 47, 3033–3044. Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32, 3–25. Posner, M. I., Walker, J. A., Friedrich, F. J., & Rafal, R. D. (1984). Effects of parietal injury on covert orienting of attention. Journal of Neuroscience, 4, 1863–1874. Prablanc, C., & Jeannerod, M. (1974). Latence et precision des saccades en fonction de l’intensité, de la durée et de la position rétinienne d’un stimulus. Revue E.E.G., 4 (3), 484– 488. Prime, S. L., Vesia, M., & Crawford, J. D. (2008). Transcranial magnetic stimulation over posterior parietal cortex disrupts transsaccadic memory of multiple objects. Journal of Neuroscience, 28 (27), 6938–6949. Raymond, J. E., Shapiro, K. L., & Arnell, K. M. (1992). Temporary suppression of visual processing in RSVP task: An attentional blink? Journal of Experimental Psychology: Hu man Perception and Performance, 18, 849–860. Rees, G. (2001). Neuroimaging of visual awareness in patients and normal subjects. Cur rent Opinions in Neurobiology, 11 (2), 150–156. Riddoch, M. J., Chechlacz, M., Mevorach, C., Mavritsaki, E., Allen, H., & Humphreys, G. W. (2010). The neural mechanisms of visual selection: The view from neuropsychology. Annals of the N Y Academy of Science, 1191 (1), 156–181. Rizzo, M. (1993). “Bálint syndrome” and associated visuo-spatial disorders. Baillière’s Clinical Neurology, 2, 415–437. Rizzo, M., & Vecera, S. P. (2002). Psychoanatomical substrates of Bálint ‘s syndrome. Jour nal of Neurology, Neurosurgery, and Psychiatry, 72 (2), 162–178. Rizzolatti, G., Riggio, L., Dascola, I., & Umiltá, C. (1987). Reorienting attention across the horizontal and vertical meridians: Evidence in favor of a premotor theory of attention. Neuropsychologia, 25, 31–40. Robertson, I. (1989). Anomalies in the laterality of omissions in unilateral left visual ne glect: Implications for an attentional theory of neglect. Neuropsychologia, 27 (2), 157– 165.

Page 52 of 56

Attentional Disorders Rode, G., Luauté, J., Klos, T., Courtois-Jacquin, S., Revol, P., Pisella, L., Holmes, N. P., Bois son D., & Rossetti, Y. (2007a). Bottom-up visuo-manual adaptation: Consequences for spa tial cognition. In P. Haggard, Y. Rossetti, & M. Kawato (Eds.), Attention and performance XXI: Sensorimotor foundations of higher cognition (pp. 207–229). Oxford, UK: Oxford Uni versity Press. Rode, G., Pisella, L., Marsal, L., Mercier, S., Rossetti, Y., & Boisson, D. (2006). Prism adap tation improves spatial dysgraphia following right brain damage. Neuropsychologia, 44 (12), 2487–2493. Rode, G., Perenin, M. T., & Boisson, D. (1995). [Neglect of the representational space: Demonstration by mental evocation of the map of France]. Revue Neurologique, 151 (3), 161–164. Rode, G., Revol, P., Rossetti, Y., Boisson, D., & Bartolomeo, P. (2007b). Looking while imagining: The influence of visual input on representational neglect. Neurology, 68 (6), 432–437. Rondot, P., de Recondo J., & Ribadeau-Dumas, J. L. (1977). Visuomotor ataxia. Brain, 100 (2), 355–376. Rorden, C., Mattingley, J. B., Karnath, H.-O., & Driver, J. (1997). Visual extinction as prior entry: Impaired perception of temporal order with intact motion perception after parietal injury. Neuropsychologia, 35, 421–433. Rossetti, Y. (1998). Implicit short-lived motor representations of space in brain damaged and healthy subjects. Consciousness and Cognition, 7, 520–558. Rossetti, Y., McIntosh, R. M., Revol, P., Pisella, L., Rode, G., Danckert, J., Tilikete, C., Dijk erman, H. C. M., Boisson, D., Michel, F., Vighetto, A., & Milner, A. D. (2005). Visually guided reaching: posterior parietal lesions cause a switch from visuomotor to cognitive control. Neuropsychologia, 43/2, 162–177. Rossetti, Y., & Pisella L. (2002). Tutorial. Several “vision for action” systems: A guide to dissociating and integrating dorsal and ventral functions. In W. Prinz & B. Hommel (Eds.), Attention and performance XIX: Common mechanisms in perception and action (pp. 62– 119). Oxford, UK: Oxford University Press. Rossetti, Y., Pisella, L., & Pélisson, D. (2000). New insights on eye blindness and hand sight: Temporal constraints of visuomotor networks. Visual Cognition, 7, 785–808. Rossetti, Y., Pisella, L. & Vighetto, A. (2003). Optic ataxia revisited: Visually guided action versus immediate visuo-motor control. Experimental Brain Research, 153 (2), 171–179. Rossetti, Y., & Revonsuo, A. (2000) Beyond dissociations: Recomposing the mind-brain af ter all? In Y. Rossetti & A. Revonsuo (Eds.), Beyond dissociation: Interaction between dis sociated implicit and explicit processing (pp. 1–16). Amsterdam: Benjamins. Page 53 of 56

Attentional Disorders Rossetti, Y., Rode, G., Pisella, L., Farne A., Ling L., Boisson D., & Perenin, M. T. (1998). Hemispatial neglect and prism adaptation: When adaptation to rightward optical devia tion rehabilitates the neglected left side. Nature, 395, 166–169. Russell, C., Deidda, C., Malhotra, P., Crinion, J. T., Merola, S., & Husain, M. (2010). A deficit of spatial remapping in constructional apraxia after right-hemisphere stroke. Brain, 133, 1239–1251. Schindler, I., McIntosh, R. D., Cassidy, T. P., Birchall, D., Benson, V., Ietswaart, M., & Milner, A. D. (2009). The disengage deficit in hemispatial neglect is restricted to be tween-object shifts and is abolished by prism adaptation. Experimental Brain Research, 192 (3), 499–510. (p. 350)

Schulman, G. L., Astafiev, S. V., McAvoy, M. P., d’Avossa, G., & Corbetta, M. (2007). Right TPJ deactivation during visual search: Functional significance and support for a filter hy pothesis. Cerebral Cortex, 17 (11), 2625–2633. Smith, S., & Holmes, G. (1916). A case of bilateral motor apraxia with disturbance of visu al orientation. British Medical Journal, 1, 437–441. Snyder LH, Batista AP, Andersen RA (1997). Coding of intention in the posterior parietal cortex. Nature, 386 (6621), 167–170. Striemer, C., Blangero, A., Rossetti, Y., Boisson, D., Rode, G., Vighetto, A., Pisella, L., & Danckert, J. (2007). Deficits in peripheral visual attention in patients with optic ataxia. NeuroReport, 18 (11), 1171–1175. Striemer, C., Blangero, A., Rossetti, Y., Boisson, D., Rode, G., Salemme, R., Vighetto, A., Pisella, L., & Danckert, J. (2008). Bilateral posterior parietal lesions disrupt the beneficial effects of prism adaptation on visual attention: Evidence from a patient with optic ataxia. Experimental Brain Research, 187 (2), 295–302. Striemer, C., Locklin, J., Blangero, A., Rossetti, Y., Pisella, L., & Danckert, J. (2009). Atten tion for action? Examining the link between attention and visuomotor control deficits in a patient with optic ataxia. Neuropsychologia, 47 (6), 1491–1499. Tian, J., Schlag J., & Schlag-Rey, M. (2000). Testing quasi-visual neurons in the monkey’s frontal eye field with the triple-step paradigm. Experimental Brain Research, 130, 433– 440. Tyler, H. R. (1968). Abnormalities of perception with defective eye movements (Bálint’s syndrome). Cortex, 4, 154–171. Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. J. Ingle, M. A. Goodale, & R. J. W. Mansfield (Eds.), Analysis of visual behavior (pp. 549–586). Cam bridge, MA: MIT Press.

Page 54 of 56

Attentional Disorders Vallar, G. (2007). Spatial neglect, Balint-Homes’ and Gerstmann’s syndrome, and other spatial disorders. CNS Spectrums, 12 (7), 527–536. Vallar, G., & Perani, D. (1986). The anatomy of unilateral neglect after right-hemisphere stroke lesions: A clinical/CT-scan correlation study in man. Neuropsychologia, 24, 609– 622. van Koningsbruggen, M. G., Gabay, S., Sapir, A., Henik, A., & Rafal, R. D. (2010). Hemi spheric asymmetry in the remapping and maintenance of visual saliency maps: A TMS study. Journal of Cognitive Neurosci ence, 22 (8), 1730–1738. Vecera, S. P., & Rizzo, M. (2006). Eye gaze does not produce reflexive shifts of attention: Evidence from frontal-lobe damage. Neuropsychologia, 44, 150–159. Verfaellie, M., Rapcsak, S. Z., et al. (1990). Impaired shifting of attention in Bálint ‘s syn drome. Brain and Cognition, 12 (2), 195–204. Vighetto, A. (1980). Etude neuropsychologique et psychophysique de l’ataxie optique. Thèse Université Claude Bernard Lyon I. Vighetto, A., & Perenin, M. T. (1981). Optic ataxia: Analysis of eye and hand responses in pointing at visual targets. Revue Neurologique, 137 (5), 357–372. Vuilleumier, P., & Rafal, R. D. (2000). A systematic study of visual extinction: Betweenand within-field deficits of attention in hemispatial neglect. Brain, 123 (Pt 6), 1263–1279. Vuilleumier, P., Sergent, C., Schwartz, S., Valenza, N., Girardi, M., Husain, M., & Driver, J. (2007). Impaired perceptual memory of locations across gaze-shifts in patients with uni lateral spatial neglect. Journal of Cognitive Neuroscience, 19 (8), 1388–1406. Wade, A. R., Brewer, A. A., Rieger, J. W., & Wandell, B. A. (2002). Functional measure ments of human ventral occipital cortex: retinotopy and colour. Philosophical Transac tions of the Royal Society of London, Series B, Biological Sciences. 357 (1424), 963–973. Walker, R., & Findlay, J. M. (1997). Eye movement control in spatial- and object-based ne glect. In P. Thier & H.-O. Karnath (Eds.), Parietal lobe contributions to orientation in 3D space (pp. 201–218). Heidelberg: Springer-Verlag. Wardak, C., Olivier, E., & Duhamel, J. R. (2002). Saccadic target selection deficits after lateral intraparietal area inactivation in monkeys. Journal of Neuroscience, 22 (22), 9877– 9884. Weddell, R. A. (2004). Subcortical modulation of spatial attention including evidence that the Sprague effect extends to man. Brain and Cognition, 55 (3), 497–506. Wojciulik, E., Husain, M., Clarke, K., & Driver, J. (2001) Spatial working memory. Journal of Neuropsychologia, 39 (4), 390–396.

Page 55 of 56

Attentional Disorders Wolpert, T. (1924). Die simultanagnosie. Zeitschrift für gesamte Neurologie und Psychia trie, 93, 397–415. Yarbus, A. L. (1967). Eye movements and vision. New York: Plenum Press. Yeshurun, Y., & Carrasco, M. (1998). Attention improves or impairs visual performance by enhancing spatial resolution. Nature, 396, 72–75. Zihl, J. (2000). Rehabilitation of visual disorders after brain injury. In Neuropsychological rehabilitation: A modular handbook. Hove, UK: Psychology Press.

Laure Pisella

Laure Pisella, Lyon Neuroscience Research Center, Bron, France A. Blangero

A. Blangero, Lyon Neuroscience Research Center, Bron, France Caroline Tilikete

Caroline Tilikete, Lyon Neuroscience Research Center, University Lyon, Hospices Civils de Lyon, Hôpital Neurologique. Damien Biotti

Damien Biotti, Lyon Neuroscience Research Center. Gilles Rode

Gilles Rode, Lyon Neuroscience Research Center, Hospices Civils de Lyon, and Hôpi tal Henry Gabrielle. Alain Vighetto

Alain Vighetto, Lyon Neuroscience Research Center, University Lyon, Hospices Civils de Lyon, Hôpital Neurologique, Lyon, France. Jason B. Mattingley

Jason B. Mattingley is Professor of Cognitive Neuroscience, The University of Queensland. Yves Rossetti

Yves Rossetti, Lyon Neuroscience Research Center, University Lyon, Mouvement et Handicap, Plateforme IFNL-HCL, Hospices Civils de Lyon.

Page 56 of 56

Semantic Memory

Semantic Memory Eiling Yee, Evangelia G. Chrysikou, and Sharon L. Thompson-Schill The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0017

Abstract and Keywords Semantic memory refers to general knowledge about the world, including concepts, facts, and beliefs (e.g., that a lemon is normally yellow and sour or that Paris is in France). How is this kind of knowledge acquired or lost? How is it stored and retrieved? This chapter reviews evidence that conceptual knowledge about concrete objects is acquired through experience with them, thereby grounding knowledge in distributed representations across brain regions that are involved in perceiving or acting on them, and impaired by damage to these brain regions. The authors suggest that these distributed representa tions result in flexible concepts that can vary depending on the task and context, as well as on individual experience. Further, they discuss the role of brain regions implicated in selective attention in supporting such conceptual flexibility. Finally, the authors consider the neural bases of other aspects of conceptual knowledge, such as the ability to general ize (e.g., to map lemons and grapes onto the category of fruit), and the ability to repre sent knowledge that does not have a direct sensorimotor correlate (e.g., abstract con cepts, such as peace). Keywords: semantic memory, concepts, categories, representation, knowledge, sensorimotor, grounding, embodi ment

Introduction What Is Semantic Memory? How do we know what we know about the world? For instance, how do we know that a cup must be concave, or that a lemon is normally yellow and sour? Psychologists and cog nitive neuroscientists use the term semantic memory to refer to this kind of world knowl edge. In his seminal article, “Episodic and Semantic Memory,” Endel Tulving borrowed the term semantic from linguists to refer to a memory system for “words and other verbal symbols, their meaning and referents, about relations among them, and about rules, for mulas, and algorithms for manipulating them”1 (Tulving, 1972, p. 386). Page 1 of 40

Semantic Memory Today, most psychologists use the term semantic memory more broadly—to refer to all kinds of general world knowledge, whether it is about words or concepts, facts or beliefs. What these types of world knowledge have in common is that they are made up of knowl edge that is independent of specific experiences; instead, it is general information or knowledge that can be retrieved without reference to the circumstances in which it was originally acquired. For example, the knowledge that lemons are shaped like mini-foot balls would be considered part of semantic memory, whereas knowledge about where you were the last time you tasted a lemon would be considered part of episodic memory. This division is reflected in a prominent taxonomy of long-term memory (Squire, 1987), in which semantic and episodic memory are characterized as distinct components of the ex plicit (or declarative) memory system for facts (semantic knowledge) and events (episodic knowledge).

What Is the Relationship Between Semantic Memory and Episodic Memory? (p. 354)

Although semantic memory and episodic memory are typically considered distinct, the de gree to which semantic memory is dependent on episodic memory is a matter of ongoing debate. This is because in order to possess a piece of semantic information, there must have been some episode during which that information was learned. Whether this means that all information in semantic memory begins as information in episodic memory (i.e., memory linked to a specific time and place) is an open question. According to Tulving, the answer is no: “If a person possesses some semantic memory information, he obviously must have learned it, either directly or indirectly, at an earlier time, but he need not pos sess any mnemonic information about the episode of such learning …” (p. 389). In other words, it may be possible for information to be incorporated into our semantic memory in the absence of ever having conscious awareness of the instances in which we were ex posed to it. Alternatively, episodic memory may be the “gateway” to semantic memory (see Squire & Zola, 1998, for review)—that is, it may be the route through which seman tic memory must be acquired (although eventually this information may exist indepen dently). Most of the evidence brought to bear on this debate has come from studies of pa tients with selective episodic or semantic memory deficits. We turn to these patients in the following two subsections.

How Is Semantic Memory Acquired? Children who develop amnesia in early childhood (consequent to bilateral hippocampal damage) are relevant to the question of whether the acquisition of semantic information depends on episodic memory. If semantic knowledge is acquired through episodic memo ry, then because these children had limited time to acquire semantic knowledge before developing amnesia, they should have limited semantic knowledge. Interestingly, despite their episodic memory impairments, amnesic children’s semantic knowledge appears rel atively intact (Bindschaedler et al., 2011; Gardiner et al., 2008; Vargha-Khadem et al., 1997). Furthermore, studies on the famous amnesic patient H.M. have revealed that he acquired some semantic knowledge after the surgery that led to his amnesia (for words Page 2 of 40

Semantic Memory that came into common use [Gabrieli et al., 1988] and for people who became famous [O’Kane et al., 2004] after his surgery). Thus, the evidence suggests that semantic knowl edge can be acquired independently of the episodic memory system. However, semantic knowledge in these amnesic patients is not normal (e.g., it is acquired very slowly and la boriously). It is therefore possible that the acquisition of semantic memory normally de pends on the episodic system,2 but other points of entry can be used (albeit less efficient ly) when the episodic system is damaged. Alternatively, these patients may have enough remaining episodic memory to allow the acquisition of semantic knowledge (Squire & Zo la, 1998).

Can Semantic Memories Be “Forgotten”? Everyone occasionally experiences difficulty retrieving episodic memories (what did I eat for dinner last night?), but can people lose their knowledge of what things are? Imagine walking through an orchard with a friend: Your friend has no trouble navigating among the trees; then—to your surprise—as you stroll under a lemon tree, she picks up a lemon, holds it up and asks, “What is this thing?” In an early report, Elizabeth Warrington (1975) described three patients who appeared to have lost this kind of knowledge. The syndrome has subsequently been termed semantic dementia (also known as the temporal variant of fronto-temporal dementia), a neurode generative disease that causes gradual and selective atrophy of the anterior temporal cor tex (predominantly on the left; see Garrard & Hodges, 1999 ; Mesulam et al., 2003; Mum mery et al., 1999). Although semantic dementia patients typically speak fluently and with out grammatical errors, as the disease progresses, they exhibit severe word-finding diffi culties and marked deficits in identifying objects, concepts, and people (Snowden et al., 1989) irrespective of stimulus modality (e.g., pictures or written or spoken words; Bozeat et al., 2000; Hodges et al., 1992; Patterson et al., 2006, 2007; Rogers & Patterson, 2007; Snowden et al., 1994, 2001). Semantic dementia patients’ performance on tests of visuo-spatial reasoning and execu tive function is less impaired (e.g., Hodges et al., 1999; Rogers et al., 2006). Importantly, they also have relatively preserved episodic memories (e.g., Bozeat et al., 2002a, 2002b, 2004; Funnell, 1995a, 1995b, 2001; Graham et al., 1997, 1999; Snowden et al., 1994, 1996, 1999). Research on semantic dementia thus provides further evidence that the neural structures underlying episodic memory are at least partially independent of those underlying retrieval from semantic memory. How one conceives of the relationship between semantic and episodic memory is complicated by the fact that (as we discuss in the following section) there are different kinds of semantic knowledge. It may be that for sensorimotor aspects of semantic knowl edge (e.g., knowledge about the shape, size, or smell of things), “new information enters semantic memory through our perceptual systems, not through episodic memory” (Tulv ing, 1991. p. 20), whereas semantic knowledge of information that does not enter directly through our senses (e.g., “encyclopedic knowledge,” such as the fact that trees photosyn (p. 355)

Page 3 of 40

Semantic Memory thesize) depends more heavily on contextual information. Moreover, sensorimotor and nonsensorimotor components of semantic knowledge may be stored in different areas of the cortex. Of note, even encyclopedic knowledge is often acquired indirectly; for exam ple, knowing that apple trees photosynthesize allows you to infer that lemon trees also photosynthesize. Semantic knowledge may support the ability to make these kinds of gen eralizations. In the next section, we introduce some influential hypotheses about what the different components of semantic knowledge might be.

What Are the Different Aspects of Semantic Memory? Psychologists began to ask questions about how our knowledge about the world is orga nized following observations of different kinds of impairments in patients with brain in juries. More than 25 years ago, Warrington and McCarthy (1983) described a patient who had more difficulty identifying nonliving than living things. Shortly after, Warrington and Shallice (1984) described four patients exhibiting a different pattern of impairments: more difficulty identifying living than nonliving things. These and other observations of category-specific impairments led to the proposal that semantic memory might be orga nized in domains of knowledge such as living things (e.g., animals, vegetables, fruits) and nonliving things (e.g., tools, artifacts), which can be selectively impaired after brain in jury (Warrington & McCarthy, 1994). Thus, one possible organizational framework for se mantic knowledge is categorical (also referred to as domain specific; e.g., Caramazza & Shelton, 1998). Early functional neuroimaging studies, however, suggested that semantic memory may be organized along featural (also known as modality- or attribute-specific) lines—either in stead of or in addition to domain-specific lines. These studies showed neuroanatomical dissociations between visual and nonvisual object attributes, even within a category (e.g., Thompson-Schill et al., 1999). For example, Martin and colleagues (1995) reported that retrieving the color of an object was associated with activation in ventral temporal cortex bilaterally, whereas retrieving action-related information was associated with activation in middle temporal and frontal cortex. Further observations from neuropsychological patients have suggested even finer subdi visions within semantic memory (e.g., Buxbaum & Saffran, 2002; Saffran & Schwartz, 1994). In particular, in categorical frameworks, living things can be further divided into distinct subcategories (e.g., fruits and vegetables). Similarly, in featural frameworks, non visual features can be subdivided into knowledge about an object’s function (e.g., a spoon is used to eat) versus knowledge about how it is manipulated (e.g., a spoon is held with the thumb, index, and middle fingers, at an angle; Buxbaum, Veramonti, & Schwartz, 2000; Kellenbach, Brett, & Patterson, 2003; Sirigu et al., 1991); likewise, visual features can be subdivided into different attributes (e.g., color, size, form, or motion; see Thomp son-Schill, 2003, for review).

Page 4 of 40

Semantic Memory In the remainder of this chapter, we present a number of different theories cognitive neu roscientists have proposed for the organization of semantic knowledge, and we discuss experimental evidence on how this organization might be reflected in the brain. Although some findings would appear, at first, to be consistent with an organization of semantic memory by categories of information, we will conclude that the bulk of the evidence sup ports an organization by features or attributes that are distributed across multiple brain regions.

How Is Semantic Memory Organized? How is knowledge in semantic memory organized? Is it organized like files appear on a computer, with separate folders for different kinds of information (Applications, Docu ments, Music, Movies, etc.), and subfolders within those folders providing further organi zation? That is, is semantic knowledge organized hierarchically? Or is it organized more like how information is actually stored in computer (e.g., RAID) memory, wherein data are stored in multiple (frequently redundant) drives or levels to increase access speed and re liability? That is, is semantic knowledge organized in a distributed fashion? In this section we briefly describe four different classes of models that have been put forth to describe the organization of semantic memory. (p. 356)

Traditional Cognitive Perspectives

Classical cognitive psychological theories have described the organization of knowledge in semantic memory in terms of a hierarchy (e.g., a tree is a plant and a plant is a living thing; Collins & Quillian, 1969) that is structured according to abstract relations between concepts (i.e., the propositions, rules, or procedures that determine where a concept fits in the hierarchy) and that may be inaccessible to conscious experience (e.g., Pylyshyn, 1973). Cognitive theorists have also considered whether semantic knowledge may be ac quired and stored in multiple formats akin to verbal and visual codes (e.g., Paivio, 1969, 1971, 1978). Historically, these theories have not described brain mechanisms that might support conceptual knowledge, but these sorts of descriptions foreshadow the theories about the organization of semantic memory (category vs. attribute based) that character ize cognitive neuroscience today.

Domain-Specific Category-Based Models As described above, a number of observations from patients with brain injuries suggest that different object categories (i.e., living and nonliving things) might be differentially in fluenced by brain damage. One way to instantiate the evident neural dissociation be tween living and nonliving things is to posit that there are distinct neural regions dedicat ed to processing different categories of objects. The “domain-specific” category-based model (Caramazza & Shelton, 1998) does just that. According to this model, evolutionary pressure led to the development of adaptations to facilitate recognition of categories that are particularly relevant for survival or reproduction, such as animals, plant life (i.e., Page 5 of 40

Semantic Memory fruits and vegetables), conspecifics, and possibly tools; and these adaptations led to ob jects from these different categories having distinct, non-overlapping neural representa tions. Such a system would have adaptive value to the extent that having dedicated neur al mechanisms for recognizing these objects could make for faster and more accurate classification—and subsequent appropriate response. Although a fundamental principle of this model is that representations of concepts from these different categories are processed in distinct regions and thus do not overlap, it does not speak to how conceptual knowledge is represented within these categories. In fact, an elaboration of this model (Mahon & Caramazza, 2003) is partially distributed and partially sensorimotor based in that it suggests that representations may be distributed over different sensory modalities. However, within each modality, the representations of different categories remain distinct.

Sensory-Functional and Sensorimotor-Based Theories A complication for category-based models is that despite the “category-specific” label, pa tients’ recognition problems do not always adhere to category boundaries—deficits can span category boundaries or affect only part of a category. This suggests a need for an ac count of semantic memory that does not assume a purely category-specific organization. Sensory-functional theory provides an alternative account. According to this model, con ceptual knowledge is divided into anatomically distinct sensory and functional stores, and so-called category-specific deficits emerge because the representations of different kinds tend to rely on sensory and functional information to different extents (Farah & McClel land, 1991; Warrington & McCarthy, 1987). For example, representations of living things depend more on visual information than do artifacts, which depend more on functional in formation. Consequently, deficits that partially adhere to category boundaries can emerge even without semantic memory being categorically organized per se. Sensory-functional theory is not without its own problems, however. There exist numer ous patients whose deficits cannot be captured by a binary sensory-functional divide (see Caramazza & Shelton, 1998, for a review), which demonstrates that a simple two-way partitioning of semantic attributes is overly simplistic. A related but more fully specified proposal by Alan Allport addresses this concern by pointing out that sensory information should not be considered a unitary entity but rather should be divided into multiple at tributes (e.g., color, sound, form, touch). Specifically, Allport (1985) suggests that the sen sorimotor systems used to experience the world are also used to represent meaning: “The essential idea is that the same neural elements that are involved in coding the sensory at tributes of a (possibly unknown) object presented to eye or hand or ear also make up the elements of the auto-associated activity-patterns that represent familiar object-concepts in ‘semantic memory’” (1985, p. 53).3 Hence, according to Allport’s model, representa tions are sensorimotor based, and consequently, the divisions of labor that exist in senso rimotor processing should be reflected in conceptual representations. More recently, oth er sensorimotor-based models have made similar claims (e.g., Barsalou, 1999; (p. 357)

Page 6 of 40

Semantic Memory Damasio, 1989; Lakoff & Johnson, 1999; in a later section, we discuss empirical studies that address these predictions). One question that often arises with respect to these sensorimotor-based theories is whether, in addition to sensorimotor representations and the connections between them, it is useful to posit one or more specific brain regions, often called a hub or convergence zone, where higher order similarity—that is, similarity across sensory modalities—can be computed (e.g., Damasio, 1989; Simmons & Barsalou, 2003). Such an architecture may facilitate capturing similarity among concepts, thereby promoting generalization and the formation of categories (see Patterson et al., 2007, for a review). We return to these is sues in later sections, where we discuss generalization and the representation of knowl edge that is abstract in that it has no single direct sensorimotor correlate (e.g., the pur pose for which an object is used, such as “to tell time” for a clock).

Correlated Feature-Based Accounts The final class of models that we discuss is commonly referred to as correlated featurebasedaccounts (Gonnerman et al., 1997; McRae, de Sa, & Seidenberg, 1997; Tyler & Moss 2001). According to these models, the “features” from which concepts are built comprise not only sensorimotor-based features (such as shape, color, action, and taste) but also other (experience-based) attributes that participants produce when asked to list features of objects. For instance, for a tiger, these features might include things such as “has eyes,” “breathes,” “has legs,” and “has stripes,” whereas for a fork, they might in clude “made of metal,” “used for spearing,” and “has tines.” Importantly, different classes of objects are characterized by different degrees of co-oc currence of particular types of features. For example, for a given living thing, participants tend to list features that are shared with other living things (e.g., “has eyes,” “breathes,” “has legs”), whereas for artifacts, they tend to list features that are not shared with other artifacts (e.g., “used for spearing,” “has tines”). When features tend to co-occur, they can be said to be correlated. For example, if something has legs, it is also likely to breathe and to have eyes. Because correlated feature-based models consider that living and non living things can be described through component features, they are at least partially compatible with both sensorimotor and domain-specific theories.4 According to one influential correlated feature-based model (Tyler & Moss, 2001), highly correlated shared features tend to support knowledge of a category as a whole, whereas distinctive features tend to support accurate identification of individual members. Fur ther, the correlations between features enable them to support each other, making these features robust. Hence, because living things have many shared features, general catego ry knowledge is robust for them. On the other hand, because individual living things tend to have few and uncorrelated distinctive features (e.g., “has stripes” or “has spots”), dis tinctive information about living things is particularly susceptible to impairment. In con trast, features that distinguish individual artifacts from others tend to be correlated (e.g., “has tines” is correlated with “used for spearing”), making this information robust. While Page 7 of 40

Semantic Memory differing in some details, Cree and McRae’s (2003) feature-based account similarly posits that objects (living and nonliving) differ with respect to number of shared versus distinc tive features and that these factors vary with object category. Hence, correlated featurebased accounts hypothesize that the reason for category-specific deficits is not domain of knowledge per se, but instead is differences in the distribution of features across domains (see also Rogers & Patterson, 2007).

Summary of Models The main division between domain-specific category-based models, on the one hand, and sensorimotor-based and correlated feature-based accounts, on the other, concerns how category knowledge is represented. For domain-specific models, object category is a pri mary organizing principle of semantic memory, whereas for the other accounts, category differences emerge from other organizational properties. In many ways, correlated fea ture-based accounts echo sensorimotor-based theories. In particular, these two classes of models are parallel in that categories emerge through co-occurrence of features, with the relevance of different features depending on the particular object, and with different parts of a representation supporting one another. The major distinguishing aspect is that sensorimotor-based theories focus on sensorimotor features—specifying that the same brain regions that encode a feature represent it. In contrast, because none of the funda mental principles of correlated feature-based accounts require that features be sensori motor based (in fact, a concern for these models is how features should be defined), these accounts do not require that features be situated in brain regions that are tied to sensory or motor processing. Incorporating a convergence zone type of architecture into a sensorimotor-based model may help integrate all three classes of models. Convergence zone theories posit dedicated regions for integrating across sensorimotor-based features, extracting statisti cal regularities across concepts, and ultimately producing a level of representation with a category-like topography in the brain (Simmons & Barsalou, 2003). (p. 358)

What Are the Neural Systems that Support Se mantic Memory, and How Do We Retrieve Se mantic Information from These Systems? Are Different Categories Supported by Different Brain Regions? Functional neuroimaging techniques like positron emission tomography (PET) and func tional magnetic resonance imaging (fMRI) have allowed cognitive neuroscientists to ex plore different hypotheses regarding the neural organization of semantic memory in un damaged brains. By means of these methodologies, researchers observe regional brain activity while participants perform cognitive tasks such as naming objects, deciding

Page 8 of 40

Semantic Memory whether two stimuli belong in the same object category, or matching pictures of stimuli to their written or spoken names. Early work attempted to examine whether specific brain regions are selectively active for knowledge of different object categories (e.g., animals or tools). These studies found that thinking about animals tends to produce increased neural activity in inferior posterior ar eas, including inferior temporal (Okada et al., 2000; Perani et al., 1995) and occipital re gions (Grossman et al., 2002; Martin et al., 1996; Okada et al., 2000; Perani et al., 1995), whereas thinking about tools tends to activate more dorsal and frontal areas, including left dorsal (Perani et al., 1995) or inferior (Grossman et al., 2002; Okada et al., 2000) pre frontal regions, as well as left premotor (Martin et al., 1996), inferior parietal (Okada et al., 2000), and posterior middle temporal areas (Grossman et al., 2002; Martin et al., 1996; Okada et al., 2000). Further, within the inferior temporal lobe, the lateral fusiform gyrus generally shows increased neural activity in response to animals, while the medial fusiform tends to respond more to tools (see Martin, 2007, for a review). Although these findings might seem at first glance to provide unambiguous support for a domain-specific, category-based organization of semantic memory, the data have not al ways been interpreted as such. Sensory-functional theories can also account for putative ly category-specific activations because they posit that different regions of neural activity for animals and tools reflect a tendency for differential weighting of visual and functional features for objects within a given category, rather than an explicit category-based orga nization (e.g., Warrington & McCarthy, 1987). The hypothesis that a feature’s weight can vary across objects raises the possibility that even for a given object, a feature’s weight may vary depending on its relevance to a given context. In other words, the extent to which a particular feature becomes active for a giv en object may be contextually dependent not only on long-term, object-related factors (i.e., is this feature relevant in general for the identification of this object?) but also on short-term, task-related factors (i.e., is this feature relevant for the current task?). The following sections describe evidence suggesting that both the format of the stimulus with which semantic memory is probed (i.e., words vs. pictures) and the demands of the task influence which aspects of a given concept’s semantic representation are activated.

Does the Format of the Stimulus Influence Semantic Memory Re trieval? Studies of neuropsychological patients have suggested dissociations in performance be tween semantic knowledge tasks that use pictorial or verbal stimuli. For example, pa tients with optic aphasia are unable to identify objects presented visually, whereas their performance with lexical/verbal stimuli remains unimpaired (e.g., Hillis & Caramazza, 1995; Riddoch & Humphreys, 1987). On the other hand, Saffran and colleagues (2003a) described a patient whose object recognition performance was enhanced when prompted with pictures but not with words. This neuropsychological evidence suggests that pic tures and words may have differential access to different components of semantic knowl Page 9 of 40

Semantic Memory edge (Chainay & Humphreys, 2002; Rumiati & Humphreys, 1998; Saffran et al., 2003b). That is, damage to a component accessed by one stimulus type (e.g., words) can spare components accessed by a different stimulus type (e.g., pictures). Consistent with the neuropsychological observations, studies of healthy participants have found that although the patterns of brain activation produced when accessing the same concept from pictures and words can overlap significantly, there are also differences (e.g., Gates & Yoon, 2005; Vandenberghe et al., 1996; see also Sevostianov et al., 2002). (p. 359) Bright, Moss, and Tyler (2004; see also Wright et al., 2008) performed a metaanalysis of four PET studies involving semantic categorization and lexical decision tasks with verbal and pictorial stimuli. They found evidence for a common semantic system for pictures and words in the left inferior frontal gyrus and left temporal lobe (anterior and medial fusiform, parahippocampal, and perirhinal cortices) and evidence for modalityspecific activations for words in both temporal poles and for pictures in both occipito-tem poral cortices. Overall, evidence from studies examining access to semantic knowledge from pictures versus words suggests that concepts are distributed patterns of brain acti vation that can be differentially tapped by stimuli in different formats.

Does the Type of Task Influence Semantic Memory Retrieval? Retrieval from semantic memory can be influenced not only by the format of the stimuli used to elicit that information (as described above) but also by specifics of the task, such as the information that the participant is asked to produce and the amount of time provid ed to respond. For example, in an elegant PET experiment, Mummery and colleagues (1998) showed participants the names of living things or artifacts and asked them to make judgments about either a perceptual attribute (color) or a nonperceptual attribute (typical location). Different attribute judgments elicited distinct patterns of activation (in creased activation in the left temporal-parietal-occipital junction for location and in creased activation in the left anterior middle temporal cortex for color). Moreover, differ ences between attributes were larger than differences between category (i.e., living things vs. artifacts), suggesting that the most prominent divisions in semantic memory may be associated with attributes rather than categories—a structure consistent with dis tributed, feature-based models of semantic memory (see also Moore & Price, 1999). The amount of time provided to respond also appears to affect which aspects of a concept become active. In an early semantic priming study, Schreuder and colleagues (1984) observed that priming for perceptual information (e.g., between the concepts apple and ball, which are similar in shape) emerges when task demands encourage a rapid re sponse, whereas priming for more abstract information (e.g., between apple and banana, which are from the same category) emerges only when responses are slower (see Yee et al., 2011, for converging evidence). More recently, Rogers and Patterson (2007) provided additional evidence that speed of response influences which semantic features are avail able: When participants were under time pressure, responses were more accurate for cat egorization judgments that did not require specific information, such as between cate gories (e.g., distinguishing birds from vehicles), and less accurate for categorization that Page 10 of 40

Semantic Memory did require access to specific information, such as within a category (e.g., distinguishing between particular kinds of birds). When participants were allowed more time to re spond, the pattern reversed. Thus, the results of these studies suggest that the specifics of the task influence which aspects of a representation become measurably active. In sum, retrieval from semantic memory can be influenced not only by the format of the stimuli used to elicit the information (e.g., words vs. pictures) but also by the timing of the task and the information that the participant is asked to provide.

Is Retrieval Influenced by Interactions Between Category and Task? The format- and task-related effects reviewed earlier suggest that the most prominent di vision in semantic memory might be in terms of attribute domains and not, necessarily, category domains, thus offering support for distributed, feature-based models of semantic memory. Clearly, though, differences in format or task cannot account for the fact that dif ferences between categories can be observed even with the same format and task. How ever, the presence of both format and task effects in semantic knowledge retrieval raises the possibility that interactions between stimulus modality and task type can elicit catego ry effects that these factors do not produce independently. In this section we explore how the organization of semantic memory might accommodate stimulus, task, and category ef fects. For instance, the particular combinations of sensorimotor attributes retrieved from se mantic memory might be determined by an interaction between task-type and sensorimo tor experience (Thompson-Schill et al., 1999). For example, for living things, retrieval of both visual and nonvisual information should require activation of visual attributes be cause semantic memory about living things depends largely on knowledge about their vi sual features. To illustrate, people’s experience with zebras is largely visual; hence, re trieval of even nonvisual information about them (e.g., Do zebras live in Africa?) will en gage visual attributes because one’s knowledge about zebras is built around their (p. 360) visual features (assuming that retrieving more weakly represented attributes depends on the activation of more strongly represented attributes; see Farah & McClelland, 1991). In contrast, for nonliving things, only retrieval of visual information should require activa tion of visual attributes. For instance, because people’s experience with microwave ovens is distributed across a wide range of properties (e.g., visual, auditory, tactile), retrieval of nonvisual information about them (e.g., Do microwave ovens require more electricity than refrigerators?) will not necessarily engage visual attributes. Thompson-Schill and colleagues (1999) found evidence for just such a dissociation: The left fusiform gyrus (a region linked to visual knowledge) was activated by living things re gardless of whether participants made judgments about their visual or nonvisual proper ties. In contrast, for nonliving things, the same visual region was active only when partici pants were asked to make judgments about visual properties. The complementary pattern has also been observed: A region linked to action information (the left posterior middle temporal cortex) was activated by tools for both action and nonaction tasks, but was acti Page 11 of 40

Semantic Memory vated by fruit only during an action task (Phillips et al., 2002). These and related findings (Hoenig et al., 2008) suggest that category-specific activations may reflect differences in which attributes are important for our knowledge of different object categories (but see Caramazza, 2000, for an alternative perspective). Related work has demonstrated that ostensibly category-specific patterns can be elimi nated by changing the task. Both patients with herpes simplex virus encephalitis and unimpaired participants exhibit apparently category-specific patterns when classifying objects at the “basic” level (i.e., at the level of dog or car) as revealed by errors or by functional activity in ventral temporal cortex, respectively. However, these differences can be made to disappear when objects are classified more specifically (e.g., Labrador or BMW, instead of dog or car; Lambon Ralph et al., 2007; Rogers et al., 2005). Why might level of classification matter? One possibility relates to correlated feature-based models (discussed earlier): Differences in the structure of the stimuli that are correlated with cat egory may interact with the task (e.g., Humphreys et al., 1988; Price et al., 2003; Tarr & Gautier, 2000; see also Cree & McRae, 2003). For instance, at the basic level, animals typically share more features (e.g., consider dog vs. goat), than do vehicles (e.g., car vs. boat). This greater similarity for animals may produce a kind of “crowding” that makes them particularly difficult to differentiate at the basic level (e.g., Rogers et al., 2005; Nop peney et al., 2007; Tyler & Moss, 2001; but cf. Wiggett et al., 2009, who find that interac tions between category and task do not always modulate category effects). Hence, the studies described in this section provide further evidence that apparently cat egory-specific patterns may be due to interactions between stimuli and task. More broad ly, numerous studies have explored whether semantic memory is organized in the brain by object category, by perceptual or functional features, or by a multimodal distributed network of attributes. Thus far, the findings are compatible with correlated feature and sensorimotor-based accounts and appear to suggest a highly interactive distributed se mantic system that is engaged differently depending on object category and task de mands (for a review, see Thompson-Schill, 2003).

Do the Same Neural Regions Underlie Perceptual and Conceptual Processing of Objects? The preceding evidence largely supports one main tenet of sensorimotor, feature-based accounts—that semantic memory is distributed across different brain regions. However, an additional claim of sensorimotor theory is that the brain regions that are involved when perceiving and interacting with an object also encode its meaning. To address this claim, research has attempted to explore the extent to which the different sensorimotor properties of an object (e.g., its color, action, or sound) activate the same neural systems as actually perceiving these properties. With respect to color, for example, Martin and colleagues (1995) measured changes in re gional cerebral blood flow using PET when participants generated the color or the action associated with pictures of objects or their written names. Generating color words led to Page 12 of 40

Semantic Memory activation in the ventral temporal lobe in an area anterior to that implicated in color per ception, whereas generating action words was associated with activation in the middle temporal gyrus just anterior to a region identified in the perception of motion. Martin and colleagues interpreted these results as indicative of a distributed semantic memory net work organized according to one’s sensorimotor experience of different object attributes (see also Ishai et al., 2000; Wise et al., 1991). More recent studies have reported some di rect overlap5 between regions involved in color perception and (p. 361) those involved in retrieval of color knowledge about objects (Hsu et al., 2011; Simmons et al., 2007). With respect to action, analogous findings have been reported regarding overlap between perceptual-motor and conceptual processing. Chao and Martin (2000; see also Chao, Hax by, & Martin, 1999; Gerlach et al., 2000) showed that the left ventral premotor and left posterior parietal cortices (two areas involved in planning and performing actions) are se lectively active when participants passively view or name pictures of manipulable tools. The involvement of these regions despite the absence of a task requiring the retrieval of action information (i.e., even during passive viewing) can be explained if the representa tions of manipulable objects include areas involved in planning and performing actions. In a recent study (Yee, Drucker, & Thompson-Schill, 2010) we obtained additional evidence supporting this hypothesis: In left premotor cortex and inferior parietal sulcus, the neural similarity of a pair of objects (as measured by fMRI-adaptation; see later) is correlated with the degree of similarity in the actions used to interact with them. For example, a pi ano and a typewriter, which we interact with using similar hand motions, have similar representations in action regions, just as they should if representations are sensorimotor based. Moreover, reading action words (e.g., lick, pick, kick) produces differential activity in or near motor regions activated by actual movement of the tongue, fingers, and feet, respectively (Hauk et al., 2004). Interestingly, it appears that this motor region activation can be modulated by task: Reading an action verb related to leg movement (e.g., kick) ac tivates motor regions in literal (kick the ball) but not figurative (kick the bucket) sen tences (Raposo et al., 2009). Although visual and motor features have been studied most often, other modalities also supply evidence for overlap between conceptual and perceptual processing. Regions in volved in auditory perception and processing (posterior and superior middle temporal gyri) are active when reading the names of objects that are strongly associated with sounds (e.g., telephone; Kiefer et al., 2008; see also Goldberg et al., 2006; Kellenbach et al., 2001; Noppeney & Price, 2002). Similarly, an orbitofrontal region associated with taste and smell is activated when making decisions about objects’ flavor (Goldberg et al., 2006), and simply reading words with strongly associated smells (e.g., cinnamon) acti vates primary olfactory areas (Gonzalez et al., 2006). Patients with brain damage affecting areas involved in sensorimotor processing are also relevant to the question of whether regions underlying perception and action also under lie conceptual knowledge. A sensorimotor-based account would predict that damage to an auditory, visual, or motor area (for example), should affect the ability to retrieve auditory, visual, or motor information about an object, whereas access to features corresponding to Page 13 of 40

Semantic Memory undamaged brain regions would be less affected. There is evidence that this is indeed the case. For instance, patients with damage to left auditory association cortex have prob lems accessing concepts for which sound is highly relevant (e.g., thunder or telephone; Bonner & Grossman, 2012; Trumpp et al., 2013). Likewise, a patient with damage to ar eas involved in visual processing (right inferior occipito-temporal junction) had more diffi culty naming pictures of objects whose representations presumably rely on visual infor mation (e.g., living things that are not ordinarily manipulated) than objects whose repre sentations are presumably less reliant on visual information (e.g., living or nonliving things that are generally manipulated); the patient’s encyclopedic and auditory knowl edge about both types of objects, in contrast, was relatively preserved (Wolk et al., 2005). Similarly, apraxic patients, who have difficulty performing object-related actions—and who typically have damage to the premotor or parietal areas subserving these actions— show abnormally delayed access to manipulation information about objects (Myung et al., 2010). Studies with normal participants using transcranial magnetic stimulation (TMS), which produces a temporary and reversible “lesion” likewise suggest that motor areas are involved in processing motor-related concepts (e.g., Pobric et al., 2010; see Hauk et al., 2008, for review), as do studies requiring normal participants to perform an explicit motor task designed to interfere with activating object-appropriate motor programs (e.g., Witt et al., 2010; Yee et al., 2013). Finally, Gainotti (2000) conducted a comprehensive re view of category-specific deficits, focusing on relationships between location of brain damage and patterns of impairment. These relationships, Gainotti observed, suggest that the categorical nature of the deficits is produced by correlations between (damaged) brain regions and sensorimotor information that is central to various categories. Overall, findings from neuroimaging, neuropsychological, and TMS studies converge to suggest that semantic knowledge about objects is built (p. 362) around their sensorimotor attributes and that these attributes are stored in sensorimotor brain regions.

Which Neural Regions Underlie the Generalization of Semantic Knowledge? A critical function of semantic memory is the ability to generalize (or abstract) over our experiences with a given object. Such generalization permits us to derive a representa tion that will allow us to recognize new exemplars of it and make predictions about as pects of these exemplars that we have not directly perceived. For example, during analog ical thinking, generalization is critical to uncover relationships between a familiar situa tion and a new situation that may not be well understood (e.g., that an electron is to the nucleus like a planet is to the sun). Thus, analogical thinking involves not only retrieving information about the two situations but also a mapping between their surface elements based on shared abstract relationships (see Chrysikou & Thompson-Schill, 2010). Similar ly, knowing that dogs and cats are both animals (i.e., mapping them from their basic to their superordinate level categories) may facilitate generalization from one to the other. A full treatment of the process of generalization would be beyond the scope of this chapter.

Page 14 of 40

Semantic Memory However, we briefly touch on some of the things that cognitive neuroscience has revealed about the generalization process. Several findings are consistent with the idea that different brain regions support different levels of representation. For instance, an anterior temporal region (the perirhinal cortex, particularly in the left) was activated when naming pictures at the basic level (e.g., dog or hammer), but not at the superordinate level (e.g., living or manmade), whereas a posteri or temporal region (fusiform gyrus bilaterally) was activated for both levels (Tyler et al., 2004, but cf. Rogers et al., 2006). In addition, greater anterior temporal lobe activity has been observed during word–picture matching at a specific level (e.g., robin?kingfisher?) than at a more general level (e.g., animal?vehicle?; Rogers et al., 2006). Further, process ing may differ for different levels of representation: Recordings of neural activity (via magnetoencephalography) suggest that during basic level naming, there are more recur rent interactions between left anterior and left fusiform regions than during superordi nate level naming (Clark et al., 2011). One interpretation of these findings is that there exists a hierarchically structured system along a posterior-anterior axis in the temporal cortex—with posterior regions more in volved in coarse processing (such as the presemantic, perceptual processing required for superordinate category discrimination) and anterior regions more involved in the integra tion of information across modalities that facilitates basic-level discrimination (e.g., cat vs. dog; see Martin & Chao, 2001). More broadly, these and related findings (e.g., Chan et al., 2011; Grabowski et al., 2001; Kable et al., 2005) are consistent with the idea that se mantic knowledge is represented at different levels of abstraction in different regions (see also Hart & Kraut, 2007, for a mechanism by which different types of knowledge could be integrated). If true, this may be relevant to a puzzle that has emerged in neuroimaging tests of Allport’s (1985) sensorimotor model of semantic memory. There is a consistent trend for retrieval of a given physical attribute to be associated with activation of cortical areas 2 to 3 cm anterior to regions associated with perception of that attribute (Thompson-Schill, 2003). This pattern, which has been interpreted as coactivation of the “same areas” in volved in sensorimotor processing, as Allport hypothesized, could alternately be used as grounds to reject the Allport model. What does this anterior shift reflect? We believe the answer may lie in ideas developed by Rogers and colleagues (2004). They have articulated a model of semantic memory that includes units that integrate informa tion across all of the attribute domains (including verbal descriptions and object names; McClelland & Rogers, 2003). As a consequence, “abstract semantic representations emerge as a product of statistical learning mechanisms in a region of cortex suited to per forming cross-modal mappings by virtue of its many interconnections with different per ceptual-motor areas” (Rogers et al., 2004, p. 206). The process of abstracting away from modality-specific representations may occur gradually across a number of cortical re gions (perhaps converging on the temporal pole). As a result, a gradient of abstraction may emerge in the representations throughout a given region of cortex (e.g., the ventral Page 15 of 40

Semantic Memory extrastriate visual pathway), and the anterior shift may reflect activation of a more ab stract representation (Kosslyn & Thompson, 2000). In other words, the conceptual simi larity space in more anterior regions may depart a bit from the similarity space in the en vironment, moving in the direction of abstract relations. A gradient like this could also help solve another puzzle: If concepts are sensorimotor based, one might worry that thinking of a concept would cause (p. 363) one to hallucinate it or execute it (e.g., thinking of lemon would cause one to hallucinate a lemon, and think ing of kicking would produce a kick). But if concepts are represented (at least in part) at a more abstract level than that which underlies direct sensory perception and action, then the regions that underlie, for example, action execution, need not become sufficiently ac tive to produce action. More work is needed to uncover the nature of the representations —and how the similarity space may gradually change across different cortical regions.

Summary of the Neural Systems Supporting Semantic Memory In this section we have briefly summarized a large body of data on the neural systems supporting semantic memory (see Noppeney, 2009, for a more complete review of func tional neuroimaging evidence for sensorimotor-based models). We suggested that in light of the highly consistent finding that sensorimotor regions are active during concept re trieval, the data largely support sensorimotor-based models of semantic memory. Howev er, there is a question that is frequently raised about activation in sensorimotor regions during semantic knowledge retrieval: Could it be that the activation of sensorimotor re gions that has been observed in so many studies is “epiphenomenal”6 rather than indicat ing that aspects of semantic knowledge are encoded in these regions? (See Mahon & Caramazza, 2008, for discussion.) For example, perhaps activation in visual areas during semantic processing is a consequence of generating visual images, and not of semantic knowledge per se. The patient, TMS, and behavioral interference work described above help to address this question: It is not clear how an epiphenomenal account would ex plain the fact that lesioning or interfering with a sensorimotor brain region affects the ability to retrieve the corresponding attribute of a concept. These data therefore suggest that semantic knowledge is at least partially encoded in sensorimotor regions. However, the task effects described above raise another potential concern. Traditionally, in the study of semantic representations (and, in fact, in cognitive psychology more broadly) it is assumed that only effects that can be demonstrated across a variety of con texts should be considered informative with regard to the structure and organization of semantic memory. If one holds this tenet, then these task effects are problematic. Yet, as highlighted by the work described in this section, task differences can be accommodated if one considers an important consequence of postulating that the representations of con cepts are distributed (recall that all but traditional approaches allow for a distributed ar chitecture): Distributed models allow attention to be independently focused on specific (e.g., contextually relevant) properties of a representation through partial activation of the representation (see Humphreys & Forde, 2001, for a description of one such model). This means that if a task requiring retrieval of action information, for example, produces Page 16 of 40

Semantic Memory activation in premotor and parietal regions, but a task requiring retrieval of color does not, the discrepancy may reflect differential focus of attention within an object concept rather than that either attribute is not part of the object concept. Thus, the differences between effects that emerge in different contexts lead to important questions, such as how we are able to flexibly focus attention on relevant attributes. We turn to this in the next section.

Biasing Semantic Representations If our semantic knowledge is organized in a multimodal, highly interactive, distributed system, how is it that we are able to weight certain attributes more heavily than others depending on the circumstance—so that we can, for example, retrieve just the right com binations of features to identify or answer questions about concepts like a horse, a screw driver, or an airplane? In other words, how does our brain choose, for a given object and given the demands of the task at hand, the appropriate pattern of activation? A number of studies have suggested that the prefrontal cortex, particularly the left ventrolateral re gions, produces a modulatory signal that biases the neural response toward certain pat terns of features (e.g., Frith, 2000; Mechelli et al., 2004; Miller & Cohen, 2001; Noppeney et al., 2006). For example, when, during semantic knowledge retrieval, competition among different properties is high, a region in the left inferior frontal gyrus is activated (Thompson-Schill et al., 1997; see also Kan & Thompson-Schill, 2004; Thompson-Schill et al., 1998; Thompson-Schill, D’Esposito, et al., 1999). Several mechanisms have been proposed regarding this region’s role in selective activa tion of conceptual information, among them that prefrontal cortical activity during se mantic tasks reflects the maintenance of different attributes in semantic memory (e.g., Gabrieli et al., 1998) or that this region performs a “controlled retrieval” of semantic in formation (e.g., Badre & Wagner, 2007). We and others (p. 364) have suggested that this region, although critical in semantic memory retrieval, performs a domain-general func tion as a dynamic filtering mechanism that biases neural responses toward task-relevant information while gating task-irrelevant information (Shimamura, 2000; Thompson-Schill, 2003; Thompson-Schill et al., 2005). In other words, when a context or task requires us to focus on specific aspects of our semantic memory, the left ventrolateral prefrontal cortex biases which aspects of our distributed knowledge system will be most active.

Individual Differences in Access to and in the Organization of Semantic Memory Earlier, we discussed two types of evidence that support sensorimotor models of semantic memory: (1) sensorimotor regions are active during concept retrieval, and (2) damage to a sensorimotor region affects the ability to retrieve the corresponding attribute of an ob ject. However, we have not yet addressed an additional claim of sensorimotor-based theo Page 17 of 40

Semantic Memory ries: If it is true that the sensorimotor regions that are active when an object is perceived are the same regions that represent its meaning, then an individual’s experience with that object should shape the way it is represented. In other words, the studies that we have described so far have explored the way that concepts are represented in the “aver age” brain, and the extent to which the findings have been consistent presumably reflects the commonalities in human experience. Yet studying the average brain does not allow us to explore whether, as predicted by sensorimotor-based theories, differences in individu als’ experiences result in differences in their representation of concepts. In this section we describe some ways in which individual differences influence the organization of se mantic memory.

Are There Individual Differences in Semantic Representations? Semantic representations appear to vary as a consequence of lifelong individual differ ences in sensorimotor experience: For instance, recruitment of left parietal cortex (a re gion involved in object-related action) during the retrieval of object shapes was modulat ed by the amount of lifetime tactile experience associated with the objects (Oliver et al., 2009). Similarly, right- and left-handed people, who use their hands differently to perform various actions with manipulable objects, employ homologous but contralateral brain re gions to represent those objects: When participants named tools, handedness influenced the lateralization of premotor activity (Kan et al., 2006). Critically, handedness was not a predictor of premotor lateralization for objects that are not acted on manually (animals). In related work, while reading action verbs (e.g., write, throw) right-handed participants activated primarily left premotor cortex regions, whereas left-handed participants activat ed primarily right premotor cortex regions (Willems et al., 2010). No such difference was observed for nonmanual action verbs (e.g., kneel, giggle). Analogous findings have been observed for long-term experience with sports: When reading sentences describing ice hockey (but not when reading about everyday experiences), professional ice hockey play ers activated premotor regions more than nonplayers did (Beilock et al., 2008). Further, such differences are not limited to motor experience: When professional musicians identi fied pictures of musical instruments (but not control objects), they activated auditory as sociation cortex and adjacent areas more than nonmusicians did (Hoenig et al., 2011). Even with much less than a lifetime of experience, the neural representation of an object can reflect specific experience with it. Oliver and colleagues (2008) asked one set of par ticipants to learn (by demonstration) actions for a set of novel objects, perform those ac tions, and also view the objects, whereas a second set of participants viewed the same ob jects without learning actions but had the same total amount of exposure to them. In a subsequent fMRI session in which participants made judgments about visual properties of the objects, activity in parietal cortex was found to be modulated by the amount of tactile and action experience a participant had with a given object. These and related findings (Kiefer et al., 2007; Weisberg et al., 2007) demonstrate a causal link between experience with an object and its neural representation, and also show that even relatively shortterm differences in sensorimotor experience can influence an object’s representation. Page 18 of 40

Semantic Memory Intriguingly, changes in individual experience may also lead to changes in the representa tion of abstract concepts. Right-handers’ tendency to associate “good” with “right” and “bad” with “left” (Casasanto, 2009) can be reversed when right hand dominance is com promised because of stroke or a temporary laboratory-induced handicap (Casasanto & Chrysikou, 2011).

What Happens When a Sensory Modality Is Missing? As would be expected given the differences observed for handedness and relatively shortterm (p. 365) experience, more dramatic differences in individual experience have also been shown to affect the organization of semantic knowledge. For instance, color influ ences implicit similarity judgments for sighted but not for blind participants (even when blind participants have good explicit color knowledge of the items tested; Connolly et al., 2007). Interestingly, this difference held only for fruits and vegetables, and not for house hold items, consistent with a large literature demonstrating that the importance of color for an object concept varies according to how useful it is for recognizing the object (see Tanaka & Presnell, 1999, for review). However, differences in sensory experience do not always produce obvious differences in the organization of semantic knowledge. For instance, when making judgments about hand action, blind, like sighted, participants selectively activate left posterior middle tem poral areas that in sighted people have been associated with processing visual motion (Noppeney et al., 2003). Furthermore, blind participants demonstrate category-specific (nonliving vs. animal, in this case) activation in the same visual areas as sighted partici pants (ventral temporal and ventral occipital regions; Mahon et al., 2009). Because senso rimotor-based theories posit that visual experience accounts for the activation in visual areas, the findings in these two studies may appear to be inconsistent with sensorimotorbased theories and instead suggest an innate specification of action representation or of living/nonliving category differences. However, given the substantial evidence that corti cal reorganization occurs if visual input is absent (for a review, see Amedi, Merabet, Bermpohl, & Pascual-Leone, 2005), another possibility is that in blind participants these “visual” regions are sensitive to nonvisual factors (e.g., shape information that is ac quired tactilely) that correlate with hand action and with the living/nonliving distinction.

Summary of Individual Differences in Semantic Memory At first glance, the individual differences that we have described in this section may seem surprising. If our concept of a lemon, for example, is determined by experience, then no two individuals’ concepts of a lemon will be exactly the same. Further, your own concept of a lemon is likely to change subtly over time, probably without conscious awareness. Yet the data described above suggest that this is, in fact, what happens. Because sensorimo tor-based models assume that our representations of concepts are based on our experi ences with them, these models can easily account for, and in fact predict, these differ ences and changes. It is a challenge for future research to explore whether there are fac

Page 19 of 40

Semantic Memory tors that influence the extent to which we attend to different types of information, and that constrain the degree to which representations change over time.

Abstract Knowledge Our discussion of the organization of semantic memory has thus far focused primarily on the physical properties of concrete objects. Clearly, though, a complete theory of seman tic memory must also provide an account for how we represent abstract concepts (e.g., peace) as well as abstract features of concrete objects (e.g., “used to tell time” is a prop erty of a clock). According to the “concreteness effect,” concrete words are processed more easily than abstract words (e.g., Paivio, 1991) because their representations include sensory information that abstract words lack. However, there have been reports of se mantic dementia patients who have more difficulty with concrete than abstract words (Bonner et al., 2009; Breedin et al., 1994; but cf. Hoffman & Lambon-Ralph, 2011, and Jef feries et al., 2009, for evidence that the opposite pattern is more common in semantic de mentia), suggesting that there must be more to the difference between these word types than quantity of information. Additional evidence for a qualitative difference between the representations of concrete and abstract words comes from work by Crutch and Warring ton (2005). They reported a patient AZ, with left temporal, parietal, and posterior frontal damage, who, for concrete words, exhibits more interference from words closely related in meaning (e.g., synonyms) than for “associated” words (i.e., words that share minimal meaning but often occur in similar contexts), whereas for abstract words, she displays the opposite pattern. Neuroimaging studies that have compared abstract and concrete words have identified an inconsistent array of regions associated with abstract concepts: the left superior tem poral gyrus (Wise et al., 2000), right anterior temporal pole, or left posterior middle tem poral gyrus (Grossman et al., 2002). These inconsistencies may be due to the differing de mands of the tasks employed in these studies or to differences in how “abstract” is opera tionalized. The operational definition of abstract may be particularly important because it varies widely across studies—ranging from words without sensorimotor associations to words that have low imageability (p. 366) (i.e., words that are difficult to visualize) to emo tion words (e.g., love). We surmise that these differences likely have a particularly signifi cant influence on where brain activation is observed. Using abstract stimuli intended to have minimal sensorimotor associations, Noppeney and Price (2004) compared fMRI activation while subjects made judgments about words (com prising nouns, verbs, and adjectives) referring to visual, auditory, manual action, and ab stract concepts. Relative to the other conditions, abstract words activated the left inferior frontal gyrus, middle temporal gyrus, superior temporal sulcus, and anterior temporal pole. Because these are classical “language” areas, the authors suggest that the activa tions are a consequence of the representations of abstract words being more reliant on contextual information provided by language. Recently, Rodriguez and colleagues (2011) observed activation in these same regions for abstract verbs. They also observed that a Page 20 of 40

Semantic Memory greater number of regions were active for abstract relative to concrete verbs—leading them to hypothesize that because abstract words appear in more diverse contexts (Hoff man et al., 2011), the networks supporting them are more broadly distributed. Like abstract words, abstract features (e.g., “used to tell time”) have no direct sensorimo tor correlates. Our ability to conceive of abstract concepts and features—i.e., knowledge that cannot be directly perceived from any individual sensory modality—demonstrates that there must be more to semantic knowledge than simple sensorimotor echoes. How might abstract concepts or features be represented in the kind of distributed architecture that we have described? Rogers and colleagues’ model of semantic memory (introduced above in the context of generalization) may be of service here as well. They argue that the interaction between content-bearing perceptual representations and verbal labels pro duces a similarity space that is not captured in any single attribute domain, but rather re flects abstract similarity (cf. Caramazza, Hillis, Rapp, & Romani, 1990; Chatterjee, 2010; Damasio, 1989; Plaut, 2002; Tyler, Moss, Durrant-Peatfield, & Levy, 2000). Based on the abundant interconnections between the temporal pole and different sensori motor areas, and on the fact that temporal pole degeneration is associated with semantic dementia (introduced in earlier), Rogers and colleagues suggest that this region may sup port abstract knowledge and generalization. Semantic dementia, in particular, has had a large influence on ideas about the anterior temporal lobes’ role in semantic memory. In this disorder, relatively focal degeneration in the anterior temporal lobes accompanies se mantic memory deficits (e.g., problems naming, recognizing, and classifying objects, re gardless of category), whereas other cognitive functions are relatively spared (see Hodges & Patterson, 2007, for a review). The concomitance of the anatomical and cogni tive impairments in semantic dementia therefore lends credence to the idea that the ante rior temporal lobes are important for supporting semantic memory (see Patterson et al., 2007, for a review). Additional research is needed to explore whether brain regions be yond the anterior temporal lobe serve similar “converging” functions.

Methodological Advances The studies reviewed in this chapter employed behavioral, neuropsychological, and neu roimaging techniques to explore the organization and function of semantic memory. A number of methodologies that have recently been introduced in cognitive neuroscience hold much promise for the study of semantic memory. First, new approaches in experimental design and data analysis for neuroimaging-based studies allow cognitive neuroscientists to address more fine-grained questions about the neural representation of concepts. For example, questions relating to representational similarity can be explored with fMRI adaptation (e.g., Grill-Spector & Malach, 2001). This technique relies on the assumption that when stimuli that are representationally similar are presented sequentially, the repeated activation of the same set of neurons will pro duce a reduced fMRI response. If the stimuli are representationally distinct, no such adapted response should be observed. This method can be applied to address a number of Page 21 of 40

Semantic Memory questions pertaining, for instance, to relationships between regions implicated in the pro cessing of different object attributes (e.g., color, shape, and size; see Yee et al., 2010, for function and manipulation), or to the degree to which the same neurons are involved in perception and in conceptual representation. Similarly, multivoxel pattern analysis (e.g., Mitchell et al., 2008; Norman et al., 2006; Weber et al., 2009) and functional connectivity approaches allow for analyses that exploit the distributed nature of brain activation, rather than focusing on focal activation peaks (see Rissman & Wagner, 2012). Second, noninvasive brain stimulation techniques, specifically TMS and transcranial di rect current stimulation (tDCS), allow researchers to temporarily “lesion” a given brain region and (p. 367) observe the effects on behavior (e.g., Antal et al., 2001, 2008; Walsh & Pascual-Leone, 2003). In contrast to studying patients in the months and years after brain injuries that produce permanent lesions, using these “virtual lesions” allows cognitive neuroscientists to examine the role of a given brain region without the possibility that re organization of neural function has occurred. Third, cognitive neuroscience has benefited from advances in eye-tracking research, in which eye movements to objects are monitored as participants listen to spoken language (Cooper 1974; Tanenhaus et al., 1995). Hearing a word (e.g., piano) produces eye move ments toward pictures of semantically related objects (e.g., a trumpet; Yee & Sedivy, 2006), and the probability of looking at the related object is predicted by how far away it is in “semantic space” (calculated in terms of the degree of featural overlap; Huettig & Altmann, 2005). This semantic eye-tracking paradigm has been used to explore specific dimensions of featural overlap (e.g., shape, color, manipulation) and is well suited to in vestigating semantic representations in patients with brain damage (Mirman & Graziano, 2012; Myung et al., 2010). Such behavioral paradigms inform cognitive neuroscience of the behavioral consequences of the manner in which semantic memory is organized.

Implications and Future Directions Is There Something Special about Action? Much of the work in cognitive neuroscience that has been the focus of this chapter indi cates that semantic representations are at least partially sensorimotor based. One senso rimotor modality in particular, action, has received a great deal of attention, perhaps be cause of the discovery of “mirror neurons”—cells that respond both when an action is perceived and when it is performed (Rizzolatti & Craighero, 2004). This focus on action has led to a common criticism of sensorimotor-based theory: Being impaired in perform ing actions does not entail being unable to conceive of objects with strongly associated actions—suggesting that action may not, in fact, be part of these conceptual representa tions.7 There are at least three important points to keep in mind with respect to this criticism. First, concepts are more than associated actions (and in fact many concepts—e.g., book Page 22 of 40

Semantic Memory shelf or tree—may have weakly if any associated actions). As a result, sensorimotor-based representations can include many different components (e.g., visual, auditory, and olfac tory as well as action oriented) that are distributed across cortex. For this reason, under a sensorimotor-based account it would be surprising if all of these components were dam aged simultaneously. This means that losing part of a representation does not entail los ing the entire concept—just as losing one finger from a hand does not entail loss of the entire hand. Moreover, as highlighted in our discussion of abstract features, all of the var ious sensorimotor components still make up only part of conceptual knowledge—because semantic knowledge is only partially sensorimotor. Second, even concepts that at first glance seem predominantly action based generally comprise more than action alone. For example, our knowledge of kicking may include not only the action but also the contexts in which kicking is likely to occur (see Taylor & Zwaan, 2009, for a discussion of the many possible components of action knowledge and the resulting implications for “fault-toler ant comprehension”). Third, recent research (reviewed earlier) suggests that depending on the demands of the task, we are able to dynamically focus our attention on different aspects of a concept. This means that sensorimotor-based distributed models are not inconsistent with finding that an action is not routinely activated when the concept is activated, or that patients with disorders of action can respond successfully to concepts that are action based if the task does not require access to action information. In fact, such findings fall naturally out of the architecture of these models. Such models allow for semantic memory to exhibit some degree of gracefuldegradation (or fault tolerance) in that representations can con tinue to be accessed despite the loss of some of their components.

Is Semantic Memory Really “Shared Knowledge”? Semantic memory is often referred to as “shared knowledge,” to distinguish it from the individual experiences that make up episodic memory. Yet in this chapter we have empha sized that individual experience, task, and context all influence the extent to which differ ent aspects of an object’s representation become active over time. Thus, when conceiving of an object, there may be no fixed representational “outcome” that is stable across dif ferent episodes of conceiving of it (or even across time within an episode), let alone across individuals. This raises a significant challenge for how to define and understand semantic memory: Because semantic memory is “shared knowledge” only to the extent that our experiences (both long and short term) are shared, (p. 368) understanding the representation and retrieval of semantic knowledge may depend on our ability to de scribe aspects of these representations that are not shared. Future work must therefore do more than discover the extent to which various attributes are routinely activated for certain concepts. It should also attempt to characterize variations in the neural bases of semantic memory, as well as the neural mechanisms by which context or task demands modulate which aspects of a concept are activated (and at what rate), allowing for contin uously changing outcomes (for further discussion, see Spivey, 2007).

Page 23 of 40

Semantic Memory

From Categories to Semantic Spaces Many of the studies described in this chapter explored the organization of semantic mem ory by comparing the neural responses to traditionally defined categories (e.g., animals vs. tools). However, a more fruitful method of understanding conceptual representations may be to compare individual concepts to one another, and extract dimensions that de scribe the emergent similarity space. The newer methods of analyzing neuroimaging data discussed above (such as fMRI adaptation and multi-voxel pattern analysis, or MVPA) are well suited to the task of describing these types of neural similarity spaces. Further, by making inferences from these spaces, it is possible to discover what type of information is represented in a given cortical region (e.g., Mitchell et al., 2008; Weber et al., 2009; Yee et al., 2010). Overall, our understanding of semantic memory can benefit more from studying individual items (e.g., Bedny et al., 2007) and their relations to each other, than from simply examining categories as unified wholes.

Where Does Semantic Memory Fit in the Traditional Taxonomy of Memory? Traditionally, semantic memory is considered to be part of the declarative (explicit) mem ory system (Squire, 1987). Yet the sensorimotor-based frameworks we have discussed in this chapter suggest that semantic memory is also partially composed of information con tained in sensorimotor systems and can be probed through (implicit) perceptual priming. The amnesic patients we discussed in the first section of this chapter also support the idea that semantic memory is at least partially implicit, in that they are able to acquire some semantic knowledge despite severely impaired episodic memories. Hence, the cur rent conception of semantic memory does not seem to fit cleanly into existing descrip tions of either declarative (explicit) or nondeclarative (implicit) memory. Rather, our knowledge about the world and the objects in it appears to rely on both declarative and nondeclarative memory.

Summary In this chapter we have briefly summarized a wide variety of data pertaining to the cogni tive neuroscience of semantic memory. We reviewed different schemes for characterizing the organization of semantic memory and argued that the bulk of the evidence converges to support sensorimotor-based models (which extend sensory-functional theory). Because these models allow for, and in fact are predicated on, a role for degree and type of experi ence (which will necessarily vary by individual and by concept), they are able to accom modate a wide variety of observations. Importantly, they can also make specific, testable predictions regarding experience. Finally, it is important to emphasize that although often pitted against one another in service of testing specific hypotheses, sensorimotor and cor related feature-based models are not at odds with a categorical-like organization. In fact, both were developed to provide a framework in which a categorical organization can emerge from commonalities in the way we interact with and experience similar objects. Page 24 of 40

Semantic Memory

References Allport, D. A. (1985). Distributed memory, modular subsystems and dysphasia. In S. K. Newman & R. Epstein (Eds.), Current perspectives in dysphasia (pp. 207–244). Edin burgh: Churchill Livingstone. Amedi, A., Merabet, L., Bermpohl, F., & Pascual-Leone, A. (2005). The occipital cortex in the blind: Lessons about plasticity and vision. Current Directions in Psychological Science, 16, 306–311. Antal, A., Nitsche, M. A., & Paulus, W. (2001). External modulation of visual perception in humans, NeuroReport, 12, 3553–3555. Antal, A., & Paulus, W. (2008). Transcranial direct current stimulation of visual percep tion, Perception, 37, 367–374. Badre, D., & Wagner, A. D. (2007). Left ventrolateral prefrontal cortex and the cognitive control of memory. Neuropsychologia, 45, 2883–1901. Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral Brain Science, 22, 577– 660. Bedny, M., Aguirre, G. K., & Thompson-Schill, S. L. (2007). Item analysis in functional magnetic resonance imaging. NeuroImage, 35, 1093–1102. Beilock, S. L., Lyons, I. M., Mattarella-Micke, A., Nusbaum, H. C., & Small, S. L. (2008). Sports experience changes the neural processing of action language. Proceedings of the National Academy of Sciences, 105, 13269–13273. Bonner, M. F., & Grossman, M. (2012). Gray matter density of auditory association cortex relates to knowledge of sound concepts in primary progressive aphasia. Journal of Neuro science, 32 (23), 7986–7991. Bonner, M. F., Vesely, L., Price, C., Anderson, C., Richmond, L., Farag, C., et al. (2009). Re versal of the concreteness effect in semantic dementia. Cognitive Neuropsychology, 26, 568–579. Bozeat, S., Lambon Ralph, M. A., Patterson, K., Garrard, P., & Hodges, J. R. (2000). Nonverbal semantic impairment in semantic dementia. Neuropsychologia, 38, 1207–1215. Bozeat, S., Lambon Ralph, M. A., Patterson, K., & Hodges, J. R. (2002a). The influence of personal familiarity and context on object use in semantic dementia. Neurocase, 8, 127– 134. Bozeat, S., Lambon Ralph, M. A., Patterson, K., & Hodges, J. R. (2002b). When objects lose their meaning: What happens to their use? Cognitive, Affective, & Behavioral Neuro sciences, 2, 236–251.

Page 25 of 40

Semantic Memory Bozeat, S., Patterson, K., & Hodges, J. R. (2004). Relearning object use in semantic de mentia. Neuropsychological Rehabilitation, 14, 351–363. Bindschaedler, C., Peter-Favre, C., Maeder, P., Hirsbrunner, T., & Clarke, S. (2011). Grow ing up with bilateral hippocampal atrophy: From childhood to teenage. Cortex, 47, 931– 944. Breedin, S. D., Saffran, E. M., & Coslett, H. B. (1994). Reversal of the concreteness effect in a patient with semantic dementia. Cognitive Neuropsychology, 11, 617–660. Bright, P., Moss, H., & Tyler, L.K. (2004). Unitary vs multiple semantics: PET studies of word and picture processing. Brain and Language, 89, 417–432. Buxbaum, L. J., & Saffran, E. M. (2002). Knowledge of object manipulations and object function: Dissociations in apraxic and nonapraxic subjects. Brain and Language, 82, 179– 199. Buxbaum, L. J., Veramonti, T., & Schwartz, M. F. (2000). Function and manipulation tool knowledge in apraxia: Knowing “what for” but not “how.” Neurocase, 6, 83–97. Caramazza, A. (2000). Minding the facts: A comment on Thompson-Schill et al’s “A neural basis for category and modality specificity of semantic knowledge.” Neuropsychologia, 38, 944–949. Caramazza, A., Hillis, A. E., Rapp, B. C., & Romani, C. (1990). The multiple semantics hy pothesis: Multiple confusions? Cognitive Neuropsychology, 7, 161–189. Caramazza, A., & Shelton, J. R. (1998). Domain-specific knowledge systems in the brain the animate-inanimate distinction. Journal of Cognitive Neuroscience 10, 1–34. Casasanto, D. (2009). Embodiment of abstract concepts: Good and bad in right- and lefthanders. Journal of Experimental Psychology: General, 138, 351–367. Casasanto, D., & Chrysikou, E. G. (2011). When left is “right”: Motor fluency shapes ab stract concepts. Psychological Science, 22, 419–422. Chainay, H., & Humphreys, G. W. (2002). Privileged access to action for objects relative to words. Psychonomic Bulletin & Review, 9, 348–355. Chan, A. M., Baker, J. M., Eskandar, E., Schomer, D., Ulbert, I., Marinkovic, K., Cash, S. C., & Halgren, E. (2011). First-pass selectivity for semantic categories in human an teroventral temporal lobe. Journal of Neuroscience, 31, 18119–18129. Chao, L. L., Haxby, J. V., & Martin, A. (1999). Attribute-based neural substrates for per ceiving and knowing about objects. Nature Neuroscience, 2, 913–919. Chao, L. L., & Martin, A. (2000). Representation of manipulable man-made objects in the dorsal stream. NeuroImage, 12, 478–484. Page 26 of 40

Semantic Memory Chatterjee, A. (2010). Disembodying cognition. Language and Cognition. 2-1, 79–116. Chrysikou, E. G., & Thompson-Schill, S. L. (2010). Are all analogies created equal? Pre frontal cortical functioning may predict types of analogical reasoning (commentary). Cog nitive Neuroscience, 1, 141–142. Clarke, A., Taylor, K. I., & Tyler, L. K. (2011). The evolution of meaning: Spatiotemporal dynamics of visual object recognition. Journal of Cognitive Neuroscience, 23, 1887–1899. Collins, A. M., & Quillian, M. R. (1969). Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior, 8, 240–247. Connolly, A. C., Gleitman, L. R., & Thompson-Schill, S. L. (2007). The effect of congenital blindness on the semantic representation of some everyday concepts. Proceedings of the National Academy of Sciences, 104, 8241–8246. Cooper, R. M. (1974). The control of eye fixation by the meaning of spoken language: A new methodology for the real-time investigation of speech perception, memory, and lan guage processing. Cognitive Psychology, 6, 84–107. Cree, G. S., & McRae, K. (2003). Analyzing the factors underlying the structure and com putation of the meaning of chipmunk, cherry, chisel, cheese and cello (and many other such concrete nouns). Journal of Experimental Psychology: General, 132, 163–201. Crutch, S. J., & Warrington, E. K. (2005). Abstract and concrete concepts have structural ly different representational frameworks. Brain, 128, 615–627. Damasio, A. R. (1989). The brain binds entities and events by multiregional activa tion from convergence zones. Neural Computation, 1, 123–132. (p. 370)

Farah, M. J., & McClelland, J. L. (1991). A computational model of semantic memory im pairment: modality specificity and emergent category specificity. Journal of Experimental Psychology: General, 120, 339–357. Frith, C. (2000). The role of dorsolateral prefrontal cortex in the selection of action as re vealed by functional imaging. In S. Monsell & J. Driver (Eds.), Control of cognitive processes (pp. 549–565). Cambridge, MA: MIT Press. Funnell, E. (1995a). A case of forgotten knowledge. In R. Campbell & M. A. Conway (Eds.), Broken memories (pp. 225–236). Oxford, UK: Blackwell Publishers. Funnell, E. (1995b). Objects and properties: A study of the breakdown of semantic memo ry. Memory, 3, 497–518. Funnell, E. (2001). Evidence for scripts in semantic dementia: Implications for theories of semantic memory. Cognitive Neuropsychology, 18, 323–341.

Page 27 of 40

Semantic Memory Gabrieli, J. D. E., Cohen, N. J., & Corkin, S. (1988). The impaired learning of semantic knowledge following bilateral medial temporal-lobe resection. Brain & Cognition, 7, 151– 177. Gabrieli, J. D., Poldrack, R. A., & Desmond, J. E. (1998). The role of left prefrontal cortex in language and memory. Proceedings of the National Academy of Sciences of the United States of America, 95 (3), 906–913. Gage, N., & Hickok, G. (2005). Multiregional cell assemblies, temporal binding, and the representation of conceptual knowledge in cortex: A modern theory by a “classical” neu rologist, Carl Wernicke. Cortex, 41, 823–832. Gainotti, G. (2000). What the locus of brain lesion tells us about the nature of the cogni tive defect underlying category-specific disorders: A review. Cortex, 36, 539–559. Gardiner, J. M., Brandt, K. R., Baddeley, A. D., Vargha-Khadem, F., & Mishkin, M. (2008). Charting the acquisition of semantic knowledge in a case of developmental amnesia. Neu ropsychologia. 46, 2865–2868. Garrard, P., & Hodges, J. R. (1999). Semantic dementia: Implications for the neural basis of language and meaning. Aphasiology, 13, 609–623. Gates, L., & Yoon, M. G. (2005). Distinct and shared cortical regions of the human brain activated by pictorial depictions versus verbal descriptions: An fMRI study. NeuroImage, 24, 473–486. Gerlach, C., Law, I., Gade, A., & Paulson, O.B. (2000). Categorization and category effects in normal object recognition: A PET study. Neuropsychologia, 38, 1693–1703. Goldberg, R. F., Perfetti, C. A., & Schneider, W. (2006). Perceptual knowledge retrieval ac tivates sensory brain regions. Journal of Neuroscience, 26, 4917–4921. Gonnerman, L. M., Andersen, E. S., Devlin, J. T., Kempler, D., & Seidenberg, M. S. (1997). Double dissociation of semantic categories in Alzheimer’s disease. Brain and Language, 57, 254–279. Gonzalez, J., Barros-Loscertales, A., Pulvermuller, F., Meseguer, V., Sanjuan, A., Belloch, V., et al. (2006). Reading cinnamon activates olfactory brain regions. NeuroImage, 32, 906–912. Grabowski, T. J., Damasio, H., Tranel, D., Boles Ponto, L. L., Hichwa, R.D., & Damasio, A. R. (2001). A role for left temporal pole in the retrieval of words for unique entities. Hu man Brain Mapping, 13, 199–212. Graham, K. S., Lambon Ralph, M. A., & Hodges, J. R. (1997). Determining the impact of autobiographical experience on “meaning”: New insights from investigating sports-relat

Page 28 of 40

Semantic Memory ed vocabulary and knowledge in two cases with semantic dementia. Cognitive Neuropsy chology, 14, 801–837. Graham, K. S., Lambon Ralph, M. A., & Hodges, J. R. (1999). A questionable semantics: The interaction between semantic knowledge and autobiographical experience in seman tic dementia. Cognitive Neuropsychology, 16, 689–698. Graham, K. S., Simons, J. S., Pratt, K. H., Patterson, K., & Hodges, J. R. (2000). Insights from semantic dementia on the relationship between episodic and semantic memory. Neu ropsychologia, 38, 313–324. Greve, A., van Rossum, M. C. W., & Donaldson, D. I. (2007). Investigating the functional interaction between semantic and episodic memory: Convergent behavioral and electro physiological evidence for the role of familiarity. NeuroImage, 34, 801–814. Grill-Spector, K., & Malach, R. (2001). fMR-adaptation: A tool for studying the functional properties of human cortical neurons. Acta Psychologica, 107, 293–321. Grossman, M., Koenig, P., DeVita, C., Glosser, G., Alsop, D., Detre, J., & Gee, J. (2002). The neural basis for category-specific knowledge: An fMRI study. NeuroImage, 15, 936–948. Hart, J., & Kraut, M. A. (2007). Neural hybrid model of semantic object memory (version 1.1). In J. Hart & M. A. Kraut (Eds.), Neural basis of semantic memory (pp. 331–359). New York: Cambridge University Press. Hauk, O., Johnsrude, I., & Pulvermuller, F. (2004). Somatotopic representation of action words in human motor and premotor cortex. Neuron, 41, 301–307. Hauk, O., Shtyrov, Y., & Pulvermuller, F. (2008). The time course of action and action-word comprehension in the human brain as revealed by neurophysiology. Journal of Physiology, Paris 102, 50–58. Hillis, A., & Caramazza, A. (1995). Cognitive and neural mechanisms underlying visual and semantic processing: Implication from “optic aphasia.” Journal of Cognitive Neuro science, 7, 457–478. Hodges, J. R., & Patterson, K. (2007). Semantic dementia: a unique clinicopathological syndrome. The Lancet Neurology, 6, 1004–1014. Hodges, J. R., Patterson, K., Oxbury, S., & Funnell, E. (1992). Semantic dementia: Pro gressive fluent aphasia with temporal lobe atrophy. Brain, 115, 1783–1806. Hodges, J. R., Patterson, K., Ward, R., Garrard, P., Bak, T., Perry, R., & Gregory, C. (1999). The differentiation of semantic dementia and frontal lobe dementia (temporal and frontal variants of frontotemporal dementia) from early Alzheimer’s disease: A comparative neu ropsychological study. Neuropsychology, 13, 31–40.

Page 29 of 40

Semantic Memory Hoenig, K., Müller, C., Herrnberger, B., Spitzer, M., Ehret, G., & Kiefer, M. (2011). Neuro plasticity of semantic maps for musical instruments in professional musicians. NeuroI mage, 56, 1714–1725. Hoenig, K., Sim, E.-J., Bochev, V., Herrnberger, B., & Kiefer, M. (2008). Conceptual flexi bility in the human brain: Dynamic recruitment of semantic maps from visual, motion and motor-related areas. Journal of Cognitive Neuroscience, 20, 1799–1814. Hoffman, P., & Lambon Ralph, M. A. (2011). Reverse concreteness effects are not a typical feature of semantic dementia: Evidence for the hub-and-spoke model of concep tual representation. Cerebral Cortex, 21, 2103–2112. (p. 371)

Hoffman, P., Rogers, T. T., & Lambon Ralph, M. A. (2011). Semantic diversity accounts for the “missing” word frequency effect in stroke aphasia: Insights using a novel method to quantify contextual variability in meaning. Journal of Cognitive Neuroscience, 23, 2432– 2446. Hsu, N. S., Kraemer, D. J. M., Oliver, R. T., Schlichting, M. L., & Thompson-Schill, S. L. (2011). Color, context, and cognitive style: Variations in color knowledge retrieval as a function of task and subject variables. Journal of Cognitive Neuroscience, 23, 2554–2557. Huettig, F., & Altmann, G. T. M. (2005). Word meaning and the control of eye fixation: Se mantic competitor effects and the visual world paradigm. Cognition, 96, B23–B32. Humphreys, G. W., & Forde, E. M. (2001). Hierarchies, similarity, and interactivity in ob ject recognition: “Category-specific” neuropsychological deficits. Behavioral and Brain Sciences, 24, 453–476. Humphreys, G. W., Riddoch, M. J., & Quinlan, P. T. (1988). Cascade processes in picture identification. Cognitive Neuropsychology, 5, 67–103. Jefferies, E., Patterson, K., Jones, R. W., & Lambon Ralph, M. A. (2009) Comprehension of concrete and abstract words in semantic dementia. Neuropsychology, 23, 492–499. Ishai, A., Ungerleider, L. G., & Haxby, J. V. (2000). Distributed neural systems for the gen eration of visual images. Neuron, 28, 979–990. Kable, J. W., Kan, I. P., Wilson, A., Thompson-Schill, S. L., & Chatterjee, A. (2005). Concep tual representations of action in lateral temporal cortex. Journal of Cognitive Neuro science, 17, 855–870. Kan, I. P., Alexander, M. P., & Verfaellie, M. (2009). Contribution of prior semantic knowl edge to new episodic learning in amnesia. Journal of Cognitive Neuroscience, 21, 938– 944. Kan, I. P., Kable, J. W., Van Scoyoc, A., Chatterjee, A., & Thompson-Schill, S. L. (2006). Fractionating the left frontal response to tools: Dissociable effects of motor experience and lexical competition. Journal of Cognitive Neuroscience, 18, 267–277. Page 30 of 40

Semantic Memory Kan, I. P., & Thompson-Schill, S. L. (2004). Effect of name agreement on prefrontal activi ty during overt and covert picture naming. Cognitive, Affective, & Behavioral Neuro science, 4, 43–57. Kellenbach, M. L., Brett, M., & Patterson, K. (2001). Large, colorful or noisy? Attributeand modality-specific activations during retrieval of perceptual attribute knowledge. Cog nitive, Affective, & Behavioral Neuroscience, 1, 207–221. Kellenbach, M., Brett, M., & Patterson, K. (2003). Actions speak louder than functions: The importance of manipulability and action in tool representation. Journal of Cognitive Neuroscience, 15, 30–46. Kiefer, M., Sim, E.-J., Herrnberger, B., Grothe, J. & Hoenig, K. (2008). The sound of con cepts for markers for a link between auditory and conceptual brain systems. Journal of Neuroscience, 28, 12224–12230. Kiefer, M., Sim, E.-J., Liebich, S., Hauk, O., & Tanaka, J. (2007). Experience-dependent plasticity of conceptual representations in human sensory-motor areas. Journal of Cogni tive Neuroscience, 19, 525–542. Kosslyn, S. M., & Thompson, W. L. (2000). Shared mechanisms in visual imagery and visu al perception: Insights from cognitive neuroscience. In M. S. Gazzaniga (Ed.), The new cognitive neurosciences (2nd ed., pp. 975–985). Cambridge, MA: MIT Press. Lakoff, G., & Johnson, M. (1999). Philosophy in the flesh: The embodied mind and its chal lenge to Western thought. New York: Basic Books. Lambon Ralph, M. A., Lowe, C., & Rogers, T. (2007). Neural basis of category-specific se mantic deficits for living things: evidence from semantic dementia, HSVE and a neural network model. Brain, 130 (Pt 4), 1127–1137. Mahon, B. Z., Anzellotti, S., Schwarzbach, J., Zampini, M., & Caramazza, A. (2009). Cate gory-specific organization in the human brain does not require visual experience. Neuron, 63, 397–405. Mahon, B. Z., & Caramazza, A. (2003). Constraining questions about the organization & representation of conceptual knowledge. Cognitive Neuropsychology, 20, 433–450. Mahon, B. Z., & Caramazza, A. (2008). A critical look at the embodied cognition hypothe sis and a new proposal for grounding conceptual content. Journal of Physiology—Paris, 102, 59–70. Martin, A. (2007) The representation of object concepts in the brain. Annual Review of Psychology, 58, 25–45. Martin, A., & Chao, L. L. (2001). Semantic memory and the brain: Structure and process es. Current Opinion in Neurobiology, 11, 194–201.

Page 31 of 40

Semantic Memory Martin, A., Haxby, J. V., Lalonde, F. M., Wiggs, C. L., & Ungerleider, L. G. (1995). Discrete cortical regions associated with knowledge of color and knowledge of action. Science, 270, 102–105. Martin, A., Wiggs, C. L., Ungerleider, L. G., & Haxby, J. V. (1996). Neural correlates of cat egory-specific knowledge. Nature, 379, 649–652. McClelland, J. L., & Rogers, T. T. (2003). The parallel distributed processing approach to semantic cognition. Nature Reviews Neuroscience, 4, 310–322. McRae, K., de Sa, V. R., & Seidenberg, M. S. (1997). On the nature and scope of featural representations of word meaning. Journal of Experimental Psychology: General, 126, 99– 130. Mechelli, A., Price, C. J., Friston, K. J., & Ishai, A. (2004). Where bottom-up meets topdown: Neuronal interactions during perception and imagery. Cerebral Cortex, 14, 1256– 1265. Mesulam, M. M., Grossman, M., Hillis, A., Kertesz, A., & Weintraub, S. (2003). The core and halo of primary progressive aphasia and semantic dementia. Annals of Neurology, 54, S11–S14. Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. An nual Review of Neuroscience, 24, 167–202. Mirman, D., & Graziano, K. M. (2012). Damage to temporoparietal cortex decreases inci dental activation of thematic relations during spoken word comprehension. Neuropsy chologia, 50, 1990–1997. Mitchell, T. M., Shinkareva, S. V., Carlson, A., Chang, K.-M., Malave, V. L., Mason, R. A., Just, M. A. (2008). Predicting human brain activity associated with the meanings of nouns. Science, 320, 1191–1195. Moore, C. J., & Price, C. J. (1999). A functional neuroimaging study of the variables that generate category-specific object processing differences. Brain, 122, 943–962. Mummery, C. J., Patterson, K., Hodges, J. R., & Price, C. J. (1998). Functional neuroanato my of the semantic system: Divisible by what? Journal of Cognitive Neuroscience, 10, 766–777. (p. 372)

Mummery, C. J., Patterson, K., Wise, R. J. S., Vandenbergh, R., Price, C. J., &

Hodges, J. R. (1999). Disrupted temporal lobe connections in semantic dementia. Brain, 122, 61–73. Myung, J., Blumstein, S. E., Yee, E., Sedivy, J. C., Thompson-Schill, S. L., & Buxbaum, L. J. (2010). Impaired access to manipulation features in apraxia: Evidence from eyetracking and semantic judgment tasks. Brain and Language, 112, 101–112.

Page 32 of 40

Semantic Memory Noppeney, U. (2009). The sensory-motor theory of semantics: Evidence from functional imaging. Language and Cognition, 1-2, 249–276. Noppeney, U., Friston, K., & Price, C. (2003). Effects of visual deprivation on the organiza tion of the semantic system. Brain, 126, 1620–1627. Noppeney, U., Patterson, K., Tyler, L. K., Moss, H., Stamatakis, E. A., Bright, P., Mummery, C., & Price, C. J. (2007). Temporal lobe lesions and semantic impairment: A comparison of herpes simplex virus encephalitis and semantic dementia. Brain, 130 (Pt 4), 1138–1147. Noppeney, U., & Price, C. J. (2002). Retrieval of visual, auditory, and abstract semantics. NeuroImage, 15, 917–926. Noppeney, U., & Price, C. J. (2004). Retrieval of abstract semantics. NeuroImage, 22, 164– 170. Noppeney, U., Price, C. J., Friston, K. J., & Penny, W. D. (2006). Two distinct neural mecha nisms for category-selective responses. Cerebral Cortex, 1, 437–445. Norman, K. A., Polyn, S. M., Detre, G. J., & Haxby, J. V. (2006). Beyond mind-reading: Mul ti-voxel pattern analysis of fMRI data. Trends in Cognitive Sciences, 10, 424–430. Okada, T., Tanaka, S., Nakai, T., Nishiwaza, S., Inui, T., Sadato, N., Yonekura, Y., & Kon ishi, J. (2000). Naming of animals and tools: A functional magnetic resonance imagine study of categorical differences in the human brain areas commonly used for naming vi sually presented objects. Neuroscience Letters, 296, 33–36. O’Kane G., Kensinger E. A., Corkin S. (2004). Evidence for semantic learning in profound amnesia: An investigation with patient H.M. Hippocampus 14, 417–425. Oliver, R. T., Geiger, E. J., Lewandowski, B. C., & Thompson-Schill, S. L. (2009). Remem brance of things touched: How sensorimotor experience affects the neural instantiation of object form. Neuropsychologia, 47, 239–247. Oliver, R. T., Parsons, M. A., & Thompson-Schill, S. L. (2008). Hands on learning: Varia tions in sensorimotor experience alter the cortical response to newly learned objects. San Francisco: Cognitive Neuroscience Society. Paivio, A. (1969). Mental Imagery in associative learning and memory. Psychological Re view, 76, 241–263. Paivio, A. (1971). Imagery and verbal processes. New York: Holt, Rinehart, and Winston. Paivio, A. (1978). The relationship between verbal and perceptual codes. In E. C. Carterette & M. P. Friedman (Eds.), Handbook of perception (Vol. 8, pp 375–397). London: Academic Press. Paivio, A. (1991). Dual coding theory: Retrospect and current status. Canadian Journal of Psychology, 45, 255–287. Page 33 of 40

Semantic Memory Patterson, K., Lambon-Ralph, M. A., Jefferies, E., Woollams, A., Jones, R., Hodges, J. R., & Rogers, T. T. (2006). “Presemantic” cognition in semantic dementia: Six deficits in search of an explanation. Journal of Cognitive Neuroscience, 18, 169–183. Patterson, K., Nestor, P. J., & Rogers, T. T. (2007). Where do you know what you know? The representation of semantic knowledge in the human brain. Nature Reviews Neuro science, 8, 976–987. Perani, D., Cappa, S. F., Bettinardi, V., Bressi, S., Gorno-Tempini, M., Matarrese, & Fazio, F. (1995). Different neural systems for the recognition of animals and man-made tools. NeuroReport, 6, 1637–1641. Phillips, J. A., Noppeney, U., Humphreys, G. W., & Price, C. J. (2002). Can segregation within the semantic system account for category specific deficits? Brain, 125, 2067–2080. Plaut, D. C. (2002). Graded modality-specific specialization in semantics: A computational account of optic aphasia. Cognitive Neuropsychology, 19, 603–639. Pobric, G., Jefferies, E., & Lambon Ralph, M. A. (2010) Induction of category-specific vs. general semantic impairments in normal participants using rTMS. Current Biology, 20, 964–968. Price, C. J., Noppeney, U., Phillips, J. A., & Devlin, J. T. (2003). How is the fusiform gyrus related to category-specificity? Cognitive Neuropsychology, 20 (3-6), 561–574. Pylyshyn, Z. W. (1973). What the mind’s eye tells the mind’s brain: A critique of mental imagery. Psychological Bulletin, 80, 1–24. Raposo, A., Moss, H. E., Stamatakis, E. A., & Tyler, L. K. (2009). Modulation of motor and premotor cortices by actions, action words and action sentences. Neuropsychologia, 47, 388–396. Riddoch, M. J., & Humphreys, G. W. (1987). Visual object processing in a case of optic aphasia: A case of semantic access agnosia. Cognitive Neuropsychology, 4, 131–185. Rissman, J., & Wagner, A. D. (2012). Distributed representations in memory: Insights from functional brain imaging. Annual Review of Psychology, 63, 101–128. Rizzolatti, G., & Craighero, L. (2004.) The mirror-neuron system. Annual Review of Neu roscience 27, 169–192. Rodriguez-Ferreiro, J., Gennari, S. P., Davies, R., & Cuetos, F. (2011). Neural correlates of abstract verb processing, Journal of Cognitive Neuroscience, 23, 106–118. Rogers, T. T., Hocking, J., Mechelli, A., Patterson, K., & Price, C. (2005). Fusiform activa tion to animals is driven by the process, not the stimulus. Journal of Cognitive Neuro science, 17, 434–445.

Page 34 of 40

Semantic Memory Rogers, T. T., Hocking, J., Noppeney, U., Mechelli, A., Gorno-Tempini, M., Patterson, K., & Price, C. (2006). The anterior temporal cortex and semantic memory: Reconciling find ings from neuropsychology and functional imaging. Cognitive, Affective and Behavioral Neuroscience, 6, 201–213. Rogers, T. T., Ivanoiu, A., Patterson, K., & Hodges, J. R. (2006). Semantic memory in Alzheimer’s disease and the frontotemporal dementias: A longitudinal study of 236 pa tients. Neuropsychology, 20, 319–335. Rogers, T. T., Lambon Ralph, M. A., Garrard, P., Bozeat, S., McClelland, J. L., Hodges, J. R., & Patterson, K. (2004). The structure and deterioration of semantic memory: A neuropsy chological and computational investigation. Psychological Review, 111, 205–235. Rogers, T. T., & Patterson, K. (2007). Object categorization: Reversals and explanations of the basic-level advantage. Journal of Experimental Psychology: General, 136, 451–469. Rumiati, R. I., & Humphreys, G. W. (1998). Recognition by action: Dissociating visual and semantic routes to action in normal observers. Journal of Experimental Psychology: Hu man Perception and Performance, 24, 631–647. Saffran, E. M., Coslett, H. B., & Keener, M. T. (2003b). Differences in word associations to pictures and words. Neuropsychologia, 41, 1541–1546. Saffran, E. M., Coslett, H. B., Martin, N., & Boronat, C. (2003a). Access to knowl edge from pictures but not words in a patient with progressive fluent aphasia. Language and Cognitive Processes, 18, 725–757. (p. 373)

Saffran, E. M., & Schwartz, M. F. (1994). Of cabbages and things: Semantic memory from a neuropsychological perspective—A tutorial review. In C. Umilta & M. Moscovitch (Eds.), Attention and performance XV (pp. 507–536). Cambridge, MA: MIT Press. Schreuder, R., Flores D’Arcais, G. B., & Glazenborg, G. (1984). Effects of perceptual and conceptual similarity in semantic priming. Psychological Research, 45, 339–354. Sevostianov, A., Horwitz, B., Nechaev, V., Williams, R., Fromm, S., & Braun, A. R. (2002). fMRI study comparing names versus pictures for objects. Human Brain Mapping, 16, 168– 175. Shimamura, A. P. (2000). The role of the prefrontal cortex in dynamic filtering. Psychobi ology, 28, 207–218. Simmons, K., & Barsalou, L.W. (2003). The similarity-in-topography principle: Reconciling theories of conceptual deficits. Cognitive Neuropsychology, 20, 451–486. Simmons, W., Ramjee, V., Beauchamp, M., McRae, K., Martin, A., & Barsalou, L. (2007). A common neural substrate for perceiving and knowing about color. Neuropsychologia, 45, 2802–2810.

Page 35 of 40

Semantic Memory Sirigu, A., Duhamel, J. R., Poncet, M. (1991). The role of sensorimotor experience in ob ject recognition: A case of multimodal agnosia. Brain, 114, 2555–2573. Snowden, J. S., Bathgate, D., Varma, A., Blackshaw, A., Gibbons, Z. C., & Neary, D. (2001). Distinct behavioral profiles in frontotemporal dementia and semantic dementia. Journal of Neurology, Neurosurgery, & Psychiatry, 70, 323–332. Snowden, J. S., Goulding, P. J., & Neary, D. (1989). Semantic dementia: A form of circum scribed cerebral atrophy. Behavioural Neurology, 2, 167–182. Snowden, J. S., Griffiths, H. L., & Neary, D. (1994). Semantic dementia: Autobiographical contribution to preservation of meaning. Cognitive Neuropsychology, 11, 265–288. Snowden, J. S., Griffiths, H. L., & Neary, D. (1996). Semanticepisodic memory interactions in semantic dementia: Implications for retrograde memory function. Cognitive Neuropsy chology, 13, 1101–1137. Snowden, J. S., Griffiths, H. L., & Neary, D. (1999). The impact of autobiographical experi ence on meaning: Reply to Graham, Lambon Ralph, and Hodges. Cognitive Neuropsychol ogy, 11, 673–687. Spivey, M. J. (2007). The continuity of mind. New York: Oxford University Press. Squire, L. R. (1987). Memory and brain. New York: Oxford University Press. Squire, L. R., & Zola, S. M. (1998). Episodic memory, semantic memory, and amnesia. Hip pocampus, 8, 205–211. Tanaka, J. M., & Presnell, L. M. (1999). Color diagnosticity in object recognition. Percep tion & Psychophysics, 61, 1140–1153. Tanenhaus, M. K., Spivey-Knowlton, M. J., Eberhard, K. M., & Sedivy, J. C. (1995). Integra tion of visual and linguistic information in spoken language comprehension. Science, 268 (5217), 1632–1634. Tarr, M. J., & Gauthier, I. (2000). FFA: A flexible fusiform area for subordinate-level visual processing automatized by expertise. Nature Neuroscience, 3, 764–769. Taylor, L. J., & Zwaan, R. A. (2009). Action in cognition: The case of language. Language and Cognition, 1, 45–58. Thompson-Schill, S. (2003). Neuroimaging studies of semantic memory: Inferring how from where. Neuropsychologia, 41, 280–292. Thompson-Schill, S. L., Aguirre, G. K., D’Esposito, M., & Farah, M. J. (1999). A neural ba sis for category and modality specificity of semantic knowledge. Neuropsychologia, 37, 671–676.

Page 36 of 40

Semantic Memory Thompson-Schill, S. L., Bedny, M., & Goldberg, R. F. (2005). The frontal lobes and the reg ulation of mental activity. Current Opinion in Neurobiology, 15, 219–224. Thompson-Schill, S. L., D’Esposito, M., Aguirre, G. K., & Farah, M. J. (1997). Role of left prefrontal cortex in retrieval of semantic knowledge: A re-evaluation. Proceedings of the National Academy of Sciences, 94, 14792–14797. Thompson-Schill, S. L., D’Esposito, M., & Kan, I. P. (1999). Effects of repetition and com petition on prefrontal activity during word generation. Neuron, 23, 513–522. Thompson-Schill, S. L., Swick, D., Farah, M. J., D’Esposito, M., Kan, I. P., & Knight, R. T. (1998). Verb generation in patients with focal frontal lesions: A neuropsychological test of neuroimaging findings. Proceedings of the National Academy of Sciences, 95, 15855– 15860. Trumpp, N., Kliese, D., Hoenig, K., Haarmaier, T., & Kiefer, M. (2013). A causal link be tween hearing and word meaning: Damage to auditory association cortex impairs the pro cessing of sound-related concepts. Cortex, 49, 474–486. Tulving, E. (1972). Episodic and semantic memory. In E. Tulving & W. Donaldson (Eds.), Organization of memory (pp. 381–403). New York: Academic Press. Tulving, E. (1991). Concepts of human memory. In L. Squire, G. Lynch, N. M. Weinberger, & J. L. McGaugh (Eds.), Memory: Organization and locus of change (pp. 3–32). New York: Oxford University Press. Tyler, L. K., & Moss, H. E. (2001). Towards a distributed account of conceptual knowl edge. Trends in Cognitive Sciences, 5, 244–252. Tyler, L. K., Moss, H. E., Durrant-Peatfield, M. R., & Levy, J. P. (2000). Conceptual struc ture and the structure of concepts: a distributed account of category-specific deficits. Brain & Language, 75, 195–231. Tyler, L. K., Stamatakis, E. A., Bright, P., Acres, K., Abdalah, S., Rodd, J. M., & Moss, H. E. (2004). Processing objects at different levels of specificity. Journal of Cognitive Neuro science, 16, 351–362. Vandenberghe, R., Price, C., Wise, R., Josephs, O., & Frackowiak, R. S. J. (1996). Function al anatomy of a common semantic system for words and pictures. Nature, 383, 254–256. Vargha-Khadem, F., Gadian, D. G., Watkins, K. E., Connelly, A., Van Paesschen, W., Mishkin, M. (1997). Differential effects of early hippocampal pathology on episodic and semantic memory. Science 277, 376–380. Walsh, V., & Pascual-Leone, A. (2003). Transcranial magnetic stimulation: A neurochrono metrics of mind. Cambridge, MA: MIT Press.

Page 37 of 40

Semantic Memory Warrington, E. K. (1975). The selective impairment of semantic memory. Quarterly Jour nal of Experimental Psychology, 27, 635–657. Warrington, E. K., & McCarthy, R. A. (1983). Category specific access dysphasia. Brain, 106, 869–878. Warrington, E. K., & McCarthy, R. A. (1987). Categories of knowledge: Further fractiona tion and an attempted integration. Brain, 110, 1273–1296. Warrington, E. K., & McCarthy, R. A. (1994). Multiple meaning systems in the brain: A case for visual semantics. Neuropsychologia, 32, 1465–1473. (p. 374)

Warrington, E. K., & Shallice, T. (1984). Category specific semantic impairments. Brain, 107, 829–854. Weber, M., Thompson-Schill, S. L., Osherson, D., Haxby, J., & Parsons, L. (2009). Predict ing judged similarity of natural categories from their neural representations. Neuropsy chologia, 47, 859–868. Weisberg, J., Turennout, M., & Martin, A. (2007). A neural system for learning about ob ject function. Cerebral Cortex. 17, 513–521. Wiggett, A. J., Pritchard, I. C., & Downing, P. E. (2009). Animate and inanimate objects in human visual cortex: evidence for task-independent category effects. Neuropsychologia, 47, 3111–3117. Willems, R. M., Hagoort, P., & Casasanto, D. (2010). Body-specific representations of ac tion verbs: Neural evidence from right- and left-handers. Psychological Science, 21, 67– 74. Wise, R. J., Chollet, F., Hadar, U., Friston, K., Hoffner, E., & Frackowiak, R. (1991). Distri bution of cortical neural networks involved in word comprehension and word retrieval. Brain, 114 (Pt 4), 1803–1817. Wise, R. J. S., Howard, D., Mummery, C. J., Fletcher, P., Leff, A., Buchel, C., & Scott, S. K. (2000). Noun imageability and the temporal lobes. Neuropsychologia, 38, 985–994. Witt, J. K., Kemmerer, D., Linkenauger, S. A., & Culham, J. (2010). A functional role for motor simulation in naming tools. Psychological Science, 21, 1215–1219. Wolk, D. A., Coslett, H. B., & Glosser, G. (2005). The role of sensory-motor information in object recognition: Evidence from category-specific visual agnosia, Brain and Language 94, 131–146. Wright, N. D., Mechelli, A., Noppeney, U., Veltman, D. J., Rombouts, S. A. R. B., Glensman, J., Haynes, J. D., & Price, C. J. (2008). Selective activation around the left occipito-tempo ral sulcus for words relative to pictures: Individual variability or false positives? Human Brain Mapping, 29, 986–1000. Page 38 of 40

Semantic Memory Yee, E., Chrysikou, E., Hoffman, E., & Thompson-Schill, S. L. (2013). Manual experience shapes object representations. Psychological Science, 24 (6), 909–919. Yee, E., Drucker, D. M., & Thompson-Schill, S. L. (2010). fMRI-adaptation evidence of overlapping neural representations for objects related in function or manipulation. Neu roImage, 50, 753–763. Yee, E., Hufstetler, S., & Thompson-Schill, S. L. (2011). Function follows form: Activation of shape and function features during object identification. Journal of Experimental Psy chology: General, 140, 348–363. Yee, E., & Sedivy, J. C. (2006). Eye movements to pictures reveal transient semantic acti vation during spoken word recognition. Journal of Experimental Psychology, Learning, Memory, and Cognition, 32, 1–14.

Notes: (1) . Linguists use the term semantic in a related, but slightly narrower way— to refer to the meanings of words or phrases. (2) . There is mounting evidence that the reverse may also be true: semantic memory has been found to support episodic memory acquisition (Kan et al., 2009) and retrieval (Gra ham et al., 2000; Greve et al., 2007). (3) . These ideas about the relationship between knowledge and experience echo those of much earlier thinkers. For example, in “An Essay Concerning Human Understanding,” John Locke considers the origin of “ideas,” or what we now refer to as “concepts,” such as “whiteness, hardness, sweetness, thinking, motion, elephant …”, arguing: “Whence comes [the mind] by that vast store, which the busy and boundless fancy of man has painted on it with an almost endless variety? … To this I answer, in one word, From expe rience.” Furthermore, in their respective works on aphasia, Wernicke (1874) and Freud (1891) both put forth similar ideas (Gage & Hickok, 2005). (4) . Recall that the domain-specific hypothesis allows for distributed representations within different categories. (5) . Moreover (returning to the task effects discussed in 4.3), it has been suggested that the presence or absence of direct overlap may reflect the existence of multiple types of color representations that vary in resolution (or abstraction) with differences in task-con text influencing whether information is retrieved at a fine (high-resolution) level of detail or a more abstract level. Retrieving high- (but not necessarily low-) resolution color knowledge results in overlap with color perception regions (Hsu et al., 2011). (6) . We use the word “epiphenomenal” here to remain consistent with the objections that are sometimes raised in this literature; however, we note that the literal translation of the meaning of this term (i.e., an event with no effectual consequence) may not be suited to

Page 39 of 40

Semantic Memory descriptions of neural activity, which can always be described as having an effect on its efferent targets. (7) . Note that an analogous critique—and importantly, a response analogous to the one that follows—could be made for any sensorimotor modality.

Eiling Yee

Eiling Yee is staff scientist at the Basque Center on Cognition, Brain and Language. Evangelia G. Chrysikou

Evangelia G. Chrysikou, Department of Psychology, University of Kansas, Lawrence, KS Sharon L. Thompson-Schill

Sharon L. Thompson-Schill, Department of Psychology, University of Pennsylvania, Philadelphia, PA

Page 40 of 40

Cognitive Neuroscience of Episodic Memory

Cognitive Neuroscience of Episodic Memory Lila Davachi and Jared Danker The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0018

Abstract and Keywords This chapter examines the neural underpinnings of episodic memory, focusing on process es taking place during the experience itself, or encoding, and those involved in memory reactivation or retrieval at a later time point. It first looks at the neuropsychological case study of Henry Molaison to show how the medial temporal lobe (MTL) is linked to the for mation of new episodic memories. It also assesses the critical role of the MTL in general and the hippocampus in particular in the encoding and retrieval of episodic memories. The chapter then describes the memory as reinstatement model, episodic memory re trieval, the use of functional neuroimaging as a tool to probe episodic memory, and the difference in memory paradigm. Furthermore, it discusses hippocampal activity during encoding and its connection to associative memory, activation of the hippocampus and cortex during episodic memory retrieval, and how hippocampal reactivation mediates cor tical reactivation during memory retrieval. Keywords: episodic memory, medial temporal lobe, Henry Molaison, memory retrieval, encoding, memory reacti vation, hippocampus, memory as reinstatement model, functional neuroimaging, associative memory

Introduction What did she say? Where did you go? What did you eat? The answers to questions like these require that you access previous experiences, or episodes of your life. Episodic memory is memory for our unique past experiences that unfolded in a particular time and place. By contrast, other forms of memory do not require that you access a particular time and place from the past, such as knowing what a giraffe is (semantic memory) or knowing how to ride a bike (procedural memory). Thus, episodic memory is, in a sense, a collection of our personal experiences, the past that still resides inside of us and makes up much of the narrative of our lives. In fact, the distinction between episodic remembering and semantic knowing was de scribed by Tulving (1972): “Episodic memory refers to memory for personal experiences and their temporal relations, while semantic memory is a system for receiving, retaining, Page 1 of 26

Cognitive Neuroscience of Episodic Memory and transmitting information about meaning of words, concepts, and classification of concepts.” (p. 401–402). And when William James (1890) described memory, he was think ing of episodic memory in particular. “Memory proper, or secondary memory as it might be styled, is the knowledge of a former state of mind after it has already once dropped from consciousness; or rather it is the knowledge of an event, or fact, of which meantime we have not been thinking, with the additional consciousness that we have thought or ex perienced it before” (p. 610).

Figure 18.1 Memory as reinstatement (MAR) model. During an experience, the representation of that ex perience in the brain is a distributed patter of brain activation across cortical and subcortical regions that represent our sensory-perceptual experience, actions, and cognitive and emotional states. When an experience is encoded, it is theorized that changes in the connections within the hippocampus and be tween the hippocampus and the cortex serve to bind together the different aspects of an experience into a coherent episodic memory (A). Subsequently, when a partial cue activates part of the cortical representa tion (B), these encoding-related changes in neural connections enable activation to spread to the hip pocampus, and via pattern completion, to reactivate the original pattern of activity first in the hippocam pus (C) and then in the cortex (D). It is theorized that this sequence of neural events supports episodic memory retrieval.

In this chapter, we describe the current status of our understanding of the neural under pinnings of episodic memory, focusing on processes taking place during the experience it self, or encoding, and those involved in reactivating or retrieving memories (p. 376) at a later time point. To facilitate and organize the discussion, we present a current model of memory formation and retrieval that, perhaps arguably, has received the most support to date (Figure 18.1). At the very least, it is the model that motivates and shapes the design of most current investigations into episodic memory processing in the brain.

Page 2 of 26

Cognitive Neuroscience of Episodic Memory

Setting the Stage: Early Insights from Patient Work It is wise to start a discussion of memory with the most well-known neuropsychological case study of Henry Molaison, famously known as patient H.M. At the young age of 27 years, H.M. opted for surgical removal of his medial temporal lobe (MTL) bilaterally to re lieve the devastating epilepsy with which he had suffered since he was a young child. The surgery was a success in its intended goal: H.M.’s seizures had been relieved. However, there was a devastating unintended consequence. H.M. was not able to form any new episodic memories after the removal. This inability to form new memories is known as an terograde amnesia. The specificity of the deficit was remarkable. His intelligence was in tact, and he appeared to have normal working memory and skill learning, but he was con fined to living in the moment because the present disappeared into the past without a trace. With this one case, it became clear that the MTL is absolutely critical for the formation of new episodic memories. Although there is some debate regarding the extent of presurgi cal memory loss (retrograde amnesia) that he exhibited and whether H.M. was able to learn new semantic information, it is well agreed that his main deficit was the inability to form new episodic memories. Thus, this major discovery set the stage for systems neuro scientists to begin to examine the MTL and how it contributes to episodic memory forma tion and retrieval. The MTL regions damaged in this now-famous surgery were the hippocampus as well as portions of the underlying cortex: entorhinal, perirhinal, and posterior parahippocampal cortices. Since this discovery, it has been demonstrated that other patients (p. 377) with damage to the hippocampus and MTL cortical structures also exhibit anterograde amne sia. Of current focus and highlighted in this chapter is the critical role of the MTL in gen eral and the hippocampus in particular in the encoding and retrieval of episodic memo ries.

Memory as Reinstatement Model The memory as reinstatement (MAR) model presented in Figure 18.1 represents a combi nation of current theory and knowledge regarding how episodic memories are formed and subsequently accessed. It consists of elements drawn from many different models, and should not be considered a new model but rather a summary of the common ele ments of many existing models (e.g., Alvarez & Squire et al., 1994; McClelland et al., 1995; Moscovitch et al., 2005; Norman & O’Reilly, 2003). We will first present the model, and following that, we will discuss the aspects of the model for which strong evidence ex ists and point out aspects of the model that are still somewhat lacking in empirical sup port.

Page 3 of 26

Cognitive Neuroscience of Episodic Memory During an experience (see Figure 18.1A), the representation of that experience in the brain is characterized by distributed cortical and subcortical patterns of neural activation that represent our sensory-perceptual (visual, auditory, somatosensory) experience, ac tions, internal thoughts, and emotions. Thus, in a sense, the brain is processing and rep resenting the current context and state of the organism. In addition, it is thought that the distributed pattern of cortical firing filters into the MTL and converges in the hippocam pus, where a microrepresentation of the current episode is created. The connections be tween neurons that make up the hippocampal representation for each episode are thought to strengthen through long-term potentiation (LTP). The idea is that the connec tions between neurons that are simultaneously active in the hippocampus are more likely to become strengthened compared with those that are not (Davachi, 2004; Hebb, 1949). Importantly, the LTP-mediated strengthening can happen over time both immediately af ter the experience and during post-encoding sleep (Diekelmann & Born, 2010; Ellenbo gen et al, 2007). Thus, what results from a successfully encoded experience is a hip pocampal neural pattern (HNP) and a corresponding cortical neural pattern (CNP). Im portantly, what differentiates the two patterns is that the HNP is thought to contain the critical connections between representations that allow the CNP to be accessed later and be attributed to a particular time and place, that is, an episodic memory. Thus, without the HNP, it is difficult to recover the precise CNP associated with a prior experience. Episodic memory retrieval encompasses multiple stages of processing, including cue pro cessing (see Figure 18.1B), reinstatement of the HNP (see Figure 18.1C), and finally rein statement of the CNP (see Figure 18.1D). Cue processing simply refers to the fact that re trieval is often cued by an external stimulus or internal thought whose representation is supported by cortical regions (visual cortex, auditory cortex, etc.). In addition, the cue is thought to serve as one key that might unlock the neural patterns associated with the pri or experience or episode that contained that stimulus. Specifically it is thought that dur ing successful retrieval, the retrieval cue triggers what has been referred to as hippocam pal pattern completion (see Figure 18.1C). Pattern completion refers to the idea that a complete pattern (a memory) can be reconstructed from part of the pattern (a partial cue). In other words, a part of or the entire HNP that was established during encoding and subsequently strengthened becomes reinstated. Finally, the reinstatement of the HNP is then thought to reinstate aspects of the CNP (see Figure 18.1D), resulting in the con current reactivation of disparate cortical regions that were initially active during the ex perience. Importantly, it is this final stage of reactivation that is thought to underlie our subjective experience of remembering and, in turn, drive mnemonic decision making.

Development of Functional Neuroimaging as a Tool to Probe Episodic Memory Before turning to the data generated by brain imaging studies of episodic memory, it is important to consider the behavioral approaches used to measure episodic recovery. As emphasized in Tulving’s encoding specificity principle (Tulving, 1973), in order to fully Page 4 of 26

Cognitive Neuroscience of Episodic Memory understand episodic memory, it is critical to consider the interaction between encoding and retrieval contexts (c.f. Morris et al., 1977; Roediger, 2000). More specifically, al though the probability that an event will be later remembered critically depends on processes taking place during the experience, or encoding, consideration of the encoding conditions alone cannot determine whether an event will later be recalled. For example, while you may not be able to recall what you ate last night for dinner, you may be able to recognize it from a list of possible alternatives. This is an example in which the ability to retrieve a memory depends on how (p. 378) memory is tested; likewise, different encoding conditions also critically influence whether memory will be accessible on a subsequent test. Thus, although we later discuss patterns of activation during encoding and retrieval separately, it should be understood up front that the two interact and one cannot be com pletely understood without the other. Why use imaging to understand memory? As noted by Tulving (1983), equivalent memory output responses can arise from fundamentally different internal cognitive states. Thus measures such as memory accuracy and response time, although important, are limited with respect to disentangling the contributing underlying cognitive processes that sup port memory formation and retrieval. Furthermore, the current MAR model specifically posits that representations and processes characterizing an encoding event are at least partially reinstated during later successful retrieval. Examination of the veracity of this model is greatly facilitated by the ability to use the multivariable information contained in a functional image of brain activity during encoding and retrieval. In other words, one can ask whether a specific encoding pattern of brain activity (across hundreds of voxels) is reinstated during retrieval, and whether it relates to successful recovery of episodic de tails. This is arguably much more efficient than, for example, asking whether subjects re activate a particular representation during encoding. As evidence discussed further in this chapter demonstrates, although functional imaging of episodic memory has laid the groundwork for characterizing episodic encoding and retrieval in terms of distributed neural systems and multivariate patterns of activity across subcortical and cortical re gions, important questions about the functional roles of cortical regions comprising these systems and their interactions still remain.

Early Insights into Episodic Encoding Using Functional Imaging How does experience get transformed into a long-lasting trace that can later be ac cessed? Surely, one needs to perceive and attend to aspects of an event in order for it to even have a chance of being encoded into a lasting memory. However, as we also know, we do not remember everything we see or attend to, so these factors may be necessary, but they are not sufficient. Therefore, in addition to perceptual and attentive processes, there must exist a set of further mechanisms that ensures the longevity of the stimulus or event features. These mechanisms have collectively been referred to as encoding mecha nisms(Davachi, 2006), or processing that is related to the formation of an enduring mem Page 5 of 26

Cognitive Neuroscience of Episodic Memory ory trace. It is precisely these mechanisms that are thought to be lost in patients with am nesia, who are capable of attending to stimuli but arguably never form lasting episodic memories after the onset of their amnesia. It is noteworthy, however, that recent work has demonstrated that patients with amnesia also appear to have deficits in novel associa tive processing and imagination (Addis et al, 2007; Hassabis et al, 2007; but see Squire et al, 2010). Neuroimaging has been used in a multitude of ways to understand the underlying neural mechanisms supporting episodic memory. Early approaches were directly motivated by the levels of processing framework that demonstrated that differential processing during encoding modulates memory formation (Craik & Lockhart, 1972). Thus, the earliest neu roimaging studies examined brain activation during cognitive tasks that required seman tic or associative processing (Fletcher et al., 1998; Montaldi et al., 1998; Rombouts et al., 1997; Shallice et al., 1994). This approach was targeted because it was known that se mantic or elaborative/associative processing leads to better later memory, and determin ing what brain systems were activated during this kind of processing could help to illumi nate the neural substrates of successful episodic encoding. Results from these initial tasklevel investigations revealed that activation in left lateral prefrontal cortex (PFC), and the MTL was enhanced during semantic processing of study items relative to more superficial processing of those items (Fletcher et al., 1998; Montaldi et al., 1998; Rombouts et al., 1997; Shallice et al., 1994). Involvement of left PFC in semantic retrieval processes has been observed consistently across many paradigms (Badre et al., 2005; Fiez, 1997; Pe tersen et al., 1989; Poldrack et al., 1999; Thompson-Schill et al., 1997; Wagner et al. 2001). Other approaches used to reveal the neural substrates of episodic memory formation have compared brain activation to novel versus familiar stimuli within the context of the same task. The logic in these studies is that, on average, we are more likely to encode novel stimuli compared with familiar ones (Tulving et al., 1994). The results of this ap proach again demonstrated greater activation in the MTL to novel compared with familiar stimuli (Dolan & Fletcher, 1997; Gabrieli et al., 1997; Stern et al., 1996), suggesting that processes in these regions are related to the encoding of novel information into memory. One of the early methodological breakthroughs in functional magetic resonance imaging (fMRI) that enabled a tighter linking between brain activation and episodic en (p. 379)

coding was the measurement of trial-by-trial estimates of blood-oxygen-level-dependent (BOLD) activation, compared with earlier PET and fMRI block designs. Accordingly, at present, most studies of memory now employ event-related fMRI designs and, thus, mea sure brain activation and patterns of activity across multiple voxels on individual trials. The advantage of measuring trial-by-trial estimates is that brain activation during the ex periencing of events that are later remembered can be directly contrasted with activity for events that are not remembered. This approach has been referred to as the difference in memory (DM) paradigm (Paller et al., 1987; Rugg, 1995; Sanquist et al., 1980; Wagner et al., 1999). This paradigm affords better experimental control because events yielding successful and unsuccessful memory encoding can be compared within the same Page 6 of 26

Cognitive Neuroscience of Episodic Memory individual performing the same task. Also, brain activity during encoding can be related to a variety of memory outcomes, measured by different retrieval tests. For example, one can determine whether each presented item was or was not remembered and whether, for example, some contextual detail was also recovered during remembering. The varying memory status of individual events can be used to query the brain data to determine what brain areas show patterns of activation relating to successful memory formation and the recovery of contextual details. Initial groundbreaking studies used this powerful DM approach to reveal brain regions important for successful memory formation using fMRI data. Wagner et al. (1998) demonstrated that brain activation in posterior parahippocampal cortex and left PFC dur ing semantic processing of words was greater during processing of words that partici pants later successfully recognized with high confidence. At the same time, Brewer et al. (1998) found that activation in right PFC and bilateral parahippocampal gyrus during the viewing of scene images correlated with later subjective ratings of memory using the re member/know paradigm. Taken together, these two groundbreaking studies revealed two important principles of episodic memory formation. First, brain activation during encoding depends, in some brain regions, on the content of the stimulus itself, with the lateralized PFC DM effects for verbal and visual-spatial stimuli seen in these two studies nicely aligning with existing work showing that left inferior frontal gyrus is important in semantic processing (Pol drack et al., 1999), whereas pictorial stimuli have been shown to engage right PFC to a greater extent (Kelley, 1998). Thus, successful encoding is related to enhanced activation in a subset of brain regions engaged during stimulus and task processing. The second principle that was evident in these early studies but not directly examined un til more recently is the notion of a specialized system or a domain-general mechanism un derlying episodic memory formation. Both of these initial studies (and almost every study performed since) found that activation within the MTL correlates with successful episodic memory formation. However, how subregions within the MTL differentially contribute to episodic encoding is still not known and is debated both in the animal (Eichenbaum et al., 2012; Meunier et al., 1993; Parkinson et al., 1988; Zola-Morgan & Squire, 1986) and hu man literature (Davachi et al., 2003; Jackson & Schacter, 2004; Kirwan & Stark, 2004; Mayes et al., 2004; Squire et al., 2004; Stark & Squire, 2001, 2003; for reviews, see Davachi, 2006; Diana et al., 2007; Eichenbaum et al., 2007; Ranganath et al, 2010; Wixted & Squire, 2011). The next section provides further discussion focusing on the role of the hippocampus in associative encoding, presumably by laying down a strong HNP (see Fig ure 18.1A) that can later be accessed in the context of an appropriate retrieval cue.

Page 7 of 26

Cognitive Neuroscience of Episodic Memory

Hippocampal Activity During Encoding Pre dicts Later Associative Memory The hippocampus receives direct input from medial temporal lobe cortical regions: the entorhinal, perirhinal (PRc), and parahippocampal (PHc) cortices, each of which receives a distinct pattern of inputs from other neocortical and subcortical regions. However, the PHc projects strongly into PRc, so there is clearly an interplay between these regions as well as between these regions and the hippocampus. Most researchers agree that MTL subregions likely contribute to episodic memory in a distinct way; however, the precise nature of this division remains unclear. That said, a number of experiments have been performed in the past 10 years using broadly similar designs and analysis approaches, and a consistent picture is emerging. In particular, studies have been designed to differ entiate patterns of brain activation during encoding that predict successful item memory from those that predict the recovery of associated items, context, or source. These experi ments were fueled by (p. 380) a leading influential model of MTL function that posits that item and relational encoding are supported by distinct, yet complementary, learning sys tems implemented within the hippocampus and perirhinal cortex (Marr, 1971; McClelland et al., 1995; Norman & O’Reilly, 2003; O’Reilly & Rudy, 2000). For example, providing evi dence for a distinct role in associative encoding, many studies have shown that the mag nitude of encoding activation in the hippocampus is predictive of whether participants will later remember the contextual associations from each trial (Davachi et al., 2003; Dou gal et al., 2007; Hannula & Ranganath, 2008; Kensinger & Schacter, 2006; Kirwan & Stark, 2004; Park & Rugg, 2011; Ranganath et al., 2004; Staresina & Davachi, 2008, 2009; Uncapher et al., 2006; Yu et al, 2012; but see Gold et al., 2006). Furthermore, in many of these same studies and others, PRc activation during encoding was shown to be related to whether items were later recognized, regardless of whether additional contextual de tails were also available at the time of retrieval (Davachi et al., 2003; Dougal et al., 2007; Haskins et al, 2008; Kensinger & Schacter, 2006; Kirwan & Stark, 2004; Ranganath et al., 2004; Staresina & Davachi, 2008, 2009). Taken together, these data suggest a division of labor across MTL regions in their respective contributions to item and associatice memo ry formation. Interestingly, these distinctions between encoding mechanisms in PRc and hippocampus correspond with similar distinctions from single cell recordings in animals (Brown & Ag gleton, 2001; Eichenbaum et al., 2010; Komorowski et al., 2009; Sauvage et al., 2008). Ad ditionally, there is notable evidence from human patient work that damage to the hip pocampus disproportionately impairs recollection, compared with item recognition based on familiarity (Giovanello et al., 2003; Spiers et al., 2001; Vann et al., 2009; Yonelinas et al., 2002; but see Wixted & Squire, 2004). Hippocampal damage is more common, but pa tients with damage to PRc, but not hippocampus, are very rare. In one seminal report, however, a woman with anterior temporal lobe resection that spared hippocampus but re moved the left perirhinal cortex revealed an interesting behavioral pattern in her memory performance. Specifically, she showed a higher than average propensity to recollect with little evidence of familiarity-based memory (Bowles et al., 2007). This finding is critical Page 8 of 26

Cognitive Neuroscience of Episodic Memory because it appears to be consistent with the growing body of literature that PRc mecha nisms are important for knowing that an item has previously occurred even when you cannot remember in what particular episodic context. One important advance, fueled by observations that PRc and PHc are sensitive to differ ent stimulus classes (e.g., scenes versus objects), is that the MTL cortex may contribute to domain-specific encoding of object and scene-like, or contextual, details, whereas the hippocampus may be important in domain-general binding together of these various dis tinct episodic elements (Davachi, 2006). First, it has been demonstrated that PRc re sponds more to objects and faces than scenes and that PHc shows the opposite response pattern: greater activation to scenes than objects and faces (Liang et al., 2013; Litman et al., 2009). Second, when study items were scenes and the associated context was devised to be one of six repeating objects, it was seen that PRc enoding activation now predicted the later recovery of the associated objects (a form of context), whereas succesful scene memory (a form of item memory) was supported by PHc (Awipi & Davachi, 2008). Third, it was shown that both hippocampal and PRc activation predicted whether object details were later recalled, whereas only hippocampal activation additionally predicted whether other contextual details were later recovered (Staresina & Davachi, 2008; see also Park & Rugg, 2011). Finally, in a tightly controlled study in which study items were always words but participants treated the word either as a cue to imagine an object or a cue to imagine a scene, it was shown that PRc activation predicted later source memory for the objectimagery trials and that PHc activation predicted later source memory for the scene-im agery trials (Staresina et al., 2011). Taken together, it is clear that involvement of MTL cortex in encoding is largely dependent on the content of the episode and on what aspects of the episode are attended. By contrast, across all of the aforementioned studies and a whole host of other experiments, hippocampal activation appears to selectively predict whether associated details are later recovered, irrespective of the context of those details (Awipi & Davachi, 2008; Park et al., 2012; Prince et al., 2005; Rugg et al., 2012; Staresina & Davachi, 2008; Staresina et al, 2011). These results bring some clarity to the seemingly inconsistent findings that activation in the PHc has been shown to both correlate with lat er item (Davachi & Wagner, 2002; Eldridge et al., 2000; Kensinger et al., 2003) and asso ciative memory (Awipi & Davachi, 2008; Cansino et al., 2002; Davachi et al., 2003; Kirwan & Stark, 2004; Ranganath et al., 2004; Staresina et al., 2011) across different paradigms. It is likely that the role of PRc (p. 381) and PHc in item versus associative encoding will vary depending on the nature of the stimuli being treated as the “item” and the “con text” (Staresina et al., 2011). It is important to note that although the studies cited above provide strong evidence for a selective role of the hippocampus in binding episodic representations so that they can be later retrieved, another proposal has recently emerged linking hippocampal processes with establishing “strong” memories—both strongly “recollected” and strongly “famil iar” (Kirwan et al, 2008; Shrager et al., 2008; Song et al., 2011; see Hayes et al., 2011 for support for both accounts). This account is not necessarily in conflict with the aforemen tioned notion that PRc is important in both item encoding and item–feature binding. How ever, it does raise questions about the often-used dichotomy linking PRc and hippocampal Page 9 of 26

Cognitive Neuroscience of Episodic Memory function with the subjective sense of “knowing” and “remembering.” It is also important to keep in mind that the paradigms being used to distinguish item from associative, re membering from knowing, and low from high memory strength are all going to suffer from ambiguities in the interpretion of each specific condition. For example, high confi dence recognition can be associated with the recovery of all kinds of episodic detail. Thus, if you only ask for one detail and a participant fails to recover that detail, the par ticipant might actually be recollecting other noncriterial episodic details. Furthermore, it is unclear what underlying operations, or information processing, is being proposed to support the memory strength theory. This is in contrast to the complementary learning systems approach, which is strongly grounded in how underlying processing within PRc and hippocampus can come to support item and associative encoding. Thus, taken together, current functional imaging results strongly suggest that greater hippocampal activation during the encoding of an event is correlated with the later recov ery of the details associated with that event. However, very little has been done to identi fy whether a specific HNP needs to be reinstated or completed in order to allow for recov ery of episodic details. Instead there have been recent reports that the level of reactiva tion in a region of interest can correlate with later memory. These results have specifical ly been seen in work examining post-encoding rest and sleep periods (Rasch et al., 2007; Peigneux et al., 2004; 2006; Tambini et al., 2010). It is assumed that overall hippocampal BOLD activation during encoding may thus be a good proxy for laying down a strong HNP. However, this assumption needs to be tested.

Hippocampus Activates During Episodic Re trieval According to the model presented in Figure 18.1, presentation of a partial cue during episodic retrieval reactivates the original memory trace in cortex through pattern comple tion processes in the hippocampus. Therefore, the hippocampus is the critical hub that connects cue processing to cortical reinstatement during episodic retrieval, and hip pocampal activation should be a necessary component of episodic retrieval. Consistent with this idea, the hippocampus has been found to activate during retrieval, specifically when the retrieval appears to be episodic in nature (i.e., characterized by the recovery of associations or contextual details). A number of measures have been used to isolate episodic from nonepisodic retrieval. For example, the remember/know paradigm is a recognition memory paradigm in which participants must distinguish among studied items that are recollected with episodic details (remember), studied items that are merely familiar (know), and unstudied items (new). Studied items endorsed as “remembered” are associated with greater hippocampal activity during retrieval than studied items en dorsed as “known” or “new” (Eldridge et al., 2000; Wheeler & Buckner, 2004; see also, Daselaar et al., 2006; Yonelinas et al., 2005). Another way to isolate episodic retrieval is to compare the successful retrieval of an association to successful item recognition in the absence of successful associative retrieval. The hippocampus has been found to be more Page 10 of 26

Cognitive Neuroscience of Episodic Memory active during associative retrieval than during nonassociative retrieval (Dobbins et al, 2003; Kirwan & Stark, 2004; Yonelinas et al., 2001). In addition, hippocampal activation during autobiographical memory retrieval has been found to correlate with subjective re ports of a number of recollective qualities, including detail, emotionality, and personal significance (Addis et al., 2004). The current evidence is consistent with the notion that the hippocampus is driven by the recovery of associations or episodic details. The MAR model posits that episodic retrieval does not just activate hippocampus general ly, but also specifically reactivates through pattern completion the same hippocampal neurons (the HNP) that were active during the encoding episode. Because of methodolog ical obstacles, there are currently no fMRI studies demonstrating that the hippocampus (p. 382) reinstates encoding activity during retrieval (but see the later section, Hippocam pal Reactivation Mediates Cortical Reactivation: An Open Question). However, in a land mark study using intracranial single-cell recordings in epileptic patients, Gelbard-Sagiv and colleagues (2008) presented compelling evidence that hippocampal neurons reacti vate during episodic retrieval. Gelbard-Sagiv et al. recorded from neurons while partici pants viewed and subsequently freely recalled a series of short movie clips. A subset of hippocampal neurons responded selectively to specific movie clips during viewing. Dur ing subsequent free recall, hippocampal neurons selectively activated during viewing of a particular clip reactivated immediately preceding verbal recall of that clip. For example, one hippocampal neuron fired specifically during the viewing of a clip from “The Oprah Winfrey Show.” That same neuron fired immediately before verbal recall of the same clip. In contrast to hippocampal neurons, anterior cingulate neurons demonstrated selectivity during viewing but did not reactivate during recall. This study demonstrated that hip pocampal neurons can indeed reactivate during episodic retrieval.

Cortex Reactivates During Episodic Retrieval In the MAR model presented in Figure 18.1, presentation of a partial cue during episodic retrieval reactivates the CNP through pattern completion processes in the hippocampus. In the course of this chapter, we have already discussed and elaborated on evidence sup porting the role of the hippocampus in memory encoding (see Figure 18.1A) and retrieval (see Figure 18.1C). We will now summarize and discuss evidence that partial cues reacti vate cortex during retrieval (see Figure 18.1D). In contrast to MTL damage, which results in anterograde amnesia, damage to posterior cortical regions often leads to an inability to retrieve previously learned information (ret rograde amnesia; for a review of data demonstrating that impaired perception is often ac companied by impaired mental imagery, see Farah, 1988). According to Greenberg and Rubin (2003), memory deficits due to posterior cortical damage are usually limited to the cognitive processes affected by the impairment. That is, particular components of memo ries that are supported by the damaged cortex may be rendered lost or inaccessible. For example, individuals with damage to auditory cortex might experience memories without sound. However, in some cases, damage to posterior cortical regions can lead to more Page 11 of 26

Cognitive Neuroscience of Episodic Memory global forms of retrograde amnesia. This kind of global memory impairment would be ex pected if the damaged cortex represented a large or crucial component of many episodic memories. This appears to be the case in some individuals with damage to visual cortex (Rubin & Greenberg, 1998). In contrast to amnesia caused by MTL lesions, amnesia caused by damage to the perceptual systems seems to be predominantly retrograde in na ture. Thus, it appears that the cortex is pivotal in representing the contents of memory that are reactivated during remembering. Strong support for cortical reactivation also requires evidence that regions of cortex that are activated directly by some stimulus during encoding can be indirectly reactivated by an associate of that stimulus (i.e., the partial cue in Figure 18.1B) during retrieval. In the past decade, substantial evidence drawn from functional neuroimaging studies of memo ry has supported the idea that the presentation of partial cues can lead to the reactiva tion of cortex during episodic retrieval. These studies have largely converged on a single paradigm, which we describe in detail here. This paradigm relies critically on the associa tion of neutral retrieval cues with different kinds of associative or contextual information, such that the stimuli that engage different regions of cortex during encoding are re trieved but not actually presented during retrieval. During encoding, brain activity is recorded while participants encounter and build associations between neutral stimuli (e.g., words) and multiple categories of stimuli that evoke activity in different brain re gions (e.g., pictures vs. sounds). During retrieval, brain activity is recorded while partici pants are presented with the neutral stimuli as retrieval cues and are instructed to make decisions about their memory of the cue or its associates. For example, participants might make a decision about whether the cue was studied or not (recognition decision), how well they remember encoding the cue (remember/know decision), or how the cue was en coded or what its associates were (cued recall/source decision). Early studies using this paradigm, relying on what we will refer to as the region of inter est (ROI) approach, demonstrated that circumscribed cortical regions that are differen tially engaged during the encoding of different kinds of associations are also differentially engaged during their retrieval. For example, in an event-related fMRI study, Wheeler, Pe tersen, and Buckner (2000) had participants associate words (e.g., dog) with either corre sponding sounds (“WOOF!”) or corresponding pictures (e.g., a picture of a dog) during encoding. During (p. 383) subsequent retrieval, participants were presented with each studied word and instructed to indicate whether it was studied with a sound or picture. Wheeler et al. (2000) found that the fusiform gyrus, a region in the visual association cor tex that is preferentially activated by pictures compared with sounds during encoding, is also more strongly activated during cued picture retrieval than cued sound retrieval. They also found that Heschl’s gyrus, a region in the auditory association cortex that is preferentially activated by sounds compared with pictures during encoding, is more strongly activated during cued sound retrieval than cued picture retrieval. That is, re gions that are activated during sound and picture encoding are reactivated during sound and picture retrieval. This finding has been replicated across studies for both pictures

Page 12 of 26

Cognitive Neuroscience of Episodic Memory (Vaidya et al., 2002; Wheeler & Buckner, 2003; Wheeler & Buckner, 2004; Wheeler et al., 2006) and sounds (Nyberg et al., 2000). In addition, the specificity of reactivation has been demonstrated in several studies that found that the retrieval of different visual categories evokes activity in stimulus-selective regions of visual cortex. For example, the ventral and dorsal visual processing streams, which process object information and location information, respectively (Ungerleider & Mishkin, 1982), have been shown to reactivate during the retrieval of object and location information (Khader et al., 2005). Similarly, the fusiform face area (FFA) and parahip pocampal place area (PPA), two regions in the ventral visual stream that have been shown to respond preferentially to visually presented faces and scenes, respectively (Epstein & Kanwisher, 1998; Kanwisher et al., 1997), have correspondingly been shown to reactivate during the retrieval of faces and places (O’Craven & Kanwisher, 2000; Ranganath et al., 2004; see also Danker, Fincham, & Anderson, 2011). In a series of studies, Slotnick and colleagues demonstrated that even regions very early in the visual processing stream re activate during retrieval of the appropriate stimulus: Color retrieval reactivates color pro cessing region V8 (Slotnick, 2009a), retrieval of items in motion reactivates motion pro cessing region MT+ (Slotnick & Thackral, 2011), and retrieval of items presented to the right or left visual field reactivates the contralateral area of extrastriate cortex (BA 18) in a retinotopic manner (Slotnick, 2009b). These studies demonstrate that different process ing modules within the visual system are reactivated during the retrieval of specific kinds of visual information (for a more in-depth discussion, see Danker & Anderson, 2010). In recent years, a new approach known as classification or multivoxel pattern analysis (MVPA, Haxby et al., 2001; Mitchell et al., 2004) has become popular for investigating the reactivation of encoding representations during retrieval. In contrast to the ROI ap proach, which is sensitive to overall activity differences between conditions within a re gion (i.e., a group of contiguous voxels), the classifier approach is sensitive to differences in the pattern of activity across voxels between conditions. In a typical classification study, a computer algorithm known as a classifier (e.g., a neural network) is trained to differentiate the pattern of activity across voxels between two or more conditions. The logic behind applying the classifier approach to study episodic memory is as follows: If partial cues reactivate cortex during retrieval, then the pattern of cortical activity within a particular condition during cued retrieval should resemble, at least partially, the pat tern of activity within that condition during encoding. Therefore, a classifier trained to differentiate conditions on encoding trials should also be able to classify retrieval trials at above chance accuracy. Greater similarity between corresponding encoding and retrieval patterns will be reflected in greater classifier accuracy. Polyn and colleagues (2005) were the first to apply the classification method to the study of memory in this manner. Participants studied lists containing photographs of famous faces, famous locations, and common objects, and subsequently retrieved as many list items as possible in a free recall paradigm (i.e., no cues were presented). Polyn et al. trained classifiers to differentiate between encoding trials in the three conditions, and tested the classifiers using the free recall data. Consistent with their predictions, Polyn et Page 13 of 26

Cognitive Neuroscience of Episodic Memory al. found that the reactivation of a given stimulus type’s pattern of activity correlated with verbal recall of items of that stimulus type. It is worth noting that the voxels that con tributed to classification decisions overlapped with, but were not limited to, the categoryselective regions that one would expect to find using the ROI approach (i.e., the FFA and PPA). Polyn et al. present their technique of applying MVPA to memory retrieval as “a powerful new tool that researchers can use to test and refine theories of how people mine the recesses of the past” (p. 1966). As mentioned earlier, the retrieval of associations and contexts is one of the hallmarks of episodic retrieval. Insofar as episodic retrieval is characterized by cortical reactivation, we should expect greater reactivation when episodic details are recovered (p. 384) during retrieval. Consistent with this, the earliest studies to find reactivation of encoding regions during retrieval required associative retrieval (Nyberg et al., 2000; Wheeler et al., 2000). Along the same lines, reactivation correlates with subjective reports of the recovery of episodic details. In studies using the remember/know paradigm, items endorsed as re membered have been found to evoke more reactivation than items endorsed as known us ing both the ROI (Wheeler & Buckner, 2004) and classifier (Johnson et al., 2009) ap proaches. Along the same lines, Daselaar et al. (2008) found that the degree of activation in auditory and visual association cortex was positively correlated with participant rat ings of reliving during autobiographical memory retrieval, suggesting that reactivation correlates with the number or quality of retrieved details. Overall, the current evidence suggests that reactivation correlates with subjective ratings of episodic retrieval. It is often the case that a particular retrieval cue is associated with multiple episodes. Re trieval becomes more difficult in the presence of competing associations (Anderson, 1974), and this is often reflected in increased prefrontal and anterior cingulate cortex (ACC) involvement during retrieval (e.g., Danker, Gunn, & Anderson, 2008; ThompsonSchill et al., 1997; Wagner et al., 2001). It has been theorized that this increased frontal activity represents the engagement of control processes that select among competing al ternatives during retrieval (e.g., Danker, Gunn, & Anderson, 2007; Thompson-Schill et al., 1997; Wagner et al., 2001). According to Kuhl and colleagues (2010), if competition re sults from the simultaneous retrieval of multiple episodes, then competition should be re flected in the simultaneous reactivation of competing memories during retrieval. In their study, Kuhl et al. (2010) instructed participants to associate words (e.g., “lamp”) with im ages of well-known faces (e.g., Robert De Niro) or scenes (e.g., Taj Majal). Some words were paired with one associate, and some words were paired with two associates: one face and one scene. During retrieval, participants were presented with a word as a cue and instructed to recall its most recent associate and indicate the visual category (face or scene). Kuhl et al. (2010) used a classifier approach to capture the amount of target and competitor reactivation during retrieval and found that competition decreased target classifier accuracy, presumably because of increased competitor reactivation. Further more, when classifier accuracy was low, indicating high competition, frontal engagement was increased. A follow-up study using three kinds of images (faces, scenes, and objects) confirmed that competition corresponded to increased competitor reactivation, and found that competitor reactivation correlated with ACC engagement during retrieval (Kuhl, Page 14 of 26

Cognitive Neuroscience of Episodic Memory Brainbridge, & Chun, 2012). These studies demonstrate that competition during retrieval is reflected in the reactivation of competing memories.

Hippocampal Reactivation Mediates Cortical Reactivation: An Open Question Figure 18.1 outlines a process whereby hippocampal reactivation mediates cortical reacti vation during retrieval. As discussed in this chapter, current research supports the role of the hippocampus in episodic encoding and retrieval: Hippocampal activity during encod ing predicts subsequent memory, and hippocampal activity during retrieval coincides with the recovery of episodic details. Furthermore, we have shown that cortical reactivation occurs during retrieval and correlates with the recovery of episodic details. Despite the fact the many models of hippocampal–cortical interaction converge on the theory that the hippocampus mediates cortical reactivation during retrieval (Alvarez & Squire, 1994; Mc Clelland et al., 1995; Moscovitch et al., 2005), there is currently a paucity of empirical ev idence from neuroimaging studies. If hippocampal reactivation mediates cortical reactiva tion, then hippocampal reactivation should both precede and predict cortical reactivation. Given the limitations of the methodological techniques currently available, there are two major hurdles: (1) simultaneous measurement of hippocampal and cortical reactivation, and (2) measurement of reactivation at a temporal resolution capable of distinguishing the temporal order of hippocampal and cortical reactivation. By capitalizing on stimulus types with distinct patterns of hippocampal and cortical activi ty, one should be able to simultaneously measure hippocampal and cortical reactivation during retrieval using classification methods. In fact, there has already been some suc cess in using classifiers to identify cognitive states (Hassabis et al., 2009), and even indi vidual memories (Chadwick et al., 2010), using patterns of activity across hippocampal voxels. However, no study has attempted to classify retrieval trials using a classifier trained on the encoding data. This would be a true demonstration of hippocampal reacti vation during retrieval. However, measuring cortical and hippocampal reactivations at a sufficient temporal reso lution to distinguish their order is a more difficult methodological (p. 385) barrier. Where as event-related potential (ERP) studies of reactivation have provided estimates of how early cortical reactivation occurs (Johnson et al., 2008; Slotnick, 2009b; Yick & Wilding, 2008), it would be extremely difficult to isolate a signal from the hippocampus using elec troencephalography (EEG) or even magnetoencephalography (MEG). Testing the hypothe sis that hippocampal reactivation mediates cortical reactivation during episodic retrieval will be one of the major challenges for episodic memory researchers in the near future.

Page 15 of 26

Cognitive Neuroscience of Episodic Memory

References Addis, D. R., Moscovitch, M., Crawley, A. P., & McAndrews, M. P. (2004). Recollective qualities modulate hippocampal activation during autobiographical memory retrieval. Hippocampus, 14, 752–762. Addis, D. R., Wong A. T., & Schacter D. L. (2007). Remembering the past and imagining the future: Common and distinct neural substrates during event construction and elabo ration. Neuropsychologia, 45 (7) 1363–1377. Alvarez, P., & Squire, L. R. (1994). Memory consolidation and the medial temporal lobe: A simple network model. Proceedings of the National Academy of Sciences, 91, 7041–7045. Anderson, J. R. (1974). Retrieval of propositional information from long-term memory. Cognitive Psychology, 5, 451–474. Awipi, T., & Davachi L. (2008). Content-specific source encoding in the human medial temporal lobe. Journal of Experimental Psychology: Learning, Memory and Cognition, 34 (4) 769–779. Badre, D., Poldrack, R. A., Paré-Gloev, E. J., Insler, R. Z., & Wagner, A. D. (2005). Dissocia ble controlled retrieval and generalized selection mechanisms in ventrolateral prefrontal cortex. Neuron, 47, 907–918. Bowles, B., Crupi, C., Mirsattari, S. M., Pigott, S. E., Parrent, A. G., Pruessner, J. C., Yonelinas, A. P., & Köhler, S. (2007). Impaired familiarity with preserved recollection after anterior temporal-lobe resection that spares the hippocampus. Proceedings of the Nation al Academy of Sciences, 41, 16382–16387. Brewer, J. B., Zhao, Z., Desmond, J. E., Glover, G. H., & Gabrieli, J. D. E. (1998). Making memories: Brain activity that predicts how well visual experience will be remembered. Science, 281, 1185–1187. Cansino, S., Maquet, P., Dolan, R. J., & Rugg, M. D. (2002). Brain activity underlying en coding and retrieval of source memory. Cerebral Cortex, 12, 1048–1056. Chadwick, M. J., Hassabis, D., Weiskopf, N., & Maguire, E. A. (2010). Decoding individual episodic memory traces in the human hippocampus. Current Biology, 20, 544–547. Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11, 671–684. Danker, J. F., & Anderson, J. R. (2010). The ghosts of brain states past: Remembering re activates the brain regions engaged during encoding. Psychological Bulletin, 136, 87–102. Danker, J. F., Fincham, J. M., & Anderson, J. R. (2011). The neural correlates of competi tion during memory retrieval are modulated by attention to the cues. Neuropsychologia, 49, 2427–2438. Page 16 of 26

Cognitive Neuroscience of Episodic Memory Danker, J. F., Gunn, P., & Anderson, J. R. (2008). A rational account of memory predicts left prefrontal activation during controlled retrieval. Cerebral Cortex, 18, 2674–2685. Daselaar, S. M., Fleck, M. S., & Cabeza, R. (2006). Triple dissociation in the medial tem poral lobes: recollection, familiarity, and novelty. Journal of Neurophysiology, 96, 1902– 1911. Daselaar, S. M., Rice, H. J., Greenberg, D. L., Cabeza, R., LaBar, K. S., & Rubin, D. C. (2008). The spatiotemporal dynamics of autobiographical memory: Neural correlates of recall, emotional intensity, and reliving. Cerebral Cortex, 18, 217–229. Davachi, L. (2004). The ensemble the plays together, stays together. Hippocampus, 14, 1– 3. Davachi, L., Mitchell, J. P., & Wagner, A. D. (2003). Multiple routes to memory: Distinct medial temporal lobe processes build item and source memories. Proceedings of the Na tional Academy of Sciences, 100, 2157–2162. Davachi, L. (2006). Item, context and relational episodic encoding in humans. Current Opinion in Neurobiology, 16, 693–700. Davachi, L. (2007). Encoding: The proof is still required. In H. L. Roediger III, Y. Dudai, & S. M. Fitzpatrick (Eds.), Science of memory: Concepts (pp. 137–143). New York: Oxford University Press. Dobbins, I. G., Rice, H. J., Wagner, A. D., & Schacter, D. L. (2003). Memory orientation and success: separable neurocognitive components underlying episodic recognition. Neu ropsychologia, 41, 318–333. Dougal, S., Phelps, E. A., & Davachi, L. (2007). The role of the medial temporal lobe in item recognition and source recollection of emotional stimuli. Cognitive, Affective & Be havioral Neurosciences, 7 (3) 233–242. Diana, R. A., Yonelinas, A. P., & Ranganath, C. (2007). Imaging recollection and familiarity in the medial temporal lobe: a three-component model. Trends in Cognitive Sciences, 11, 379–386. Diekelmann, S., & Born, J. (2010). The memory function of sleep. Nature Reviews Neuro science, 11, 114–126. Dolan, R. J., & Fletcher, P. C. (1997). Dissociating prefrontal and hippocampal function in episodic memory encoding. Nature, 388, 582–585. Eichanbam, H., Yonelinas, A. R., & Ranganath, C. (2007). The medial temporal lobes and recognition memory. Annual Reviews in Neuroscience, 30, 123–152.

Page 17 of 26

Cognitive Neuroscience of Episodic Memory Eichenbaum, H., Sauvege, M., Fortin, N., Komorovski, R., & Lipton, P. (2012). Towards a functional organization of episodic memory in the medial temporal lobe. Neuroscience & Biobehavioral Reviews, 36, 1597–1608. Eichenbaum, H., Fortin, N., Sauvage, M., Robitsek, R. J., & Farovik, A. (2010) An animal model of amnesia that uses Receiver Operating Characteristics (ROC) analysis to distin guish recollection from familiarity deficits in recognition memory. Neuropsychologia 48, 2281–2289. Eldridge, L. L., Knowlton, B. J., Furmanski, C. S., Bookheimer, S. Y., & Engel, S. A. (2000). Remembering episodes: A selective role for the hippocampus during retrieval. Nature Neuroscience, 3, 1149–1152. Ellenbogen, J. M., Hu, P. T., Payne, J. D. Titone, D., & Walker, M. P. (2007) Human relation al memory requires time and sleep. Proceedings of the National Academy of Sciences, 104 (18), 7723–7728. Epstein, R., & Kanwisher, N. (1998). A cortical representation of the local visual environ ment. Nature, 392, 598–601. Farah, M. J. (1988). Is visual imagery really visual? Overlooked evidence from neu ropsychology. Psychological Review, 95, 307–317. (p. 386)

Fiez, J. A. (1997). Phonology, semantics, and the role of the left inferior prefrontal cortex. Human Brain Mapping, 5, 79–83. Fletcher, P. C., Shallice, T., & Dolan, R. J. (1998). The functional roles of prefrontal cortex in episodic memory. I. Encoding. Brain, 121, 1239–1248. Gabrieli, J. D. E., Brewer, J. B., Desmond, J. E., & Glover, G. H. (1997). Separate neural bases of two fundamental memory processes in the human medial temporal lobe. Science, 11, 264–266. Gelbard-Sagiv, H., Mukamel, R., Harel, M., Malach, R., & Fried, I. (2008). Internally gen erated reactivation of single neurons in human hippocampus during free recall. Science, 322, 96–101. Giovanello, K. S., Verfaellie, M., & Keane, M. M. (2003). Disproportionate deficit in asso ciative recognition relative to item recognition in global amnesia. Cognitive, Affective, and Behavioral Neuroscience, 3, 186–194. Gold, J. J., Smith, C. N., Baylet, P. J., Shrager, Y., Brewer, J. B., Stark, C. E. L., Hopkins, R. O., & Squire, L. R. (2006). Item memory, source memory, and the medial temporal lobe: Concordant findings from fMRI and memory-impaired patients. Proceedings of the Na tional Academy of Sciences of the United States of America, 103, 9351–9356. Greenberg, D. L., & Rubin, D. C. (2003). The neuropsychology of autobiographical memo ry. Cortex, 39, 687–728. Page 18 of 26

Cognitive Neuroscience of Episodic Memory Hannula, D. E., & Ranganath, C. (2008). Medial temporal lobe activity predicts successful relational memory binding. Journal of Neuroscience, 28 (1), 116–124. Haskins, A. L., Yonelinas, A. P., & Ranganath, C. (2008). Perirhinal cortex supports uniti zation and familiarity-based recognition of novel associations. Neuron, 59 (4), 554–560. Hassabis, D., Kumaran D., Vann S. D., & Macguire E. A. (2007). Patients with hippocam pal amnesia cannot imagine new experiences. Proceedings of the National Academy of Sciences, 105 (5), 1726–1731. Hassabis, D., Chu, C., Rees, G., Weiskopf, N., Molyneux, P. D., & Maguire, E. A. (2009). Decoding neuronal ensembles in the human hippocampus. Current Biology, 19, 546–554. Haxby, J. V., Gobbini, M. I., Furey, M. L., Ishai, A., Schouten, J. L., & Pietrini, P. (2001). Distributed and overlapping representations of faces and objects in ventral temporal cor tex. Science, 293, 2425–2430. Hayes, S. M., Buchler, N., Stokes, J., Kragel, J., & Cabeza, R. (2011). Neural correlates of confidence during item and recognition and source memory retrieval: Evidence for both dual-process and strength memory theories. Journal of Cognitive Neuroscience, 23, 3959– 3971. Hebb, D. O. (1949). The organization of behavior. New York: Wiley & Sons. Jackson, O., & Schacter, D. L. (2004). Encoding activity in anterior medial temporal lobe supports subsequent associative recognition. NeuroImage, 21, 456–462. James, W. (1890). The principles of psychology. New York: Holt. Johnson, J. D., Minton, B. R., & Rugg, M. D. (2008). Context-dependence of the electro physiology correlates of recollection. NeuroImage, 39, 406–416. Johnson, J. D., McDuff, S. G. R., Rugg, M. D., & Norman, K. A. (2009). Recollection, famil iarity, and cortical reinstatement: A multivoxel pattern analysis. Neuron, 63, 697–708. Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: A module in human extrastriate cortex specialized for face perception. Journal of Neuroscience, 17, 4302–4311. Kelley, W. M., Miezin, F. M., McDermott, K. B., Buckner, R. L., Raichle, M. E., Cohen, N. J., Ollinger, J. M., Akbudak, E., Conturo, T. E., Snyder, A. Z., & Petersen, S. E. (1998). Hemi spheric specialization in human dorsal frontal cortex and medial temporal lobe for verbal and nonverbal memory encoding. Neuron, 20, 927–936. Kensinger, E. A., Clarke, R. J., & Corkin, S. (2003). What neural correlates underlie suc cessful encoding and retrieval? A functional magnetic resonance imaging study using a divided attention paradigm. Journal of Neuroscience, 23, 2407–2415.

Page 19 of 26

Cognitive Neuroscience of Episodic Memory Kensinger, E. A., & Schacter, D. (2006). Amygdala activity is associated with the success ful encoding of item, but not source, information for positive and negative stimuli. Journal of Neuroscience, 26 (9), 2564–2570. Khader, P., Burke, M., Bien, S., Ranganath, C., & Rosler, F. (2005). Content-specific activa tion during associative long-term memory retrieval. NeuroImage, 27, 805–816. Kirwan, C. B., & Stark, C. L. (2004). Medial temporal lobe activation during encoding and retrieval of novel face-name pairs. Hippocampus, 14, 919–930. Kirwan, C. B., Wixted, J. T., & Squire, L.R. (2008). Activity in the medial temporal lobe predicts memory strength, whereas activity in the prefrontal cortex predicts recollection. Journal of Neuroscience, 28, 10548–10541. Komorowski, R. W., Manns, J. R., & Eichenbaum, H. (2009) Robust conjunctive item-place coding by hippocampal neurons parallels learning what happens. Journal of Neuroscience, 29, 9918–9929. Kuhl, B. A., Rissman, J., Chun, M. M., & Wagner, A. D. (2010). Fidelity of neural reactiva tion reveals competition between memories. Proceedings of the National Academy of Sciences, 108, 5903–5908. Kuhl, B. A., Brainbridge, W. A., & Chun, M. M. (2012). Neural reactivation reveals mecha nisms for updating memory. Journal of Neuroscience, 32, 3453–3461. Levin, D. T., Simons, D. J., Angelone, B. L., & Chabris, C. F. (2002). Memory for centrally attended changing object in an incidental real-world change detection paradigm. British Journal of Psychology, 92, 289–302. Liang, J., Wagner, A. D., & Preston, A. R. (2013). Content representation in the human me dial temporal lobe. Cerebral Cortex, 23 (1), 80–96. Litman, L., & Davachi, L. (2008) Distributed learning enhances relational memory consol idation. Learning & Memory, 15, 711–716. Marr, D. (1971). Simple memory: A theory for archicortex. Philosophical Transactions of the Royal Society of London, Series B, 176, 161–234. McClelland, J. L., McNaughton, B. L., & O’Reilly, R. C. (1995). Why are there complemen tary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102, 419– 457. Meunier, M., Bachevalier, J., Mashkin, M., & Murray, E. A. (1993) Effects of visual recog nition of combined and separate ablations of the entorhinal and perirhinal cortex in rhe sus monkeys. Journal of Neuroscience, 13, 5418–5432.

Page 20 of 26

Cognitive Neuroscience of Episodic Memory Mitchell, T., Hutchinson, R., Niculescu, S., Pereira, F., Wang, X., Just, M., & New man, S. (2004). Learning to decode cognitive states from brain images. Machine Learning, 57, 145–175. (p. 387)

Montaldi, D., Mayes, A. R., Barnes, A., Pirie, H., Hadley, D. M., Patterson, J., & Wyper, D. J. (1998). Associative encoding of pictures activates the medial temporal lobes. Human Brain Mapping, 6, 85–104. Morris, C. D., Bransford, J. D., & Franks, J. J. (1977). Levels of processing versus transfer appropriate processing. Journal of Verbal Learning and Verbal Behavior, 16, 519–533. Moscovitch, R., Rosenbaum, R. S., Gilboa, A., Addis, D. R., Westmacott, R., Grady, C., McAndrews, M. P., Levine, B., Black, S., Winocur, G., & Nadel, L. (2005). Functional neu roanatomy of remote episodic, semantic, and spatial memory: A unified account based on multiple trace theory. Journal of Anatomy, 207, 35–66. Norman, K. A., & Schacter, D. L. (1997). False recognition in young and older adults: Ex ploring the characteristics of illusory memories. Memory & Cognition, 25, 838–848. Norman, K. A., & O’Reilly, R. C. (2003). Modeling hippocampal and neocortical contribu tions to recognition memory: a complementary-learning-systems approach. Psychological Review, 110, 611–646. Nyberg, L., Habib, R., McIntosh, A. R., & Tulving, E. (2000). Reactivation of encoding-re lated brain activity during memory retrieval. Proceedings of the National Academy of Sciences, 97, 11120–11124. O’Craven, K. M., & Kanwisher, N. (2000). Mental imagery of faces and places activates corresponding stimulus-specific brain regions. Journal of Cognitive Neuroscience, 12, 1013–1023. O’Reilly, R. C. & Rudy, J. W. (2000). Computational principles of learning in the neocortex and hippocampus. Hippocampus, 10, 389–397. Paller, K. A., Kutas, M., & Mayes, A. R. (1987). Neural correlates of encoding in an inci dental learning paradigm. Electroencephalography and Clinical Neurophysiology, 67, 360–371. Park, H., & Rugg, M. D. (2011) Neural correlates of encoding within- and across-domain inter-item associations. Journal of Cognitive Neuroscience, 23, 2533–2543. Park, H., Shannon V., Biggan, J., & Spann, C. (2012). Neural activity supporting the forma tion of associative memory versus source memory. Brain Research, 1471, 81–92. Parkinson, J. K., Murray, E. A., & Mishkin, M. (1988). A selective mnemonic role for the hippocampus in monkeys: memory for the location of objects. Journal of Neuroscience, 8, 4159–4167.

Page 21 of 26

Cognitive Neuroscience of Episodic Memory Peigneux, P., Laureys, S., Fuchs, S., Collette, F., Perrin, F., Reggers, J., Phillips, C., Deguel dre, C., Del Fiore, G., Aerts, J., Luxen, A., & Maquet, P. (2004). Are spatial memories strengthened in the human hippocampus during slow wave sleep? Neuron, 44, 535–545. Peigneux, P., Orban P., Balteau E., Degueldre C., Luxen A., Laureys S., & Paquet P. (2006). Offline persistence of memory-related cerebral activity during active wakefulness. PLoS Biology 4 (4), e100. Petersen, S. E., Fox, P. T., Posner, M. I., Mintun, M., & Raichle, M. E. (1989). Positron emission tomographic studies of processing of single words. Journal of Cognitive Neuro science, 1, 153–170. Poldrack, R. A., Wagner, A. D., Prull, M. W., Desmond, J. E., Glover, G. H., & Gabrieli, J. D. E. (1999). Functional specialization for semantic and phonological processing in left infe rior frontal cortex. NeuroImage, 10, 15–35. Polyn, S. M., Natu, V. S., Cohen, J. D., & Norman, K. A. (2005). Category-specific cortical activity precedes retrieval during memory search. Science, 310, 1963–1966. Prince, S. E., Daselaar, S. M., & Cabeza, R. (2005). Neural correlates of relational memo ry: Successful encoding and retrieval of semantic and perceptual associations. Journal of Neuroscience, 25 (5), 1203–1210. Ranganath, C., Cohen, M. X., Dam, C., & D’Esposito, M. (2004). Inferior temporal, pre frontal, and hippocampal contributions to visual working memory maintenance and asso ciative memory retrieval. Journal of Neuroscience, 24, 3917–3925. Ranganath, C. (2010). A unified framework for the functional organization of the medial temporal lobes and the phenomenology of episodic memory. Hippocampus, 20, 1263– 1290. Rasch, B., Buchel, C., Gais, S., & Born, J. (2007). Odor cues during slow-wave sleep prompt declarative memory consolidation. Science, 315, 1426–1429. Roediger, H. L. (2000). Why retrieval is the key process to understanding human memory. In E. Tulving (Ed.), Memory, consciousness, and the brain: The Tallinn conference (pp. 52– 75). Philadelphia: Psychology Press. Rombouts, S. A. R. B., Machielsen, W. C. M., Witter, M. P., Barkhof, F., Lindelboom, J., & Scheltens, P. (1997). Visual associative encoding activates the medial temporal lobe: A functional magnetic resonance imaging study. Hippocampus, 7, 594–601. Rubin, D. C., & Greenberg, D. L. (1998). Visual memory-deficit amnesia: A distinct amne sia presentation and etiology. Proceedings of the National Academy of Sciences, 95, 5413– 5416.

Page 22 of 26

Cognitive Neuroscience of Episodic Memory Rugg, M. D. (1995). ERP studies of memory. In M. D. Rugg & M. G. H. Coles (Eds.), Elec trophysiology of mind: Event-related brain potentials and cognition (pp. 133–170). Lon don: Oxford University Press. Sanquist, T. F., Rohrbaugh, J., Syndulko, K., & Lindsley, D. B. (1980). An event-related po tential analysis of coding processes in human memory. Progress in Brain Research, 54, 655–660. Sauvage, M. M., Fortin, N. J., Owens, C. B., Yonelinas, A. P., & Eichenbaum, H. (2008). Recognition memory: Opposite effects of hippocampal damage on recollection and famil iarity. Nature Neuroscience, 11, 16–18. Shallice, T., Fletcher, P., Frith C. D., Grasby, P., Frackowiak, R. S. J., & Dolan, R. J. (1994). Brain regions associated with acquisition and retrieval of verbal episodic memory. Nature, 368, 633–635. Shrager, Y., Kirwan, C. B., & Squire, L. R. (2008). Activity in both hippocampus and perirhinal cortex predicts the memory strength of subsequently remembered information. Neuron, 59, 547–553. Slotnick, S. D. (2009a). Memory for color reactivates color processing region. NeuroRe port, 20, 1568–1571. Slotnick, S. D. (2009b). Rapid retinotopic reactivation during spatial memory. Brain Re search, 1268, 97–111. Slotnick, S. D., & Thakral, P. P. (2011). Memory for motion and spatial location is mediat ed by contralateral and ipsilateral motion processing cortex. NeuroImage, 55, 794–800. Song, Z., Wixted, J. T., Smith, C. N., & Squire, L. R. (2011). Different nonlinear functions in hippocampus and perirhinal cortex relating functional MRI activity to memory strength. Proceedings of the National Academy of Sciences, 108, 5783–5788. Spiers, H. J., Burgess, N., Hartley, T., Vargha-Khadem, F., & O’Keefe, J. (2001). Bi lateral hippocampal pathology impairs topographical and episodic memory but not visual pattern matching. Hippocampus, 11 (6), 715–725. (p. 388)

Squire, L. R., Stark, C. E. L., & Clark, R. E. (2004). The medial temporal lobe. Annual Re view of Neuroscience, 27, 279–306. Squire, L. R., van der Horst, A. S., McDuff, S. G., Frascino, J. C., Hopkins, R. O., & Mauldin, K. N. (2010). Role of the hippocampus in remembering the past and imagining the future. Proceedings of the National Academy of Sciences, 107 (44) 19044–19048. Staresina, B. P., & Davachi, L. (2008). Selective and shared contributions of the hip pocampus and perirhinal cortex to episodic item and associative encoding. Journal of Cog nitive Neuroscience, 20 (8), 1478–1489.

Page 23 of 26

Cognitive Neuroscience of Episodic Memory Staresina, B. P., & Davachi, L. (2009). Mind the gap: Binding experiences across space and time in the human hippocampus. Neuron, 63 (3), 267–276. Staresina, B. P., Duncan, K. D., Davachi, L. (2011). Perirhinal and parahippocampal cor tices differentially contribute to later recollection of object- and scene-related event de tails. Journal of Neuroscience, 31 (24), 8739–8747. Stark, C. E. L., & Squire, L. R. (2001). Simple and associative recognition memory in the hippocampal region. Learning & Memory, 8, 190–197. Stark, C. E. L., & Squire, L. R. (2003). Hippocampal damage equally impairs memory for single items and memory for conjunctions. Hippocampus, 13, 281–292. Tambini, A., Ketz, N., & Davachi, L. (2010). Enhanced brain correlations during rest are related to memory for recent experiences. Neuron, 65, 280–290. Thompson-Schill, S. L., D’Esposito, M., Aguirre, G. K., & Farah, M. J. (1997). Role of left inferior prefrontal cortex in retrieval of semantic knowledge: A reevaluation. Proceedings of the National Academy of Sciences, 94, 14792–14797. Tulving, E. (1973). Encoding specificity and retrieval processes in episodic memory. Psy chological Review, 80, 352–373. Tulving, E. (1972). Episodic and semantic memory. In E. Tulving & W. Donaldson (Eds.), Organization Memory (pp. 381–403). New York: Academic Press. Tulving, E. (1983). Elements of episodic memory. New York: Oxford University Press. Tulving, E., Markowitsch, H. J., Kapur, S., Habib, R., & Houle, S. (1994). Novelty encoding networks in the human brain—positron emission tomography data. NeuroReport, 5, 2525– 2528. Uncapher, M. R., Otten, L. J., & Rugg, M. D. (2006). Episodic encoding is more than the sum of its parts: An fMRI investigation of multifeatural contextual encoding. Neuron, 52 (3) 547–556. Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. J. Ingle, M. A. Goodale, & R. J. W. Mansfield (Eds.), Analysis of visual behavior (pp. 549–586). Cam bridge, MA: MIT Press. Vaidya, C. J., Zhao, M., Desmond, J. E., & Gabrieli, J. D. E. (2002). Evidence for cortical specificity in episodic memory: Memory-induced re-activation of picture processing areas. Neuropsychologia, 40, 2136–2143. Vann, S.D., Tsivilis D., Denby C.E., Quamme J.R., Yonelinas A.P., Aggleton J.P., Montaldi D. & Mayes A.R. (2009) Impaired recollection but spared familiarity in patients with extend ed hippocampal system damage revealed by 3 convergent methods. Proceedings of the National Academy of Sciences, 106 (13): 5442–5447. Page 24 of 26

Cognitive Neuroscience of Episodic Memory Wagner, A. D., Schacter, D. L., Rotte, M., Koutstaal, W., Maril, A., Dale, A. M., Rosen, B. R., & Buckner, R. L. (1998). Building memories: Remembering and forgetting verbal expe riences as predicted by brain activity. Science, 281, 1188–1191. Wagner, A. D., Koutstaal, W., & Schacter, D. L. (1999). When encoding yields remember ing: Insights from event-related neuroimaging. Philosophical Transactions of the Royal Society of London, Biology, 354, 1307–1324. Wagner, A. D., Paré-Blagoev, E. J., Clark, J., & Poldrack, R. A. (2001). Recovering meaning: Left prefrontal cortex guides controlled semantic retrieval. Neuron, 31, 329–338. Wheeler, M. E., & Buckner, R. L. (2003). Functional dissociations among components of remembering: Control, perceived oldness, and content. Journal of Neuroscience, 23, 3869–3880. Wheeler, M. E., & Buckner, R. L. (2004). Functional-anatomic correlates of remembering and knowing. NeuroImage, 21, 1337–1349. Wheeler, M. E., Petersen, S. E., & Buckner, R. L. (2000). Memory’s echo: Vivid remember ing reactivates sensory-specific cortex. Proceedings of the National Academy of Sciences, 97, 11125–11129. Wheeler, M. E., Shulman, G. L., Bucckner, R. L., Miezin, F. M., Velanova, K., & Petersen, S. E. (2006). Evidence for separate perceptual reactivation and search processes during re membering. Cerebral Cortex, 6, 949–959. Wixted, J. T., & Squire, L.R. (2004). Recall and recognition are equally impaired in pa tients with selective hippocampal damage. Cognitive, Affective, and Behavioral Neuro science, 4, 58–66. Wixted, J. T., & Squire, L. R. (2011). The medial temporal lobe and the attributes of mem ory. Trends in Cognitive Sciences, 15, 210–217. Yick, Y. Y., & Wilding, E. L. (2008). Material-specific correlates of memory retrieval. Neu roReport, 19, 1463–1467. Yonelinas, A. P., Hopfinger, J. B., Buonocore, M. H., Kroll, N. E. A., & Baynes, K. (2001). Hippocampal, parahippocampal, and occipital-temporal contributions to associative and item recognition memory: an fMRI study. NeuroReport, 12, 359–363. Yonelinas, A. P., Kroll, N. E., Quamme, J. R., Lazzara, M. M., Sauvé, M. J., Widaman, K. F., & Knight, R. T. (2002). Effects of extensive temporal lobe damage or mild hypoxia on rec ollection and familiarity. Nature Neuroscience, 11, 1236–1241. Yonelinas, A. P., Otten, L. J., Shaw, K. N., & Rugg, M. D. (2005). Separating the brain re gions involved in recollection and familiarity in recognition memory. Journal of Neuro science, 25, 3002–3008.

Page 25 of 26

Cognitive Neuroscience of Episodic Memory Yu, S. S., Johnson, J. D., & Rugg, M. D. (2012) Hippocampal activity during recognition memory co-varies with the accuracy and confidence of source memory judgments. Hip pocampus, 22, 1429–1437. Zola-Morgan, S., & Squire, L. R. (1986). Memory impairment in monkeys following le sions limited to the hippocampus. Behavioral Neuroscience, 100, 155–160.

Lila Davachi

Lila Davachi, Department of Psychology, Center for Neural Science, New York Uni versity Jared Danker

Jared Danker, Department of Psychology, New York University

Page 26 of 26

Working Memory

Working Memory Bradley R. Buchsbaum and Mark D'Esposito The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0019

Abstract and Keywords Working memory refers to the temporary retention of information that has just arrived to the senses or been retrieved from long-term memory. Although internal representations of external stimuli have a natural tendency to decay, they can be kept “in mind” through the action of maintenance or rehearsal strategies, and can be subjected to various opera tions that manipulate information in the service of ongoing behavior. Empirical studies of working memory using the tools of neuroscience, such as electrophysiological recordings in monkeys and functional neuroimaging in humans, have advanced our knowledge of the underlying neural mechanisms of working memory. Keywords: short-term memory, working memory, prefrontal cortex, maintenance, neuroimaging, functional mag netic resonance imaging, phonological loop, visual-spatial sketchpad, central executive

Introduction Humans and other animals with elaborately evolved sensory systems are prodigious con sumers of information: Each successive moment in an ever-changing environment nets a vast informational catch—a rich and teeming mélange of sights, sounds, smells, and sen sations. Everything that is caught by the senses, however, is not kept—and that which is kept may not be kept for long. Indeed, the portion of experience that survives the immedi ate moment is but a small part of the overall sensory input. With regard to memory stor age, then, the brain is not a packrat, but rather is a judicious and discerning collector of the most important pieces of experience. A good collector of experience, however, is also a good speculator: The most important information to store in memory is that which is most likely to be relevant at some time in the future. Of course, a large amount of infor mation that might be important in the next few seconds is very unlikely to be of any im portance in a day, a month, or a year. It might be stated more generally that to a large de gree the relevance of information is time-bounded—sense-data collected and registered in the present are far more likely to be useful in a few seconds than they are to be in a few minutes. It would seem, then, that the temporary relevance of information demands the existence of a temporary storage system: a kind of memory that is capable of holding onto Page 1 of 48

Working Memory the sense-data of the “just past” in an easily accessible form, while ensuring that older, ir relevant, or distracting information is discarded or actively suppressed. The existence of this kind of short-term or “working memory” has been well established over the past century through the detailed study of human performance on tasks de signed to examine the limits, properties, and underlying structure of human memory. Moreover, in recent years, much has been learned about the neurobiological basis of working memory through the study of brain-damaged patients, the effect of cortical abla tions on animal behavior, electrophysiological recordings from single cells in the nonhu man primate, and regional brain activity as measured by modern functional neuroimaging tools such as positron emission tomography (p. 390) (PET), functional magnetic resonance imaging (fMRI), and event-related potentials (ERPs). In this chapter, we examine how the psychological concept of working memory has, through a variety of cognitive neuroscien tific investigations, been validated as a biological reality.

Short-Term Memory In the mid-1960s, evidence began to accumulate supporting the view that separate func tional systems underlie memory for recent events and memory for more distant events. A particularly robust finding came from studies of free recall in which it was demonstrated that when subjects are presented a list of words and asked to repeat as many as possible in any order, performance is best for the first few items (the primacy effect) and for the last few items (the recency effect)—a pattern of accuracy that when plotted as a function of serial position (Figure 19.1) appears U-shaped (Glanzer & Cunitz, 1966; Waugh & Nor man, 1965).

Page 2 of 48

Working Memory

Figure 19.1 Plot of recall accuracy as a function of serial position in a test of free recall. Primacy and re cency effects are evident in the U-shaped pattern of the curve. Adapted with permission from Glanzer & Cunitz, 1966.

When a brief filled retention period is interposed between stimulus presentation and re call, however, performance on early items is relatively unaffected, but the recency effect disappears (Glanzer & Cunitz, 1966; Postman & Phillips, 1965). These findings suggest that in the immediate recall condition, the last few items of a list are recalled best be cause they remain accessible in a short-term store, whereas early items are more perma nently represented (and thus unaffected by the insertion of a filled delay) in a long-term store. This idea that memory, as a functional system, contains both short- and long-term stores is exemplified by the two-store memory model of Atkinson and Shiffrin (1968). In this prototype psychological memory model, comprising a short-term store (STS) and long-term store (LTS), information enters the system through the STS, where it is encod ed and enriched, before being passed on to the LTS for permanent storage. Although the idea that short-term storage is a necessary prerequisite for entry into the LTS has not held up, the two-store model of Atkinson and Shiffrin crystallized the very idea of memory as a divisible, dichotomous system and provided the conceptual framework for the inter pretation of patterns of memory deficits observed in patients with brain damage.

Neurological Evidence for Short-Term and Long-Term Memory Stores Perhaps the most compelling evidence for the existence of two memory stores comes from case studies of persons with focal brain lesions. In the early 1950s a surgical proce dure for the treatment of intractable epilepsy that involved bilateral removal of the medi Page 3 of 48

Working Memory al temporal lobe in patient H.M. resulted in a catastrophic impairment in his ability to form new long-term memories, though, remarkably, his short-term memory was left intact (Scoville & Milner, 1957). Thus, H.M., although perfectly capable of repeating back a string of digits—the classic test of short-term memory—was unable to permanently store new facts and events. In the following decade, when Warrington and Shallice (Shallice & Warrington, 1970; Warrington & Shallice, 1969) reported a number of case studies of pa tients with temporal-parietal lesions who had dramatically impaired short-term memory for numbers and words coupled with a preserved ability to learn supra-span (e.g., greater than ten items) word lists with repeated study, the case for a separation between shortand long-term memory was strengthened. It is important to emphasize that the shortterm memory deficits exhibited by such patients were, in the purest cases (Shallice & Warrington, 1977), not accompanied by any obvious deficits in ordinary language compre hension and speech production. Thus, for instance, patient J.B. was able to carry on con versations normally and to speak fluently without abnormal pauses, errors, or other symptoms of aphasia; in short, the “language faculty,” considered to encompass the processes necessary for the online comprehension and production (p. 391) of meaningful speech, need not be disturbed even in the presence of a nearly complete eradication of verbal short-term memory (Shallice & Butterworth, 1977). This established an important dissociation between the short-term memory syndrome and the aphasic syndromes—a class of neurological disorders that specifically affect language ability—and argued, again, for a dedicated system in the brain for the temporary storage of information. In summary, the discovery of “short-term memory patients,” as they were to be called, in the neuropsychological investigations of Warrington, Shallice, and others established a double dissociation both in brain localization (long-term memory—medial temporal lobe; verbal short-term memory—temporal-parietal cortex) and patterns of performance, be tween short- and long-term memory systems. In addition, the short-term memory disorder could be clearly distinguished, at the behavioral level at least, from the major disorders of language such as Broca’s and Wernicke’s aphasia.

Development of the Concept of Working Memo ry Short-term memory had, until the landmark work of Baddeley and colleagues (Baddeley, 1986; Baddeley & Hitch, 1974), typically been viewed as a more or less passive and amor phous medium for the brief storage of information derived from the senses. Questions tended to focus on the principles governing the mnemonic “life cycle” of an item in mem ory—for example, why and at what rate are items forgotten? What is the role of passive decay? What is the role of interference, both proactive and retroactive, in forgetting? What is the route from short-term memory to long-term memory, and what are the factors that influence this process? These questions, though of fundamental importance to under standing how memory works, tended to emphasize the general mechanisms—the proce dures and principles of memory—rather than the underlying functional architecture of Page 4 of 48

Working Memory the system. What was missing from this line of research was the recognition that the con tents of short-term memory are not physical elements governed by certain lawful and in exorable processes of decay and interference, but rather dynamic representations of a fluid cognition, capable of being maintained, transformed, and manipulated by active, ex ecutive processes of higher control. Thus, for instance, two of the most important vari ables in studies of short-term memory, before the emergence of the working memory model, were time (e.g., between stimulus presentation and recall) and serial order (e.g., of a list of items); both of which variables are defined by the inherent structure of the en vironmental input. In more recent years, at least as great an emphasis has been placed on variables that reflect an ability or attribute of the subject, for instance, his or her rate of articulation (Hulme, Newton, Cowan, Stuart, & Brown, 1999), memory capacity (Cowan, 2001), degree of inhibitory control (Hasher, Zacks, & Rahhal, 1999), or ability to “re fresh” information in memory (Raye, Johnson, Mitchell, Reeder, & Greene, 2002). Interest in these “internal variables” is a recognition of the fact that what is “in memory” at a mo ment in time is defined to various degrees by the structure of the input (e.g., time, serial order, information content), the biophysical properties of the storage medium (e.g., rate of decay, interference susceptibility), and the active processes of control that continually monitor and operate on the contents of memory. It is this last ingredient that puts the “work” into working memory; it makes explicit the active and transformative character of mental processes and acknowledges that the content of memory need not mirror the structure and arrangement of environmental input, but rather may reflect the intentions, plans, and goals of the conscious organism. With that introduction in mind, let us now give a brief overview of the working memory model of Baddeley and colleagues (Baddeley, 1986, 2000; Baddeley & Hitch, 1974). Whereas contemporary models of short-term memory tended to emphasize storage buffers as the receptacles for information arriving from the senses, Baddeley and Hitch (1974) focused on rehearsal processes, that is, strategic mechanisms for the maintenance of items in memory. Thus, for example, when one is trying to keep a telephone or license plate number “in mind,” a common strategy is to repeatedly rehearse, either subvocally or out loud, the contents of the numerical or alphanumerical sequence. Research had shown that in tests of serial recall, when subjects are prevented from engaging in covert rehearsal during a delay period that is inserted between stimulus presentation and recall, overall performance is dramatically impaired (Baddeley, Thomson, & Buchanan, 1975). In the case of verbal material, then, it was clear that in many ways the ability to keep words in memory depended in large part on articulatory processes. This insight was central to the development of the verbal component of working memory, the phonological loop (see below), and led to a broader conceptualization of short-term memory that seeks (p. 392) to explain not only how and why information enters and exits awareness but also how re sources are marshaled in a strategic effort to capture and maintain the objects of memory in the focus of attention.

Page 5 of 48

Working Memory The central tenets of the working memory model are as follows: 1. It is a limited-capacity system; at any moment in time, there is only a finite amount of information directly available for processing in memory. 2. The specialized subsystems devoted to the representation of information of a par ticular type, for instance, verbal or visual-spatial, are structurally independent of one another; the integrity of information represented in one domain is protected from the interfering effects of information that may be arriving to another domain. 3. Storage of information in memory is distinct from the processes that underlie stimulus perception; rather, there is two-stage process whereby sensory information is first analyzed by perceptual modules and then transferred into specialized storage buffers that have no other role but to temporarily “hold” preprocessed units of infor mation. Moreover, the pieces of information that reside in such specialized buffers are subject to passive, time-based decay as well as inter-item interference (e.g., simi lar-sounding words like “man, mad, map, cap, mad” can lead to interference within a specialized phonological storage structure); finally, such storage buffers have no built-in or internal mechanism for maintaining or otherwise refreshing their contents —rather, this must occur from without, through the process of rehearsal, which might be a motor or top-down control mechanism that can sequentially access and refresh the contents that remain active within the store. The initial working memory model proposed by Baddeley and Hitch (1974), but later re fined somewhat (Baddeley, 1986, 2000; Salame & Baddeley, 1982), argued for the exis tence of three functional components of working memory. The central executive was envi sioned as a control system of limited attentional capacity responsible for coordinating and controlling two subsidiary slave systems: a phonological loop and a visual-spatial sketch pad. The phonological loop was responsible for the storage and maintenance of informa tion in a verbal form, and the visual-spatial sketchpad was dedicated to the storage and maintenance of visual-spatial information. In the last decade, a fourth component, the episodic buffer has been added to the model in order to capture a number of phenomena that could not be readily explained within the original framework.

The Central Executive As has already been mentioned, working memory is viewed as a limited-capacity system. There are a number of reasons for this capacity limitation, but an important one relates to what one might call the allocation of attention. Although many people are perfectly capa ble of walking and chewing gum at the same time, it is far more difficult to perform two simultaneous cognitive activities that are both attentionally demanding. Thus, quite apart from the structural limitations inherent to memory storage systems (e.g., the natural in clination of memory traces to fade with time and interference), there also appear to be certain fundamental constraints on “how much” attention can be allocated to the set of active tasks at any one time (Kahneman, 1973). The central executive component of work ing memory sits, as it were, at the helm of the cognitive apparatus and is responsible for the dispensation of attentional resources to the subsidiary components (e.g., the phono Page 6 of 48

Working Memory logical loop) in working memory (Baddeley, 1986). Because total attentional capacity is fi nite, there must be a mechanism that intervenes to determine how the pool of attention is to be divided among the many possible actions, with their different levels of priority and reward contingencies that are afforded by the environment. Thus, in dual-task paradigms, the central executive plays a crucial role in the scheduling and shifting of resources be tween tasks, and it can be used to explain the decline in performance that may be ob served even when the two tasks in question involve different memory subsystems (Badde ley, 1992). Finally, it has often been pointed out that the central executive concept is too vague to act as anything other than a kind of placeholder for what is undoubtedly a much more complex system than is implied by the positing of a unitary and homunculus-like central cognitive operator (for a model of executive cognition, see Shallice, 1982). Provid ed, however, that the concept is not taken too literally, it can serve as a convenient way to refer to the complex and variegated set of processes that constitute the brain’s executive system.

The Phonological Loop Built into the architecture of the working memory model is a separation between domainspecific (p. 393) mechanisms of memory maintenance and domain-general mechanisms of executive control. Thus, the verbal component of working memory, or the phonological loop, is viewed as a “slave” system that can be mobilized by the central executive when verbal material has to be retained in memory over some uncertain delay. Within the phonological loop, it is the interplay of two components—the phonological store and the articulatory rehearsal process—that enables representations of verbal material to be kept in an active state. The phonological store is a passive buffer in which speech-based infor mation can be stored for brief (approximately 2 seconds) periods. The articulatory control process serves to refresh and revivify the contents of the store, thus allowing the system to maintain short sequences of verbal items in memory for an extended interval. This divi sion of labor between two interlocking components, one an active process and the other a passive store, is crucial to the model’s explanatory power. For instance, when the articula tory control process is interfered with through the method of articulatory suppression (e.g., by requiring subjects to say “hiya” over and over again), items in the store rapidly decay, and recall performance suffers greatly. The store, then, lacks a mechanism of reac tivating its own contents but possesses memory capacity, whereas conversely, the articu latory rehearsal process lacks an intrinsic memory capacity of its own but can exert its ef fect indirectly by refreshing the contents of the store.

The Visual-Spatial Sketchpad The other slave system in the working memory model is the visual-spatial sketchpad, which is critical for the online retention of object and spatial information. Again, as is suggested by the term “sketchpad,” the maintenance of visual-spatial imagery in an ac tive state requires top-down, or strategic, processing. As with the phonological loop, where articulatory suppression interferes with the maintenance of verbal information, a concurrent processing demand in the visual-spatial domain, such as tracking a spot of Page 7 of 48

Working Memory light moving on a screen, making random eye movements, or presenting subjects with ir relevant visual information during learning, also impairs memory performance. Although the symmetry between sensory and motor representations of visual-spatial information is less obvious than it is in the case of speech, it has been demonstrated that covert eye movement is important for the maintenance of spatial information (Postle, D’Esposito, & Corkin, 2005). Baddeley (1986) initially proposed that in the context of spatial memory, covert eye movements can act as way of revisiting locations in memory, and thus operate very much like the articulatory rehearsal process known to be important for the mainte nance of verbal information. Moreover, requiring subjects to perform a spatial interfer ence task that disrupts or otherwise occupies this rehearsal component significantly im pairs the performance of tests of spatial working memory, but may have little effect on nonspatial visual memory tasks (Cocchini, Logie, Della Sala, MacPherson, & Baddeley, 2002; Della Sala, Gray, Baddeley, Allamano, & Wilson, 1999). In contrast, retention of vi sual shape or color information is interfered with by visual-perceptual input, but not by a concurrent demand in the spatial domain (Klauer & Zhao, 2004). Thus, the principles that underlie the operation of the phonological loop are qualitatively similar to those that un derlie the operation of the visual-spatial sketchpad; in both cases, maintenance processes consist of covert motor performance that serves to reactivate the memory traces residing in sensory stores. This mechanism might be most simply described as “remembering by doing,” a strategy that is most effective when a motor code, which can be infinitely regen erated and that is under the subject’s voluntary control, can be substituted for a fragile and less easily maintained perceptual memory code.

The Episodic Buffer Working memory was originally conceived as temporary workspace in which a small amount of information could be kept in mind through the action of rehearsal mechanisms whose purpose was to counteract memory decay. But when these rehearsal mechanisms are interrupted by way of experimental interventions such as articulatory suppression (Baddeley, Lewis, & Vallar, 1984), the effect is not catastrophic. In a typical study, sup pression would cause digit span to drop from seven to five items. Moreover, whereas the storage systems in working memory were initially conceived of as temporary and with a fixed capacity, many studies have shown that working memory capacity is increased if the experimental stimuli are familiar (Ericsson & Kintsch, 1995), meaningful (Martin & He, 2004), or structured, such as in the case of sentences (R. A. McCarthy & Warrington, 1987). This implies that prior experience influences how easily information can be encod ed and maintained in working memory. The episodic buffer is hypothesized to provide the means by which integrated information, such as semantics, syntactic structures, learned patterns, (p. 394) and multimodal long-term representations, is temporarily retrieved and bound into the mental workspace of working memory (Baddeley, 2000). The episodic buffer serves as an interface between a range of systems, but using a common multidi mensional code. The buffer is assumed to be limited in capacity because of computational demand of providing simultaneous access to information provided across multiple cogni

Page 8 of 48

Working Memory tive domains (Baddeley, Allen, & Vargha-Khadem, 2010; Prabhakaran, Narayanan, Zhao, & Gabrieli, 2000).

Summary The working memory model of Baddeley and colleagues describes a system for the main tenance and manipulation of information that is stored in domain-specific memory buffers. Separate cognitive components are dedicated to the functions of storage, re hearsal, and executive control. Informational encapsulation and domain segregation dic tate that auditory-verbal and visual information be kept in separate storage subsystems— the phonological loop and the visual-spatial sketchpad, respectively. These storage sub systems themselves comprise specialized components for the passive storage of memory traces, which are subject to time and interference-based decay, and for the reactivation of these memory traces by way of simulation, or rehearsal. Thus, storage components repre sent memory traces, but have no internal means of refreshing them, whereas rehearsal processes (e.g., articulatory, saccadic) have no mnemonic capacity of their own, but can reactivate the decaying traces held in temporary stores. In the succeeding sections we ex amine how neuroscience has built on the cognitive foundation of the working memory model of Baddeley and colleagues to refine our understanding of how information is maintained and manipulated in the brain. We will see that in some cases neuroscientific evidence has bolstered and reinforced aspects of the working memory model, whereas in other cases neuroscience has compelled a departure from certain core principles of the Baddeleyan concept.

Emergence of Working Memory as a Neurosci entific Concept Perhaps the first insights into the neurobiological underpinnings of a memory whose pur pose is to bridge cross-temporal contingencies (Fuster, 1997) comes from the work of Ja cobsen, who studied nonhuman primate behavior after ablation to the prefrontal cortices. In comparing normal chimpanzees to those that had suffered extensive injury to the pre frontal cortex (PFC), Jacobsen (1936) noted: The normal chimpanzee has considerable facility in using sticks or other objects to manipulate its environment, e.g., to reach a piece of food beyond its unaided reach. It can solve such problems when it must utilize several sticks, some of which may not be immediately available in the visual field. After ablation of the pre-frontal areas, the chimpanzee continues to use sticks as tools but it may have difficulty solving the problem if the necessary sticks and the food are not simulta neously present in the visual field. It exhibits also a characteristic “memory” de fect. Given an opportunity to observe a piece of food being concealed under one of two similar cups, it fails to recall after a few seconds under which cup the lure has been hidden.… (p. 317)

Page 9 of 48

Working Memory In his pioneering experimental work, Jacobsen (1936) discovered that damage to the PFC of the monkey produces selective deficits in a task requiring a delayed response to the presentation of a sensory stimulus. The delayed response tasks were initially devised by Hunter (1913) as a way of differentiating between animals on the basis of their ability to use information not currently available in the sensory environment to guide an imminent response. In the classic version of this test, a monkey is shown the location of a food morsel that is then hidden from view and placed in one of two wells. After a delay period of a few seconds, the monkey chooses one of the two locations and is rewarded if his choice corresponds to the location of the food. Variations on this test include the delayed alternation task, the delayed match-to-sample task, and the delayed nonmatch-to-sample task. The family of delayed-response tasks measure a complex cognitive ability that re quires at least three clearly identifiable subprocesses: to recognize and properly encode the to-be-remembered item, to hold an internal representation of the item “online” across an interval of time, and finally, to initiate the appropriate motor command when a re sponse is prompted. Jacobsen showed that lesions to the PFC impair only the second of the above three functions, suggesting a fundamental role for the region in immediate or short-term memory. Thus, monkeys with lesions to PFC perform in the normal range on a variety of tests requiring sensorimotor behavior, such as visual pattern discrimination and motor learning and control—that is, tasks without a short-term mnemonic component. Al though the impairments in the performance of delayed-response tasks in Jacobsen’s stud ies were caused by large (p. 395) prefrontal lesions that often extended into the frontal pole and orbital surface, later studies showed that lesions confined to the region of the principal sulcus produced deficits equally as severe (Blum, 1952; Butters, Pandya, Stein, & Rosen, 1972). Fuster and Alexander (1971) reported the first direct physiological measures of PFC in volvement in short-term memory. With microelectrodes placed in the PFC, they measured the firing patterns of neurons during a spatial delayed-response task and showed that many cells had increased firing, relative to an intertrial baseline period, during both cue presentation and the later retention period. Importantly, some cells fired exclusively dur ing the delay period and therefore could be considered pure “memory cells.” The results were interpreted as providing evidence for PFC involvement in the focusing of attention “on information that is being or that has been placed in temporary memory storage for prospective utilization” (p. 654). Many subsequent electrophysiological studies have demonstrated memory-related activity in the PFC of the monkey during delayed-response tasks of various kinds (e.g., Joseph & Barone, 1987; Niki, 1974; Niki & Watanabe, 1976; Quintana, Yajeya, & Fuster, 1988), although it was Patricia Goldman-Rakic who first drew a parallel (but see Passingham, 1985) and then firmly linked the phenomenon of persis tent activity in PFC to the cognitive psychological concept of “working memory.” In a re view of the existing literature on the role of the PFC in short-term memory, Goldman-Ra kic (1987), citing lesion and electrophysiological studies in the monkey, human neuropsy chology, and the cytoarchitectonics and cortical-cortical connections of the PFC, argued that the dorsolateral PFC (the principal sulcus of the monkey) plays an essential role in holding visual-spatial information in memory before the initiation of a response and in the Page 10 of 48

Working Memory absence of guiding sensory stimulation. In this and later work (especially that of Wilson, 1993), Goldman-Rakic developed a model of PFC in which visual-spatial and (visual) ob ject working memory were topographically segregated, with the former localized to the principal sulcus and the latter localized to a more ventral region along the inferior con vexity of the lateral PFC (Figure 19.2). This domain-specific view of the prefrontal organization, which was supported by ob served dissociations in the responsivity of neurons in dorsal and ventral areas of the PFC during delayed response tasks, could be viewed as an anterior expansion of the dorsal (“where”) and ventral (“what”) streams that had been discovered in the visual system in posterior neocortex (Ungerleider & Mishkin, 1982). In addition, the parallel and modular nature of the proposed functional and neuroanatomical architecture of PFC was in keep ing with the tenet of domain independence in the working memory model of Baddeley and colleagues. The connection between persistent activation in the PFC of the monkey and a model of memory developed in the field of cognitive psychology might seem tenuous, especially in light of the fact that the working memory model was originally formulated on the basis of evidence derived from behavioral studies using linguistic material—an informational medium clearly unavailable to monkeys. For Goldman-Rakic, though, the use of the term “working memory” in the context of nonhuman primate electrophysiology was intended not as an offhand or otherwise desultory nod to psychology (Goldman-Rakic, 1990), but rather as a reasoned and deliberate effort to unify both our understanding of and manner of referencing a common neurobiological mechanism underlying an aspect of higher cog nition that is well developed in primate species. Certainly, in retrospect, the decision to label the phenomenon of persistent activity in PFC with the term “working memory” has had an immeasurable impact on memory research, and indeed may be thought of as one of the two or three most important events contributing to the emergence of an integrated and unified approach to the study of neurobiology and psychology.

Page 11 of 48

Working Memory

Functional Neuroimaging Studies of Working Memory

Figure 19.2 A, Diagram of the frontal lobe and loca tion of principal sulcus and inferior convexity. B, Re sponses of inferior convexity neuron with visual ob ject-specific activity in the delayed response task. Upper panels show delay-period activity for object memory trials; lower panels show lack of response on spatial memory trials. C, Responses of dorsolateral prefrontal neuron with spatial memory selectivity. Upper panels show lack of responsivity on object memory trials; lower panels show delay-period activi ty on spatial memory trials. D, Schematic diagram il lustrating the dorsal and ventral streams in the visu al system and their connections with prefrontal cor tex (PFC). AS, arcuate sulcus; PS, principal sulcus. The posterior parietal (PP) cortex is concerned with spatial perception, and the inferior temporal (IT) cor tex with object recognition. These regions are con nected with the dorsolateral (DL) and inferior con vexity (IC) prefrontal cortices where, according to the Goldman-Rakic model, memory for spatial loca tion and object identity are encoded in working mem ory. Adapted with permission from Wilson et al., 1993.

At about the same time as Fuster and Alexander (1971) recorded neural activity in the monkey PFC during a working memory task, Ingvar and colleagues examined variation in regional cerebral blood flow (rCBF) during tasks requiring complex mental activity. In deed, Risberg and Ingvar (1973), in the first functional neuroimaging study of short-term memory, showed that during a backward digit span task, the largest increases in rCBF, compared with a resting baseline, were observed in pre-rolandic and anterior frontal cor tex. It was not, however, until the emergence of PET and the development the oxygen-15 tracer that the mapping of brain activity during the exercise of higher mental functions would become genuinely amenable to the evaluation of complex hypotheses about of the neural basis of cognition. In the middle and late 1980s, technological advances in the PET technique, with its relatively high spatial resolution (approximately 1 cm3), were accom panied by a critical conceptual (p. 396) innovation known as cognitive subtraction, which provided the inferential machinery needed to link regional variation in brain activity to Page 12 of 48

Working Memory experimental manipulations at the task or psychological level (Posner, Petersen, Fox, & Raichle, 1988). Thus, for any set of hypothesized mental processes (a,b,c), if a task can be devised in which one condition recruits all of the processing components (Task 1a,b,c) and another condition recruits only a subset of the components (Task 2a,b), subtraction of the observed regional activity during Task 2 from that observed during Task 1 should reveal the excess neural activity due to the performance of Task 1, and thus is associated with the cognitive component c. The working memory model of Baddeley, with its discrete cog nitive components (e.g., central executive, phonological loop, and visual-spatial scratch pad) was an ideal model with which to test the power of cognitive subtraction using mod ern neuroimaging tools. Indeed, in the span of only 2 years, the landmark studies of Paulesu et al. (1993), Jonides et al. (1993), and D’Esposito et al. (1995) had mapped all of the cognitive components of the working memory model onto specific regions of the cere bral cortex. The challenge in successive years was to go beyond this sort of “psychoneur al transcription”—which is necessarily a unidirectional mapping between the cognitive box and the cerebral convolution—and begin to develop models that generate hypotheses that refer directly to the brain regions and mechanisms that underlie working memory. In the following sections, we review how cognitive neuroscience studies of short-term mem ory and executive control used the working memory model to gain an initial neural foothold, upon which later studies were buttressed, and which would lead to insights and advances in our understanding of working memory as it is implemented in the brain.

Visual-Spatial Working Memory The first study of visual-spatial working memory in PET was carried out by Jonides and colleagues in 1993 using the logic of cognitive subtraction to isolate mnemonic processes associated with the maintenance of visual-spatial information, in a task very similar to the tasks used by Goldman-Rakic and her colleagues with monkeys (p. 397) (Funahashi, Bruce, & Goldman-Rakic, 1989; Goldman-Rakic, 1987). During “memory” scans, subjects were shown an array of three dots appearing for 200 ms on the circumference of a 14-mm imaginary circle and instructed to maintain the items in memory during a 3-second reten tion interval. This was followed by a probe for location-memory consisting of a circular outline that either did or did not (with equal probability) enclose one of the previously memorized dots, and to which subjects responded with a yes/no decision. In “perception” scans, the three dots and the probe outline were presented simultaneously, so that sub jects did not have to retain the location of the items in memory during a delay, but instead simply had to decide whether the outline encircled one of the three displayed dots (Fig ure 19.3).

Page 13 of 48

Working Memory

Figure 19.3 A, Schematic presentation of spatial memory task from Jonides et al. (1993) and Smith et al. (1995). B, Top panel shows example “memory” tri al; bottom panel shows example “perception” trial.

Subtraction of the “perception” scans from the “memory” scans revealed a right-lateral ized network of cortical regions that would become a hallmark of neuroimaging studies of visual-spatial working memory: the posterior parietal lobe, dorsal premotor cortex, occipi tal cortex (Brodmann area 19), and PFC. In their interpretation of the findings, the au thors suggested that the occipital activity reflected a role in the creation, but not neces sarily the maintenance, of an internal visual image of the dot pattern, and that activity in the PFC might reflect one of two things: (1) the literal storage of a representation of the image in memory during the delay, or (2) the representation of a pointer or link to other brain circuitry, perhaps in the occipital or parietal lobe, that is actually responsible for maintaining the memory engram. These two explanations for the observation of pre frontal activity during working memory tasks, which in later years would often be pitted against each other, nicely framed the emerging debate on the division of labor among the cortical regions involved in the maintenance of information in working memory. A major aim of many of the early neuroimaging studies of visual-spatial working memory was to duplicate the canonical finding of Goldman-Rakic and colleagues of a dorsal-ven tral dissociation in monkey PFC for spatial and object working memory. Studies by Petrides et al. (1993) and McCarthy et al. (1994) demonstrated with PET and fMRI, re spectively, that mid-dorsolateral PFC (Brodmann areas 9 and 46) shows increased activity during spatial working memory when compared with a control condition. An attempt to show a neuroanatomical double dissociation between spatial and object working memory was undertaken by Smith et al. (1995) in a PET study that used carefully controlled non verbalizable object stimuli that were presented in both object and spatial task contexts. This study found distinct brain circuits for the storage of spatial and object information, with spatial working memory relying primarily on right-hemisphere regions in the pre frontal (BA 46) and parietal (BA 40) cortices, and object working memory involving only a left inferotemporal area. These results, however, only partially replicated the monkey study of Wilson et al. (1993), who had found distinct dorsal and ventral regions in lateral Page 14 of 48

Working Memory PFC for spatial and object working memory. A similar pattern was found by McCarthy et al. (1994), in which regional (p. 398) differences between object and spatial working mem ory were most pronounced across hemispheres rather than between dorsal and ventral divisions of the PFC. In a contemporaneous review and meta-analysis of all human neu roimaging studies of working memory, D’Esposito et al. (1998) showed that there was vir tually no evidence for a neuroanatomical dissociation between spatial and object working memory in the PFC, a finding that was supported by a later and more exhaustive quanti tative meta-analysis (Wager & Smith, 2003). Indeed, establishing a correspondence be tween the functional neuroanatomy of visual-spatial working memory in the monkey and human brains has proved remarkably difficult, leading to a protracted debate among and between monkey neurophysiologists and human neuroimaging researchers about the proper way to conceptualize the functional topography of working memory in the PFC (Goldman-Rakic, 2000; E. K. Miller, 2000). Increasingly, efforts were made to adapt hu man neuroimaging studies to resemble as closely as possible the kinds of tasks used in animal electrophysiology, such as the delayed match-to-sample procedure. The emer gence of event-related fMRI, with its superior spatial and temporal resolution to oxy gen-15 PET, was critical to this new effort at cross-disciplinary synthesis and reconcilia tion, and led to a number of fundamental insights on the brain basis of working memory, to the discussion of which we now turn. Early PET studies of working memory relied exclusively on the logic of cognitive subtrac tion to isolate hypothesized components of a complex cognitive task. Thus, even for work ing memory tasks that consisted of a number of temporal phases within a given trial (e.g., stimulus presentation → memory maintenance → recognition decision), the low temporal resolution of PET prohibited separate statistical assessment of activity within a single task phase. Event-related fMRI, on the other hand, with its temporal resolution on the or der of 2 to 4 seconds, could be used to examine functional activity in different portions of a multiphase trial, provided that each of the sequential task components was separated by approximately 4 seconds (Zarahn, Aguirre, & D’Esposito, 1997). This methodology per mits the isolation of maintenance-related activity during the delay period of a match-tosample procedure without relying on a complex cognitive subtraction. Using event-related fMRI, Courtney et al. (1998) demonstrated a neuroanatomical dissoci ation between delay-period activity during working memory maintenance for either the identity (object memory) or location (spatial memory) of a set of three face stimuli. Greater activity during the delay period on face identity trials was observed in the left in ferior frontal gyrus, whereas greater activity during the delay period of the location task was observed in dorsal frontal cortex, a finding consistent with the spatial/object domain segregation thesis of Goldman-Rakic (1987). Unlike previous studies that had implicated human BA 46—the presumed homologue to the monkey principal sulcus—in spatial work ing memory, Courtney et al. (1998) observed enhanced delay-period activity for the loca tion task, bilaterally, in the superior frontal sulcus (BA 8), a region just anterior to the frontal eye field (FEF). A control task requiring sensory-guided eye movements was used to functionally delineate the FEF and thus distinguish them from regions with a specifi cally mnemonic function. They concluded that the localization of spatial working memory Page 15 of 48

Working Memory in the superior frontal sulcus (posterior and superior to BA 46) indicated an evolutionary displacement in the functional anatomy of the PFC, possibly owing to the emergence of new cognitive abilities such as abstract reasoning, complex problem solving, and plan ning for the future. In short, then, this study was the first functional neuroimaging study to fully replicate the object versus spatial working memory dissociation shown by Gold man-Rakic and colleagues, insofar as one accepts their proposal that the human homo logue to the monkey principal sulcus is located, not in the middle frontal gyrus or BA 46, but rather in the superior frontal sulcus. While several subsequent studies of spatial working memory offered further support (Le ung, Seelig, & Gore, 2004; Munk et al., 2002; Sala, Rama, & Courtney, 2003; Walter et al., 2003) for a specifically mnemonic role of the superior frontal sulcus in tasks of spatial working memory, other studies failed to replicate the finding (Postle, Berger, & D’Esposito, 1999; Postle, Berger, Taich, & D’Esposito, 2000; Srimal & Curtis). For in stance, although Postle et al. (2000) observed delay-period activity in this region during a spatial working memory task, they also found it to be equally active during the generation of two-dimensional saccades, a task that required visual-spatial attention and motor con trol but placed no demands on memory storage. Using a paradigm that varied the length of the delay period in a memory-guided saccade task similar to that used by Funahashi and colleagues in the monkey, Srimal and Curtis (2008) failed to show any maintenancerelated activity in the (p. 399) superior frontal sulcus, casting doubt on whether this area is the human homologue of the monkey principal sulcus as originally suggested by Court ney et al. (1998). Srimal and Curtis (2008) note, however, that all of the studies of spatial working memory that have shown delay-period activity in human superior frontal sulcus have used tasks that required maintenance of multiple spatial locations. This suggests that activity in this region reflects the operation of higher order processes that are need ed when one is required to maintain the spatial relation or configuration of multiple ob jects in space. Moreover, Curtis et al. (2004) have shown using the memory-guided sac cade task that the FEF and intraparietal sulcus (IPS) are the two areas that not only show robust delay-period activity but also have activity that predicts the accuracy of the gener ated saccade (Figure 19.4). This indicates that the very areas that are known to be impor tant for the planning and preparation of eye movements and, more generally, for spatial selective attention (Corbetta, Kincade, & Shulman, 2002) are also critical for working memory maintenance (Logie, 1995). In addition, recent evidence has shown that internal shifts to locations in working memory not only activate the frontoparietal oculomotor sys tem but also can activate early visual cortex (including V1) in a retinotopic manner (Munneke, Belopolsky, & Theeuwes).

Page 16 of 48

Working Memory

Visual Object Working Memory

Page 17 of 48

Working Memory

Figure 19.4 Event-related study of spatial working memory by Curtis et al. (2004). A, Schematic depic tion of the oculomotor delayed response tasks in which subjects used the cue’s location to make a memory-guided saccade. Both the matching-to-sam ple (top) and nonmatching-to-sample (bottom) tasks began with the brief presentation of a small cue. Dur ing matching trials, the subject made a memory-guid ed saccade (depicted by the thin black line) after the disappearance of the fixation cue marking the end of the delay. Feedback was provided by the re-presenta tion of the cue. At this point, the subject corrected any errors by shifting gaze to the cue. The difference between the end-point fixation after the memoryguided saccade and the fixation to acquire the feed back cue was used as an index of memory accuracy. During nonmatching trials, the subject made a sac cade to the square that did not match the location of the sample cue. B, Average (±S.E. bars) blood-oxy gen-level-dependent (BOLD) time series data for matching-to-sample (black) and nonmatching-to-sam ple (gray) oculomotor delayed-response tasks. The solid gray bar represents the delay interval. The gray gradient in the background depicts the probability that the BOLD signal is emanating from the delay pe riod, where darker indicates more probable. The frontal eye field (FEF) show greater delay-period ac tivity during the matching task where an oculomotor strategy is efficient. The right intraparietal sulcus (IPS) shows greater delay-period activity during the nonmatching task when subjects are biased from us ing such a strategy. C, Scatter plot showing the cor relation between memory-guided saccade (MGS) ac curacy and the magnitude of the delay-period para meter estimates in the right FEF. More accurate MGS was associated with greater delay-period activi ty.

Page 18 of 48

Working Memory Many studies have investigated the maintenance of visual objects, mostly faces, houses, scenes, and abstract shapes that are not easily verbalizable (e.g., Belger et al., 1998; Courtney, Ungerleider, Keil, & Haxby, 1996, 1997; Druzgal & D’Esposito, 2001, 2003; Lin den et al., 2003; G. McCarthy et al., 1994; Mecklinger, Gruenewald, Besson, Magnie, & Von Cramon, 2002; Postle & D’Esposito, 1999; Postle, Druzgal, & D’Esposito, 2003; Rama, Sala, Gillen, Pekar, & Courtney, 2001; Sala et al., 2003; Smith et al., 1995). A consistent finding has been that posterior cortical areas within the inferior temporal lobe that may preferentially respond to the presentation of certain categories of complex visual objects also tend to activate during object working memory tasks. For example, the fusiform gyrus, found along the ventral surface of the temporal lobe, shows greater activation when a subject is shown pictures of faces than when shown other types of complex visual stimuli like pictures of houses or scenes or household objects (Kanwisher, McDermott, & Chun, 1997). Indeed, given its selective response (p. 400) properties, the fusiform gyrus has been termed the “fusiform face area,” or FFA. There are four important findings that indicate that posterior extrastriate cortical regions like the FFA play an important role in the mnemonic storage of object features. First, the FFA shows persistent delay-period activity (Druzgal & D’Esposito, 2001, 2003; Postle et al., 2003; Rama et al., 2001) during working memory tasks. Second, the activity in the FFA is selective for faces; it is greater during delays in which subjects are maintaining faces compared with other objects (Sala et al., 2003). Third, as the number of faces that are being maintained increases, the magnitude of the delay-period activity increases in the FFA (Druzgal & D’Esposito, 2001, 2003; Jha & McCarthy, 2000). Such load effects strongly suggest a role in short-term storage because, as the number of items that must be represented increases, so should the storage demands. Fourth, using a delayed pairedassociates task, Ranganath et al. (2004) showed that the FFA responds during an unfilled delay interval following the presentation of a house that the subject has learned is associ ated with a certain face. Therefore, the delay-period FFA activity likely reflects the reacti vated image of the associated face that was retrieved from long-term memory despite the fact that no face was actually presented before the delay. Together, these studies suggest that posterior regions of visual association cortex, like the FFA, participate in the internal storage of specific classes of visual object features. Most likely, the mechanisms used to create internal representations of objects that are no longer in our environment are simi lar to the mechanisms used to represent objects that exist in our external environment. What happens to memory representations for visual objects when subjects momentarily divert their attention to elsewhere (i.e., to a competing task or a different item in memo ry)? Lewis-Peacock et al. (2012) have exploited the enhanced sensitivity of multivoxel analyses of fMRI data to answer this question. In their paradigm, a subject is presented with two items displayed on the screen, one word and one picture, followed by a cue indi cating which of the two stimuli to retain in memory. After an 8-second delay, a second cue appears indicating for the subject either to retain the same item in memory or else to re trieve and maintain the item that the subject had previously been instructed to ignore for another 8 seconds. The authors found that when an item is actively maintained in the fo cus of attention, multivoxel patterns of brain activation can be used to “decode” whether Page 19 of 48

Working Memory the item in memory is a picture or a word. However, if the item is not actively maintained in working memory, there is not sufficient information in the multivoxel patterns to identi fy it. This is surprising in light of the behavioral data showing that subjects are easily able to retrieve the ignored item when they are cued to do so (on “switch trials”). Thus, a memory may be resident within “short-term memory” while not being actively main tained, and yet there is no discernible neurophysiological footprint in the blood-oxygenlevel-dependent (BOLD) signal that the item is displaying heightened activation. Two im portant points may be taken from this work—first, that analyses of patterns of activity of fer increased power and specificity when investigating the brain’s representations of indi vidual memories; and, second, that increased BOLD activity during working memory tasks reflects processes associated with active maintenance and may not be able to de tect memory representations that are accessible but nevertheless outside the direct focus of attention.

Working Memory Capacity Although many studies of visual working memory, such as those reviewed above, have pri marily focused on the extent of overlap between stimulus selectivity during visual percep tion and working memory maintenance, a recent line of research has explored the physio logical a basis of capacity limits in visual short-term memory. This work has exploited an extremely simple and elegant short-term memory paradigm that tests a subject’s ability to detect a change between two sets of objects (usually simple, colored discs or squares; Figure 19.5) separated in time by a short delay (Luck & Vogel, (p. 401)

Page 20 of 48

Working Memory 1997). Using ERPs, Vogel et al. (2004) employed a varia tion of the change detection task to explore the electro physiological indices of ca pacity limits in visual shortterm memory. In each trial, subjects were presented with two arrays of colored squares presented 100 ms on either side of a central fixa tion cross, preceded by a cue indicating which side of the screen to attend. This was Figure 19.5 Two studies of the capacity of visual short-term memory. A, The visual change detection followed by a 900-ms delay paradigm using an example trial for the left hemi and a test array, in which field. B, Averaged event-related potential difference subjects had to decide waves at lateral and posterior parietal electrode sites whether any items in the at plotted for memory loads of 1, 2, 3, and 4. C, Region of the intraparietal sulcus (IPS), bilaterally, with level tended visual field had of activation that tracks visual short-term memory changed color (items in the load as defined by Cowan’s K function. D, Behavioral unattended filed were al performance and IPS response functions. Behavioral performance corresponds to the estimated number (K ways re-presented un of encoded colored discs at each set size. IM, iconic changed). The number of col memory control task; VSTM, visual short-term memo ored squares in each hemi ry task. Top, Reprinted by permission from Macmil field varied from one to ten lan Publishers Ltd: Nature, Edward K. Vogel and Maro G. Machizawa, “Neural activity predicts indi items. By computing the dif vidual differences in visual working memory capaci ference waveform between ty,” 428, 748–751, copyright 2004. ipsilateral activity and con tralateral activity over poste rior parietal and occipital electrodes, an estimate of the magnitude of delay-period activity could be calculated for each memory set size. These difference waves revealed that the magnitude of activity increased linearly from one to three items and then leveled off thereafter. The physiolog ical asymptote in the magnitude of neural activity was nearly identical to the behavioral esti mate of the average capacity, as measured using a formula referred to as Cowan’s K (Cowan, 2001; Pashler, 1988), which was 2.8. Moreover, individual estimates of the increase in activity between arrays of two and four items were very highly correlated (r = 0.78) with individual esti mates of Cowan’s K, a finding that strongly suggests that the electrophysiological signal reflects the number of items held in visual short-term memory. Todd and Marois (2004) used the same paradigm with fMRI and showed that activation in the posterior IPS increased from one to four items and was flat from four to eight items—a relationship that precisely mirrored the estimate of Cowan’s K in the group of subjects. Individual differences in the fMRI signal in the posterior IPS were also shown to correlate with individual estimates of K, thus replicating the result of Vo gel (2004). Xu et al. (2006) showed that the complexity of stored objects (rather than colored squares, multifeature objects were used as memory items) modulated the magnitude of activity in the lateral occipital cortex and the superior IPS, but did not affect activity in the posterior IPS. Thus, it appears that the posterior IPS represents a fixed number of objects, or slots, that can be used to store items of index locations in visual space. Using simultaneous auditory and visual presentation, Cowan et al. (2011) showed that the (p. 402) IPS had load-dependent activation Page 21 of 48

Working Memory that cut across the representational domain, arguing for a modality-independent storage system that the authors linked to Cowan’s concept of a focus of attention (Cowan, 1988). The correct in terpretation, however, as to what is actually being represented in the IPS is still a matter of de bate; however, one need not assume that what is being “stored” is a literal representation of an object—it may instead reflect the deployment of a limited pool of attentional resources that scales with the number of items that are maintained in memory.

Can Working Memory Capacity Be Expanded Through Training? In the past several years there has been a great deal of interest in the question as to whether working memory capacity can be expanded through practice and training (Jaeg gi, Buschkuehl, Jonides, & Perrig, 2008; Klingberg, 2010; Morrison & Chein, 2011). Stud ies of individual differences in working memory abilities have shown that the construct is highly correlated (r ∼ 0.5) with fluid intelligence (Engle, Tuholski, Laughlin, & Conway, 1999; Oberauer, Schulze, Wilhelm, & Suss, 2005)—and it therefore stands to reason that if working memory capacity could be expanded through training, so could one enhance a person’s general problem-solving abilities and overall intelligence. A central concern of such training research is in demonstrating that practice with a working memory task con fers a benefit, not just in performance on the same or similar types of tasks, but also in cognitive performance on activities that share little surface similarity with the training task (Shipstead, Redick, & Engle, 2012)—a phenomenon call far transfer. Although there is some evidence supporting the hypothesis that training on a working memory task can lead to improvements on other cognitive tasks (Chein & Morrison, 2010; Jaeggi, Buschkuehl, Jonides, & Shah, 2011; Klingberg et al., 2005), the largest study of computer ized cognitive training involving more than 10,000 participants failed to demonstrate far transfer (Owen et al., 2010). Nevertheless, research on this topic is still in its infancy, and it seems plausible that small improvements in working memory capacity could be achieved through training.

Verbal Working Memory Research on the neural basis verbal working memory has, for a number of reasons, taken a different course from corresponding work in the visual domain. First, whereas in visual working memory many of the most influential ideas and concepts have derived from work in the monkey, verbal working memory is a uniquely human phenomenon and has there fore benefited from animal research only indirectly, or by analogy with the visual system. Even research on the primary modality relevant to verbal working memory, that of audi tion, is surprisingly scarce in the monkey literature, owing to the difficulty in training nonhuman primates to perform delayed-response tasks with auditory stimuli, which can take upward of 15,000 learning trials (see Fritz, Mishkin, & Saunders, 2005). On the oth er hand, an entirely different state of affairs prevails in the field of human cognitive psy chology, where verbal short-term and working memory has over the past 40 years been studied extensively, almost to the exclusion of other modalities, resulting in thousands of published articles, a host of highly reliable and replicated behavioral phenomena, and Page 22 of 48

Working Memory dozens of sophisticated computational models. Finally, the study of aphasic patients has provided a wealth of information about the neural circuitry underlying language, and sys tematic neurological and neuropsychological inquiries into the impairments that accom pany damage to the language system have yielded detailed neuroanatomical models. The aphasia literature notwithstanding, the study of the neural basis of verbal working memo ry has depended, to a much greater extent than has been the case in the visual-spatial do main, on pure cognitive models of memory, in particular the phonological loop of Badde ley and colleagues. Not surprisingly, as it turns out, there are notable similarities be tween working memory for visual material and working memory for linguistic material, despite the absence of an exactly analogous capacity in nonhuman primates. Early neurological investigations of patients with language disturbances, or aphasia, re vealed that lesions to specific parts of the cerebral cortex could cause extremely selective deficits in language abilities. Thus, lesions to the inferior frontal gyrus are associated with Broca’s aphasia, a disorder that causes severe impairments in speech production. Broca’s aphasia is not, however, a disorder of peripheral motor coordination, such as the ability to move and control the tongue and mouth, but rather is a disorder of the ability to plan, program, and access the motor codes required for the production of speech (Good glass, 1993). The functions of speech perception and comprehension in Broca’s aphasia are generally preserved, however. Lesions to the posterior superior temporal gyrus (STG) and surrounding cortex, on the other hand, are associated (p. 403) with Wernicke’s apha sia, a complex syndrome that is characterized by fluent but error-filled production and poor comprehension and perception of speech. A third, less studied syndrome called con duction aphasia, typically caused by lesions in the posterior sylvian region (generally less extensive and relatively superior to lesions causing Wernicke’s aphasia), is associated with preserved speech perception and comprehension, occasional errors in otherwise flu ent spontaneous speech (e.g., phoneme substitutions), and severe difficulties with verba tim repetition of words and sentences (H. Damasio & Damasio, 1980). From the stand point of verbal short-term memory, there are a number of important points to be drawn from these three classic aphasic syndromes. First, the neural structures that underlie the perception and production of speech are partly dissociable. Thus, it appears that the brain retains at least two codes for the representation of speech: a sensory, or acoustic code, and an articulatory, or motor code; the former is necessary for the perception of speech, and the latter is required for the production of speech. It is tempting to postulate that posterior temporal lesions primarily affect receptive language functions, whereas an terior lesions affect productive language functions—but this is not quite true: Both Wernicke’s aphasia and conduction aphasia are caused by posterior lesions, yet only the former is associated with a receptive language disturbance (Hickok & Poeppel, 2000). Se cond, all of the above-mentioned disorders affect basic aspects of language processing, such as the comprehension, production, and perception of speech. Even conduction apha sia, for which a deficit in repetition of speech is often emphasized, is characterized by speech errors that occur in the course of natural language production. Finally, the classic Wernicke-Lichteim-Geschwind (Geschwind, 1965) model of language explains each of these three syndromes as disruptions to components of a neuroanatomical network of ar Page 23 of 48

Working Memory eas, in the inferior frontal and superior temporal cortices, that subserve language func tion. In the 1960s a handful of patients were described that did not fit nicely into the classic aphasiological rubric. Both Luria (1967) and Warrington and Shallice (1969) described pa tients with damage to the temporal-parietal cortex who were severely impaired at repeat ing sequences of words or digits spoken aloud by the experimenter. Luria referred to the deficit as an acoustic-mnestic aphasia, whereas Warrington and Shallice (1969), who were perhaps more attuned to extant information processing models in cognitive psychology, referred to the deficit as a “selective impairment of auditory-verbal short-term memory.” In both of these cases, however, the memory impairment was accompanied by a deficit in ordinary speech production (i.e., word-finding difficulties, errors of speech, and reading difficulty), which was, in fact, consistent with the more common diagnosis of conduction aphasia, and therefore complicated the argument in favor of a “pure” memory impair ment. Several years later, however, a patient (JB) (Shallice & Butterworth, 1977), also with a temporal-parietal lesion, was described who had a severely reduced auditory-ver bal immediate memory span (one or two items) and yet was otherwise unimpaired in ordi nary language use, including speech production and even long-term learning of supraspan lists of words. Several other such patients have since been described (for a review, see Shallice & Vallar, 1990), thus strengthening the case for the existence of an auditoryverbal storage component located in temporal-parietal cortex. The puzzle, of course, with respect to the classic neurological model of language dis cussed above, is how a lesion in the middle of the perisylvian speech center could pro duce a deficit in auditory-verbal immediate memory without any collateral deficit in basic language functioning. One possibility is that the precise location of the brain injury is de terminative, so that a particularly focal and well-placed lesion in temporal-parietal cortex might spare cortex critical for speech perception and speech production, while damaging a region dedicated to the storage of auditory-verbal information. However, the number of patients who have been described with a selective impairment to auditory-verbal shortterm memory is small, and the lesion locations that have been reported are comparable to those that might, in another patient, have led to conduction or Wernicke’s aphasia (A. R. Damasio, 1992; Dronkers, Wilkins, Van Valin, Redfern, & Jaeger, 2004; Goodglass, 1993). This would seem, then, to be a question particularly well suited to high-resolution func tional neuroimaging. The first study that attempted to localize the components of phonological loop in the brain was that of Paulesu and colleagues (1993). In one task, English letters were visually presented on a monitor and subjects were asked to remember them. In a second task, let ters were presented and rhyming judgments were made about them (press a button if let ter rhymes with “B”). In a baseline condition, Korean letters were visually presented and subjects were asked to remember them using a visual code. According to the authors’ log ic, the first task would (p. 404) require the contribution of all the components of the phonological loop—subvocal rehearsal, phonological storage, and executive processes— and the second (rhyming) task would only require subvocal rehearsal and executive Page 24 of 48

Working Memory processes. This reasoning was based on previous research showing that when letters are presented visually (Vallar & Baddeley, 1984), rhyming decisions engage the subvocal re hearsal system, but not the phonological store. Thus, a subtraction of the rhyming condi tion from the letter-rehearsal condition should isolate the neural locus of the phonological store. First, results were presented for the two tasks requiring phonological processing with the baseline tasks (viewing Korean letters) that did not. Several areas were shown to be significantly more active in the phonological tasks, including (in all cases, bilaterally): Broca’s area (BA 44/45), the supplementary motor cortex (SMA), the insula, the cerebel lum, Brodmann area 22/42, and Brodmann area 40. Subtracting the rhyming condition from the phonological short-term memory condition left a single brain area: Brodmann area 40—the neural correlate of the phonological store. Not surprisingly, the articulatory rehearsal process recruited a distributed neural circuit that included the inferior frontal gyrus. Activation of multiple brain regions during articu latory rehearsal is not surprising, given the complexity of the process and the variety of lesion sites associated with a speech production deficit. On the other hand, the localiza tion of the phonological store in a single brain region, BA 40 (or the supramarginal gyrus), comports with the idea of a solitary “receptacle,” where phonological information is temporarily stored. A number of follow-up PET studies, using various tasks and design logic, generally replicated the basic finding of the Paulesu study, namely a frontal-insular cerebellar network associated with rehearsal processes, and a parietal locus for the phonological store (Awh et al., 1996; Jonides et al., 1998; Salmon et al., 1996; Schumach er et al., 1996; Smith & Jonides, 1999). In a review of these pre-millennium PET studies of verbal working memory, Becker (1999) questioned whether the localization of the phonological store in BA 40 of the parietal cor tex could be reconciled with the logical architecture of the phonological loop. For in stance, one key aspect of the phonological loop model is that auditory information (whether it be speech, tones, music, or white noise), but not visual information, has oblig atory access to the phonological store. The reason for this asymmetry is to account for dissociations in memory performance that depend on the modality in which information is presented. For instance, the presentation of distracting auditory information while sub jects attempt to retain a list of verbal items in memory impairs performance on tests of recall. In contrast, the presentation of distracting visual information during verbal memo ry retention has no impact on verbal recall. This phenomenon, known as the irrelevant sound effect, is explained by assuming that auditory information—whether relevant or ir relevant—always enters the phonological store, but that visual-verbal information only en ters the store when it is explicitly subvocalized. Becker and colleagues, however, noted that if indeed auditory information has obligatory access to the phonological store, its “neural correlate” should be active even during passive auditory perception. Functional neuroimaging studies of passive auditory listening (e.g., with no memory component), however, do not show activity in the parietal lobe, but rather show activation that is large ly confined to the superior temporal lobe (e.g., Binder et al., 2000). In addition, efforts to show verbal mnemonic specificity to maintenance-related activity on the parietal lobe have not been successful, showing instead that working memories for words, visual ob Page 25 of 48

Working Memory jects, and spatial locations all activate the area (Badre, Poldrack, Pare-Blagoev, Insler, & Wagner, 2005; Nystrom et al., 2000; Zurowski et al., 2002). Thus, it would appear that if there were a true neural correlate to the phonological store, it must reside within the confines of the auditory cortical zone of the superior temporal cortex. As was the case in the visual-spatial domain, the emergence of event-related fMRI, with its ability to isolate delay-period activity during working memory, was an inferential boon to the study of verbal working memory. Postle (1999) showed with visual-verbal presenta tion of letter stimuli that delay-period activity in single subjects was often localized in the posterior-superior temporal cortex rather than the parietal lobe. Buchsbaum (2001) also used an event-related fMRI paradigm in which, on each trial, subjects were presented with acoustic speech information that they then rehearsed subvocally for 27 seconds, fol lowed by a rest period. Analysis focused on identifying regions that were responsive both during the perceptual phase and during the rehearsal phase of the trial. Activation oc curred in two regions in the posterior superior temporal cortex, one in the posterior supe rior temporal sulcus (pSTS) bilaterally and one along the dorsal surface of the left poste rior planum temporale, that is, in (p. 405) the Sylvius fissure at the parietal-temporal boundary (area SPT). Notably, although the parietal lobe did show delay-period activity, it was unresponsive during auditory stimulus presentation. In a follow-up study, Hickok (2003) showed that the same superior temporal regions (pSTS and SPT) were active both during the perception and during delay-period maintenance of short (5-second) musical melodies, suggesting that these posterior temporal storage sites are not restricted to speech-based, or phonological, information (Figure 19.6). Several subsequent studies have confirmed the role of SPT in inner speech and verbal working memory (Hashimoto, Lee, Preus, McCarley, & Wible, 2010; Hickok, Okada, & Serences, 2009; Koelsch et al., 2009). Acheson et al. (2011) used fMRI to identify posterior temporal regions activated during verbal working memory maintenance, and then used repetitive transcranial mag netic stimulation (TMS) to these sites while subjects performed a rapid-paced reading task that involved language production but no memory load. TMS applied to the posterior temporal area significantly interfered with paced reading, arguing for common neural substrate for language production and verbal working memory.

Page 26 of 48

Working Memory

Figure 19.6 Main results from Hickok et al.’s (2003) study of verbal and musical working memory mainte nance. A, Averaged time course of activation over the course of a trial in the Sylvian fissure at the parietaltemporal boundary (area SPT) for speech and music conditions. Timeline at bottom shows structure of each trial; black bars indicate auditory stimulus pre sentation. Red traces indicate activation during re hearsal trials, black traces indicate activity during listen-only trials in which subjects did not rehearse stimuli at all. B, Activation maps of in the left hemi sphere (sagittal slices) showing three response pat terns for both music rehearsal (left) and speech re hearsal trials (right): auditory-only responses shown in green; delay-period responses shown in blue; and auditory + rehearsal responses shown in red. Arrows indicate the location of area SPT. pSTS, posterior su perior temporal sulcus.

Stevens et al. (2004) and Rama et al. (2004) have shown that memory for voice identity, independent of phonological content (i.e., matching speaker identity as opposed to word identity), selectively activates the mid-STS and the anterior STG of the superior temporal region, but not the more posterior and dorsally situated SPT region. Buchsbaum et al. (2005) have further shown that the mid-STS is more active when subjects recall verbal in formation that is acoustically presented than when the information is visually presented, whereas area SPT shows equally strong delay-period activity for both auditory and visual forms of input. This finding is supported by regional analyses of structural MRI in large groups of patients with brain lesions that have showed that damage to the STG is most predictive of auditory short-term memory impairment (Koenigs et al. 2011; Leff et al., 2009). Thus, it appears that different regions in the auditory association cortex of the su perior temporal cortex are attuned to different qualities or features of a verbal stimulus, such as voice information, input modality, phonological content, and lexical status (e.g., Martin & Freedman, 2001)—and all of these codes may play a role in the short-term maintenance of verbal information.

Page 27 of 48

Working Memory

Figure 19.7 A comparison of conduction aphasia, phonological working memory in functional magnetic resonance imaging (fMRI), and their overlap. Left panel surface shows the regional distribution lesion overlap in patients with conduction aphasia (maxi mum is 12/14, or 85% overlap). Middle panel shows the percentage of subjects with maintenance-related activity in a phonological working memory task. Right panel shows the area of maximal overlap be tween the lesion and fMRI surfaces (lesion > 85% overlap and significant fMRI activity for conjunction of encoding and rehearsal).

Additional support for a feature-based topography of auditory association cortex comes from neuroanatomical tract-tracing studies in the monkey (p. 406) that have revealed sep arate temporal-prefrontal pathways arising along the anterior-posterior axis of the superi or temporal region (Romanski, 2004; Romanski et al., 1999). The posterior part of the STG projects to dorsolateral PFC (BA 46, 8), whereas neurons in the anterior STG are more strongly connected to the ventral PFC, including BA 12 and 47. Several authors have suggested, similar to the visual system, a dichotomy between ventral-going auditoryobject and a dorsal-going auditory-spatial processing streams (Rauschecker & Tian, 2000; Tian, Reser, Durham, Kustov, & Rauschecker, 2001). Thus, studies have shown that the neurons in the rostral STG have more selective responses to classes of complex sounds, such as vocalizations, whereas more caudally located regions have more spatial selectivi ty (Chevillet, Riesenhuber, & Rauschecker, 2011; Rauschecker & Scott, 2009; Rauscheck er & Tian, 2000; Tian et al., 2001). Hickok and Poeppel (2000, 2004) have proposed that human speech processing also proceeds along diverging auditory dorsal and ventral streams, although they emphasize the distinction between perception for action, or audi tory-motor integration, in the dorsal stream and perception for comprehension in the ven tral stream. Buchsbaum (2005) has shown with fMRI time-series data that, consistent with the monkey connectivity patterns, the most posterior and dorsal part of the superior temporal cortex, area SPT, has the strongest functional connectivity with dorsolateral and posterior (premotor) parts of the PFC, whereas the midportion of the STS is most tightly coupled with BA 12 and 47 of the ventrolateral PFC. Moreover, gross distinctions be tween anterior (BA 47) and posterior (BA 44/6), parts of the PFC have been associated with conceptual-semantic and phonological-articulatory aspects of verbal processing (Pol drack et al., 1999; Wagner, Pare-Blagoev, Clark, & Poldrack, 2001). Earlier we posed the question as to how a lesion in posterior sylvian cortex, an area of known importance for online language processing, could occasionally produce an impair ment restricted to phonological short-term memory. One solution to this puzzle is that Page 28 of 48

Working Memory subjects with selective verbal short-term memory deficits from temporal-parietal lesions retain their perceptual and comprehension abilities owing to the sparing of the ventral stream pathways in the lateral temporal cortex, whereas the preservation of speech pro duction is due to an unusual capacity in these subjects for right-hemisphere control of speech (Buchsbaum & D’Esposito, 2008b; Hickok & Poeppel, 2004). The short-term mem ory deficit arises, then, from a selective deficit in auditory-motor integration—or the abili ty to translate between acoustic and articulatory speech codes—a function that is espe cially taxed during tests of repetition and short-term memory (Buchsbaum & D’Esposito, 2008a). Conduction aphasia, the aphasic syndrome most often associated with a deficit in auditory repetition and verbal short-term memory in the absence of any difficulty with speech perception, may reflect a disorder of auditory-motor integration. Indeed, it has re cently been shown that the lesion site most often implicated in conduction aphasia cir cumscribes area SPT in the posterior-most portion of the superior temporal lobe, a link between a disorder of verbal repetition and a region in the brain often implicated in tasks of verbal working memory (Buchsbaum et al. 2011; Figure 19.7). Thus, impairment in the ability to temporarily store verbal information, as occurs in conduction aphasia, may re sult from damage to a system, area (p. 407) SPT, that is critical for the interfacing of audi tory and motor representations of sound.

Cognitive Control of Working Memory In the foregoing sections we have examined how different types of information—spatial, visual, verbal—are represented in the brain. A key conclusion has been that the regions of the cerebral cortex that are specialized for the perception of certain classes of stimuli are also important for the maintenance of such information in working memory. This principle of course only applies to regions of the cerebral cortex, located primarily in the posterior half of the neocortex, that are specialized for sensory processing in the first place. For in stance, the PFC, which is not easily categorized in terms of type of sensory processing, shows little evidence of content-specific selectivity in working memory. Rather, the PFC appears to play a more general role in maintaining, monitoring, and controlling the cur rent contents of working memory, irrespective of the type of information involved. Indeed, the role of Baddeley’s somewhat elusive “central executive” is most likely fulfilled by the coordinated action of the PFC. We learn little, however, by merely substituting the ho munculus of the working memory model, the central executive, with a large area of cor tex in the front of the brain (Miyake et al., 2000). In recent years, however, progress has been made in the study of the role of the PFC in “cognitive control” by investigating the constituent subprocesses that together compose the brain executive system. Three of the most important processes that serve to regulate the contents of working memory are selection, reactivation (or updating), and suppression. Each of these opera tions functions to regulate what is currently “in” working memory, allowing for the con tents of working memory to be determined strategically, according to ongoing goals and actions of the organism. Moreover, these cognitive control processes allow for the best utilization of a system that is subject to severe capacity limitations; thus, selection is a Page 29 of 48

Working Memory mechanism that regulates what enters working memory, suppression serves to prevent unwanted information from entering (or remaining in) memory, and reactivation offers a mechanism for retaining information in working memory. All three of these operations fall under the general category of what we might call “top-down” signals or commands that the PFC deploys to effectively regulate memory. In the following sections we briefly re view research on the neural basis of these working memory control functions. It is easy to see why selection is an important process for regulating the contents of work ing memory. For instance, if one tries to mentally calculate the product of the numbers 8 and 16, one might attack the problem in stages. First multiply 8 and 6 (yielding 48), then multiply 8 and 16 (yielding 80), and then add together the intermediate values (48 + 80 = 128). For this strategy to work, one has to be able to be able to select the currently rele vant numbers in working memory at each stage of the calculation. Thus, at any given time, it is likely that many pieces of information are competing for access to working memory. One way to study this type of selection is to devise working memory tasks that vary the degree of proactive interference, so that on some trials subjects must select the correct items from among a set of active, but task irrelevant competitors (see Jonides & Nee, 2006; Vogel, McCollough, & Machizawa, 2005; Yi, Woodman, Widders, Marois, & Chun, 2004). Functional neuroimaging studies have shown that the left inferior frontal gyrus is modulated by the number of irrelevant competing alternatives that confront a subject (Badre & Wagner, 2007; Thompson-Schill, D’Esposito, Aguirre, & Farah, 1997). Moreover, behavioral measures of the interfering effects of irrelevant information are cor related with the level of activation in the left inferior frontal gyrus (D’Esposito, Postle, Jonides, & Smith, 1999; Smith, Jonides, Marshuetz, & Koeppe, 1998; Thompson-Schill et al., 2002). From a mechanistic standpoint, selection may involve the maintenance of goal information to bias activation in favor of relevant over irrelevant information competing for access to working memory (E. K. Miller & Cohen, 2001; Rowe, Toni, Josephs, Frack owiak, & Passingham, 2000). Reactivation refers to the updating or refreshing of information that is currently active in memory. For instance, in verbal working memory, rehearsal is used as a strategy to keep sequences of phonological material in mind. But why is this strategy effective? According to the phonological loop model, articulatory rehearsal serves to “refresh” the decaying traces in the phonological store. We know from functional neuroimaging studies (re viewed above) that during covert verbal rehearsal, brain areas known to be important for speech production (e.g., Broca’s area, premotor cortex) coactivate with areas that are known to be important for speech perception. One interpretation of this phenomenon, consistent with the ideas of the phonological loop, is that regions in the PFC that govern the selection of speech motor programs can also send signals to posterior cortex (p. 408) that can “reactivate” a set of corresponding sensory traces. In this case, reactivation is achieved presumably by way of a tight sensorimotor link between a prefrontally situated motor system (e.g., LIFG for speech) and an auditory system located in posterior neocor tex.

Page 30 of 48

Working Memory There must yet be other means of achieving reactivation, however, when there is not a di rect equivalence between some top-down (e.g., motor) and bottom-up (e.g., sensory) code, as is true in the case of speech, in which every sound that can be produced can also be perceived (Chun & Johnson, 2011). Thus, for instance, there is no obvious “motor equivalent” of an abstract shape, color, or picture of a (expressionless) face, and yet such images can nevertheless be maintained to some degree in working memory. One possibili ty is that the maintenance of sensory stimuli that do not have a motor analogue is achieved by means of a more general mechanism of reactivation that need not rely on di rect motor to sensory signals, but rather operates in a similar manner to selective atten tion. Thus, just as attention can be focused on this or that aspect of the current perceptu al environment, it can also be directed to some subset of the current contents of working memory. Indeed, if, as we have been suggesting, the neural substrates of memory are shared with those of perception, then it seems likely that the same mechanisms of selec tive attention can be applied to both the contents of perception and the contents of mem ory. A potential mechanism for selective modulation of memory representations is by way of a top-down bias signal that can target populations of cells in sensory cortex and modu late their level of activity. Top-down suppression has the opposite effect of selection or reactivation—it serves to de crease, rather than increase, the salience of a stimulus representation in working memo ry. Although it is sometimes argued that suppression in attention and memory is merely the flip side of selection, and therefore does not constitute a separate processing mecha nism in its own right, evidence from cognitive neuroscience suggests that this is not the case. For instance, Gazzaley (2005) used a working memory delay task to directly study the neural mechanisms underlying top-down activation and suppression by investigating the processes involved when participants were required to select relevant and suppress irrelevant information. During each trial, participants observed sequences of two faces and two natural scenes presented in a randomized order. The tasks differed in the in structions informing the participants how to process the stimuli: (1) remember faces and ignore scenes, (2) remember scenes and ignore faces, or (3) passively view faces and scenes without attempting to remember them. In the two memory tasks, the encoding of the task-relevant stimuli required selective attention and thus permitted the dissociation of physiological measures of enhancement and suppression relative to the passive base line. fMRI data revealed top-down modulation of both activity magnitude and processing speed that occurred above or below the perceptual baseline depending on task instruc tion. Thus, during the encoding period of the delay task, FFA activity was enhanced when faces had to be remembered compared with a condition in which they were passively viewed. Likewise, FFA activity was suppressed when faces had to be ignored (with scenes now being retained instead across the delay interval) compared with a condition in which they were passively viewed. Thus, there appears to be at least two types of top-down sig nal: one that serves to enhance task-relevant information and another that serves to sup press task-relevant information. It is well documented that the nervous system uses inter leaved inhibitory and excitatory mechanisms throughout the neuroaxis (e.g., spinal reflex es, cerebellar outputs, and basal ganglia movement control networks). Thus, it may not Page 31 of 48

Working Memory be surprising that enhancement and suppression mechanisms may exist to control cogni tion (Knight, Staines, Swick, & Chao, 1999; Shimamura, 2000). By generating contrast through enhancements and suppressions of activity magnitude and processing speed, topdown signals bias the likelihood of successful representation of relevant information in a competitive system.

Top-Down Control Signals and the Prefrontal Cortex Although it has been proposed that the PFC provides the major source of the types of topdown signals that we have described, this hypothesis largely originates from suggestive findings rather than direct empirical evidence. However, a few studies lend direct causal support to this hypothesis. For example, Fuster (1985) investigated the effect of cooling inactivation of specific parts of the PFC on spiking activity in ITC inferior temporal cortex (ITC) neurons, during a delayed match-to-sample DMS color task. During the delay inter val in this task—when persistent stimulus-specific activity in ITC neurons is observed—in activation caused attenuated spiking profiles and a loss of stimulus (p. 409) specificity of ITC neurons. These two alterations of ITC signaling strongly implicate the PFC as a source of top-down signals necessary for maintaining robust sensory representations in the absence of bottom-up sensory activity. Tomita (1999) was able to isolate top-down sig nals during the retrieval of paired associates in a visual memory task. Spiking activity was recorded from stimulus-specific ITC neurons as cue stimuli were presented to the ip silateral hemifield. This experiment’s unique feature was the ability to separate bottomup sensory signals from a top-down mnemonic reactivation, using a posterior split-brain procedure that limited hemispheric cross-talk to the anterior corpus callosum connecting each PFC. When a probe stimulus was presented ipsilaterally to the recording site, thus restricting bottom-up visual input to the contralateral hemisphere, stimulus-specific neu rons became activated at the recording site approximately 170 ms later. Because these neurons received no bottom-up visual signals of the probe stimulus, with the only route between the two hemispheres being through the PFC, this experiment showed that PFC neurons were sufficient to trigger the reactivation of object-selective representations in ITC regions in a top-down manner. The combined lesion/electrophysiological approach in humans has rarely been implemented. Chao and Knight (1998), however, studied patients with lateral PFC lesions during DMS tasks. It was found that when distracting stimuli are presented during the delay period, the amplitude of the recorded ERP from posterior electrodes was markedly increased in patients compared with controls. These results were interpreted to show disinhibition of sensory processing and support a role of the PFC in suppressing the representation of stimuli that are irrelevant for current behavior. Based on the data we have reviewed thus far, we might propose that any population of neurons within primary or unimodal association cortex can exhibit persistent neuronal ac tivity, which serves to actively maintain the representations coded by those neuronal pop ulations (Curtis & D’Esposito, 2003). Areas of multimodal cortex, such as PFC and pari etal cortex, which are in a position to integrate representations through connectivity to unimodal association cortex, are also critically involved in the active maintenance of taskrelevant information (Burgess, Gilbert, & Dumontheil, 2007; Stuss & Alexander, 2007). Page 32 of 48

Working Memory Miller and Cohen (2001) have proposed that in addition to the recent sensory informa tion, integrated representations of task contingencies and even abstract rules (e.g., if this object, then this later response) are also maintained in the PFC. This is similar to what Fuster (1997) has long emphasized, namely that the PFC is critically responsible for tem poral integration and the mediation of events that are separated in time but contingent on one another. In this way, the PFC may exert “control” in that the information it repre sents can bias posterior unimodal association cortex in order to keep neural representa tions of behaviorally relevant sensory information activated when they are no longer present in the external environment (B. T. Miller & D’Esposito, 2005; Postle, 2006b; Ran ganath et al., 2004). In a real world example, when a person is looking at a crowd of peo ple, the visual scene presented to the retina may include a myriad of angles, shapes, peo ple, and objects. If that person is a police officer looking for an armed robber escaping through the crowd, however, some mechanism of suppressing irrelevant visual informa tion while enhancing task-relevant information is necessary for an efficient and effective search. Thus, neural activity throughout the brain that is generated by input from the out side world may be differentially enhanced or suppressed, presumably from top-down sig nals emanating from integrative brain regions such as PFC, based on the context of the situation. As Miller and Cohen (2001) state, putative top-down signals originating in PFC may permit “the active maintenance of patterns of activity that represent goals and the means to achieve them. They provide bias signals throughout much of the rest of the brain, affecting visual processes and other sensory modalities, as well as systems respon sible for response execution, memory retrieval, emotional evaluation, etc. The aggregate effect of these bias signals is to guide the flow of neural activity along pathways that es tablish the proper mappings between inputs, internal states and outputs needed to per form a given task.” Computational models of this type of system have created a PFC mod ule (e.g., O’Reilly & Norman, 2002) that consists of “rule” units whose activation leads to the production of a response other than the one most strongly associated with a given in put. Thus, “this module is not responsible for carrying out input–output mappings needed for performance. Rather, this module influences the activity of other units whose respon sibility is making the needed mappings” (e.g., Cohen, Dunbar, & McClelland, 1990). Thus, there is no need to propose the existence of a homunculus (e.g., central executive) in the brain that can perform a wide range of cognitive operations that (p. 410) are necessary for the task at hand (Hazy, Frank, & O’Reilly, 2006). Clearly, there are other areas of multimodal cortex such as posterior parietal cortex, and the hippocampus, that can also be the source of top-down signals. For example, it is thought that the hippocampus is specialized for rapid learning of arbitrary information that can be recalled in the service of controlled processing (McClelland, McNaughton, & O’Reilly, 1995). Several recent studies have offered evidence that the hippocampus, long thought to contribute little to “online” cognitive activities, plays a role in the maintenance of information in working memory for novel or high information stimuli (Olsen et al., 2009; Olson, Page, Moore, Chatterjee, & Verfaellie, 2006; Ranganath & D’Esposito, 2001; Rose, Olsen, Craik, & Rosenbaum, 2012). Moreover, input from brainstem neuromodula tory systems probably plays a critical role in modulating goal-directed behavior (Robbins, Page 33 of 48

Working Memory 2007). For example, the dopaminergic system probably plays a critical role in cognitive control processes (for a review, see Cools & Robbins, 2004). Specifically, it is proposed that phasic bursts of dopaminergic neurons may be critical for updating currently activat ed task-relevant representations, whereas tonic dopaminergic activity serves to stabilize such representations (e.g., Cohen, Braver, & Brown, 2002; Durstewitz, Seamans, & Se jnowski, 2000).

Summary and Conclusions Elucidation of the cognitive and neural architectures underlying working memory has been an important focus of neuroscience research for much of the past two decades. The emergence of the concept of working memory, with its emphasis on the utilization of the information stored in memory in the service of behavioral goals, has enlarged our under standing and broadened the scope of neuroscience research of short-term memory. Data from numerous studies have been reviewed and have demonstrated that a network of brain regions, including the PFC, is critical for the active maintenance of internal repre sentations. Moreover, it appears that the PFC has functional subdivisions that are orga nized according to the domain (e.g., verbal, spatial, object) of the topographical inputs ar riving from posterior cortices. In addition, however, a level of representational abstract ness is achieved through the integration of information converging in the PFC. Finally, working memory function is not localized to a single brain region, but rather is an emer gent property of the functional interactions between the PFC and other posterior neocor tical regions. Numerous questions remain about the neural basis of this complex cogni tive system, but studies such as those reviewed in this chapter should continue to provide converging evidence that may provide answers to the many residual questions.

References Acheson, D. J., Hamidi, M., Binder, J. R., & Postle, B. R. (2011). A common neural sub strate for language production and verbal working memory. Journal of Cognitive Neuro science, 23 (6), 1358–1367. Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A proposed system and its con trol processes. In K. W. Spence (Ed.), The psychology of learning and motivation: Ad vances in research and theory (Vol. 2, pp. 89–195). New York: Academic Press. Awh, E., Jonides, J., Smith, E. E., Schumacher, E. H., Koeppe, R. A., & Katz, S. (1996). Dis sociation of storage and rehearsal in working memory: PET evidence. Psychological Science, 7, 25–31. Baddeley, A. D. (1986). Working memory. Oxford (Oxfordshire), New York: Clarendon Press; Oxford University Press. Baddeley, A. (1992). Working memory. Science, 255 (5044), 556–559.

Page 34 of 48

Working Memory Baddeley, A. (2000). The episodic buffer: A new component of working memory? Trends in Cognitive Sciences, 4 (11), 417–423. Baddeley, A., Allen, R., & Vargha-Khadem, F. (2010). Is the hippocampus necessary for vi sual and verbal binding in working memory? Neuropsychologia, 48 (4), 1089–1095. Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. Bower (Ed.), The psychology of learning and motivation (Vol. 7, pp. 47–90). New York: Academic Press. Baddeley, A., Lewis, V., & Vallar, G. (1984). Exploring the articulatory loop. Quarterly Jour nal of Experimental Psychology, 36, 2233–2252. Baddeley, A. D., Thomson, N., & Buchanan, M. (1975). Word length and the structure of short-term memory. Journal of Verbal Learning and Verbal Behavior, 14, 575–589. Badre, D., Poldrack, R. A., Pare-Blagoev, E. J., Insler, R. Z., & Wagner, A. D. (2005). Disso ciable controlled retrieval and generalized selection mechanisms in ventrolateral pre frontal cortex. Neuron, 47 (6), 907–918. Badre, D., & Wagner, A. D. (2007). Left ventrolateral prefrontal cortex and the cognitive control of memory. Neuropsychologia, 45 (13), 2883–2901. Becker, J. T., MacAndrew, D. K., & Fiez, J. A. (1999). A comment on the functional localiza tion of the phonological storage subsystem of working memory. Brain and Cognition, 41 (1), 27–38. Belger, A., Puce, A., Krystal, J. H., Gore, J. C., Goldman-Rakic, P., & McCarthy, G. (1998). Dissociation of mnemonic and perceptual processes during spatial and nonspatial work ing memory using fMRI. Human Brain Mapping, 6 (1), 14–32. Binder, J. R., Frost, J. A., Hammeke, T. A., Bellgowan, P. S., Springer, J. A., Kaufman, J. N., et al. (2000). Human temporal lobe activation by speech and nonspeech sounds. Cerebral Cortex, 10 (5), 512–528. Blum, R. A. (1952). Effects of subtotal lesions of frontal granular cortex on delayed reac tion in monkeys. AMA Archives of Neurology and Psychiatry, 67 (3), 375–386. Buchsbaum, B. R., Baldo, J., Okada, K., Berman, K. F., Dronkers, N., D’Esposito, M., et al. (2011). Conduction aphasia, sensory-motor integration, and phonological short(p. 411)

term memory—An aggregate analysis of lesion and fMRI data. Brain and Language. Buchsbaum, B. R., & D’Esposito, M. (2008b). The search for the phonological store: from loop to convolution. Journal of Cognitive Neuroscience, 20 (5), 762–778. Buchsbaum, B. R., Hickok, G., & Humphries, C. (2001). Role of the left superior temporal gyrus in phonological processing for speech perception and production. Cognitive Science, 25, 663–678.

Page 35 of 48

Working Memory Buchsbaum, B. R., Olsen, R. K., Koch, P., & Berman, K. F. (2005). Human dorsal and ven tral auditory streams subserve rehearsal-based and echoic processes during verbal work ing memory. Neuron, 48 (4), 687–697. Burgess, P. W., Gilbert, S. J., & Dumontheil, I. (2007). Function and localization within ros tral prefrontal cortex (area 10). Philosophical Transactions of the Royal Society of Lon don, Series B, Biological Sciences, 362 (1481), 887–899. Butters, N., Pandya, D., Stein, D., & Rosen, J. (1972). A search for the spatial engram within the frontal lobes of monkeys. Acta Neurobiologiae Experimentalis (Wars), 32 (2), 305–329. Chao, L. L., & Knight, R. T. (1998). Contribution of human prefrontal cortex to delay per formance. Journal of Cognitive Neurosciences, 10 (2), 167–177. Chein, J. M., & Morrison, A. B. (2010). Expanding the mind’s workspace: training and transfer effects with a complex working memory span task. Psychonomic Bulletin and Re view, 17 (2), 193–199. Chevillet, M., Riesenhuber, M., & Rauschecker, J. P. (2011). Functional correlates of the anterolateral processing hierarchy in human auditory cortex. Journal of Neuroscience, 31 (25), 9345–9352. Chun, M. M., & Johnson, M. K. (2011). Memory: Enduring traces of perceptual and reflec tive attention. Neuron, 72 (4), 520–535. Cocchini, G., Logie, R. H., Della Sala, S., MacPherson, S. E., & Baddeley, A. D. (2002). Concurrent performance of two memory tasks: Evidence for domain-specific working memory systems. Memory and Cognition, 30 (7), 1086–1095. Cohen, J. D., Braver, T. S., & Brown, J. W. (2002). Computational perspectives on dopamine function in prefrontal cortex. Current Opinion in Neurobiology, 12 (2), 223–229. Cohen, J. D., Dunbar, K., & McClelland, J. L. (1990). On the control of automatic process es: A parallel distributed processing account of the Stroop effect. Psychological Review, 97 (3), 332–361. Cools, R., & Robbins, T. W. (2004). Chemistry of the adaptive mind. Philosophical Transac tions, Series A, Mathematical, Physical, and Engineering Sciences, 362 (1825), 2871– 2888. Corbetta, M., Kincade, J. M., & Shulman, G. L. (2002). Neural systems for visual orienting and their relationships to spatial working memory. Journal of Cognitive Neurosciences, 14 (3), 508–523. Courtney, S. M., Petit, L., Maisog, J. M., Ungerleider, L. G., & Haxby, J. V. (1998). An area specialized for spatial working memory in human frontal cortex. Science, 279 (5355), 1347–1351. Page 36 of 48

Working Memory Courtney, S. M., Ungerleider, L. G., Keil, K., & Haxby, J. V. (1996). Object and spatial visu al working memory activate separate neural systems in human cortex. Cerebral Cortex, 6 (1), 39–49. Courtney, S. M., Ungerleider, L. G., Keil, K., & Haxby, J. V. (1997). Transient and sustained activity in a distributed neural system for human working memory. Nature, 386 (6625), 608–611. Cowan, N. (1988). Evolving conceptions of memory storage, selective attention, and their mutual constraints within the human information-processing system. Psychological Bul letin, 104 (2), 163–191. Cowan, N. (2001). The magical number 4 in short-term memory: a reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24 (1), 87–114; discussion 114– 185. Cowan, N., Li, D., Moffitt, A., Becker, T. M., Martin, E. A., Saults, J. S., et al. (2011). A neural region of abstract working memory. Journal of Cognitive Neurosciences, 23 (10), 2852–2863. Curtis, C. E., & D’Esposito, M. (2003). Persistent activity in the prefrontal cortex during working memory. Trends in Cognitive Sciences, 7 (9), 415–423. Curtis, C. E., Rao, V. Y., & D’Esposito, M. (2004). Maintenance of spatial and motor codes during oculomotor delayed response tasks. Journal of Neuroscience, 24 (16), 3944–3952. D’Esposito, M., Aguirre, G. K., Zarahn, E., Ballard, D., Shin, R. K., & Lease, J. (1998). Functional MRI studies of spatial and nonspatial working memory. Brain Research. Cogni tive Brain Research, 7 (1), 1–13. D’Esposito M, Detre JA, Alsop DC, Shin RK, Atlas S, Grossman M (1995). The neural basis of the central executive system of working memory. Nature, 378: 279–281. D’Esposito, M., Postle, B. R., Jonides, J., & Smith, E. E. (1999). The neural substrate and temporal dynamics of interference effects in working memory as revealed by event-relat ed functional MRI. Proceedings of the National Academy of Sciences U S A, 96 (13), 7514–7519. Damasio, A. R. (1992). Aphasia. New England Journal of Medicine, 326 (8), 531–539. Damasio, H., & Damasio, A. R. (1980). The anatomical basis of conduction aphasia. Brain, 103 (2), 337–350. Della Sala, S., Gray, C., Baddeley, A., Allamano, N., & Wilson, L. (1999). Pattern span: A tool for unwelding visuo-spatial memory. Neuropsychologia, 37 (10), 1189–1199.

Page 37 of 48

Working Memory Dronkers, N. F., Wilkins, D. P., Van Valin, R. D., Jr., Redfern, B. B., & Jaeger, J. J. (2004). Le sion analysis of the brain areas involved in language comprehension. Cognition, 92 (1-2), 145–177. Druzgal, T. J., & D’Esposito, M. (2001). Activity in fusiform face area modulated as a func tion of working memory load. Brain Research. Cognitive Brain Research, 10 (3), 355–364. Druzgal, T. J., & D’Esposito, M. (2003). Dissecting contributions of prefrontal cortex and fusiform face area to face working memory. Journal of Cognitive Neurosciences, 15 (6), 771–784. Durstewitz, D., Seamans, J. K., & Sejnowski, T. J. (2000). Neurocomputational models of working memory. Nature Neuroscience, 3 (Suppl), 1184–1191. Engle, R. W., Tuholski, S. W., Laughlin, J. E., & Conway, A. R. (1999). Working memory, short-term memory, and general fluid intelligence: A latent-variable approach. Journal of Experimental Psychology: General, 128 (3), 309–331. Ericsson, K. A., & Kintsch, W. (1995). Long-term working memory. Psychological Review, 102 (2), 211–245. Fritz, J., Mishkin, M., & Saunders, R. C. (2005). In search of an auditory engram. Proceedings of National Academy of Sciences U S A, 102 (26), 9359–9364. (p. 412)

Funahashi, S., Bruce, C. J., & Goldman-Rakic, P. S. (1989). Mnemonic coding of visual space in the monkey’s dorsolateral prefrontal cortex. Journal of Neurophysiology, 61 (2), 331–349. Fuster, J. M. (1997). Network memory. Trends in Neurosciences, 20 (10), 451–459. Fuster, J. M. (2000). Prefrontal neurons in networks of executive memory. Brain Research Bulletin, 52 (5), 331–336. Fuster, J. M., & Alexander, G. E. (1971). Neuron activity related to short-term memory. Science, 173 (997), 652–654. Fuster, J. M., Bauer, R. H., & Jervey, J. P. (1985). Functional interactions between infer otemporal and prefrontal cortex in a cognitive task. Brain Research, 330 (2), 299–307. Gazzaley, A., Cooney, J. W., McEvoy, K., Knight, R. T., & D’Esposito, M. (2005). Top-down enhancement and suppression of the magnitude and speed of neural activity. Journal of Cognitive Neuroscience, 17 (3), 507–517. Geschwind, N. (1965). Disconnexion syndromes in animals and man. I. Brain, 88 (2), 237– 294. Glanzer, M., & Cunitz, A.-R. (1966). Two storage mechanisms in free recall. Journal of Ver bal Learning and Verbal Behavior, 5, 351–360. Page 38 of 48

Working Memory Goldman-Rakic, P. S. (1987). Circuitry of primate prefrontal cortex and regulation of be havior by representational memory. In F. Plum (Ed.), Handbook of physiology—the ner vous system (Vol. 5, pp. 373–417). Bethesda, MD: American Physiological Society. Goldman-Rakic, P. S. (1990). Cellular and circuit basis of working memory in prefrontal cortex of nonhuman primates. Progress in Brain Research, 85, 325–335; discussion, 335– 326. Goldman-Rakic, P. (2000). Localization of function all over again. NeuroImage, 11 (5 Pt 1), 451–457. Goodglass, H. (1993). Understanding aphasia. San Diego, CA: Academic Press. Hasher, L., Zacks, R. T., & Rahhal, T. A. (1999). Timing, instructions, and inhibitory con trol: some missing factors in the age and memory debate. Gerontology, 45 (6), 355–357. Hashimoto, R., Lee, K., Preus, A., McCarley, R. W., & Wible, C. G. (2011). An fMRI study of functional abnormalities in the verbal working memory system and the relationship to clinical symptoms in chronic schizophrenia. Cerebral Cortex, 20 (1), 46–60. Hazy, T. E., Frank, M. J., & O’Reilly, R. C. (2006). Banishing the homunculus: making working memory work. Neuroscience, 139 (1), 105–118. Hickok, G. Computational neuroanatomy of speech production (2012). Nature Review Neuroscience, 13 (2), 135–145. Hickok, G., Buchsbaum, B., Humphries, C., & Muftuler, T. (2003). Auditory-motor interac tion revealed by fMRI: speech, music, and working memory in area Spt. Journal of Cogni tive Neuroscience, 15 (5), 673–682. Hickok, G., Okada, K., & Serences, J. T. (2009). Area Spt in the human planum temporale supports sensory-motor integration for speech processing. Journal of Neurophysiology, 101 (5), 2725–2732. Hickok, G., & Poeppel, I. D. (2000). Towards a functional neuroanatomy of speech percep tion. Journal of Cognitive Neuroscience, 45–45. Hickok, G., & Poeppel, D. (2004). Dorsal and ventral streams: A framework for under standing aspects of the functional anatomy of language. Cognition, 92 (1-2), 67–99. Hulme, C., Newton, P., Cowan, N., Stuart, G., & Brown, G. (1999). Think before you speak: Pauses, memory search, and trace redintegration processes in verbal memory span. Jour nal of Experimental Psychology: Learning, Memory, and Cognition, 25 (2), 447–463. Hunter, W. S. (1913). The delayed reaction in animals and children. Behavioral Mono graphs, 2, 1–86. Ingvar, D. H. (1977). Functional responses of the human brain studied by regional cere bral blood flow techniques. Acta Clinical Belgica, 32 (2), 68–83. Page 39 of 48

Working Memory Ingvar, D. H., & Risberg, J. (1965). Influence of mental activity upon regional cerebral blood flow in man: A preliminary study. Acta Neurologica Scandinavica Supplementum, 14, 183–186. Jacobsen, C. F. (1936). Studies of cerebral function in primates. Comparative Psychologi cal Monographs, 13, 1–68. Jaeggi, S. M., Buschkuehl, M., Jonides, J., & Perrig, W. J. (2008). Improving fluid intelli gence with training on working memory. Proceedings of the National Academy of Sciences U S A, 105 (19), 6829–6833. Jaeggi, S. M., Buschkuehl, M., Jonides, J., & Shah, P. Short- and long-term benefits of cog nitive training. Proceedings of the National Academy of Sciences U S A, 108 (25), 10081– 10086. Jha, A. P., & McCarthy, G. (2000). The influence of memory load upon delay-interval activi ty in a working-memory task: an event-related functional MRI study. Journal of Cognitive Neuroscience, 12 (Suppl 2), 90–105. Jonides, J., & Nee, D. E. (2006). Brain mechanisms of proactive interference in working memory. Neuroscience, 139 (1), 181–193. Jonides, J., Schumacher, E. H., Smith, E. E., Koeppe, R. A., Awh, E., Reuter-Lorenz, P. A., et al. (1998). The role of parietal cortex in verbal working memory. Journal of Neuro science, 18 (13), 5026–5034. Jonides, J., Smith, E. E., Koeppe, R. A., Awh, E., Minoshima, S., & Mintun, M. A. (1993). Spatial working memory in humans as revealed by PET. Nature, 363 (6430), 623–625. Joseph, J. P., & Barone, P. (1987). Prefrontal unit activity during a delayed oculomotor task in the monkey. Experimental Brain Research, 67 (3), 460–468. Kahneman, D. (1973). Attention and effort. New York: Prentice-Hall. Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: A module in human extrastriate cortex specialized for face perception. Journal of Neuroscience, 17 (11), 4302–4311. Klauer, K. C., & Zhao, Z. (2004). Double dissociations in visual and spatial short-term memory. Journal of Experimental Psychology: General, 133 (3), 355–381. Klingberg, T. (2010). Training and plasticity of working memory. Trends in Cognitive Sciences, 14 (7), 317–324. Klingberg, T., Fernell, E., Olesen, P. J., Johnson, M., Gustafsson, P., Dahlstrom, K., et al. (2005). Computerized training of working memory in children with ADHD: A randomized, controlled trial. Journal of the American Academy of Child and Adolescent Psychiatry, 44 (2), 177–186. Page 40 of 48

Working Memory Knight, R. T., Staines, W. R., Swick, D., & Chao, L. L. (1999). Prefrontal cortex regulates inhibition and excitation in distributed neural networks. Acta Psychologica (Amst), 101 (2-3), 159–178. Koelsch, S., Schulze, K., Sammler, D., Fritz, T., Muller, K., & Gruber, O. (2009). Functional architecture of verbal and tonal working memory: an FMRI study. Human Brain Mapping, 30 (3), 859–873. (p. 413)

Koenigs, M., Acheson, D. J., Barbey, A. K., Solomon, J., Postle, B. R., & Grafman, J. (2011). Areas of left perisylvian cortex mediate auditory-verbal short-term memory. Neuropsy chologia, 49 (13), 3612–3619. Leff, A. P., Schofield, T. M., Crinion, J. T., Seghier, M. L., Grogan, A., Green, D. W., et al. (2009). The left superior temporal gyrus is a shared substrate for auditory short-term memory and speech comprehension: Evidence from 210 patients with stroke. Brain, 132 (Pt 12), 3401–3410. Leung, H. C., Seelig, D., & Gore, J. C. (2004). The effect of memory load on cortical activi ty in the spatial working memory circuit. Cognitive, Affective, and Behavioral Neuro science, 4 (4), 553–563. Lewis-Peacock, J. A., Drysdale, A. T., Oberauer, K., & Postle, B. R. (2012). Neural evidence for a distinction between short-term memory and the focus of attention. Journal of Cogni tive Neuroscience, 24 (1), 61–79. Linden, D. E., Bittner, R. A., Muckli, L., Waltz, J. A., Kriegeskorte, N., Goebel, R., et al. (2003). Cortical capacity constraints for visual working memory: Dissociation of fMRI load effects in a fronto-parietal network. NeuroImage, 20 (3), 1518–1530. Logie, R. H. (1995). Visuo-spatial working memory. Hove, UK: Erlbaum. Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features and conjunctions. Nature, 390 (6657), 279–281. Luria, A. R., Sokolov, E. N., & Klinkows, M. (1967). Towards a neurodynamic analysis of memory disturbances with lesions of left temporal lobe. Neuropsychologia, 5 (1), 1. Martin, R. C., & Freedman, M. L. (2001). Short-term retention of lexical-semantic repre sentations: Implications for speech production. Memory, 9 (4), 261–280. Martin, R. C., & He, T. (2004). Semantic short-term memory and its role in sentence pro cessing: a replication. Brain and Language, 89 (1), 76–82. McCarthy, G., Blamire, A. M., Puce, A., Nobre, A. C., Bloch, G., Hyder, F., et al. (1994). Functional magnetic resonance imaging of human prefrontal cortex activation during a spatial working memory task. Proceedings of the National Academy of Sciences U S A, 91 (18), 8690–8694.

Page 41 of 48

Working Memory McCarthy, G., Puce, A., Constable, R. T., Krystal, J. H., Gore, J. C., & Goldman-Rakic, P. (1996). Activation of human prefrontal cortex during spatial and nonspatial working mem ory tasks measured by functional MRI. Cerebral Cortex, 6 (4), 600–611. McCarthy, R. A., & Warrington, E. K. (1987). The double dissociation of short-term memo ry for lists and sentences. Evidence from aphasia. Brain, 110 (Pt 6), 1545–1563. McClelland, J. L., McNaughton, B. L., & O’Reilly, R. C. (1995). Why there are complemen tary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102 (3), 419–457. Mecklinger, A., Gruenewald, C., Besson, M., Magnie, M. N., & Von Cramon, D. Y. (2002). Separable neuronal circuitries for manipulable and non-manipulable objects in working memory. Cerebral Cortex, 12 (11), 1115–1123. Miller, B. T., & D’Esposito, M. (2005). Searching for “the top” in top-down control. Neuron, 48 (4), 535–538. Miller, E. K. (2000). The prefrontal cortex: No simple matter. NeuroImage, 11 (5 Pt 1), 447–450. Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. An nual Review of Neuroscience, 24, 167–202. Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (2000). The unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent variable analysis. Cognitive Psychology, 41 (1), 49–100. Morrison, A. B., & Chein, J. M. (2011). Does working memory training work? The promise and challenges of enhancing cognition by training working memory. Psychonomic Bulletin and Review, 18 (1), 46–60. Munk, M. H., Linden, D. E., Muckli, L., Lanfermann, H., Zanella, F. E., Singer, W., et al. (2002). Distributed cortical systems in visual short-term memory revealed by event-relat ed functional magnetic resonance imaging. Cerebral Cortex, 12 (8), 866–876. Munneke, J., Belopolsky, A. V., & Theeuwes, J. (2012). Shifting Attention within Memory Representations Involves Early Visual Areas. PLoS One, 7 (4), e35528. Niki, H. (1974). Differential activity of prefrontal units during right and left delayed re sponse trials. Brain Research, 70 (2), 346–349. Niki, H., & Watanabe, M. (1976). Prefrontal unit activity and delayed response: Relation to cue location versus direction of response. Brain Research, 105 (1), 79–88. Nystrom, L. E., Braver, T. S., Sabb, F. W., Delgado, M. R., Noll, D. C., & Cohen, J. D. (2000). Working memory for letters, shapes, and locations: fMRI evidence against stimu Page 42 of 48

Working Memory lusbased regional organization in human prefrontal cortex. NeuroImage, 11 (5 Pt 1), 424– 446. O’Reilly, R. C., & Norman, K. A. (2002). Hippocampal and neocortical contributions to memory: advances in the complementary learning systems framework. Trends in Cogni tive Sciences, 6 (12), 505–510. Oberauer, K., Schulze, R., Wilhelm, O., & Suss, H. M. (2005). Working memory and intelli gence—their correlation and their relation: Comment on Ackerman, Beier, and Boyle (2005). Psychological Bulletin, 131 (1), 61–65; author reply, 72–65. Olsen, R. K., Nichols, E. A., Chen, J., Hunt, J. F., Glover, G. H., Gabrieli, J. D., et al. (2009). Performance-related sustained and anticipatory activity in human medial temporal lobe during delayed match-to-sample. Journal of Neuroscience, 29 (38), 11880–11890. Olson, I. R., Page, K., Moore, K. S., Chatterjee, A., & Verfaellie, M. (2006). Working memo ry for conjunctions relies on the medial temporal lobe. Journal of Neuroscience, 26 (17), 4596–4601. Owen AM, Hampshire A, Grahn JA, Stenton R, Dajani S, Burns AS, Howard RJ, Ballard CG (2010). Putting brain training to the test. Nature, 465: 775–778. Pashler, H. (1988). Familiarity and visual change detection. Perception and Psychophysics, 44 (4), 369–378. Passingham, R. E. (1985). Memory of monkeys (Macaca mulatta) with lesions in pre frontal cortex. Behavioral Neuroscience, 99 (1), 3–21. Paulesu, E., Frith, C. D., & Frackowiak, R. S. (1993). The neural correlates of the verbal component of working memory. Nature, 362 (6418), 342–345. Petrides, M., Alivisatos, B., Evans, A. C., & Meyer, E. (1993). Dissociation of hu man mid-dorsolateral from posterior dorsolateral frontal cortex in memory processing. Proceedings of the National Academy of Science U S A, 90 (3), 873–877. (p. 414)

Poldrack, R. A., Wagner, A. D., Prull, M. W., Desmond, J. E., Glover, G. H., & Gabrieli, J. D. (1999). Functional specialization for semantic and phonological processing in the left in ferior prefrontal cortex. NeuroImage, 10 (1), 15–35. Posner, M. I., Petersen, S. E., Fox, P. T., & Raichle, M. E. (1988). Localization of cognitive operations in the human brain. Science, 240 (4859), 1627–1631. Postle, B. R. (2006b). Working memory as an emergent property of the mind and brain. Neuroscience, 139 (1), 23–38. Postle, B. R., Berger, J. S., & D’Esposito, M. (1999). Functional neuroanatomical double dissociation of mnemonic and executive control processes contributing to working memo

Page 43 of 48

Working Memory ry performance. Proceedings of the National Academy of Science U S A, 96 (22), 12959– 12964. Postle, B. R., Berger, J. S., Taich, A. M., & D’Esposito, M. (2000). Activity in human frontal cortex associated with spatial working memory and saccadic behavior. Journal of Cogni tive Neuroscience, 12 (Suppl 2), 2–14. Postle, B. R., & D’Esposito, M. (1999). “What”-Then-Where” in visual working memory: An event-related fMRI study. Journal of Cognitive Neuroscience, 11 (6), 585–597. Postle, B. R., D’Esposito, M., & Corkin, S. (2005). Effects of verbal and nonverbal interfer ence on spatial and object visual working memory. Memory and Cognition, 33 (2), 203– 212. Postle, B. R., Druzgal, T. J., & D’Esposito, M. (2003). Seeking the neural substrates of vi sual working memory storage. Cortex, 39 (4-5), 927–946. Postman, L., & Phillips, L.-W. (1965). Short-term temporal changes in free recall. Quarter ly Journal of Experimental Psychology, 17, 132–138. Prabhakaran, V., Narayanan, K., Zhao, Z., & Gabrieli, J. D. (2000). Integration of diverse information in working memory within the frontal lobe. Nature Neuroscience, 3 (1), 85– 90. Quintana, J., Yajeya, J., & Fuster, J. M. (1988). Prefrontal representation of stimulus attrib utes during delay tasks. I. Unit activity in cross-temporal integration of sensory and sen sory-motor information. Brain Research, 474 (2), 211–221. Rama, P., Poremba, A., Sala, J. B., Yee, L., Malloy, M., Mishkin, M., et al. (2004). Dissocia ble functional cortical topographies for working memory maintenance of voice identity and location. Cerebral Cortex, 14 (7), 768–780. Rama, P., Sala, J. B., Gillen, J. S., Pekar, J. J., & Courtney, S. M. (2001). Dissociation of the neural systems for working memory maintenance of verbal and nonspatial visual informa tion. Cognitive, Affective, and Behavioral Neuroscience, 1 (2), 161–171. Ranganath, C., Cohen, M. X., Dam, C., & D’Esposito, M. (2004). Inferior temporal, pre frontal, and hippocampal contributions to visual working memory maintenance and asso ciative memory retrieval. Journal of Neuroscience, 24 (16), 3917–3925. Ranganath, C., & D’Esposito, M. (2001). Medial temporal lobe activity associated with ac tive maintenance of novel information. Neuron, 31 (5), 865–873. Rauschecker, J. P., & Scott, S. K. (2009). Maps and streams in the auditory cortex: nonhu man primates illuminate human speech processing. Nature Neuroscience, 12 (6), 718– 724.

Page 44 of 48

Working Memory Rauschecker, J. P., & Tian, B. (2000). Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proceedings of the National Academy of Sciences U S A, 97 (22), 11800–11806. Raye, C. L., Johnson, M. K., Mitchell, K. J., Reeder, J. A., & Greene, E. J. (2002). Neu roimaging a single thought: Dorsolateral PFC activity associated with refreshing just-acti vated information. NeuroImage, 15 (2), 447–453. Risberg, J., & Ingvar, D. H. (1973). Patterns of activation in the grey matter of the domi nant hemisphere during memorizing and reasoning: A study of regional cerebral blood flow changes during psychological testing in a group of neurologically normal patients. Brain, 96 (4), 737–756. Robbins, T. W. (2007). Shifting and stopping: Fronto-striatal substrates, neurochemical modulation and clinical implications. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 362 (1481), 917–932. Romanski, L. M. (2004). Domain specificity in the primate prefrontal cortex. Cognitive, Af fective, and Behavioral Neuroscience, 4 (4), 421–429. Romanski, L. M., Tian, B., Fritz, J., Mishkin, M., Goldman-Rakic, P. S., & Rauschecker, J. P. (1999). Dual streams of auditory afferents target multiple domains in the primate pre frontal cortex. Nature Neuroscience, 2 (12), 1131–1136. Rose, N. S., Olsen, R. K., Craik, F. I., & Rosenbaum, R. S. (2012). Working memory and amnesia: the role of stimulus novelty. Neuropsychologia, 50 (1), 11–18. Rowe, J. B., Toni, I., Josephs, O., Frackowiak, R. S., & Passingham, R. E. (2000). The pre frontal cortex: response selection or maintenance within working memory? Science, 288 (5471), 1656–1660. Sala, J. B., Rama, P., & Courtney, S. M. (2003). Functional topography of a distributed neural system for spatial and nonspatial information maintenance in working memory. Neuropsychologia, 41 (3), 341–356. Salame, P., & Baddeley, A. D. (1982). Disruption of short-term memory by unattended speech: Implications for the structure of working memory. Journal of Verbal Learning and Verbal Behavior, 21, 150–164. Salmon, E., Van der Linden, M., Collette, F., Delfiore, G., Maquet, P., Degueldre, C., et al. (1996). Regional brain activity during working memory tasks. Brain, 119 (Pt 5), 1617– 1625. Schumacher, E. H., Lauber, E., Awh, E., Jonides, J., Smith, E. E., & Koeppe, R. A. (1996). PET evidence for an amodal verbal working memory system. NeuroImage, 3 (2), 79–88. Scoville, W. B., & Milner, B. (1957). Loss of recent memory after bilateral hippocampal le sions. Journal of Neurology, Neurosurgery, and Psychiatry, 20 (1), 11–21. Page 45 of 48

Working Memory Shallice, T. (1982). Specific impairments of planning. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 298 (1089), 199–209. Shallice, T., & Butterworth, B. (1977). Short-term memory impairment and spontaneous speech. Neuropsychologia, 15 (6), 729–735. Shallice, T., & Vallar, G. (1990). The impairment of auditory-verbal short-term storage. In G. Vallar & T. Shallice (Eds.), Neuropsychological impairments of short-term memory (pp. 11–53). Cambridge, UK: Cambridge University Press. Shallice, T., & Warrington, E. K. (1970). Independent functioning of verbal memory stores: A neuropsychological study. Quarterly Journal of Experimental Psychology, 22 (2), 261–273. Shallice, T., & Warrington, E. K. (1977). Auditory-verbal short-term-memory impairment and conduction aphasia. Brain and Language, 4 (4), 479–491. Shimamura, A. P. (2000). Toward a cognitive neuroscience of metacognition. Con sciousness and Cognition, 9 (2 Pt 1), 313–323; discussion 324–316. (p. 415)

Shipstead, Z., Redick, T. S., & Engle, R. W. (2012). Is working memory training effective? Psychological Bulletin, 138 (4). Smith, E. E., & Jonides, J. (1999). Storage and executive processes in the frontal lobes. Science, 283 (5408), 1657–1661. Smith, E. E., Jonides, J., Koeppe, R. A., Awh, E., Schumacher, E. H., & Minoshima, S. (1995). Spatial versus object working-memory—PET investigations. Journal of Cognitive Neuroscience, 7 (3), 337–356. Smith, E. E., Jonides, J., Marshuetz, C., & Koeppe, R. A. (1998). Components of verbal working memory: evidence from neuroimaging. Proceedings of the National Academy of Sciences U S A, 95 (3), 876–882. Srimal, R., & Curtis, C. E. (2008). Persistent neural activity during the maintenance of spatial position in working memory. NeuroImage, 39 (1), 455–468. Stevens, A. A. (2004). Dissociating the cortical basis of memory for voices, words and tones. Brain Research. Cognitive Brain Research, 18 (2), 162–171. Stuss, D. T., & Alexander, M. P. (2007). Is there a dysexecutive syndrome? Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 362 (1481), 901–915. Thompson-Schill, S. L., D’Esposito, M., Aguirre, G. K., & Farah, M. J. (1997). Role of left inferior prefrontal cortex in retrieval of semantic knowledge: A reevaluation. Proceedings of the National Academy of Sciences U S A, 94 (26), 14792–14797.

Page 46 of 48

Working Memory Thompson-Schill, S. L., Jonides, J., Marshuetz, C., Smith, E. E., D’Esposito, M., Kan, I. P., et al. (2002). Effects of frontal lobe damage on interference effects in working memory. Cognitive, Affective, and Behavioral Neuroscience, 2 (2), 109–120. Tian, B., Reser, D., Durham, A., Kustov, A., & Rauschecker, J. P. (2001). Functional special ization in rhesus monkey auditory cortex. Science, 292 (5515), 290–293. Todd, J. J., & Marois, R. (2004). Capacity limit of visual short-term memory in human pos terior parietal cortex. Nature, 428 (6984), 751–754. Tomita, H., Ohbayashi, M., Nakahara, K., Hasegawa, I., & Miyashita, Y. (1999). Top-down signal from prefrontal cortex in executive control of memory retrieval. Nature, 401 (6754), 699–703. Ungerleider, L., & Mishkin, M. (1982). Two cortical visual systems. In D. Ingle, M. A. Goodale & R. J. W. Mansfield (Eds.), Analysis of visual behavior (pp. 549–586). Cambridge, M A: MIT Press. Vallar, G., & Baddeley, A. (1984). Fractionation of working memory: Neuropsychological Evidence for a phonological short-term store. Journal of Verbal Learning and Verbal Be havior, 23, 151–161. Vogel, E. K., & Machizawa, M. G. (2004). Neural activity predicts individual differences in visual working memory capacity. Nature, 428 (6984), 748–751. Vogel, E. K., McCollough, A. W., & Machizawa, M. G. (2005). Neural measures reveal indi vidual differences in controlling access to working memory. Nature, 438 (7067), 500–503. Wager, T. D., & Smith, E. E. (2003). Neuroimaging studies of working memory: A metaanalysis. Cognitive, Affective, and Behavioral Neuroscience, 3 (4), 255–274. Wagner, A. D., Pare-Blagoev, E. J., Clark, J., & Poldrack, R. A. (2001). Recovering meaning: Left prefrontal cortex guides controlled semantic retrieval. Neuron, 31 (2), 329–338. Walter, H., Wunderlich, A. P., Blankenhorn, M., Schafer, S., Tomczak, R., Spitzer, M., et al. (2003). No hypofrontality, but absence of prefrontal lateralization comparing verbal and spatial working memory in schizophrenia. Schizophrenic Research, 61 (2-3), 175–184. Warrington, E., & Shallice, T. (1969). Selective impairment of auditory verbal short-term memory. Brain, 92, 885–886. Waugh, N. C., & Norman, D. A. (1965). Primary memory. Psychological Review, 72, 89– 104. Wilson, F. A., Scalaidhe, S. P., & Goldman-Rakic, P. S. (1993). Dissociation of object and spatial processing domains in primate prefrontal cortex. Science, 260 (5116), 1955–1958. Xu, Y., & Chun, M. M. (2006). Dissociable neural mechanisms supporting visual shortterm memory for objects. Nature, 440 (7080), 91–95. Page 47 of 48

Working Memory Yi, D. J., Woodman, G. F., Widders, D., Marois, R., & Chun, M. M. (2004). Neural fate of ig nored stimuli: Dissociable effects of perceptual and working memory load. Nature Neuro science, 7 (9), 992–996. Zarahn, E., Aguirre, G., & D’Esposito, M. (1997). A trial-based experimental design for fMRI. NeuroImage, 6 (2), 122–138. Zurowski, B., Gostomzyk, J., Gron, G., Weller, R., Schirrmeister, H., Neumeier, B., et al. (2002). Dissociating a common working memory network from different neural substrates of phonological and spatial stimulus processing. NeuroImage, 15 (1), 45–57.

Bradley R. Buchsbaum

Bradley R. Buchsbaum, Rotman Research Institute, Baycrest Centre, Toronto, On tario, Canada Mark D'Esposito

Mark D’Esposito is Professor of Neuroscience and Psychology, and Director of the Henry H. Wheeler, Jr. Brain Imaging Center at the Helen Wills Neuroscience Institute at the University of California, Berkeley. He is also Director of the Neurorehabilita tion Unit at the Northern California VA Health Care System and Adjunct Professor of Neurology at UCSF.

Page 48 of 48

Motor Skill Learning

Motor Skill Learning Rachael Seidler, Bryan L. Benson, Nathaniel B. Boyden, and Youngbin Kwak The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0020

Abstract and Keywords Early neuroscience experiments of skill learning focused predominantly on motor cortical plasticity. More recently, experiments have shown that cognitive processes such as work ing memory and error detection are engaged in the service of motor skill learning, partic ularly early in the learning process. This engagement early in learning maps onto pre frontal and striatal brain regions. As learning progresses, skill performance becomes au tomated and is no longer associated with prefrontal cortical recruitment. “Choking under pressure” may involve a return to early learning cognitive control mechanisms, resulting in disruption of highly learned skills. Conversely, analogy learning may speed skill acqui sition by allowing learners to bypass the more cognitively demanding early stages. Ques tions for the future include the relative involvement of and interaction between implicit and explicit memory systems for different types of skill learning, and the impact of experi ential and genetic individual differences on learning success. Keywords: skill learning, working memory, error, implicit, explicit

Introduction Do you aspire to take up the piano? To improve your tennis or golf game? Or, perhaps you would like to maintain your motor abilities in the face of injury, growth, advancing age, or disease? Skill learning underlies our capacity for such adaptive motor behaviors. Re search over the past 20 years has provided great insight into the dynamic neural process es underlying human motor skill acquisition, focusing primarily on brain networks that are engaged during early versus late stages of learning. What has been challenging for the field is to tightly link these shifting neural processes with what is known about mea sureable behavioral changes and strategic processes that occur during learning. Because individuals learn at different rates and often adopt different strategies, it is difficult to characterize the dynamics of evolving motor behaviors. Moreover, skill learning is often implicit, so verbal reports about the learning process are imprecise and unreliable. Here, we review our current understanding of skill learning from a cognitive neuroscience Page 1 of 38

Motor Skill Learning perspective, with a particular emphasis on linking the cognitive (i.e., which strategies and behavioral processes are relied on for skill learning) with the neuroscience (i.e., which neural networks underlie these processes, and where motor memories are stored in the brain). Researchers in different disciplines have employed varying approaches to these two topics with relatively little crosstalk. For example, those working from the behavioral approach have been focused on questions such as whether implicit or explicit memory systems are engaged during early and late learning; those working from the computation al neuroscience approach have modeled fast and slow processes of learning; and those working from the cognitive neuroscience approach have identified large-scale shifts in brain networks that are engaged (p. 417) during early versus late learning. Here, we en deavor to bring together these predominantly autonomous schools of thought. Given the constraints of this chapter, we focus on learning from overt practice and do not address other aspects of the field such as sleep-dependent consolidation of learning, transfer of learning, mental practice, and learning by observing others. Moreover, these topics have been recently reviewed elsewhere (Garrison, Winstein, & Aziz-Zadeh, 2010; Krakauer & Shadmehr, 2006; Robertson, Pascual-Leone, & Miall, 2004; Seidler, 2010). Researchers studying skill acquisition have classified learning into at least two broad cat egories: sensorimotor adaptation and sequence learning (Doyon & Benali, 2005; Willing ham, 1998). In sensorimotor adaptation paradigms, participants modify movements to ad just to changes in either sensory input or motor output characteristics. A real-world ex ample is learning to drive a new car: The magnitude of vehicle movement in response to the amount of wheel turn and accelerator depression varies across vehicles. Thus, the dri ver must learn the new mapping between his or her actions and the resulting vehicle movements. Another example of sensorimotor adaptation is learning how the size and speed of hand movements of a mouse correspond to cursor movements on a computer display screen. The study of motor performance under transformed spatial mappings spans over 100 years. Helmholtz (1867) originally used prisms to invert the visual world, whereas more recent investigations make use of computer displays to alter visual feed back of movement (Cunningham, 1989; Ghilardi, Gordon, & Ghez, 1995; Krakauer, Pine, Ghilardi, & Ghez, 2000; Seidler, Noll, & Chintalapati, 2006). These studies have demon strated that sensorimotor adaptation occurs when movements are actively made in the new environment. Motor sequence learning refers to the progressive association between isolated elements of movement, eventually allowing for a multi-element sequence to be performed quickly. A real-world example is a gymnast learning a new tumbling routine. The gymnast already knows how to perform the individual skills in isolation, but must practice to combine them with seamless transitions.

Behavioral Models of Skill Learning Acquiring a skill requires practice and repetition. The father of modern psychology, William James, understood this when he wrote: “We must make automatic and habitual, Page 2 of 38

Motor Skill Learning as early as possible, as many useful actions as we can” (James, 1890). Even well before James, the ancient Greeks knew that repetitive action brought about physiological and behavioral changes in an organism. In his treatise, On the Motion of Animals (1908/330 BC), Aristotle noted that “the body must straightaway be moved and changed with the changes that nature makes dependent upon one another.” It is now more than 2,000 years since Aristotle first philosophized on the changes associ ated with the movement of organisms. How much, or how little, have recent advances in cognitive neuroscience changed the way we think about skill learning? What theories and ideas do we still hold dear and which have we discarded? This section of the chapter will highlight a few influential models of the processes underlying skill learning that have been offered in the past 50 years or so.

Fitts’ and Posner’s Stage Model In 1967 Paul Fitts and Michael Posner proposed an influential model for understanding motor skill acquisition (Fitts & Posner, 1967). In their model, the learning of a movement progresses through three interrelated phases: the cognitive phase, the associative phase, and the autonomous phase. One utility of this model is that it applies well to both motor and cognitive skill acquisition.

Figure 20.1 The shift from early to late learning processes that occurs as a function of skill acquisi tion. Choking under pressure has been linked to a re gression back to the cognitively demanding early stages of learning, whereas analogy learning is thought to accelerate performance improvements through bypassing this stage.

In the cognitive stage, learners must use their attentional resources to break down a de sired skill into discrete components. This involves creating a mental picture of the skill, which helps to facilitate an understanding of how these parts come together to form cor Page 3 of 38

Motor Skill Learning rect execution of the desired movement. Performance at this stage might be defined as conscious incompetence; individuals are cognizant of the various components of the task, yet cannot perform them efficiently and effectively. The next phase of the model, the asso ciative phase, requires repeated practice and the use of feedback to link the component parts into a smooth action. The ability to distinguish important from unimportant stimuli is central to this stage of the model. For example, if a baseball player is learning to detect and hit a curveball, attention should be paid to the most telling stimuli, such as the angle of the wrist on the pitcher’s throwing hand, not the height of the pitcher’s leg-kick. The last stage of the model, the autonomous stage, involves development of the learned skill so that it becomes habitual and automatic. Individuals at this stage rely on processes that require little or no conscious attention. Performance may be defined as unconscious com petence, and is (p. 418) reliant on experience and stored knowledge easily accessible for the execution of the motor skill (Figure 20.1). As will be seen throughout this chapter, this model has had an enduring impact on the field of skill learning.

Closed-Loop and Open-Loop Control Another influential theory from the information processing perspective is Adams’ (1971) closed-loop theory of skill acquisition. In this theory, two independent memory represen tations, a memory trace and a perceptual trace, serve to facilitate the learning of selfpaced pointing movements. Selection and initiation of movement are the responsibility of the memory trace, whose strength is seen as a function of the stimulus–response contigui ty. With practice the strength of the memory trace is increased (Newell, 1991). The per ceptual trace relies on feedback, both internal and external, to create an image of the cor rectness of the desired skill. Motor learning then takes place through the process of de tecting errors and inconsistencies between the perceptual trace and memory trace (Adams, 1971). Repetition, feedback, and refinement serve to produce a set of associated sensory representations and movement units that can be called on, depending on the req uisite skill. Interestingly, both internal and external feedback loops are incorporated into more modern skill acquisition theories as well. Although Adams’ theory led to numerous empirical examinations of two-state memory representations of motor learning, there are theoretical problems associated with the closed-loop theory. One such problem concerns novel movements. If movement units are stored with a memory trace and a perceptual trace in a one-to-one matching, then how does a learner generate new movements? Schmidt’s (1975) schema theory sought to deal with this problem by allowing for recognition mechanisms of movement to be generalized and mapped in a one-to-many fashion. Schema theory proposes that learning motor skills results in the construction of “generalized motor programs” dependent on the relation ship between variables rather than the absolute instantiations of the variable themselves (Newell, 1991). Thus, representations of individual actions themselves are not stored. Rather, abstract relationships or rules for motor programs are stored and called on through associative stimuli and habit strengths and can be executed without delay (Sch midt, 1975). Page 4 of 38

Motor Skill Learning

Bayesian Models of Skill Learning Many theories of skill learning describe processes of optimization. The oldest of these theories is optimization of movement cost; that is, as learning progresses, performance of the skill becomes not only more automatic, but also less effortful. For example, in the case of runners, refinement and optimization of running form allows athletes to run at the same speed with less effort, and consequently to run faster at maximal effort (Conley & Krahenbuhl, 1980; Jones, 1998). In the case of Olympic weightlifters, learning complex techniques allows them to lift heavier weights with less effort (Enoka, 1988). This idea fits well with the theory of evolution: strategies that minimize energy consump tion should be favored. As a consequence, early studies of motor control in the framework of optimization focused on the metabolic cost of movements, particularly gait (Atzler & Herbst, 1927; Ralston, 1958). These studies demonstrated good accordance between the metabolically optimal walking speed and preferred walking speed (Holt, Hamill, & An dres, 1991). However, movements are not always optimized for lowest metabolic cost, even after ex tensive (p. 419) practice (Nelson, 1983). One possible reason for this is that there are sim ply too many degrees of freedom for the motor system to sample all possibilities. Further more, our environment is highly variable, our information about it is too limited, and our movements and their associated feedback are too noisy. In an attempt to explain how individuals adapt despite such variability, researchers have suggested that the motor system relies on a Bayesian model to maximize the likelihood of desired outcomes (Geisler, 1989; Gepshtein, Seydell, & Trommershauser, 2007; Körding & Wolpert, 2004; Maloney & Mamassian, 2009; Seydell, McCann, Trommershauser, & Knill, 2008; Trommershäuser, Maloney, & Landy, 2008). Such approaches predict that move ment accuracy is stressed early in learning, which quickly eliminates many inefficient movement patterns. A model of Bayesian transfer, in which parameters from one task are used to more rapidly learn a second, novel task, predicts that learning complex movements is achieved most efficiently by first learning component, simple movements (Maloney & Mamassian, 2009). This may seem like common sense: beginning piano students don’t start right away play ing concertos, but rather start with single notes, scales, and chords, and progress from there. In the beginning, basic commands, known as motor primitives, are combined to produce a simple movement such as a key press or a pen stroke (Nishimoto & Tani, 2009; Paine & Tani, 2004; Polyakov, Drori, Ben-Shaul, Abeles, & Flash, 2009; Thoroughman & Shadmehr, 2000). By learning simple movements first, the brain is able to simultaneously form a model of the environment and its uncertainty, which transfers to other tasks (Mc Nitt-Gray, Requejo, & Flashner, 2006; Seydell et al., 2008). The advent and widespread use of neuroimaging techniques have opened the door for in tegration of these and other psychological theories with evolving views of brain function. Questions that have dominated the field of neuromotor control, and which continue to Page 5 of 38

Motor Skill Learning take center stage today, include: What are the neural bases of the stages of learning? How does their identification inform our knowledge regarding the underlying processes of skill learning? Where in the brain are motor memories formed and stored? How gener alizable are these representations?

Cognitive Neuroscience of Skill Learning The Beginning Early neuroimaging studies often investigated motor tasks because they allow for simple recording and tracking of overt behavioral responses (Grafton, Mazziotta, Woods, & Phelps, 1992; Kim et al., 1993; Pascual-Leone, Brasil-Neto, Valls-Sole, Cohen, & Hallett, 1992; Pascual-Leone, Valls-Sole, et al., 1992). Additionally, the discovery that motor repre sentations were capable of exhibiting experience-dependent change even in the adult brain was greeted with much interest and excitement. These earlier experiments identi fied prominent roles for the motor cortex, cerebellum, and striatum in skill learning, com plementing the results of earlier experiments conducted with neurological patients (Pas cual-Leone et al., 1993; Weiner, Hallett, & Funkenstein, 1983). Much of the early neu roimaging experiments of skill learning focused specifically on the motor cortex, identify ing an initial expansion of functional movement representations in primary motor cortex, followed by retraction of these representations as movement automatization progressed (Doyon, Owen, Petrides, Sziklas, & Evans, 1996; Grafton, Mazziotta, Presty, et al., 1992; Grafton, Woods, Mazziotta, & Phelps, 1991; Jueptner, Frith, Brooks, Frackowiak, & Pass ingham, 1997; Jueptner, Stephan, et al., 1997; Karni et al., 1995, 1998; Pascual-Leone, Grafman, & Hallett, 1994; Pascual-Leone & Torres, 1993). It seems logical to begin studying motor learning processes by focusing on the motor cor tex. However, these and other subsequent studies also reported extensive activation of “nonmotor” brain networks during skill learning as well, including engagement of dorso lateral prefrontal cortex, parietal cortex, anterior cingulate cortex, and associative re gions of the striatum (cf. Doyon & Benali, 2005). This work is extensively reviewed in the current chapter, with a particular emphasis on the processes and functions of both non motor and motor networks that are engaged across the time course of learning. Recent neuroimaging studies have tracked the time course of involvement of these addi tional brain structures. Typically, frontal-parietal systems are engaged early in learning with a shift to activation in more “basic” motor cortical and subcortical structures later in learning. This is taken as support that early learning is a more cognitively controlled process, whereas later learning is more automatic. This view is supported by studies showing dual-task interference during early learning (Eversheim & Bock, 2001; Taylor & Thoroughman, 2007, 2008) and those reporting correlations between cognitive capacity and the rate of early learning (Anguera, Reuter-Lorenz, Willingham, & Seidler, 2010; Bo, Borza, & Seidler, 2009; Bo & Seidler, 2009). (p. 420) In the subsequent sections, we out

Page 6 of 38

Motor Skill Learning line evidence and provide reasoned speculation regarding precisely which cognitive mechanisms are engaged during early skill learning.

Early Learning Processes In this section, we describe what is known about the neurocognitive processes of skill learning, emphasizing the early and late stages of learning. Let’s consider an example of a novice soccer player learning to make a shot on goal. Initially, she may be rehearsing her coach’s instructions in working memory (see later section, Role of Working Memory in Skill Learning), reminding herself to keep her toe down and her ankle locked, and to hit the ball with her instep. When her first shot goes wide, she will engage error detec tion and correction mechanisms (see next section, Error Detection and Correction) to ad just her aim for the next attempt, relying again on working memory to adjust motor com mands based on recent performance history. It seems intuitive, both from this example and our own experiences, that these early learning processes are cognitively demanding. As our budding soccer player progresses in skill, she can become less inwardly focused on the mechanics of her own shot and start learning about more tactical aspects of the sport. At this point, her motor performance has become fluid and automatized (see later section, Late Learning Processes). When learning a new sport or skill, this transition from “early” to “late” processing occurs not just once, but rather at multiple levels of compe tency as the learner progresses through more and more complex aspects of her or his do main of expertise (Schack & Mechsner, 2006; Wolpert & Flanagan, 2010). These stages of learning are likely not dissociated by discrete transitions, but rather overlap in time.

Error Detection and Correction Learning from errors is one of the basic principles of motor skill acquisition. Current ideas about error-based learning stem from forward model control theories (Diedrichsen, Shadmehr, & Ivry, 2009; Kawato, 1999; Shadmehr, Smith, & Krakauer, 2010; Wolpert & Miall, 1996). When movement errors are detected by sensory systems, the information is used to update the motor commands for subsequent actions. However, relying solely on sensory feedback does not allow for efficient motor adjustments because of the time de lay between the initial motor command and the arrival of sensory feedback. Movement in duces continuous changes to state variables such as limb position and velocity. To allow for accurate movement adjustments, the motor system relies on a forward model that makes predictions of the sensory outcomes (i.e., changes in position and velocity) associ ated with a given motor command (Bastian, 2006; Flanagan, Vetter, Johansson, & Wolpert, 2003). Differences between the predicted and actual sensory outcome serve as the feedback error signal that updates forthcoming motor commands. When learning a new motor skill such as moving a cursor on a computer screen, predic tion error becomes critical: New skills do not have enough of a motor history for an accu rate forward model, resulting in large prediction errors. In this case, the process of learn ing involves updating motor commands through multiple exposures to motor errors and Page 7 of 38

Motor Skill Learning gradually reducing them by refining the forward model (Donchin, Francis, & Shadmehr, 2003; Shadmehr et al., 2010). The mechanisms of error-based learning are often studied using visual-motor adaptation and force field adaptation tasks. Visual-motor adaptation involves distortion of the visual consequences of movement, whereas force field adaptation affects the proprioceptive consequences of motor commands by altering the dynamics of movement (Lalazar & Vaa dia, 2008; Shadmehr et al., 2010). Error processing under these two paradigms shows ex tensive neural overlap in the cerebellum, suggesting a common mechanism for error pro cessing and learning (Diedrichsen, Hashambhoy, Rane, & Shadmehr, 2005). Error pro cessing that contributes to learning is distinct from online movement corrections that happen within a trial (Diedrichsen, Hashambhoy, et al., 2005; Gomi, 2008). Rather, learn ing is reflected in corrections made from one trial to the next, reflecting learning or up dating of motor representations. Evidence suggests that the cerebellum provides the core mechanism underlying errorbased learning (Criscimagna-Hemminger, Bastian, & Shadmehr, 2010; Diedrichsen, Ver stynen, Lehman, & Ivry, 2005; Ito, 2002; Miall, Christensen, Cain, & Stanley, 2007; Miall, Weir, Wolpert, & Stein, 1993; Ramnani, 2006; Tseng, Diedrichsen, Krakauer, Shadmehr, & Bastian, 2007; Wolpert & Miall, 1996). Studies with cerebellar patients demonstrate that, although these patients are able to make online motor adjustments, their performance across trials does not improve (Bastian, 2006; Maschke, Gomez, Ebner, & Konczak, 2004; Morton & Bastian, 2006; Smith & Shadmehr, 2005; Tseng et al., 2007). (p. 421) Neu roimaging studies also provide evidence for cerebellar contributions to error-based motor skill learning (Diedrichsen, Verstynen, et al., 2005; Imamizu, Kuroda, Miyauchi, Yoshioka, & Kawato, 2003; Imamizu, Kuroda, Yoshioka, & Kawato, 2004; Imamizu et al., 2000). It should be noted that correcting errors within a trial does not seem to be a prerequisite to learning, which is reflected as correcting errors from one trial to the next, or across-trial corrections. This is evidenced by experiments showing that learning occurs even when participants have insufficient time for making corrections. Thus it seems that just experi encing or detecting an error is sufficient to stimulate motor learning. Neuroimaging studies provide evidence that brain regions other than the cerebellum may also play a role in error-dependent learning such as the parietal cortex, striatum, and an terior cingulate cortex (Chapman et al., 2010; Clower et al., 1996; Danckert, Ferber, & Goodale, 2008; den Ouden, Daunizeau, Roiser, Friston, & Stephan, 2010). Studies also show that cerebellar patients can learn from errors when a perturbation is introduced gradually, resulting in small errors (Criscimagna-Hemminger et al., 2010). In combination these studies suggest that not all error-based learning relies on cerebellar networks. Recent studies demonstrate a contribution of the anterior cingulate cortex (ACC) error processing system to motor learning (Anguera, Reuter-Lorenz, Willingham, & Seidler, 2009; Anguera, Seidler, & Gehring, 2009; Danckert et al., 2008; Ferdinand, Mecklinger, & Kray, 2008; Krigolson & Holroyd, 2006, 2007a, 2007b; Krigolson, Holroyd, Van Gyn, & Heath, 2008). This prefrontal performance monitoring system has been studied extensive Page 8 of 38

Motor Skill Learning ly by recording the error-related negativity (ERN), an event-related potential (ERP) com ponent that is locked to an erroneous response (Falkenstein, Hohnsbein, & Hoormann, 1995; Gehring, Coles, Meyer, & Donchin, 1995; Gehring, Goss, Coles, Meyer, & Donchin, 1993). The ERN has attracted a great deal of interest, both within the ERP research com munity and in cognitive neuroscience more generally. Much of this interest arose because of evidence that the ERN is generated in the ACC, which is known to serve cognitive con trol functions that enable the brain to adapt behavior to changing task demands and envi ronmental circumstances (Botvinick, Braver, Barch, Carter, & Cohen, 2001; Ridderinkhof, Ullsperger, Crone, & Nieuwenhuis, 2004). The presupplementary motor area (pre-SMA), a region anatomically in the vicinity of the ACC, has also been proposed to play a role in error processing (Hikosaka & Isoda, 2010; Isoda & Hikosaka, 2007). As opposed to the ACC, however, it is thought that the pre-SMA corrects for movement errors in a proactive manner (Isoda & Hikosaka, 2007). In a series of studies by Krigolson and colleagues, an ERN from the medial frontal region was found to be associated with motor tracking errors (Krigolson & Holroyd, 2006, 2007a, 2007b; Krigolson et al., 2008). Interestingly, the onset of the ERN occurred before the tracking error, indicating that the medial frontal system began to detect the error even before the error was fully committed (Krigolson & Holroyd, 2006). The authors sug gested that this might entail the medial frontal system predicting tracking errors by adopting a predictive mode of control (Desmurget, Vindras, Grea, Viviani, & Grafton, 2000). They also found that target errors which allow online movement correction within a trial did not elicit the medial frontal ERN, but rather resulted in a negative deflection in the posterior parietal region (Krigolson & Holroyd, 2007a; Krigolson et al., 2008). These results indicate that the contribution of the medial frontal ERN to motor error processing is a distinct process potentially involving prediction error calibration (Krigolson & Hol royd, 2007a; Krigolson et al., 2008). We recently tested whether the ERN was sensitive to the magnitude of error experienced during visual-motor adaptation and found a larger ERN magnitude on trials in which larg er errors were made (Anguera, Seidler, et al., 2009). ERN magnitude also decreased from the early to the late stages of learning. These results are in agreement with current theo ries of ERN and skill acquisition. For example, as the error detection theory proposes (Falkenstein, Hohnsbein, Hoormann, & Blanke, 1991; Gehring et al., 1993), a greater ERN associated with larger errors indicates that the brain was monitoring the disparity between the predicted and actual movement outcomes (Anguera, Seidler, et al., 2009). There is also evidence supporting that the error-based learning represented in the ACC contributes to motor sequence learning (Berns, Cohen, & Mintun, 1997). Several studies have shown that the N200 ERP component, which is known to be sensitive to a mismatch between the expected and actual sensory stimuli, is enhanced for a stimulus that violates a learned motor sequence (Eimer, Goschke, Schlaghecken, & Sturmer, 1996). When ERN magnitudes were compared between explicit (p. 422) and implicit learners, a larger ERN was found for the explicit learners, demonstrating greater involvement of the error moni toring system when actively searching for the regularity of a sequence (Russeler, Kuh Page 9 of 38

Motor Skill Learning licke, & Munte, 2003). A more recent study demonstrated a parametric increase in the magnitude of the ERN during sequence learning as the awareness of the sequential na ture and the expectancy to the forthcoming sequential element increased (Ferdinand et al., 2008). The series of studies described above supports a role for the prefrontal ACC system in motor error processing. The actual mechanism of how the ERN contributes to perfor mance improvements across trials during motor learning is not well understood, however. Additionally, it remains unclear whether this system works independently of or in collabo ration with the cerebellar-based error processing system.

Role of Working Memory in Skill Learning Working memory refers to the structures and processes used for temporarily storing and manipulating information (Baddeley, 1986; Miyake & Shah, 1999). Dissociated processing for spatial and verbal information was initially proposed by Baddeley and Hitch (1974). Current views suggest that working memory may not be as process pure as once thought (Cowan, 1995, 2005; Jonides et al., 2008), but the idea of separate modules for processing different types of information still holds (Goldman-Rakic, 1987; Shah & Miyake, 1996; Smith, Jonides, & Koeppe, 1996; Volle et al., 2008). Given that the early stages of motor learning can be disrupted by the performance of secondary tasks (Eversheim & Bock, 2001; Remy, 2010; Taylor & Thoroughman, 2007, 2008), and the fact that similar pre frontal cortical regions are engaged early in motor learning as those relied on to perform spatial working memory tasks (Jonides et al., 2008; Reuter-Lorenz et al., 2000), it is plau sible that learners engage spatial working memory processes to learn new motor skills. Because variation exists in the number of items that individuals can hold and operate on in working memory (cf. Vogel & Machizawa, 2004), it lends itself well to individual differ ences research approaches. To investigate whether working memory contributes to visu al-motor adaptation in a recent study by our group, we administered a battery of neu ropsychological assessments to participants and then had them perform a manual visualmotor adaptation task and a spatial working memory task during magnetic resonance imaging (MRI). The results showed that performance on the card rotation task (Ekstrome, 1976), a measure of spatial working memory, correlated with the rate of early, but not late, learning on the visual-motor adaptation task across individuals (Anguera, ReuterLorenz, Willingham, & Seidler, 2010). There were no correlations between verbal working memory measures and either early or late learning. Moreover, the neural correlates of early adaptation overlapped with those that participants engaged when performing a spa tial working memory task, notably in the right dorsolateral prefrontal cortex and in the bi lateral inferior parietal lobules. There was no neural overlap between late adaptation and spatial working memory. These data demonstrate that early, but not late, learning en gages spatial working memory processes (Anguera, Reuter-Lorenz, Willingham, & Sei dler, 2010).

Page 10 of 38

Motor Skill Learning Despite recent assertions that visual-motor adaptation is largely implicit (Mazzoni & Krakauer, 2006), these findings (Anguera, Reuter-Lorenz, Willingham, & Seidler, 2010) are consistent with the hypothesis that spatial working memory processes are involved in the early stages of acquiring new visual-motor mappings. As described in detail below, whether a task is learned implicitly (subconsciously) or explicitly appears to be a separate issue from the relative cognitive demands of a task. For example, adaptation to a small and gradual visual-motor perturbation, which typically occurs outside of the learner’s awareness, is still subject to interference by performance of a secondary task (Galea, Sa mi, Albert, & Miall, 2010). This effect is reminiscent of Nissen and Bullemer’s (1987) finding that implicitly acquiring a sequence of actions is attentionally demanding and can be disrupted by secondary task performance. Thus we propose that spatial working mem ory can be engaged for learning new motor skills even when learning is implicit. Models of motor sequence learning also propose that working memory plays an integral role (cf. Ashe, Lungu, Basford, & Lu, 2006; Verwey, 1996, 2001). Studies that have used repetitive transcranial magnetic stimulation (rTMS) to disrupt the dorsolateral prefrontal cortex, a structure involved in working memory (Jonides et al., 1993), have shown im paired motor sequence learning (Pascual-Leone, Wassermann, Grafman, & Hallett, 1996; Robertson, Tormos, Maeda, & Pascual-Leone, 2001). To evaluate whether working memo ry plays a role in motor sequence learning, we determined whether individual differences in visual-spatial working memory (p. 423) capacity affect the temporal organization of ac quired motor sequences. We had participants perform an explicit motor sequence learn ing task (i.e., they were explicitly informed about the sequence and instructed to learn it) and a visual-spatial working memory task (Luck & Vogel, 1997). We found that working memory capacity correlated with the motor sequence chunking pattern that individuals developed; that is, individuals with a larger working memory capacity chunked more items together when learning the motor sequence (Bo & Seidler, 2009). Moreover, these individuals exhibited faster rates of learning. These results demonstrate that individual differences in working memory capacity predict the temporal structure of acquired motor sequences. How might spatial working memory be used during the motor learning process? We pro pose that it is used for differing purposes depending on whether the learner is acquiring a new sequence of actions or adapting to a novel sensorimotor environment. In the case of sensorimotor adaptation, we suggest that error information from the preceding trial (see above section) is maintained in spatial working memory and relied on when the learner manipulates the sensorimotor map to generate a motor command that is appro priate for the new environment. When adaptation is in response to a rotation of the visual display, this process likely involves the mental rotation component of spatial working memory (Jordan, 2001; Logie, 2005). This interpretation agrees with Abeele and Bock’s proposal that adaptation progresses in a gradual fashion across the learning period from small angles of transformation through intermediate values until the prescribed angle of rotation is reached (Abeele, 2001). Thus, the engagement of these spatial working memo ry resources late in adaptation is markedly diminished, compared with early adaptation, when the new mapping has been formed and is in use. This notion is supported by elec Page 11 of 38

Motor Skill Learning trophysiological data demonstrating an interaction between motor areas and a frontalparietal network during motor adaptation (Wise, 1998). Dual-tasking studies of motor adaptation support the appropriate timeline for this proposal as well. Taylor and Thor oughman (2007, 2008) have shown that sensorimotor adaptation is most affected when attention is distracted by a secondary task imposed late in the trial, when error informa tion becomes available. These authors suggest that cognitive resources are engaged be tween trials so that error information can be integrated to update visual-motor maps for the subsequent trial because a secondary task performed early in the trial did not pro duce interference. Thus, it seems that spatial working memory is engaged in the service of correcting motor errors across trials to improve performance over time. This is anatomically plausible regardless of whether the cerebellar motor error system or the ACC error detection and correction network (or both; see previous section) is relied on for motor learning because both structures have connections to the lateral frontal-pari etal system that supports working memory and both structures have been reported to be activated in the early stages of sensorimotor adaptation (Anguera, Reuter-Lorenz, Willing ham, & Seidler, 2010; Imamizu et al., 2000). Many motor sequence learning paradigms result in learning with few or no errors be cause movements are cued element by element, as in the popular serial reaction time task. Trial and error sequence learning is an exception; in this case spatial working mem ory likely plays a key role in maintaining and inhibiting past erroneous responses (Hikosa ka et al., 1999). In the case of cued sequence learning, we propose that working memory is relied on for chunking together of movement elements based on repetition, and their transfer into long-term memory. Both implicit and explicit motor sequence learning para digms consistently result in activation of the frontal-parietal working memory system (Ashe, Lungu, Basford, & Lu, 2006), consistent with this notion. Interestingly, dorsal pre motor cortex, which serves as a node in both working memory and motor execution net works, is the site where sensory information from working memory is thought to be con verted into motor commands for sequence execution (Ohbayashi, 2003). Verwey (1996, 2001) also hypothesized a close relationship between working memory ca pacity and motor sequence learning. He proposed that participants rely on a cognitive processor during sequence learning, which depends on “motor working memory” to allow a certain number of sequence elements (i.e., a chunk) to be programmed in advance of execution. At the same time, a motor processor is running in parallel to execute the ac tions so that the entire sequence can be performed efficiently. Interestingly, Ericsson and colleagues (Ericsson, 1980) reported a case in which a participant with initially average memory abilities increased his memory span from 7 to 79 digits with practice. This indi vidual learned to group chunks of digits together to form “supergroups,” which allowed him to dramatically increase his digit span. Presumably the process is similar for those acquiring long motor (p. 424) sequences as well, such as a musician memorizing an entire piece of music or a dancer learning a sequence of moves for a performance.

Page 12 of 38

Motor Skill Learning

Late Learning Processes As individuals become more proficient in executing a task, striatal and cerebellar regions become more active, whereas cortical regions become less active (Doyon & Benali, 2005; Grafton, Mazziotta, Woods, et al., 1992). But the question of how and where motor memo ries are ultimately stored is not trivial to answer. Shifts in activation do not mean that these “late learning” structures are involved in memory; they may instead merely medi ate performance of a well-learned set of movements. Also, the role of a specific area may differ between types of motor tasks. At the cellular level, the acquisition of motor memories as a modulation of synaptic con nections between neurons was first proposed by Ramon y Cajal (1894). Memories result from the facilitation and selective elimination of neuronal pairings due to experience. Hebb stated the associative nature of memories in more explicit terms: “two cells or sys tems that are repeatedly active at the same time will tend to become associated, so that activity in one facilitates activity in the other” (1949, p. 70). This Hebbian plasticity mod el has been well supported in model organisms (see Abel & Lattal, 2001; Martin, Grim wood, & Morris, 2000), and it appears that similar or identical mechanisms are at play for human motor memories (Donchin, Sawaki, Madupu, Cohen, & Shadmehr, 2002). At the level of systems, motor memories appear to follow a “cascade” pattern (Krakauer & Shadmehr, 2006): motor learning initially excites and rapidly induces associative plas ticity in one area, for example, primary motor cortex or cerebellar cortex. For a period of hours, this area will remain excited, and the learning will be sensitive to disruption and interference (Brashers-Krug, Shadmehr, & Bizzi, 1996; Stefan et al., 2006). With the pas sage of time, the activation decreases, and the learning becomes more stable and resis tant to interference (Shadmehr & Brashers-Krug, 1997). Note that this has been debated (Caithness et al., 2004), but reducing the effects of anterograde interference, through ei ther intermittent practice (Overduin, Richardson, Lane, Bizzi, & Press, 2006) or washout blocks (Krakauer, Ghez, & Ghilardi, 2005), demonstrates consolidation. Memories for motor acts are stored hierarchically. This hierarchy appears to hold true both for abstract representations of movements in declarative memory (Schack & Mech sner, 2006) as well as for the motor commands themselves (Grafton & Hamilton, 2007; Krigolson & Holroyd, 2006; Paine & Tani, 2004; Thoroughman & Shadmehr, 2000; Ya mamoto & Fujinami, 2008). This storage method may reflect a fundamental limitation of working memory or attention for the number of simultaneous elements that can be oper ated on at once. In other words, hierarchical representations are a form of chunking. Analogy learning, discussed later in this chapter, may exploit this hierarchical framework by providing a ready-made set of assembled motor primitives, bringing a learner more quickly to a higher hierarchical level. The search for the site of motor memory storage has mostly focused on areas known to be active in late learning of well-practiced tasks. Classically, it was believed that primary mo tor cortex merely housed commands for simple trajectories, and was controlled by premo tor areas (Fulton, 1935). However, more recent electrophysiological work in primates has Page 13 of 38

Motor Skill Learning shown that motor cortex stores representations of target locations (Carpenter, Geor gopoulos, & Pellizzer, 1999), motor sequences (Lu, 2005; Matsuzaka, Picard, & Strick, 2007), adaptations to a force field (Li, Padoa-Schioppa, & Bizzi, 2001), and a visual-motor transformation (Paz, Boraud, Natan, Bergman, & Vaadia, 2003). Whether and how other brain areas contribute to the storage of motor memories is less clear. The cerebellum, commonly associated with motor tasks and motor learning, and ac tive in the performance of well-learned movements, has been identified as a likely candi date. However, its role is still debated. With regard to sequence learning, although it is active in performance, the cerebellum is not necessary for learning or memory (Seidler et al., 2002) unless movements are cued in a symbolic fashion (Bo, Peltier, Noll, & Seidler, 2011; Spencer & Ivry, 2008). The striatum has been extensively linked to acquisition and storage of sequential representations (Debas et al., 2010; Jankowski, 2009; Seitz & Roland, 1992), although a recent experiment provides a provocative challenge to this viewpoint (Desmurget, 2010). In contrast, substantial evidence supports a role for the cerebellum in housing internal models acquired during sensorimotor adaptation (Gray don, Friston, Thomas, Brooks, & Menon, 2005; Imamizu et al., 2000, 2003; Seidler & Noll, 2008; Werner, 2010).

Fast and Slow Learning Computational neuroscience approaches have contributed much to our understanding of time (p. 425) varying processes underlying motor learning, although these results await strong integration into cognitive neuroscience views. Computational modeling work sup ports a role for error processing in skill learning by showing that performance change across trials is related to the magnitude of errors that have been recently experienced (Scheidt, Dingwell, & Mussa-Ivaldi, 2001; Thoroughman & Shadmehr, 2000). More re cently it has been demonstrated that a single state model (i.e., rate of learning is depen dent on error magnitude) does not account for all features of motor learning such as the savings that occur when relearning a previously experienced task, and the fact that un learning occurs more quickly than learning. Work by Shadmehr and colleagues provides evidence that two time varying processes with differential responsiveness to errors and varying retention profiles can account for these features (Huang & Shadmehr, 2009; Join er & Smith, 2008; Shadmehr et al., 2010; Smith, Ghazizadeh, & Shadmehr, 2006). These authors propose that a fast learning system responds strongly to errors but does not re tain information well, whereas in contrast, a slow learning system responds weakly to er rors but exhibits better retention. Studies investigating ways to maximize the effects of the slow learning process provide some interesting implications for enhancing retention of acquired skills. For example, Joiner and Smith (2008) demonstrated that retention 24 hours after learning does not de pend on the level of performance that participants had attained at the end of learning, which often is viewed as an indicator of the amount of learning that has taken place. Rather, retention depends on the level that the slow learning process has reached at the end of learning. Further, it seems that the fast and slow processes are not fixed but rather Page 14 of 38

Motor Skill Learning can be exploited to maximize retention (Huang & Shadmehr, 2009). When an adaptive stimulus is introduced gradually, errors are small, and retention is better.

Explicit and Implicit Memory Systems in Skill Learning Understanding the cognitive and neural mechanisms of motor skill learning requires tak ing into account the relative roles of the explicit and implicit learning and memory sys tems. The explicit learning and memory system refers to the neurocognitive process that accompanies conscious awareness of learning and remembering, whereas the implicit system refers to the same process without concomitant awareness (Cohen & Squire, 1980; Reber, 1967; Squire, 1992; Voss & Paller, 2008). Laboratory evaluations of explicit memory have a specific reference to information learned earlier, such as recall of a previ ously learned list of words. Implicit memory is demonstrated by changes in performance that are due to prior experience or practice and that may not be consciously remembered (Voss & Paller, 2008). The distinction between the two memory systems and the existence of parallel neural mechanisms underlying each was initially evidenced through the study of neurological pa tients. Patients with lesions to the medial temporal lobe and in particular the hippocam pus (as in the case of the well-studied patient H.M.) show amnesia, which is the inability to learn, store, and recollect new information consciously (Cohen & Squire, 1980; Scoville & Milner, 1957). Motor learning and memory systems are spared in amnesic patients, as shown by intact mirror drawing and rotary pursuit task learning (Brooks & Baddeley, 1976; Cavaco, Anderson, Allen, Castro-Caldas, & Damasio, 2004; Corkin, 1968). These same implicit tasks are impaired in patients with diseases affecting the basal ganglia (Gabrieli, Stebbins, Singh, Willingham, & Goetz, 1997; Heindel, Salmon, Shults, Walicke, & Butters, 1989). Neuroimaging studies also demonstrate medial temporal lobe and hip pocampal activation during explicit learning (Luo & Niki, 2005; Montaldi et al., 1998; Staresina & Davachi, 2009) and striatal involvement during implicit learning (Lieberman, Chang, Chiao, Bookheimer, & Knowlton, 2004; Poldrack, Prabhakaran, Seger, & Gabrieli, 1999; Seger & Cincotta, 2006). The dorsolateral prefrontal cortex is also involved during explicit learning (Barone & Joseph, 1989a, 1989b; Sakai et al., 1998; Toni, Krams, Turner, & Passingham, 1998), whereas the cerebellum and the cortical motor areas are involved during implicit learning (Ashe et al., 2006; Matsumura et al., 2004). More recent evidence suggests that the explicit and implicit systems may affect and inter act with one another (Willingham, 2001). For example, performance on an implicit memo ry test can be influenced by explicit memory, and vice versa (Keane, Orlando, & Verfael lie, 2006; Kleider & Goldinger, 2004; Tunney & Fernie, 2007; Voss, Baym, & Paller, 2008). Moreover, the hippocampus and striatum, traditionally associated with explicit and im plicit knowledge, respectively, have been shown to interact during probabilistic classifica tion learning (Sadeh, Shohamy, Levy, Reggev, & Maril, 2011; Shohamy & Wagner, 2008). Studies also suggest (p. 426) that the two systems can compete with one another during Page 15 of 38

Motor Skill Learning learning (Eichenbaum, Fagan, Mathews, & Cohen, 1988; Packard, Hirsh, & White, 1989; Poldrack et al., 2001). The interaction between the explicit and implicit learning and memory systems is also ev ident during motor skill learning. The fact that amnesic patients can still learn visual-mo tor tasks such as mirror drawing led many to think that explicit processes are not in volved in motor skill learning. As a consequence, motor skill learning has been predomi nantly viewed as part of the implicit learning system by researchers in the memory field (Gabrieli, 1998; Henke, 2010; Squire, 1992). As explicated in this chapter, however, there are both explicit and implicit forms of motor skill learning, and there is evidence that these processes interact during the learning of new motor skills. Moreover, the heavy in volvement of cognitive processes such as error detection and working memory during the early stages of skill learning supports the notion that motor skills can benefit from explic it instruction, at least early in the learning process. The following sections further discuss the role of implicit and explicit processes in motor sequence learning and sensorimotor adaptation.

Explicit and Implicit Processes in Sequence Learning Implicit and explicit learning mechanisms have been studied extensively in motor se quence learning. There is still debate, however, as to whether the two systems interact during learning. Behavioral experiments suggest that the two systems act parallel to each other without interference (Curran & Keele, 1993; Song, Howard, & Howard, 2007). Dis tinct neural mechanisms have also been reported for the two forms of sequence learning (Destrebecqz et al., 2005; Honda et al., 1998; Karabanov et al., 2010). On the other hand, there is also some evidence supporting the interaction of the two sys tems during sequence learning. For example, the medial temporal lobe and striatum, known to serve explicit and implicit processes, respectively, have both been shown to be engaged during both implicit and explicit sequence learning (Schendan, Searl, Melrose, & Stern, 2003; Schneider et al., 2010; Wilkinson, Khan, & Jahanshahi, 2009). Several stud ies also show that explicit instructions interfere with implicit learning, supporting interac tions between the two systems (Boyd & Winstein, 2004; Green & Flowers, 1991; Reber, 1976). However, if explicit knowledge has been acquired through practice, it does not in terfere with implicit learning (Vidoni & Boyd, 2007). The interference effect between the two systems during sequence learning has also been supported by neuroimaging data. Destrebecqz and colleagues showed the expected activation in the striatum during implic it learning; during explicit learning, the ACC and medial prefrontal cortex were also ac tive, and this activation correlated negatively with striatal activation (Destrebecqz et al., 2005). The authors interpreted this as suppression of the implicit learning system (i.e., the striatum) during explicit learning. Another study showed that the intention to learn a sequence was associated with sustained right prefrontal cortex activation and an attenua tion of learning-related changes in the medial temporal lobe and thalamus, resulting in the failure of implicit learning (Fletcher et al., 2005).

Page 16 of 38

Motor Skill Learning Some studies also emphasize that the implicit and explicit memory systems share infor mation. Destrebecqz and Cleermans (2001) showed that participants were able to gener ate an implicitly learned sequence when prompted, implying the existence of explicitly ac quired sequence knowledge. An overlap of neural activation patterns between the two learning systems including the striatum has also been shown in neuroimaging studies of simultaneous explicit and implicit sequence learning (Aizenstein et al., 2004; Willingham, Salidis, & Gabrieli, 2002). A recent study demonstrates that the two memory systems are combined during the learning process (Ghilardi, Moisello, Silvestri, Ghez, & Krakauer, 2009). In this study, the authors report that the implicit and explicit systems consolidate differently and are differ entially sensitive to interference, with explicit learning being more sensitive to antero grade and implicit more sensitive to retrograde interference. They proposed that the ex plicit acquisition of sequential order knowledge and the implicit acquisition of accuracy can be combined for successful learning.

Explicit and Implicit Processes in Sensorimotor Adaptation Whether and how implicit and explicit processes play a role in sensorimotor adaptation is less well understood. It seems clear that cognitive processes contribute to the early stages of adaptive learning, as outlined above. However, cognitively demanding tasks do not necessitate reliance on explicit processes (cf. Galea et al., 2010; Nissen & Bullemer, 1987). It has been shown that participants who gain explicit awareness by the end of adaptation exhibit learning (p. 427) advantages (Hwang, Smith, & Shadmehr, 2006; Wern er & Bock, 2007). However, because participants were polled after the experimental pro cedure, it is not clear whether explicit processes aided learning per se, or arose as a re sult of the transformation having become well learned. Malone and Bastian (Malone, 2010) recently demonstrated that participants instructed how to consciously correct er rors during adaptation learned faster than those given no instructions, and participants performing a secondary distractor task learned even more slowly. Mazzoni and Krakauer (2006) also provided participants with an explicit strategy to counter an imposed rotation. Their surprising result was that implicit adaptation to the rotation proceeded despite the strategy, and despite making performance on the task worse. Using the same paradigm, Taylor and colleagues (2010) repeated this experiment, this time comparing healthy adults and patients with cerebellar ataxia. The patients with cerebellar damage performed better than the controls; they were able to implement an explicit strategy with little interference from implicit processes, whereas the control group demonstrated progressively poorer performance as a result of recalibration. Al though these findings are remarkable, the data do not necessarily support independence of implicit and explicit systems or a complete takeover by implicit processes, as proposed by Mazzoni and Krakauer (2006): The errors made by normal controls as a result of their adaptation only reached about one-third of the magnitude of the total rotation. These

Page 17 of 38

Motor Skill Learning findings provide further support for the role of the cerebellum in implicit adaptation, as separate from explicit processes. That the two systems may be independent is further supported by work from Sulzenbruck and Heuer (2009). In their task, drawing circles under changing gain conditions, explicit and implicit processes were placed both in cooperation and in opposition; the explicit strategy either complemented or opposed the gain change. Under these circumstances, and when participants never fully adapted to the gain change, the two systems operated in isolation: they did not interfere with one another, and their effects were purely summa tive. When the explicit and implicit systems are not placed in direct conflict, the evidence seems to be clearer. Explicit and implicit involvements are complementary and non-inter acting. In a study by Mazzoni and Wexler (2009), an explicit rule could be used to select a target, whereas implicit processes adapted to a visual-motor rotation without interfer ence. Interestingly, this study indicates that the two processes have a potential to inter act. Patients in a presymptomatic stage of Huntington’s disease were unable to maintain pure separation, and their adaptation performance suffered when explicit task control was required. In all, a clear conclusion regarding the role of explicit control in adaptation remains elusive, as does an understanding of the brain structures and networks involved. Future work with careful attention to methodological detail is required to address these issues.

Practical Implications for Skill Learning The cognitive neuroscience of skill acquisition has many practical implications, especially as it applies to sporting activities and rehabilitation after injury. As mentioned previously, when individuals learn a new skill, they progress from a more cognitive early phase of learning relying on prefrontal and parietal brain networks to a highly automatic late phase of learning that relies on more subcortical structures. Moving participants quickly through these stages and avoiding regressing to lower levels of performance is a topic of interest not only to scientific researchers but also to coaches, physical therapists, sports educators, and people from many other related disciplines. This section highlights the practical implications of skill learning in sports, and focuses on the phenomenon of “chok ing under pressure,” as well as on ways to circumvent and avoid such lapses in perfor mance. Numerous studies have shown that choking under pressure occurs when a highly learned and automatic skill is brought to explicit awareness leaving the normally well-learned and automatic motor skill prone to breakdown under conditions of stress (Baumeister, 1984; Beilock & Carr, 2001; Beilock, Carr, MacMahon, & Starkes, 2002). Although little neuro science research has been dedicated to this phenomenon, presumably learners show de creased performance in a well-learned skill when the focus of attention engages brain networks more actively involved in initial learning, such as the prefrontal cortex and ACC networks (see Figure 20.1, arrow on right). Researchers studying skilled performance in Page 18 of 38

Motor Skill Learning high-stakes conditions often analyze such breakdowns in performance by reporting on the differences in the focus of attention between experts and less skilled individuals. For ex ample, in a 2007 study by Casteneda and Gray, novice and expert baseball players, de pending on their explicit instructions before the task, performed differentially based on focus of attention. Specifically, it was found that the (p. 428) optimal focus of attention for highly skilled batters is one that is external and allows attention to the perceptual aspects of the action. This external focus of attention is thought to allow smooth processing of the proceduralized knowledge of swinging a baseball bat (Casteneda & Gray, 2007). On the other hand, less skilled batters benefit from a focus that attends to the step-by-step exe cution of the swing and are hampered by focusing on the effects of their action (Castena da et al., 2007). Similarly, in a 2002 study using expert and novice soccer players, Beilock et al. (2002) found that when novices were required to focus on the execution of the skill, their performance as measured by time to dribble a soccer ball through an obstacle course improved compared with a dual-task condition. In experts, though, this result was reversed. Expert players performed better when there was a concomitant task that took their attention off of executing this well-learned skill. Interestingly, however, when rightfooted expert players were instructed to perform the task with their nondominant left foot, their performance was enhanced in the skill-focus condition. Overall, this research implies that when it comes to highly learned motor skills, performance is improved with an external focus. It is only when performing a less-well-learned skill that performance benefits from a focus on the component movements that compose the complex skill. How does the way that individuals learn affect their acquisition and robustness of a new motor skill? For example, if skill acquisition requires going from a highly cognitive explic it representation to a procedural implicit representation, are there ways to affect the learning process so that the motor program is initially stored implicitly? This would also have the added benefit of making the skill more resistant to choking under pressure be cause there would not be a strong explicit motor program of the skill to bring to con scious awareness. In analogy learning, one simple heuristic is provided to the learner (Koedijker, Oudejans, & Beek, 2008). In following this one single rule, it is believed that learners eschew hypothesis testing during skill learning. Thus, little explicit knowledge about the task is accumulated, and demands on error detection and working memory processes are kept to a minimum (Koedijker, Oudejans, & Beek, 2008). From a neuro science perspective this would be akin to “bypassing” the highly cognitive prefrontal brain networks and encoding the skill directly into more robust subcortical brain regions such as the striatum and cerebellum (see Figure 20.1, arrow on left). For example, in a 2001 study conducted by Liao and Masters (2001), subjects were instructed to hit a fore hand shot in table tennis by moving the hand along an imaginary hypotenuse of a right triangle. Analogy learners acquired few explicit rules about the task, and showed similar performance compared with participants who had learned the skill explicitly. When a con current secondary task was added, the performance of the explicit learners was signifi cantly reduced. However, no significant performance impairments were seen in the analo gy (implicit learning) group under dual-task conditions.

Page 19 of 38

Motor Skill Learning

Conclusions and Future Directions Recent studies on the cognitive neuroscience of skill learning have begun to incorporate new techniques and approaches. For example, it has been shown using diffusion tensor imaging (DTI) that individual differences in cerebellar white matter microstructure corre late with skill learning ability (Della-Maggiore, 2009). Moreover, a single session of motor learning, but not simple motor performance, has been shown to modulate frontal-parietal network connectivity. The field has advanced in terms of both elucidating basic mecha nisms of skill learning and making strides in translational approaches. Our nascent un derstanding of the physiological mechanisms underlying skill learning has been exploited for the design of brain stimulation protocols that inhibit abnormal brain activity in stroke patients (Fregni, 2006) or accelerate skill learning by applying transcranial direct current stimulation to the motor cortex (Reis, 2009). When new techniques are applied to the study of skill learning mechanisms, investigators typically focus first on primary motor cortex. Although this structure is both accessible and strongly associated with skill learning, a complete understanding of the neural mech anisms underlying learning will come only from investigation of interactions between and integration of cognitive and motor networks. Moreover, investigation of ecologically valid skills in more naturalistic settings with the use of mobile brain imaging techniques (Gwin, Gramann, Makeig, & Ferris, 2010, 2011) will bring further validity and translational ca pacity to the study of skill learning. Learning a new motor skill places demands not only on the motor system but also on cog nitive processes. For example, working memory appears to be relied on for manipulating motor commands for subsequent actions based on recently experienced (p. 429) errors. Moreover, it may be used for chunking movement elements into hierarchical sequence representations. Interestingly, individual differences in working memory capacity are pre dictive of the ability to learn new motor skills. It should be noted that engagement of working memory, error processing, and attentional processes and networks does not necessitate that skill learning is explicit. As skill learning progresses, cognitive demands are reduced, as reflected by decreasing interference from secondary tasks and reduced activation of frontal-parietal brain networks.

Questions for the Future • How overlapping are the following continua? • Cognitive and procedural control of skill • Implicit and explicit memory systems • Fast and slow skill learning processes

Page 20 of 38

Motor Skill Learning • Do the implicit and explicit memory systems interact, either positively (reflected as transfer) or negatively (reflected as interference), for both motor sequence learning and sensorimotor adaptation? • What role do genetic, experiential, and other individual differences variables play in skill learning? • What role do state variables, such as motivation, arousal, and fatigue, play in skill learning? • How can a precise understanding of the neurocognitive mechanisms of skill learning be exploited to enhance rehabilitation, learning, and performance?

Author Note This work was supported by the National Institutes of Health (R01 AG 24106 S1 and T32AG00114).

References Abeele, S., Bock, O. (2001). Mechanisms for sensorimotor adaptation to rotated visual in put. Experimental Brain Research, 139, 248–253. Abel, T., & Lattal, K. M. (2001). Molecular mechanisms of memory acquisition, consolida tion and retrieval. Current Opinion in Neurobiology, 11 (2), 180–187. Adams, J. A. (1971). A closed-loop theory of motor learning. Journal of Motor Behavior, 3 (2), 111–149. Aizenstein, H. J., Stenger, V. A., Cochran, J., Clark, K., Johnson, M., Nebes, R. D., et al. (2004). Regional brain activation during concurrent implicit and explicit sequence learn ing. Cerebral Cortex, 14 (2), 199–208. Anguera, J. A., Reuter-Lorenz, P. A., Willingham, D. T., & Seidler, R. D. (2009). Contribu tions of spatial working memory to visuomotor learning. Journal of Cognitive Neuro science, 22 (9), 1917–1930. Anguera, J. A., Reuter-Lorenz, P.A., Willingham, D.T., Seidler, R.D. (2010). Contributions of spatial working memory to visuomotor learning. Journal of Cognitive Neuroscience, 22 (9), 1917–1930. Anguera, J. A., Seidler, R. D., & Gehring, W. J. (2009). Changes in performance monitoring during sensorimotor adaptation. Journal of Neurophysiology, 102 (3), 1868–1879. Aristotle; Ross, W. D., & Smith, J. A. (1908). The works of Aristotle. Oxford, UK: Clarendon Press.

Page 21 of 38

Motor Skill Learning Ashe, J., Lungu, O. V., Basford, A. T., & Lu, X. (2006). Cortical control of motor sequences. Current Opinion in Neurobiology, 16, 213–221. Atzler, E., & Herbst, R. (1927). Arbeitsphysiologische Studien. Pflügers Archiv European Journal of Physiology, 215 (1), 291–328. Baddeley, A. D. (1986). Working memory. Oxford, UK: Oxford University Press. Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. A. Bower (Ed.), Recent ad vances in learning and motivation (Vol. 8, pp. 47–89). New York: Academic Press. Barone, P., & Joseph, J. P. (1989a). Prefrontal cortex and spatial sequencing in macaque monkey. Experimental Brain Research, 78 (3), 447–464. Barone, P., & Joseph, J. P. (1989b). Role of the dorsolateral prefrontal cortex in organizing visually guided behavior. Brain Behavior and Evolution, 33 (2-3), 132–135. Bastian, A. J. (2006). Learning to predict the future: the cerebellum adapts feedforward movement control. Current Opinion in Neurobiology, 16 (6), 645–649. Baumeister, R. F. (1984). Choking under pressure: Self-consciousness and paradoxical ef fects of incentives on skillful performance. Journal of Personality and Social Psychology, 46 (3), 610–620. Beilock, S. L., & Carr, T. H. (2001). On the fragility of skilled performance: what governs choking under pressure? Journal of Experimental Psychology: General, 130 (4), 701–725. Beilock, S. L., Carr, T. H., MacMahon, C., & Starkes, J. L. (2002). When paying attention becomes counterproductive: Impact of divided versus skill-focused attention on novice and experienced performance of sensorimotor skills. Journal of Experimental Psychology: Applied, 8 (1), 6–16. Berns, G. S., Cohen, J. D., & Mintun, M. A. (1997). Brain regions responsive to novelty in the absence of awareness. Science, 276 (5316), 1272–1275. Bo, J., Borza, V., & Seidler, R. D. (2009). Age-related declines in visuospatial working memory correlate with deficits in explicit motor sequence learning. Journal of Neurophys iology, 102, 2744–2754. Bo, J., Peltier, S., Noll, D., Seidler, R. D. (2011). Symbolic representations in motor se quence learning. NeuroImage, 54 (1), 417–426. Bo, J., & Seidler, R. D. (2009). Visuospatial working memory capacity predicts the organi zation of acquired explicit motor sequences. Journal of Neurophysiology, 101 (6), 3116– 3125. Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S., & Cohen, J. D. (2001). Conflict monitoring and cognitive control. Psychological Review, 108 (3), 624–652. Page 22 of 38

Motor Skill Learning Boyd, L. A., & Winstein, C. J. (2004). Providing explicit information disrupts implicit motor learning after basal ganglia stroke. Learning and Memory, 11 (4), 388–396. Brashers-Krug, T., Shadmehr, R., & Bizzi, E. (1996). Consolidation in human mo tor memory. Nature, 382 (6588), 252–255. (p. 430)

Brooks, D. N., & Baddeley, A. D. (1976). What can amnesic patients learn? Neuropsycholo gia, 14 (1), 111–122. Caithness, G., Osu, R., Bays, P., Chase, H., Klassen, J., Kawato, M., et al. (2004). Failure to consolidate the consolidation theory of learning for sensorimotor adaptation tasks. Jour nal of Neuroscience, 24 (40), 8662–8671. Cajal, S. R. (1894). The Croonian Lecture: La fine structure des centres nerveux. Proceed ings of the Royal Society of London, 55, 444–468. Carpenter, A. F., Georgopoulos, A. P., & Pellizzer, G. (1999). Motor cortical encoding of se rial order in a context-recall task. Science, 283 (5408), 1752–1757. Castaneda, B., & Gray, R. (2007). Effects of focus of attention on baseball batting perfor mance in players of different skill levels. Journal of Sport & Exercise Psychology, 29, 60– 77. Cavaco, S., Anderson, S. W., Allen, J. S., Castro-Caldas, A., & Damasio, H. (2004). The scope of preserved procedural memory in amnesia. Brain, 127 (Pt 8), 1853–1867. Chapman, H. L., Eramudugolla, R., Gavrilescu, M., Strudwick, M. W., Loftus, A., Cunning ton, R., et al. (2010). Neural mechanisms underlying spatial realignment during adapta tion to optical wedge prisms. Neuropsychologia, 48 (9), 2595–2601. Clower, D. M., Hoffman, J. M., Votaw, J. R., Faber, T. L., Woods, R. P., & Alexander, G. E. (1996). Role of posterior parietal cortex in the recalibration of visually guided reaching. Nature, 383 (6601), 618–621. Cohen, N. J., & Squire, L. R. (1980). Preserved learning and retention of pattern-analyzing skill in amnesia: Dissociation of knowing how and knowing that. Science, 210 (4466), 207– 210. Conley, D. L., & Krahenbuhl, G. S. (1980). Running economy and distance running perfor mance of highly trained athletes. Medicine and Science in Sports and Exercise, 12 (5), 357–360. Corkin, S. (1968). Acquisition of motor skill after bilateral medial temporal-lobe excision. Neuropsychologia, 6 (3), 255–265. Cowan, N. (1995). Attention and memory: An integrated framework Oxford Psychology Series (Vol. 26). New York: Oxford University Press. Cowan, N. (2005). Working memory capacity. New York: Psychology Press. Page 23 of 38

Motor Skill Learning Criscimagna-Hemminger, S. E., Bastian, A. J., & Shadmehr, R. (2010). Size of error affects cerebellar contributions to motor learning. Journal of Neurophysiology, 103 (4), 2275– 2284. Cunningham, H. A. (1989). Aiming error under transformed spatial mappings suggests a structure for visual-motor maps. Journal of Experimental Psychology: Human Perception and Performance, 15 (3), 493–506. Curran, T., & Keele, S. W. (1993). Attentional and nonattentional forms of sequence learn ing. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19 (1), 189– 202. Danckert, J., Ferber, S., & Goodale, M. A. (2008). Direct effects of prismatic lenses on vi suomotor control: An event-related functional MRI study. European Journal of Neuro science, 28 (8), 1696–1704. Debas, K., Carrier, J., Orban, P., Barakat, M., Lungu, O., Vandewalle, G., Hadj Tahar, A., Bellec, P., Karni, A., Ungerleider, L. G., Benali, H., & Doyon, J. (2010). Brain plasticity re lated to the consolidation of motor sequence learning and adaptation. Proceedings of the National Academy of Sciences U S A, 107 (41), 17839–17844. Della-Maggiore, V., Scholz, J., Johansen-Berg, H., & Paus, T. (2009). The rate of visuomo tor adaptation correlates with cerebellar white-matter microstructure. Human Brain Map ping, 30 (12), 4048–4053. den Ouden, H. E., Daunizeau, J., Roiser, J., Friston, K. J., & Stephan, K. E. (2010). Striatal prediction error modulates cortical coupling. Journal of Neuroscience, 30 (9), 3210–3219. Desmurget, M., Turner, R. S. (2010). Motor sequences and the basal ganglia: Kinematics, not habits. Journal of Neuroscience, 30 (22), 7685–7690. Desmurget, M., Vindras, P., Grea, H., Viviani, P., & Grafton, S. T. (2000). Proprioception does not quickly drift during visual occlusion. Experimental Brain Research, 134 (3), 363– 377. Destrebecqz, A., & Cleeremans, A. (2001). Can sequence learning be implicit? New evi dence with the process dissociation procedure. Psychonomic Bulletin and Review, 8 (2), 343–350. Destrebecqz, A., Peigneux, P., Laureys, S., Degueldre, C., Del Fiore, G., Aerts, J., et al. (2005). The neural correlates of implicit and explicit sequence learning: Interacting net works revealed by the process dissociation procedure. Learning and Memory, 12 (5), 480– 490. Diedrichsen, J., Hashambhoy, Y., Rane, T., & Shadmehr, R. (2005). Neural correlates of reach errors. Journal of Neuroscience, 25 (43), 9919–9931.

Page 24 of 38

Motor Skill Learning Diedrichsen, J., Shadmehr, R., & Ivry, R. B. (2009). The coordination of movement: Opti mal feedback control and beyond. Trends in Cognitive Sciences, 14 (1), 31–39. Diedrichsen, J., Verstynen, T., Lehman, S. L., & Ivry, R. B. (2005). Cerebellar involvement in anticipating the consequences of self-produced actions during bimanual movements. Journal of Neurophysiology, 93 (2), 801–812. Donchin, O., Francis, J. T., & Shadmehr, R. (2003). Quantifying generalization from trialby-trial behavior of adaptive systems that learn with basis functions: Theory and experi ments in human motor control. Journal of Neuroscience, 23 (27), 9032–9045. Donchin, O., Sawaki, L., Madupu, G., Cohen, L. G., & Shadmehr, R. (2002). Mechanisms influencing acquisition and recall of motor memories. Journal of Neurophysiology, 88 (4), 2114–2123. Doyon, J., & Benali, H. (2005). Reorganization and plasticity in the adult brain during learning of motor skills. Current Opinion in Neurobiology, 15 (2), 161–167. Doyon, J., Owen, A. M., Petrides, M., Sziklas, V., & Evans, A. C. (1996). Functional anato my of visuomotor skill learning in human subjects examined with positron emission to mography. European Journal of Neuroscience, 8 (4), 637–648. Eichenbaum, H., Fagan, A., Mathews, P., & Cohen, N. J. (1988). Hippocampal system dys function and odor discrimination learning in rats: Impairment or facilitation depending on representational demands. Behavioral Neuroscience, 102 (3), 331–339. Eimer, M., Goschke, T., Schlaghecken, F., & Sturmer, B. (1996). Explicit and implicit learning of event sequences: evidence from event-related brain potentials. Journal of Ex perimental Psychology: Learning, Memory, and Cognition, 22 (4), 970–987. Ekstrome, R. B., French, J. W., Harman, H. H., et al. (1976). Manual for kit of factor refer enced cognitive tests. Princeton, NJ: Educational Testing Service. Enoka, R. M. (1988). Load- and skill-related changes in segmental contributions to a weightlifting movement. Medicine and Science in Sports and Exercise, 20 (2), 178–187. (p. 431)

Ericsson, K. A., Chase, W. G., & Faloon, S. (1980). Acquisition of a memory skill. Science, 208, 1181–1182. Eversheim, U., & Bock, O. (2001). Evidence for processing stages in skill acquisition: A dual-task study. Learning and Memory, 8 (4), 183–189. Falkenstein, M., Hohnsbein, J., & Hoormann, J. (1995). Event-related potential correlates of errors in reaction tasks. Electroencephalographic and Clinical Neurophysiology Sup plement, 44, 287–296.

Page 25 of 38

Motor Skill Learning Falkenstein, M., Hohnsbein, J., Hoormann, J., & Blanke, L. (1991). Effects of crossmodal divided attention on late ERP components. II. Error processing in choice reaction tasks. Electroencephalography and Clinical Neurophysiology, 78 (6), 447–455. Ferdinand, N. K., Mecklinger, A., & Kray, J. (2008). Error and deviance processing in im plicit and explicit sequence learning. Journal of Cognitive Neuroscience, 20 (4), 629–642. Fitts, P. M., & Posner, M. I. (1967). Human performance. Belmont, CA: Brooks/Cole. Flanagan, J. R., Vetter, P., Johansson, R. S., & Wolpert, D. M. (2003). Prediction precedes control in motor learning. Current Biology, 13 (2), 146–150. Fletcher, P. C., Zafiris, O., Frith, C. D., Honey, R. A., Corlett, P. R., Zilles, K., et al. (2005). On the benefits of not trying: Brain activity and connectivity reflecting the interactions of explicit and implicit sequence learning. Cerebral Cortex, 15 (7), 1002–1015. Fregni, F., Boggio, P. S., Valle, A. C., Rocha, R. R., Duarte, J., Ferreira, M. J., Wagner, T., Fecteau, S., Rigonatti, S. P., Riberto, M., Freedman, S. D., & Pascual-Leone, A. (2006). A sham-controlled trial of a 5-day course of repetitive transcranial magnetic stimulation of the unaffected hemisphere in stroke patients. Stroke, 37, 2115–2122. Fulton, J. F. (1935). A note on the definition of the “motor” and “premotor” areas. Brain, 58 (2), 311. Gabrieli, J. D. (1998). Cognitive neuroscience of human memory. Annual Review of Psy chology, 49, 87–115. Gabrieli, J. D., Stebbins, G. T., Singh, J., Willingham, D. B., & Goetz, C. G. (1997). Intact mirror-tracing and impaired rotary-pursuit skill learning in patients with Huntington’s disease: Evidence for dissociable memory systems in skill learning. Neuropsychology, 11 (2), 272–281. Galea, J. M., Sami, S. A., Albert, N. B., & Miall, R. C. (2010). Secondary tasks impair adap tation to step- and gradual-visual displacements. Experimental Brain Research, 202 (2), 473–484. Garrison, K. A., Winstein, C. J., & Aziz-Zadeh, L. (2010). The mirror neuron system: A neural substrate for methods in stroke rehabilitation. Neurorehabilitation and Neural Re pair, 24 (5), 404–412. Gehring, W. J., Coles, M. G., Meyer, D. E., & Donchin, E. (1995). A brain potential manifes tation of error-related processing. Electroencephalography and Clinical Neurophysiology Suppl, 44, 261–272. Gehring, W. J., Goss, B., Coles, M. G. H., Meyer, D. E., & Donchin, E. (1993). A neural sys tem for error detection and compensation. Psychological Science, 4 (6), 385–390.

Page 26 of 38

Motor Skill Learning Geisler, W. S. (1989). Sequential ideal-observer analysis of visual discriminations. Psycho logical Review, 96 (2), 267–314. Gepshtein, S., Seydell, A., & Trommershauser, J. (2007). Optimality of human movement under natural variations of visual-motor uncertainty. Journal of Vision, 7 (5), 13 11–18. Ghilardi, M. F., Gordon, J., & Ghez, C. (1995). Learning a visuomotor transformation in a local area of work space produces directional biases in other areas. Journal of Neurophys iology, 73 (6), 2535–2539. Ghilardi, M. F., Moisello, C., Silvestri, G., Ghez, C., & Krakauer, J. W. (2009). Learning of a sequential motor skill comprises explicit and implicit components that consolidate differ ently. Journal of Neurophysiology, 101 (5), 2218–2229. Goldman-Rakic, P. (1987). Circuitry of primate prefrontal cortex and regulation of behav ior by representational memory. In F. Plum (Ed.), Handbook of physiology (pp. 373–417). Washington, DC: American Psychological Society. Gomi, H. (2008). Implicit online corrections of reaching movements. Current Opinion in Neurobiology, 18 (6), 558–564. Grafton, S. T., & Hamilton, A. F. (2007). Evidence for a distributed hierarchy of action rep resentation in the brain. Human Movement Science, 26 (4), 590–616. Grafton, S. T., Mazziotta, J. C., Presty, S., Friston, K. J., Frackowiak, R. S., & Phelps, M. E. (1992). Functional anatomy of human procedural learning determined with regional cere bral blood flow and PET. Journal of Neuroscience, 12 (7), 2542–2548. Grafton, S. T., Mazziotta, J. C., Woods, R. P., & Phelps, M. E. (1992). Human functional anatomy of visually guided finger movements. Brain, 115 (Pt 2), 565–587. Grafton, S. T., Woods, R. P., Mazziotta, J. C., & Phelps, M. E. (1991). Somatotopic mapping of the primary motor cortex in humans: Activation studies with cerebral blood flow and positron emission tomography. Journal of Neurophysiology, 66 (3), 735–743. Graydon, F., Friston, K., Thomas, C., Brooks, V., & Menon, R. (2005). Learning-related fM RI activation associated with a rotational visuo-motor transformation. Brain Research: Cognitive Brain Research, 22 (3), 373–383. Green, T. D., & Flowers, J. H. (1991). Implicit versus explicit learning processes in a prob abilistic, continuous fine-motor catching task. Journal of Motor Behavior, 23 (4), 293–300. Gwin, J. T., Gramann, K., Makeig, S., & Ferris, D. P. (2010). Removal of movement artifact from high-density EEG recorded during walking and running. Journal of Neurophysiology, 103 (6), 3526–3534. Gwin, J. T., Gramann, K., Makeig, S., & Ferris, D. P. (2011). Electrocortical activity is cou pled to gait cycle phase during treadmill walking. NeuroImage, 54 (2), 1289–1296. Page 27 of 38

Motor Skill Learning Hebb, D. O. (1949). The organization of behavior. New York: Wiley & Sons. Heindel, W. C., Salmon, D. P., Shults, C. W., Walicke, P. A., & Butters, N. (1989). Neuropsy chological evidence for multiple implicit memory systems: A comparison of Alzheimer’s, Huntington’s, and Parkinson’s disease patients. Journal of Neuroscience, 9 (2), 582–587. Helmholtz, H. V. (1867). Handbuch der physiologischen Optik. Leipzig: Voss. Henke, K. (2010). A model for memory systems based on processing modes rather than consciousness. Nature Reviews Neuroscience, 11 (7), 523–532. Hikosaka, O., & Isoda, M. (2010). Switching from automatic to controlled behavior: corti co-basal ganglia mechanisms. Trends in Cognitive Sciences, 14 (4), 154–161. Hikosaka, O., Nakahara, H., Rand, M. K., Sakai, K., Lu, X., Nakamura, K., et al. (1999). Parallel neural networks for learning and sequential procedures. Trends in Neurosciences, 22, 464–471. (p. 432)

Holt, K. G., Hamill, J., & Andres, R. O. (1991). Predicting the minimal energy costs

of human walking. Medicine and Science in Sports and Exercise, 23 (4), 491. Honda, M., Deiber, M. P., Ibanez, V., Pascual-Leone, A., Zhuang, P., & Hallett, M. (1998). Dynamic cortical involvement in implicit and explicit motor sequence learning: A PET study. Brain, 121 (Pt 11), 2159–2173. Huang, V. S., & Shadmehr, R. (2009). Persistence of motor memories reflects statistics of the learning event. Journal of Neurophysiology, 102 (2), 931–940. Hwang, E. J., Smith, M. A., & Shadmehr, R. (2006). Dissociable effects of the implicit and explicit memory systems on learning control of reaching. Experimental Brain Research, 173 (3), 425–437. Imamizu, H., Kuroda, T., Miyauchi, S., Yoshioka, T., & Kawato, M. (2003). Modular organi zation of internal models of tools in the human cerebellum. Proceedings of the National Academy of Sciences U S A, 100 (9), 5461–5466. Imamizu, H., Kuroda, T., Yoshioka, T., & Kawato, M. (2004). Functional magnetic reso nance imaging examination of two modular architectures for switching multiple internal models. Journal of Neuroscience, 24 (5), 1173–1181. Imamizu, H., Miyauchi, S., Tamada, T., Sasaki, Y., Takino, R., Putz, B., et al. (2000). Hu man cerebellar activity reflecting an acquired internal model of a new tool. Nature, 403 (6766), 192–195. Isoda, M., & Hikosaka, O. (2007). Switching from automatic to controlled action by mon key medial frontal cortex. Nature Neuroscience, 10 (2), 240–248.

Page 28 of 38

Motor Skill Learning Ito, M. (2002). Historical review of the significance of the cerebellum and the role of Purkinje cells in motor learning. Annals of the New York Academy of Science, 978, 273– 288. James, W. (1890). The principles of psychology. New York: H. Holt. Jankowski, J., Scheef, L., Hüppe, C., & Boecker, H. (2009). Distinct striatal regions for planning and executing novel and automated movement sequences. NeuroImage, 44 (4), 1369–1379. Joiner, W. M., & Smith, M. A. (2008). Long-term retention explained by a model of shortterm learning in the adaptive control of reaching. Journal of Neurophysiology, 100 (5), 2948–2955. Jones, A. M. (1998). A five year physiological case study of an Olympic runner. British Journal of Sports Medicine, 32 (1), 39–43. Jonides, J., Lewis, R. L., Nee, D. E., Lustig, C. A., Berman, M. G., & Moore, K. S. (2008). The mind and brain of short-term memory. Annual Review of Psychology, 59, 193–224. Jonides, J., Smith, E. E., Koeppe, R. A., Awh, E. Minoshima, S., & Mintun, M. A. (1993). Spatial working memory in humans as revealed by PET. Nature, 363, 623–625. Jordan, K., Heinze, H. J., Lutz, K., Kanowski, M., Jancke, L. (2001). Cortical activations during the mental rotation of different visual objects. NeuroImage, 13, 143–152. Jueptner, M., Frith, C. D., Brooks, D. J., Frackowiak, R. S., & Passingham, R. E. (1997). Anatomy of motor learning. II. Subcortical structures and learning by trial and error. Jour nal of Neurophysiology, 77 (3), 1325–1337. Jueptner, M., Stephan, K. M., Frith, C. D., Brooks, D. J., Frackowiak, R. S., & Passingham, R. E. (1997). Anatomy of motor learning. I. Frontal cortex and attention to action. Journal of Neurophysiology, 77 (3), 1313–1324. Karabanov, A., Cervenka, S., de Manzano, O., Forssberg, H., Farde, L., & Ullen, F. (2010). Dopamine D2 receptor density in the limbic striatum is related to implicit but not explicit movement sequence learning. Proceedings of the National Academy of Sciences U S A, 107 (16), 7574–7579. Karni, A., Meyer, G., Jezzard, P., Adams, M. M., Turner, R., & Ungerleider, L. G. (1995). Functional MRI evidence for adult motor cortex plasticity during motor skill learning. Na ture, 377 (6545), 155–158. Karni, A., Meyer, G., Rey-Hipolito, C., Jezzard, P., Adams, M. M., Turner, R., et al. (1998). The acquisition of skilled motor performance: Fast and slow experience-driven changes in primary motor cortex. Proceedings of the National Academy of Sciences U S A, 95 (3), 861–868.

Page 29 of 38

Motor Skill Learning Kawato, M. (1999). Internal models for motor control and trajectory planning. Current Opinion in Neurobiology, 9 (6), 718–727. Keane, M. M., Orlando, F., & Verfaellie, M. (2006). Increasing the salience of fluency cues reduces the recognition memory impairment in amnesia. Neuropsychologia, 44 (5), 834– 839. Kim, S. G., Ashe, J., Georgopoulos, A. P., Merkle, H., Ellermann, J. M., Menon, R. S., et al. (1993). Functional imaging of human motor cortex at high magnetic field. Journal of Neu rophysiology, 69 (1), 297–302. Kleider, H. M., & Goldinger, S. D. (2004). Illusions of face memory: Clarity breeds famil iarity. Journal of Memory and Language, 50 (2), 196–211. Koedijker, J. M., Oudejans, R. R. D., & Beek P. J. (2008). Table tennis performance, follow ing explicit and analogy learning over 10,000 repetitions. International Journal of Sport Psychology, 39, 237–256. Körding, K. P., & Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning. Nature, 427 (6971), 244–247. Krakauer, J. W., Ghez, C., & Ghilardi, M. F. (2005). Adaptation to visuomotor transforma tions: Consolidation, interference, and forgetting. Journal of Neuroscience, 25 (2), 473– 478. Krakauer, J. W., Pine, Z. M., Ghilardi, M. F., & Ghez, C. (2000). Learning of visuomotor transformations for vectorial planning of reaching trajectories. Journal of Neuroscience, 20 (23), 8916–8924. Krakauer, J. W., & Shadmehr, R. (2006). Consolidation of motor memory. Trends in Neuro sciences, 29 (1), 58–64. Krigolson, O. E., & Holroyd, C. B. (2006). Evidence for hierarchical error processing in the human brain. Neuroscience, 137 (1), 13–17. Krigolson, O. E., & Holroyd, C. B. (2007a). Hierarchical error processing: different errors, different systems. Brain Research, 1155, 70–80. Krigolson, O. E., & Holroyd, C. B. (2007b). Predictive information and error processing: The role of medial-frontal cortex during motor control. Psychophysiology, 44 (4), 586–595. Krigolson, O. E., Holroyd, C. B., Van Gyn, G., & Heath, M. (2008). Electroencephalograph ic correlates of target and outcome errors. Experimental Brain Research, 190 (4), 401– 411. Lalazar, H., & Vaadia, E. (2008). Neural basis of sensorimotor learning: Modifying inter nal models. Current Opinion in Neurobiology, 18 (6), 573–581.

Page 30 of 38

Motor Skill Learning Li, C. S., Padoa-Schioppa, C., & Bizzi, E. (2001). Neuronal correlates of motor perfor mance and motor learning in the primary motor cortex of monkeys adapting to an exter nal force field. Neuron, 30 (2), 593–607. Liao, C. M., & Masters, R. S. W. (2001). Analogy learning: a means to implicit mo tor learning. Journal of Sports Sciences, 19, 307–319. (p. 433)

Lieberman, M. D., Chang, G. Y., Chiao, J., Bookheimer, S. Y., & Knowlton, B. J. (2004). An event-related fMRI study of artificial grammar learning in a balanced chunk strength de sign. Journal of Cognitive Neuroscience, 16 (3), 427–438. Logie, R. H., Della Sala, S., Beschin, N., Denis, M. (2005). Dissociating mental transforma tions and visuo-spatial storage in working memory: Evidence from representational ne glect. Memory, 13, 430–434. Lu, X., & Ashe, J. (2005). Anticipatory activity in primary motor cortex codes memorized movement sequences. Neuron, 45 (6), 967–973. Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features and conjunctions. Nature, 390, 279–281. Luo, J., & Niki, K. (2005). Does hippocampus associate discontiguous events? Evidence from event-related fMRI. Hippocampus, 15 (2), 141–148. Malone, L. A., Bastian, A. J. (2010). Thinking about walking: Effects of conscious correc tion versus distraction on locomotor adaptation. Journal of Neurophysiology, 103 (4), 1954–1962. Maloney, L. T., & Mamassian, P. (2009). Bayesian decision theory as a model of human vi sual perception: Testing Bayesian transfer. Visual Neuroscience, 26 (1), 147–155. Martin, S. J., Grimwood, P. D., & Morris, R. G. (2000). Synaptic plasticity and memory: An evaluation of the hypothesis. Annual Review of Neuroscience, 23, 649–711. Maschke, M., Gomez, C. M., Ebner, T. J., & Konczak, J. (2004). Hereditary cerebellar atax ia progressively impairs force adaptation during goal-directed arm movements. Journal of Neurophysiology, 91 (1), 230–238. Matsumura, M., Sadato, N., Kochiyama, T., Nakamura, S., Naito, E., Matsunami, K., et al. (2004). Role of the cerebellum in implicit motor skill learning: a PET study. Brain Re search Bulletin, 63 (6), 471–483. Matsuzaka, Y., Picard, N., & Strick, P. L. (2007). Skill representation in the primary motor cortex after long-term practice. Journal of Neurophysiology, 97 (2), 1819–1832. Mazzoni, P., & Krakauer, J. W. (2006). An implicit plan overrides an explicit strategy dur ing visuomotor adaptation. Journal of Neuroscience, 26 (14), 3642–3645.

Page 31 of 38

Motor Skill Learning Mazzoni, P., & Wexler, N. S. (2009). Parallel explicit and implicit control of reaching. PLoS One, 4 (10), e7557. McNitt-Gray, J. L., Requejo, P. S., & Flashner, H. (2006). Multijoint control strategies transfer between tasks. Biological Cybernetics, 94 (6), 501–510. Miall, R. C., Christensen, L. O., Cain, O., & Stanley, J. (2007). Disruption of state estima tion in the human lateral cerebellum. PLoS Biol, 5 (11), e316. Miall, R. C., Weir, D. J., Wolpert, D. M., & Stein, J. F. (1993). Is the cerebellum a smith pre dictor? Journal of Motor Behavior, 25 (3), 203–216. Miyake, A., & Shah, P. (1999). Models of working memory: Mechanisms of active mainte nance and executive control. New York: Cambridge University Press. Montaldi, D., Mayes, A. R., Barnes, A., Pirie, H., Hadley, D. M., Patterson, J., et al. (1998). Associative encoding of pictures activates the medial temporal lobes. Human Brain Map ping, 6 (2), 85–104. Morton, S. M., & Bastian, A. J. (2006). Cerebellar contributions to locomotor adaptations during splitbelt treadmill walking. Journal of Neuroscience, 26 (36), 9107–9116. Nelson, W. L. (1983). Physical principles for economies of skilled movements. Biological Cybernetics, 46 (2), 135–147. Newell, K. M. (1991). Motor skill acquisition. Annual Review of Psychology, 42, 213–237. Nishimoto, R., & Tani, J. (2009). Development of hierarchical structures for actions and motor imagery: a constructivist view from synthetic neuro-robotics study. Psychological Research, 73 (4), 545–558. Nissen, M. J., & Bullemer, P. (1987). Attentional requirements of learning: Evidence from performance measures. Cognitive Psychology, 19 (1), 1–32. Ohbayashi, M., Ohki, K., Miyashita, Y. (2003). Conversion of working memory to motor se quence in the monkey premotor cortex. Science, 301, 233–236. Overduin, S. A., Richardson, A. G., Lane, C. E., Bizzi, E., & Press, D. Z. (2006). Intermit tent practice facilitates stable motor memories. Journal of Neuroscience, 26 (46), 11888– 11892. Packard, M. G., Hirsh, R., & White, N. M. (1989). Differential effects of fornix and caudate nucleus lesions on two radial maze tasks: evidence for multiple memory systems. Journal of Neuroscience, 9 (5), 1465–1472. Paine, R. W., & Tani, J. (2004). Motor primitive and sequence self-organization in a hierar chical recurrent neural network. Neural Networks, 17 (8-9), 1291–1309.

Page 32 of 38

Motor Skill Learning Pascual-Leone, A., Brasil-Neto, J. P., Valls-Sole, J., Cohen, L. G., & Hallett, M. (1992). Sim ple reaction time to focal transcranial magnetic stimulation: Comparison with reaction time to acoustic, visual and somatosensory stimuli. Brain, 115 (Pt 1), 109–122. Pascual-Leone, A., Grafman, J., Clark, K., Stewart, M., Massaquoi, S., Lou, J. S., et al. (1993). Procedural learning in Parkinson’s disease and cerebellar degeneration. Annals of Neurology, 34 (4), 594–602. Pascual-Leone, A., Grafman, J., & Hallett, M. (1994). Modulation of cortical motor output maps during development of implicit and explicit knowledge. Science, 263 (5151), 1287– 1289. Pascual-Leone, A., & Torres, F. (1993). Plasticity of the sensorimotor cortex representa tion of the reading finger in Braille readers. Brain, 116 (Pt 1), 39–52. Pascual-Leone, A., Valls-Sole, J., Wassermann, E. M., Brasil-Neto, J., Cohen, L. G., & Hal lett, M. (1992). Effects of focal transcranial magnetic stimulation on simple reaction time to acoustic, visual and somatosensory stimuli. Brain, 115 (Pt 4), 1045–1059. Pascual-Leone, A., Wassermann, E. M., Grafman, J., & Hallett, M. (1996). The role of the dorsolateral prefrontal cortex in implicit procedural learning. Experimental Brain Re search, 107, 479–485. Paz, R., Boraud, T., Natan, C., Bergman, H., & Vaadia, E. (2003). Preparatory activity in motor cortex reflects learning of local visuomotor skills. Nature Neuroscience, 6 (8), 882– 890. Poldrack, R. A., Clark, J., Pare-Blagoev, E. J., Shohamy, D., Creso Moyano, J., Myers, C., et al. (2001). Interactive memory systems in the human brain. Nature, 414 (6863), 546–550. Poldrack, R. A., Prabhakaran, V., Seger, C. A., & Gabrieli, J. D. (1999). Striatal activation during acquisition of a cognitive skill. Neuropsychology, 13 (4), 564–574. Polyakov, F., Drori, R., Ben-Shaul, Y., Abeles, M., & Flash, T. (2009). A compact representation of drawing movements with sequences of parabolic primitives. PLoS Com putational Biology, 5 (7), e1000427. (p. 434)

Ralston, H. (1958). Energy-speed relation and optimal speed during level walking. Euro pean Journal of Applied Physiology and Occupational Physiology, 17 (4), 277–283. Ramnani, N. (2006). The primate cortico-cerebellar system: anatomy and function. Nature Reviews Neuroscience, 7 (7), 511–522. Reber, A. S. (1967). Implicit learning of artificial grammars. Journal of Verbal Learning and Verbal Behavior, 6 (6), 855–863. Reber, A. S. (1976). Implicit learning of synthetic languages: The role of instructional set. Journal of Experimental Psychology: Human Learning and Memory, 2 (1), 88–94. Page 33 of 38

Motor Skill Learning Reis, J., Schambra, H., Cohen, L. G., Buch, E. R., Fritsch, B., Zarahn, E., Celnik, P. A., & Krakauer, J. W. (2009). Noninvasive cortical stimulation enhances motor skill acquisition over multiple days through an effect on consolidation. Proceedings of the National Acade my of Sciences U S A, 106 (5), 1590–1595. Remy, F., Wenderoth, N., Lipkens, K., Swinnen, S. P. (2010). Dual-task interference during initial learning of a new motor task results from competition for the same brain areas. Neuropsychologia, 48 (9), 2517–2527. Reuter-Lorenz, P. A., Jonides, J., Smith, E. E., Hartley, A., Miller, A., Marshuetz, C., et al. (2000). Age differences in the frontal lateralization of verbal and spatial working memory revealed by PET. Journal of Cognitive Neuroscience, 12, 174–187. Ridderinkhof, K. R., Ullsperger, M., Crone, E. A., & Nieuwenhuis, S. (2004). The role of the medial frontal cortex in cognitive control. Science, 306 (5695), 443–447. Robertson, E. M., Pascual-Leone, A., & Miall, R. C. (2004). Current concepts in procedur al consolidation. Nature Reviews Neuroscience, 5 (7), 576–582. Robertson, E. M., Tormos, J. M., Maeda, F., & Pascual-Leone, A. (2001). The role of the dorsolateral prefrontal cortex during sequence learning is specific for spatial information. Cerebral Cortex, 11 (628–635). Russeler, J., Kuhlicke, D., & Munte, T. F. (2003). Human error monitoring during implicit and explicit learning of a sensorimotor sequence. Neuroscience Research, 47 (2), 233– 240. Sadeh, T., Shohamy, D., Levy, D. R., Reggev, N., & Maril, A. (2011). Cooperation between the hippocampus and the striatum during episodic encoding. Journal of Cognitive Neuro science, 23, 1597–1608. Sakai, K., Hikosaka, O., Miyauchi, S., Takino, R., Sasaki, Y., & Putz, B. (1998). Transition of brain activation from frontal to parietal areas in visuomotor sequence learning. Journal of Neuroscience, 18 (5), 1827–1840. Schack, T., & Mechsner, F. (2006). Representation of motor skills in human long-term memory. Neuroscience Letters, 391 (3), 77–81. Scheidt, R. A., Dingwell, J. B., & Mussa-Ivaldi, F. A. (2001). Learning to move amid uncer tainty. Journal of Neurophysiology, 86 (2), 971–985. Schendan, H. E., Searl, M. M., Melrose, R. J., & Stern, C. E. (2003). An FMRI study of the role of the medial temporal lobe in implicit and explicit sequence learning. Neuron, 37 (6), 1013–1025. Schmidt, R. A. (1975). A schema theory of discrete motor learning. Psychological Review, 82 (4), 225–260.

Page 34 of 38

Motor Skill Learning Schneider, S. A., Wilkinson, L., Bhatia, K. P., Henley, S. M., Rothwell, J. C., Tabrizi, S. J., et al. (2010). Abnormal explicit but normal implicit sequence learning in premanifest and early Huntington’s disease. Movement Disorders, 25 (10), 1343–1349. Scoville, W. B., & Milner, B. (1957). Loss of recent memory after bilateral hippocampal le sions. Journal of Neurology, Neurosurgery, and Psychiatry, 20 (1), 11–21. Seger, C. A., & Cincotta, C. M. (2006). Dynamics of frontal, striatal, and hippocampal sys tems during rule learning. Cerebral Cortex, 16 (11), 1546–1555. Seidler, R. D. (2010). Neural correlates of motor learning, transfer of learning, and learn ing to learn. Exercise and Sport Sciences Review, 38 (1), 3–9. Seidler, R. D., & Noll, D. C. (2008). Neuroanatomical correlates of motor acquisition and motor transfer. Journal of Neurophysiology, 99, 1836–1845. Seidler, R. D., Noll, D. C., & Chintalapati, P. (2006). Bilateral basal ganglia activation asso ciated with sensorimotor adaptation. Experimental Brain Research, 175 (3), 544–555. Seidler, R. D., Purushotham, A., Kim, S. G., Ugurbil, K., Willingham, D., & Ashe, J. (2002). Cerebellum activation associated with performance change but not motor learning. Science, 296 (5575), 2043–2046. Seitz, R. J., & Roland, P. E. (1992). Learning of sequential finger movements in man: A combined kinematic and positron emission tomography (PET) Study. European Journal of Neuroscience, 4, 154–165. Seydell, A., McCann, B. C., Trommershauser, J., & Knill, D. C. (2008). Learning stochastic reward distributions in a speeded pointing task. Journal of Neuroscience, 28 (17), 4356– 4367. Shadmehr, R., & Brashers-Krug, T. (1997). Functional stages in the formation of human long-term motor memory. Journal of Neuroscience, 17 (1), 409–419. Shadmehr, R., Smith, M. A., & Krakauer, J. W. (2010). Error correction, sensory predic tion, and adaptation in motor control. Annual Review of Neuroscience, 33, 89–108. Shah, P., & Miyake, A. (1996). The separability of working memory resources for spatial thinking and language processing: An individual differences approach. Journal of Experi mental Psychology: General, 125 (1), 4–27. Shohamy, D., & Wagner, A. D. (2008). Integrating memories in the human brain: hip pocampal-midbrain encoding of overlapping events. Neuron, 60 (2), 378–389. Smith, E. E., Jonides, J., & Koeppe, R. A. (1996). Dissociating verbal and spatial working memory: PET investigations. Journal of Cognitive Neuroscience, 7, 337–356. Smith, M. A., Ghazizadeh, A., & Shadmehr, R. (2006). Interacting adaptive processes with different timescales underlie short-term motor learning. PLoS Biology, 4 (6), e179. Page 35 of 38

Motor Skill Learning Smith, M. A., & Shadmehr, R. (2005). Intact ability to learn internal models of arm dy namics in Huntington’s disease but not cerebellar degeneration. Journal of Neurophysiol ogy, 93 (5), 2809–2821. Song, S., Howard, J. H., Jr., & Howard, D. V. (2007). Implicit probabilistic sequence learn ing is independent of explicit awareness. Learning and Memory, 14 (3), 167–176. Spencer, R. M., & Ivry, R. B. (2009). Sequence learning is preserved in individuals with cerebellar degeneration when the movements are directly cued. Journal of Cognitive Neu roscience, 21, 1302–1310. Squire, L. R. (1992). Declarative and nondeclarative memory: multiple brain sys tems supporting learning and memory. Journal of Cognitive Neuroscience, 4 (3), 232–243. (p. 435)

Staresina, B. P., & Davachi, L. (2009). Mind the gap: Binding experiences across space and time in the human hippocampus. Neuron, 63 (2), 267–276. Stefan, K., Wycislo, M., Gentner, R., Schramm, A., Naumann, M., Reiners, K., et al. (2006). Temporary occlusion of associative motor cortical plasticity by prior dynamic mo tor training. Cerebral Cortex, 16 (3), 376–385. Sulzenbruck, S., & Heuer, H. (2009). Functional independence of explicit and implicit mo tor adjustments. Conscious Cognition, 18 (1), 145–159. Taylor, J., Klemfuss, N., & Ivry, R. (2010). An explicit strategy prevails when the cerebel lum fails to compute movement errors. Cerebellum, 9, 580–586. Taylor, J. A., & Thoroughman, K. A. (2007). Divided attention impairs human motor adap tation but not feedback control. Journal of Neurophysiology, 98 (1), 317–326. Taylor, J. A., & Thoroughman, K. A. (2008). Motor adaptation scaled by the difficulty of a secondary cognitive task. PLoS One, 3 (6), e2485. Thoroughman, K. A., & Shadmehr, R. (2000). Learning of action through adaptive combi nation of motor primitives. Nature, 407 (6805), 742–747. Toni, I., Krams, M., Turner, R., & Passingham, R. E. (1998). The time course of changes during motor sequence learning: A whole-brain fMRI study. NeuroImage, 8 (1), 50–61. Trommershäuser, J., Maloney, L. T., & Landy, M. S. (2008). Decision making, movement planning and statistical decision theory. Trends in Cognitive Sciences, 12 (8), 291–297. Tseng, Y. W., Diedrichsen, J., Krakauer, J. W., Shadmehr, R., & Bastian, A. J. (2007). Senso ry prediction errors drive cerebellum-dependent adaptation of reaching. Journal of Neuro physiology, 98 (1), 54–62. Tunney, R. J., & Fernie, G. (2007). Repetition priming affects guessing not familiarity. Be havioral and Brain Functions, 3, 40. Page 36 of 38

Motor Skill Learning Verwey, W. B. (1996). Buffer loading and chunking in sequential keypressing. Journal of Experimental Psychology: Human Perception and Performance, 22, 544–562. Verwey, W. B. (2001). Concatenating familiar movement sequences: The versatile cogni tive processor. Acta Psychologica, 106, 69–95. Vidoni, E. D., & Boyd, L. A. (2007). Achieving enlightenment: What do we know about the implicit learning system and its interaction with explicit knowledge? Journal of Neurolog ic Physical Therapy, 31 (3), 145–154. Vogel, E. K., & Machizawa, M. G. (2004). Neural activity predicts individual differences in visual working memory capacity. Nature, 428, 748–751. Volle, E., Kinkingnéhun, S., Pochon, J. B., Mondon, K., Thiebaut de Schotten, M., Seassau, M., et al. (2008). The functional architecture of the left posterior and lateral prefrontal cortex in humans. Cerebral Cortex, 18 (10), 2460–2469. Voss, J. L., Baym, C. L., & Paller, K. A. (2008). Accurate forced-choice recognition without awareness of memory retrieval. Learning and Memory, 15 (6), 454–459. Voss, J. L., & Paller, K. A. (2008). Brain substrates of implicit and explicit memory: The im portance of concurrently acquired neural signals of both memory types. Neuropsycholo gia, 46 (13), 3021–3029. Weiner, M. J., Hallett, M., & Funkenstein, H. H. (1983). Adaptation to lateral displacement of vision in patients with lesions of the central nervous system. Neurology, 33 (6), 766– 772. Werner, S., & Bock, O. (2007). Effects of variable practice and declarative knowledge on sensorimotor adaptation to rotated visual feedback. Experimental Brain Research, 178 (4), 554–559. Werner, S., Bock, O., Gizewski, E. R., Schoch, B., & Timmann, D. (2010). Visuomotor adap tive improvement and aftereffects are impaired differentially following cerebellar lesions in SCA and PICA territory. Experimental Brain Research, 201 (3), 429–439. Wilkinson, L., Khan, Z., & Jahanshahi, M. (2009). The role of the basal ganglia and its cor tical connections in sequence learning: evidence from implicit and explicit sequence learning in Parkinson’s disease. Neuropsychologia, 47 (12), 2564–2573. Willingham, D. B. (1998). A neuropsychological theory of motor skill learning. Psychologi cal Review, 105 (3), 558–584. Willingham, D. B. (2001). Becoming aware of motor skill. Trends in Cognitive Sciences, 5 (5), 181–182.

Page 37 of 38

Motor Skill Learning Willingham, D. B., Salidis, J., & Gabrieli, J. D. (2002). Direct comparison of neural systems mediating conscious and unconscious skill learning. Journal of Neurophysiology, 88 (3), 1451–1460. Wise, S. P., Moody, S. L., Blomstrom, K. J., Mitz, A. R. (1998). Changes in motor cortical activity during visuomotor adaptation. Experimental Brain Research, 121, 285–299. Wolpert, D. M., & Flanagan, J. R. (2010). Motor learning. Current Biology, 20 (11), R467– R472. Wolpert, D. M., & Miall, R. C. (1996). Forward models for physiological motor control. Neural Networks, 9 (8), 1265–1279. Yamamoto, T., & Fujinami, T. (2008). Hierarchical organization of the coordinative struc ture of the skill of clay kneading. Human Movement Science, 27 (5), 812–822.

Rachael Seidler

Rachael D. Seidler, Dept. of Psychology, School of Kinesiology, Neuroscience Pro gram, University of Michigan, Ann Arbor, MI Bryan L. Benson

Bryan L. Benson, Department of Psychology, School of Kinesiology, University of Michigan, Ann Arbor, MI Nathaniel B. Boyden

Nathaniel B. Boyden, Department of Psychology, University of Michigan, Ann Arbor, MI Youngbin Kwak

Youngbin Kwak, Neuroscience Program, University of Michigan, Ann Arbor, MI

Page 38 of 38

Memory Consolidation

Memory Consolidation John Wixted and Denise J. Cai The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0021

Abstract and Keywords Memory consolidation is a multifaceted concept. At a minimum, it refers to both cellular consolidation and systems consolidation. Cellular consolidation takes place in the hours after learning, stabilizing the memory trace—a process that may involve structural changes in hippocampal neurons. Systems consolidation refers to a more protracted process by which memories become independent of the hippocampus as they are estab lished in cortical neurons—a process that may involve neural replay. Both forms of consol idation may preferentially unfold whenever the hippocampus is not encoding new infor mation, although some theories hold that consolidation occurs exclusively during sleep. In recent years, the notion of reconsolidation has been added to the mix. According to this idea, previously consolidated memories, when later retrieved, undergo consolidation all over again. With new findings coming to light seemingly every day, the concept of consoli dation will likely evolve in interesting and unpredictable ways in the years to come. Keywords: cellular consolidation, systems consolidation, reconsolidation, sleep and consolidation

The idea that memories require time to consolidate has a long history, but the under standing of what consolidation means has evolved over time. In 1900, the German experi mental psychologists Georg Müller and Alfons Pilzecker published a monograph in which a new theory of memory and forgetting was proposed, one that included—for the first time—a role for consolidation. Their basic method involved asking subjects to study a list of paired-associate nonsense syllables and then testing their memory using cued recall af ter a delay of several minutes. Typically, some of the list items were forgotten, and to in vestigate why that occurred, Müller and Pilzecker (1900) presented subjects with a sec ond, interfering list of items to study before memory for the target list was tested. They found that this interpolated list reduced memory for the target list compared with a con trol group that was not exposed to any intervening activity. Critically, the position of the interfering list within the retention interval mattered such that interference occurring soon after learning had a more disruptive effect than interference occurring later in the retention interval. This led them to propose that memories require time to consolidate and that retroactive interference is a force that compromises the integrity of recently Page 1 of 34

Memory Consolidation formed (and not-yet-consolidated) memories. In this chapter, we review the major theo ries of consolidation—beginning with the still-relevant account proposed by Müller and Pilzecker (1900)—and we consider a variety of recent developments in what has become a rapidly evolving field.

The Early View: Consolidation and Resistance to Interference According to Müller and Pilzecker’s (1900) view, consolidation consists of “trace hardening” (cf. Wickelgren, 1974) in the sense that some (p. 437) physiological process perseverates and eventually renders the memory trace less vulnerable to interference caused by new learning. The kind of interference that a consolidated trace theoretically resists differs from the kind of interference that most experimental psychologists have in mind when they study forgetting. In the field of experimental psychology, new learning has long been thought to generate interference by creating competing associations linked to a retrieval cue, not by affecting the integrity of a fragile memory trace (e.g., Keppel, 1968; Underwood, 1957; Watkins & Watkins, 1975). Traditionally, this kind of interference has been investigated using an A-B, A-C paired-associates paradigm in which the same cue words (the A items) are paired with different to-be-remembered target words across two lists (the B and C items, respectively). In a standard retroactive interference para digm, for example, the memory test consists of presenting the A items and asking partici pants to recall the B items. Having learned the A-C associations after learning the A-B as sociations, the ability of participants to recall the B items is typically impaired, and this impairment is usually assumed to reflect retrieval competition from the C items. The pow erful effect of this kind of “cue overload” interference on retention has been well estab lished by decades of psychological research, but it is almost certainly not the only kind of interference that causes forgetting. The kind of interference envisioned by Müller and Pilzecker (1900) does not involve over loading a retrieval cue but instead involves directly compromising the integrity of a par tially consolidated memory trace. In what even today seems like a radical notion to many experimental psychologists, Müller and Pilzecker (1900) assumed that the interference was nonspecific in the sense that the interfering material did not have to be similar to the originally memorized material for interference to occur. Instead, mental exertion of any kind was thought to be the interfering force (Lechner et al., 1999). “Mental exertion” is fairly vague concept, and Wixted (2004a) suggested that the kind of intervening mental exertion that Müller and Pilzecker (1900) probably had in mind consists specifically of new learning. The basic idea is that new learning, per se, serves as an interfering force that degrades recently formed and still fragile memory traces. Loosely speaking, it can be said that Müller and Pilzecker (1900) believed that the memo ry trace becomes strengthened by the process of consolidation. However, there is more than one way that a trace can become stronger, so it is important to keep in mind which meaning of a “stronger memory trace” applies in any discussion of consolidation. One Page 2 of 34

Memory Consolidation way that a trace might become stronger is that it comes to more accurately reflect past experience than it did when it was first formed, much like a snapshot taken from a Po laroid camera comes into sharper focus over time. A trace that consolidated in this man ner would yield an ever-clearer memory of the encoding event in response to the same re trieval cue. Another way that a trace might become stronger is that it becomes ever more likely to spring to mind in response to a retrieval cue (even more likely than it was when the memory was first formed). A memory trace that consolidated in either of these two ways would support a higher level of performance than it did at the end of training, as if additional learning occurred despite the absence of additional training. Still another way that a trace can become stronger is that it becomes hardened against the destructive forces of interference. A trace that hardens over time (i.e., a trace that consolidates in that sense) may simultaneously become degraded over time due to the in terfering force of new learning or to some other force of decay. As an analogy, a clay replica of the Statue of Liberty will be at its finest when it has just been completed and the clay is still wet, but it will also be at its most vulnerable. With the passage of time, however, the statue dries and becomes more resistant to damage even though it may now be a less accurate replica than it once was (because of the damage that occurred before the clay dried). Müller and Pilzecker’s (1900) original view of consolidation, which was later elaborated by Wickelgren (1974), was analogous to this. That is, the consolidation process was not thought to render the trace more representative of past experience or to render it more likely to come to mind than it was at the time of formation; instead, consol idation was assumed to render the trace (or its association with a retrieval cue) more re sistant to interference even while the integrity of the trace was gradually being compro mised by interference. These considerations suggest a relationship between Müller and Pilzecker’s (1900) view of consolidation and the time course of forgetting. More specifically, the fact that a memo ry trace hardens in such a way as to become increasingly resistant to interference even as the trace fades may help to explain the general shape of the forgetting function (Wixted, 2004b). Since the seminal work of Ebbinghaus (1885), a consistent body of evi dence has indicated that the proportional rate of (p. 438) forgetting is rapid at first and then slows to a point at which almost no further forgetting occurs. This general property is captured by the power law of forgetting (Anderson & Schooler, 1991; Wixted & Carpen ter, 2007; Wixted & Ebbesen, 1991), and it is enshrined in Jost’s law of forgetting, which states that if two memory traces have equal strength but different ages, the older trace will decay at a slower rate than the younger one from that moment on (Jost, 1897). One possibility is that the continuous reduction in the rate of forgetting as a trace ages is a re flection of the increased resistance to interference as the trace undergoes a slow process of consolidation (Wickelgren, 1974; Wixted, 2004b).

Page 3 of 34

Memory Consolidation

Modern Views of Consolidation The view of consolidation advanced by the pioneering experimental psychologists Müller and Pilzecker was not embraced by the field of experimental psychology in the latter half of the twentieth century. Ironically, during that same period of time, the notion that mem ories consolidate became the “standard story” in the field of neuroscience. The impetus for this way of thinking among neuroscientists can be traced in large part to the realiza tion that the structures of medial temporal lobe (MTL) play a critical role in the formation of new memories. The importance of these structures became clear when patient H.M. re ceived a bilateral medial temporal lobe resection in an effort to control his epileptic seizures (Scoville & Milner, 1957). Although successful in that regard, H.M. was also un expectedly left with a profound case of anterograde amnesia (i.e., the inability to form new memories from that point on) despite retaining normal perceptual and intellectual functioning, including normal working memory capacity. Another outcome—one that is relevant to the issue of consolidation—was that H.M. also exhibited temporally graded retrograde amnesia (Scoville & Milner, 1957; Squire, 2009). That is, memories that were formed before surgery were also impaired, and the degree of impairment was greater for memories that had been formed just before surgery than for memories that had been formed well before. Although memories of up to 3 years before his surgery were seeming ly impaired, H.M.’s older memories appeared to be largely intact (Scoville & Milner, 1957). This result suggests that the brain systems involved in the maintenance of memory change over time.

Systems Consolidation The temporal gradient of retrograde amnesia that is associated with injury and disease was noted long ago by Ribot (1881/1882), but he had no way of knowing what brain struc tures were centrally involved in this phenomenon. The experience of H.M. made it clear that the relevant structures reside in the MTL, and the phenomenon of temporally graded retrograde amnesia suggests an extended but time-limited role for the MTL in the encod ing and retrieval of new memories. That is, the MTL is needed to encode new memories, and it is needed for a time after they are encoded, but it is not needed indefinitely. The decreasing dependence of memories on the MTL is known as systems consolidation (Fran kland & Bontempi, 2005; McGaugh, 2000). As a result of this process, which may last as long as several years in humans, memories are eventually reorganized and established in the neocortex in such a way that they become independent of the MTL (Squire et al., 2001). Note that this is a different view of consolidation than the resistance-to-interfer ence view proposed by Müller and Pilzecker (1900). The temporal gradient of retrograde amnesia exhibited by patient H.M. (documented dur ing early years after surgery) prompted more controlled investigations using both hu mans and nonhumans. These studies have shown that the temporal gradient is real and that it is evident even when bilateral lesions are limited to the hippocampus (a central structure of the MTL). For example, in a particularly well-controlled study, Anagnostaras et al. (1999) investigated the effect of hippocampal lesions in rats using a context fearPage 4 of 34

Memory Consolidation conditioning paradigm. In this task, a tone conditional stimulus (CS) is paired with a shock unconditional stimulus (US) several times in a novel context. Such training results in a fear of both the tone and the training context (measured behaviorally as the propor tion of time spent freezing), and memory for the context-shock association in particular is known to depend on the hippocampus. Anagnostaras et al. (1999) trained a group of rats in two different contexts, and this training was later followed by surgical lesions of the hippocampus. Each rat received training in Context A 50 days before surgery and train ing in Context B 1 day before surgery. Thus, at the time lesions were induced, memory for learning in Context A was relatively old, whereas memory for learning in Context B was still new. A later test of retention showed that remote (i.e., 50-day-old) memory for con textual fear was similar to that of controls, whereas recent (i.e., 1-day-old) memory for contextual fear was greatly impaired. Thus, in rats, hippocampus-dependent memories appear to (p. 439) become independent of the hippocampus in a matter of weeks. Controlled studies in humans sometimes suggest that the time course of systems consoli dation often plays out over a much longer period of time period, a finding that is consis tent with the time window of retrograde amnesia observed for H.M. However, the time course is rather variable, and the basis for the variability is not known. Both semantic and episodic memory have been assessed in studies investigating the temporal gradient of retrograde amnesia. Semantic memory refers to memory for factual knowledge (e.g., what is the capital of Texas?), whereas episodic memory refers to memory for specific events (e.g., memory for a recently presented list of words or memory for an autobio graphical event, such as a trip to the Bahamas).

Temporal Gradient of Semantic Memory Semantic knowledge is generally acquired gradually across multiple episodes of learning and is forgotten slowly, so it seems reasonable to suppose that the systems consolidation of such knowledge would be extended in time. In one recent study of this issue, Manns et al. (2003) measured factual knowledge in six amnesic patients with damage limited to the hippocampal region. Participants were asked questions about news events that had oc curred from 1950 to early 2002 (e.g., Which tire manufacturer recalled thousands of tires? [Firestone] What software company was accused of running a monopoly? [Mi crosoft]). The data for a particular patient (and for several controls matched to that pa tient) were analyzed according to the year in which the patient became amnesic. As might be expected, memory for factual knowledge was reduced for the period of time fol lowing the onset of memory impairment. Thus, for example, if a patient became amnesic in 1985, then memory for news events that occurred after 1985 was impaired (i.e., an terograde amnesia was observed). In addition, and more to the point, factual knowledge for news events that occurred during the several years immediately before the onset of memory impairment (e.g., 1980 to 1985) was also impaired, particularly when memory was measured by free recall rather than by recognition. However, memory for events that occurred 11 to 30 years before the onset of memory impairment (e.g., 1955 to 1974) was intact. These older memories, it seems, had become fully consolidated in the neocortex and were no longer dependent on the structures of the MTL. Page 5 of 34

Memory Consolidation The findings from lesion studies have sometimes been corroborated by neuroimaging studies performed on unimpaired subjects, though the relevant literature is somewhat mixed in this regard. Although some studies found more activity in the MTL during the recollection of recent semantic memories compared with remote semantic memories (Douville et al., 2005; Haist et al., 2001; Smith & Squire, 2009), other studies found no difference (e.g., Bernard et al., 2004; Maguire et al., 2001; Maguire & Frith, 2003). For example, using functional magnetic resonance imaging (fMRI), Bernard et al. (2004) identified brain regions associated with recognizing famous faces from two different peri ods: people who became famous in the 1960s to 1970s and people who became famous in the 1990s. They found that the hippocampus was similarly active during the recognition of faces from both periods (i.e., no temporal gradient was observed). It is not clear why studies vary in this regard, but one possibility is that the detection of a temporal gradient is more likely when multiple time points are assessed, especially during the several years immediately preceding memory impairment, than when only two time points are assessed (as in Bernard et al., 2004). In an imaging study that was patterned after the lesion study reported by Manns et al. (2003), Smith and Squire (2009) measured brain activity while subjects recalled news events from multiple time points over the past 30 years. In agreement with the lesion study, they found that regions in the MTL exhibited a decrease in brain activity as a func tion of the age of the memory over a 12-year period (whereas activity was constant for memories from 13 to 30 years ago). In addition, they found that regions in the frontal lobe, temporal lobe, and parietal lobe exhibited an increase in activity as a function of the age of the trace. Thus, it seems that the (systems) consolidation of semantic memories is a slow process that may require years to complete.

Temporal Gradient of Autobiographical (Episodic) Memory Although lesion studies and neuroimaging studies point to a temporal gradient for seman tic memory lasting years, there is some debate about whether episodic memory—in par ticular autobiographical memory—exhibits any temporal gradient at all. For example, some recent studies performed on H.M. that were conducted not long before he passed away in 2008 showed that his memory for remote personal experiences, unlike his memo ry for remote factual knowledge, was not preserved (Steinvorth, Levine, & Corkin, 2005). In addition, (p. 440) a number of case studies of memory-impaired patients have reported impairments of childhood memories (Cipolotti et al., 2001; Eslinger, 1998; Hirano & Noguchi, 1998; Kitchener et al., 1998; Maguire et al., 2006; Rosenbaum et al., 2004). These findings appear to suggest that the MTL plays a role in recalling personal episodes even if they happened long ago, and the apparent implication is that autobiographical memories do not undergo systems consolidation. However, by the time H.M.’s remote au tobiographical memory impairment was documented, he was an elderly patient, and his brain was exhibiting signs of cortical thinning, abnormal white matter, and subcortical in farcts (Squire, 2009). Thus, these late-life brain abnormalities could account for the loss of remote memories. In addition, in most of the case studies that have documented re mote autobiographical memory impairment, damage was not restricted to the MTL. To Page 6 of 34

Memory Consolidation compare the remote memory effects of limited MTL damage and damage that also in volved areas of the neocortex, Bayley et al. (2005) measured the ability of eight amnesic patients to recollect detailed autobiographical memories from their early life. Five of the patients had damage limited to the MTL, whereas three had damage to the neocortex in addition to MTL damage. They found that the remote autobiographical memories of the five MTL patients were quantitatively and qualitatively similar to the recollections of the control group, whereas the autobiographical memories of the three patients with addi tional neocortical damage were severely impaired. This result suggests that semantic memory and episodic memory both eventually become independent of the MTL through a process of systems consolidation, but the temporal gradient of retroactive amnesia associ ated with that process can be obscured if damage extends to the neocortex. MacKinnon and Squire (1989) also found that the temporal gradient of autobiographical memories for five MTL patients was similar in duration to the multiyear gradient associated with se mantic memory.

Temporal Gradients Involving a Shorter Time Scale Recent neuroimaging studies have documented a temporal gradient of activity for memo ry of simple laboratory stimuli on a time scale that is vastly shorter than the multiyear process of consolidation suggested by lesion studies of semantic memory and autobio graphical memory. For example, using fMRI, Yamashita et al. (2009) measured activity in the hippocampus and temporal neocortex associated with memory for two sets of pairedassociate figures that subjects had previously memorized. One set was studied 8 weeks before the memory test (old memories), and the other was studied immediately before the memory test (new memories). Overall accuracy at the time of test was equated for the two conditions by providing extra study time for the items that were studied 8 weeks be fore. Thus, any differences in activity associated with old and new memories could not be attributed to differences in memory strength. The results showed that a region in right hippocampus was associated with greater activity during retrieval of new memories than old memories, whereas in left temporal neocortex, the opposite activation pattern (i.e., old > new) was observed. These results are consistent with a decreasing role of the hip pocampus and increasing role of the neocortex as memories age over a period as short as 50 days (cf. Takashima et al., 2006), a time scale of consolidation that is similar to that ob served in experimental animals (e.g., Anagnostaras et al., 1999). An even shorter time scale for systems consolidation was evident in a recent fMRI study reported by Takashima et al. (2009). Subjects in that study memorized two sets of face–lo cation stimuli, one studied 24 hours before the memory test (old memories) and the other studied 15 minutes before the memory test (new memories). To control for differences in memory strength, they compared activity for high-confidence hits associated with the old and new memories and found that hippocampal activity decreased and neocortical activi ty increased over the course of 24 hours. In addition, the connectivity between the hip pocampus and the neocortical regions decreased, whereas cortico-cortical connectivity increased (all over the course of only 24 hours). Results like these suggest that the process of systems consolidation can occur very quickly. Page 7 of 34

Memory Consolidation What determines whether the temporal gradient is short or long? The answer is not known, but Frankland and Bontempi (2006) suggested that the critical variable may be the richness of the memorized material. To-be-remembered stimuli presented in a labora tory are largely unrelated to a subject’s personal history and thus might be integrated with prior knowledge represented in the cortex in a rather sparse (yet rapid) manner. Au tobiographical memories, by contrast, are generally related to a large preexisting knowl edge base. The integration of such memories into an intricate knowledge base may re quire more extended dialogue between (p. 441) the hippocampus and neocortex (McClel land, McNaughton, & O’Reilly, 1995). Alternatively, Tse et al. (2007) suggested that when memories can be incorporated into an associative “schema” of preexisting knowledge (i.e., when newly learned information is compatible with previously learned information), the process of systems consolidation is completed very rapidly (within 48 hours for rats). However, it is not clear whether this idea can account for the rapid systems consolidation that is apparent for memories of arbitrary laboratory-based stimuli in humans (e.g., Takashima et al., 2009). The as-yet-unresolved question of why memories vary in how quickly they undergo systems consolidation seems likely to remain a focus of research in this area for some time to come.

Decreasing Hippocampal Activity Versus Increasing Neocortical Activity The findings discussed above support the view that memories undergo a process of sys tems consolidation in which the structures of the MTL play a decreasing role with the passage of time. An interesting feature of several of the neuroimaging studies discussed above is that not only does MTL activity often decrease with time, but also neocortical ac tivity increases over time (Smith & Squire, 2009; Takashima et al., 2009; Yamashita et al., 2009). What might that increased activity signify? The memory of a past experience that is elicited by a retrieval cue presumably consists of the reactivation of distributed neocortical areas that were active at the time the trace was initially encoded (Damasio, 1989; Hoffman & McNaughton, 2002; Johnson, McDuff, Rugg, & Norman, 2009; McClelland, McNaughton, & O’Reilly, 1995; Squire & Alvarez, 1995). The primary sensory areas of the brain (e.g., the brain areas activated by the sights, sounds, and smells associated with a visit to the county fair) converge on association ar eas of the brain, which, in turn, are heavily interconnected with the MTL. Conceivably, memories are stored in widely distributed neocortical areas from the outset, but the hip pocampus and other structures of the MTL are required to bind them together until corti co-cortical associations develop, eventually rendering the memory trace independent of the MTL (Wixted & Squire, 2011). In studies using fMRI, the increasing role of the direct cortico-cortical connections may be reflected in increased neocortical activity as time passes since the memory trace was formed (Takashima et al., 2009). In agreement with this possibility, a series of trace eyeblink conditioning studies conduct ed by Takehara-Nishiuchi and colleagues has shown that an area of the medial prefrontal cortex (mPFC) in rats becomes increasingly necessary for the retrieval of memories as they become decreasingly dependent on the hippocampus over the course of several weeks. This was shown both by lesion studies and by direct neural recordings of task-re Page 8 of 34

Memory Consolidation lated activity in the mPFC (Takehara, Kawahara, & Kirino, 2003; Takehara-Nishiuchi & McNaughton, 2008). In one particularly relevant experiment, Takehara-Nishiuchi and Mc Naughton (2008) showed that task-related activity of mPFC neurons in rats increased over the course of several weeks even in the absence of further training. In a conceptual ly related study, Frankland et al. (2004) trained mice in a fear-conditioning procedure and tested memory either 1 day (recent) or 36 days (remote) after training. They found that activity in multiple association cortical regions (measured by the expression of activityregulated genes) was greater for remote than for recent memories. In addition, they found that the increased cortical activity for remote memories was not evident in mice with a gene mutation that selectively impairs remote memory. Results like these would seem to provide direct evidence of the kind of cortical reorganization that has long been thought to underlie systems consolidation.

Cellular Consolidation Systems consolidation is not what Müller and Pilzecker (1900) had in mind when they first introduced the concept of consolidation. Their view was that a memory trace be comes increasingly resistant to interference caused by new learning as the trace consoli dates, not that the trace becomes reorganized in the neocortex (and, therefore, less de pendent on the MTL) over time. Whereas a role for systems consolidation came into sharper focus in the years following the recognition of H.M.’s memory impairment, evidence for a second kind of consolida tion began to emerge in the early 1970s. This kind of consolidation—called cellular con solidation—occurs at the level of neurons (not brain systems) and takes place over the hours (and, perhaps, days) after a memory is formed in the hippocampus (McGaugh, 2000). Cellular consolidation seems more directly relevant to the trace-hardening physio logical processes that Müller and Pilzecker (1900) had in mind, and it had its origins in the discovery of a phenomenon known as long-term potentiation (LTP; Bliss & Lomo, 1973). LTP is a relatively long-lasting enhancement of synaptic efficacy that is induced by a brief burst of high-frequency electrical stimulation (a tetanus) delivered to presynaptic neurons (Bliss & Collingridge, 1993). Before the tetanus, a single (weak) test pulse of (p. 442)

electrical stimulation applied to the presynaptic neuron elicits a certain baseline re sponse in the postsynaptic neuron, but after the tetanus, that same test pulse elicits a greater response. The enhanced reactivity typically lasts hours or days (and sometimes weeks), so it presumably does not represent the way in which memories are permanently coded. Still, LTP is readily induced in hippocampal neurons, and it is, by far, the leading approach to modeling the neural basis of initial memory formation (Bliss, Collingridge, & Morris, 2003; Martin, Grimwood, & Morris, 2000). In this model, tetanic stimulation is analogous to the effect of a behavioral experience, and the enhanced efficacy of the synapse is analogous to the memory of that experience.

Page 9 of 34

Memory Consolidation Although LTP looks like neural memory for an experience (albeit an artificial experience consisting of a train of electrical impulses), what reason is there to believe that a similar process plays a role in real memories? The induction of LTP in hippocampal neurons in volves the opening of calcium channels in postsynaptic N-methyl-D-aspartate (NMDA) re ceptors (Bliss & Collingridge, 1993). When those receptors are blocked by an NMDA an tagonist, high-frequency stimulation fails to induce LTP. Perhaps not coincidentally, NM DA antagonists have often been shown to impair the learning of hippocampus-dependent tasks in animals (e.g., Morris et al., 1986; Morris, 1989), as if an LTP-like process in the hippocampus plays an important role in the formation of new episodic memories. One study suggests that the encoding of actual memories (not just an artificial train of electri cal pulses) also gives rise to LTP in the hippocampus. Whitlock et al. (2006) trained rats on an inhibitory avoidance task (a task known to be dependent on the hippocampus), and they were able to find neurons in the hippocampus that exhibited sustained LTP after training (not after an artificial tetanus). In addition, tetanic stimulation applied to these neurons after training now had a lesser effect (as if those neurons were already close to ceiling levels of LTP) than tetanic stimulation applied to the neurons of animals who had not received training. These findings suggest that LTP may be more than just a model for memory formation; it may, in fact, be part of the mechanism that underlies the initial en coding of memory. What does LTP have to do with the story of consolidation? The induction of LTP unleashes a molecular cascade in postsynaptic neurons that continues for hours and results in struc tural changes to those neurons. The postsynaptic changes are protein-synthesis depen dent and involve morphological changes in dendritic spines (Yuste & Bonhoeffer, 2001) and the insertion of additional AMPA receptors into dendritic membranes (Lu et al., 2001). These changes are generally thought to stabilize LTP because LTP degrades rapid ly if they do not occur (or are prevented from occurring by the use of a protein synthesis inhibitor). LTP exhibits all of the characteristics of consolidation envisioned by Müller and Pilzecker (1900). In their own work, Müller and Pilzecker (1900) used an original learning phase (L1), followed by an interfering learning phase (L2), followed by a memory test for the original list (T1). Holding the retention interval between L1 and T1 constant, they essen tially showed that L1-L2-----T1 yields greater interference than L1---L2---T (where the dashes represent units of time). In experimental animals, memories formed in the hip pocampus and LTP induced in the hippocampus both exhibit a similar temporal gradient with respect to retroactive interference (Izquierdo et al., 1999; Xu et al., 1998). Whether L1 and L2 both involve hippocampus-dependent learning tasks (e.g., L1 = one-trial in hibitory avoidance learning, L2 = exploration of a novel environment), as reported by Izquierdo et al. (1999), or one involves the induction of LTP (L1) while the other involves exposure to a learning task (L2), as reported by Xu et al. (1998), the same pattern emerges. Specifically, L2 interferes with L1 when the time between them is relatively short (e.g., 1 hour), but not when the time between them is relatively long (e.g., 6 or more hours). Moreover, if an NMDA antagonist is infused into the hippocampus before L2 (thereby blocking the induction of interfering LTP that might be associated with the Page 10 of 34

Memory Consolidation learning of a potentially interfering task), no interference effect is observed even when the L1-L2 temporal interval is short. The point is that hippocampus-dependent memories and hippocampal LTP both appear to be vulnerable to interference early on and then become more resistant to interference with the passage of time. Moreover, the interfering force is the formation of new memo ries (or, analogously, the induction of LTP). Newly induced LTP, like a newly encoded memory, begins life in a fragile state. Over time, as the process of cellular consolidation (p. 443) unfolds, recently formed LTP and recently encoded memories become more sta ble, which is to say that they become more resistant to interference caused by the induc tion of new LTP or by the encoding of new memories. The use of an NMDA antagonist in rats is not the only way to induce a temporary period of anterograde amnesia (thereby protecting recently induced LTP or recently formed memories). In sufficient quantities, alcohol and benzodiazepines have been shown to do the same in humans. Moreover, like NMDA antagonists, these drugs not only induce an terograde amnesia but also inhibit the induction of LTP in the hippocampus (Del Cerro et al., 1992; Evans & Viola-McCabe, 1996; Givens & McMahon, 1995; Roberto et al., 2002, Sinclair & Lo, 1986). Interestingly, they also result in a phenomenon known as retrograde facilitation. That is, numerous studies have reported that even though alcohol induces amnesia for information studied under the influence of the drug, it actually results in im proved memory for material studied just before consumption (e.g., Bruce & Pihl, 1997; Lamberty, Beckwith, & Petros, 1990; Mann, Cho-Young, & Vogel-Sprott, 1984; Parker et al., 1980, 1981). Similar findings have been frequently reported for benzodiazepines such as diazepam and triazolam (Coenen & Van Luijtelaar, 1997; Fillmore et al., 2001; Ghoneim, Hinrichs, & Mewaldt, 1984; Hinrichs, Ghoneim, & Mewaldt, 1984; Weingartner et al., 1995). Predrug memories, it seems, are protected from interference that would have been created during the postdrug amnesic state. It is important to emphasize that postlearning amnesia-inducing agents (such as NMDA antagonists used in rats or alcohol and benzodiazepines used in humans) do not enhance predrug memories in an absolute sense. That is, in response to these drugs, the memories do not more accurately represent past experience and are not more likely to be retrieved than they were at the end of learning. Instead, memories formed before drug intake are forgotten to a lesser degree than memories formed before placebo. By limiting the forma tion of new memories, alcohol and benzodiazepines (like NMDA antagonists) may protect memories that were formed just before drug intake. While protected from the trace-de grading force of new memory formation, these memories may be allowed to consolidate (via cellular consolidation) in a way that hardens them against the interference they will later encounter when new memories are once again formed. If so, then less forgetting should be observed than would otherwise be the case. All of these findings are easily understood in terms of cellular consolidation (not systems consolidation), but a recent explosion of research on the role of sleep and consolidation

Page 11 of 34

Memory Consolidation has begun to suggest that the distinction between cellular consolidation and systems con solidation may not be as sharp as previously thought.

Sleep and Consolidation In recent years, the idea that sleep plays a special role in the consolidation of both declar ative and nondeclarative memory has received a great deal of attention. Declarative mem ory consists of the conscious remembrance of either factual information (i.e., semantic memory) or past experience (i.e., episodic memory), and it is the kind of memory that we have discussed thus far in connection with systems consolidation and cellular consolida tion. Nondeclarative memory, on the other hand, refers to the acquisition and retention of nonconscious skills and abilities, with the prototypical example being the ability to ride a bike. With practice, one’s riding ability improves, but the memory of how to balance on two wheels is not realized by consciously remembering anything about the past (as in the case of declarative memory). Instead, that memory is realized by climbing on the bike and discovering that you can ride it without falling off. Whereas declarative memory depends on the structures of the MTL, nondeclarative memories do not (Squire, 1992; Squire & Zola, 1996). As a result, amnesic patients with MTL damage have an impairment of de clarative memory (both anterograde amnesia and temporally graded retrograde amne sia), but they are generally unimpaired at learning and retaining procedural skills (Squire, 1992). An amnesic could, for example, learn to ride a bike as easily as you could, but, unlike you, the amnesic would have no conscious declarative memory of the practice sessions. Recent research suggests that sleep plays a role in the consolidation of both de clarative and nondeclarative memories. Because sleep is not an undifferentiated state, one focus of this line of research has been to identify the specific stage of sleep that is important for consolidation. Sleep is divided into five stages that occur in a regular sequence within 90-minute cycles throughout the night. Stages 1 through 4 refer to ever-deeper levels of sleep, with stages 3 and 4 often being referred to as slow-wave sleep. Rapid eye movement (REM) sleep is a lighter stage of sleep (p. 444) associated with vivid dreams. Although every stage of sleep occurs during each 90-minute sleep cycle, the early sleep cycles of the night are dominated by slowwave sleep, and the later sleep cycles are dominated by REM sleep. For declarative mem ory, the beneficial effects of sleep have been almost exclusively associated with slow-wave sleep, and this is true of its possible role in systems consolidation and cellular consolida tion. For nondeclarative memories, the beneficial effects of sleep are more often associat ed with REM sleep.

Sleep-Related Consolidation of Declarative Memory Much evidence dating back at least to Jenkins and Dallenbach (1924) has shown that less forgetting occurs if one sleeps during the retention interval than if one remains awake. A reduction in interference is generally thought to play some role in this sleep-related bene fit, but there appears to be much more to the story than that. In particular, consolidation Page 12 of 34

Memory Consolidation is an important part of the story, and the different stages of sleep play very different roles. Ekstrand and colleagues (Ekstrand, 1972; Yaroush, Sullivan, & Ekstrand, 1971) were the first to address the question of whether the different stages of sleep differentially benefit what we now call declarative memory. These researchers took advantage of the fact that most REM sleep occurs in the second half of the night, whereas most non-REM sleep oc curs in the first half. Some subjects in this experiment learned a list, went to sleep imme diately, and were awakened 4 hours later for a test of recall. These subjects experienced mostly slow-wave sleep during the 4-hour retention interval. Others slept for 4 hours, were awakened to learn a list, slept for another 4 hours, and then took a recall test. These subjects experienced mostly REM sleep during the 4-hour retention interval. The control (i.e., awake) subjects learned a list during the day and were tested for recall 4 hours lat er. The subjects all learned the initial list to a similar degree, but the results showed that 4 hours of mostly non-REM sleep resulted in less forgetting relative to the other two con ditions, which did not differ from each other (i.e., REM sleep did not facilitate memory). Barrett and Ekstrand (1972) reported similar results in a study that controlled for time-ofday and circadian rhythm confounds, and the effect was later replicated in studies by Pli hal and Born (1997, 1999). Slow-wave sleep may play a role in both cellular consolidation and systems consolidation.

Slow-Wave Sleep and Cellular Consolidation Why is slow-wave sleep more protective of recently formed memories than REM sleep? One possibility is that slow-wave sleep is more conducive to cellular consolidation than REM sleep. In experiments performed on sleeping rats, Jones Leonard et al. (1987) showed that LTP can be induced in the hippocampus during REM sleep but not during slow-wave sleep. Whereas slow-wave sleep inhibits the induction of LTP, it does not dis rupt the maintenance of previously induced LTP (Bramham & Srebo 1989). In that sense, slow-wave sleep is like the NMDA antagonists discussed earlier (i.e., they block the induc tion of new LTP but not the maintenance of previously induced LTP). By contrast, with re gard to synaptic plasticity in the hippocampus, REM sleep is similar to the awake state (i.e., LTP can be induced during REM). Even during a night of sleep, interference may occur, especially during REM sleep, when considerable mental activity (mainly vivid dreaming) takes place and memories can be en coded in the hippocampus. But memories are probably never formed during slow-wave sleep. This is true despite the fact that a considerable degree of mental activity (consist ing of static visual images, thinking, reflecting, etc.) occurs during slow-wave sleep. In deed, perhaps half as much mental activity occurs during non-REM sleep as during REM sleep (Nielsen, 2000). However, mental activity and the formation of memories are not one and the same. The mental activity that occurs during slow-wave sleep is not remem bered, perhaps because it occurs during a time when hippocampal plasticity is mini mized. Because no new memories are formed in the hippocampus during this time, cellu lar consolidation can presumably proceed in the absence of interference. During REM sleep, however, electroencephalogram (EEG) recordings suggest that the hippocampus is Page 13 of 34

Memory Consolidation in an awake-like state, and LTP can be induced (and memories can be formed), so inter ference is more likely to occur. If slow-wave sleep protects recently formed memories from interference while allowing cellular consolidation to move forward, then a temporal gradient of interference should be observed. That is, sleep soon after learning should confer more protection than sleep that is delayed. This can be tested by holding the retention interval between learning (L1) and test (T1) constant (e.g., at 24 hours), with the location of sleep (S) within that reten tion interval varied. Using the notation introduced earlier, the (p. 445) prediction would be that L1-S-----T1 will confer greater protection than L1---S---T1. If a temporal gradient is observed (i.e., if memory performance at T1 is greater in the first condition than the sec ond), it would suggest that sleep does more than simply subtract out a period of retroac tive interference that would otherwise occur. Instead, it would suggest that sleep (pre sumably slow-wave sleep) also allows the process of cellular consolidation to proceed in the absence of interference. Once again, Ekstrand (1972) performed the pioneering experiment on this issue. In that experiment, memory was tested for paired-associate words following a 24-hour retention interval in which subjects slept either during the 8 hours that followed list presentation or during the 8 hours that preceded the recall test. In the immediate sleep condition (in which L1 occurred at night, just before sleep), he found that 81 percent of the items were recalled 24 hours later; in the delayed sleep condition (in which L1 occurred in the morn ing), only 66 percent were recalled. In other words, a clear temporal gradient associated with the subtraction of retroactive interference was observed, one that is the mirror im age of the temporal gradient associated with the addition of retroactive interference re ported by Müller and Pilzecker (1900). More recent sleep studies have reinforced the idea that the temporal gradient of retrograde facilitation is a real phenomenon, and they have addressed various confounds that could have accounted for the results that Ekstrand (1972) obtained (Gais, Lucas, & Born, 2006; Talamini et al., 2008). The temporal gradient associated with sleep, like the LTP and animal learning research described earlier, is con sistent with the notion that when memory formation is temporarily halted, recently formed and still-fragile memories are protected from interference. As a result, they are given a chance to become hardened against the forces of retroactive interference that they will later encounter (perhaps through a process of cellular consolidation).

Slow-Wave Sleep and Systems Consolidation Recent sleep studies have also shed light on the mechanism that may account for systems consolidation, which presumably involves some relatively long-lasting form of communi cation between the hippocampus and the neocortex (Marr, 1971). The mechanism of com munication is not known, but a leading candidate is neural replay, and most of the work on this topic comes from sleep studies. The phenomenon of neural replay was initially ob served in hippocampal cells of sleeping rats after they had run along a familiar track, and its discovery was tied to the earlier discovery of place cells in the hippocampus.

Page 14 of 34

Memory Consolidation Long ago, it was discovered that the firing of particular hippocampal cells in awake rats is coupled to specific points in the rat’s environment (O’Keefe & Dostrovsky, 1971). These cells are known as “place cells” because they fire only when the rat traverses a particular place in the environment. Usually, hippocampal place cells fire in relation to the rat’s po sition on a running track. That is, as the rat traverses point A along the track, place cell 1 will reliably fire. As it traverses point B, place cell 2 will fire (and so on). An intriguing finding that may be relevant to the mechanism that underlies systems consolidation is that cells that fire in sequence in the hippocampus during a behavioral task tend to be come sequentially coactive again during sleep (Wilson & McNaughton, 1994). This is the phenomenon of neural replay. Neural reply has most often been observed in rats during slow-wave sleep. It has also oc casionally been observed during REM sleep, but, in that case, it occurs at a rate that is similar to the neuron firing that occurred during learning (Louie & Wilson, 2001) and thus may simply reflect dreaming. The neural replay that occurs during slow-wave sleep occurs at a rate five to ten times faster than it did during the waking state (e.g., Ji & Wil son, 2007) and may therefore reflect a biological consolidation process separate from mental activity like dreaming. It is as if the hippocampus is replaying the earlier behav ioral experience, perhaps as a way to reorganize the representation of that experience in the neocortex. The fact that replay of sequential place cell activity in the hippocampus occurs during slow-wave sleep does not, by itself, suggest anything about communication between the hippocampus and the neocortex (the kind of communication that is presumably required for systems consolidation to take place). However, Ji and Wilson (2007) reported that hip pocampal replay during slow-wave sleep in rats was coordinated with firing patterns in the visual cortex, which is consistent with the idea that this process underlies the reorga nization of memories in the neocortex. In addition, Lansink et al. (2009) performed multi neuron recordings from the hippocampus and ventral striatum during waking and sleep ing states. While the rats were awake, the hippocampal cells fired when the rat traversed a (p. 446) particular point in the environment (i.e., they were place cells), whereas the stri atal cells generally fired in response to rewards. During slow-wave sleep (but not during REM sleep), they found that the hippocampal and striatal cells reactivated together. The coordinated firing was particularly evident for pairs in which the hippocampal place cell fired before the striatal reward-related neuron. Thus, the hippocampus leads reactivation in a projection area, and this mechanism may underlie the systems consolidation of place– reward associations. One concern about studies of neural replay is that the animals are generally overtrained, so little or no learning actually occurs. Thus, it is not clear whether learning-related neur al replay takes place. However, Peyrache et al. (2009) recorded neurons in prefrontal cor tex during the course of learning. Rats were trained on a Y-maze task in which they learned to select the rewarded arm using one rule (e.g., choose the left arm) that changed to a different rule as soon as a criterion level of performance was achieved (e.g., choose the right arm). They identified sets of neuronal assemblies with reliable coactiva Page 15 of 34

Memory Consolidation tions in prefrontal cortex, and some of these coactivations became stronger when the rat started the first run of correct trials associated with the acquisition of the new rule. Fol lowing these sessions, replay during slow-wave sleep mainly involved the learning-related coactivations. Thus, learning-related replay—the mechanism that may underlie systems consolidation—can be identified and appears to get underway very soon after learning. Other evidence suggests that something akin to neural replay occurs in humans as well. An intriguing study by Rasch et al. (2007) showed that cuing recently formed odor-associ ated memories by odor re-exposure during slow-wave sleep—but not during REM sleep— prompted hippocampal activation (as measured by fMRI) and resulted in less forgetting after sleep compared with a control group. This result is consistent with the notion that systems consolidation results from the reactivation of newly encoded hippocampal repre sentations during slow-wave sleep. In a conceptually related study, Peigneux et al. (2004) measured regional cerebral blood flow and showed that hippocampal areas that were ac tivated during route learning in a virtual town (a hippocampus-dependent, spatial learn ing task) were activated again during subsequent slow-wave sleep. Moreover, the degree of activation during slow-wave sleep correlated with performance on the task the next day. In both these studies, the hippocampal reactivation (perhaps reflective of hippocamponeocortical dialogue) occurred within hours of the learning episode, a time course of con solidation ordinarily associated with cellular consolidation. The timing observed in these studies is not unlike that observed in a neuroimaging study discussed earlier in which hippocampal activity decreased, and neocortical activity increased, over a period as short as 24 hours (Takashima et al., 2009). Moreover, the timing fits with studies in rats show ing that learning-related neural replay is evident in the first slow-wave sleep episode that follows learning (Peyrache et al., 2009). In a sleep-deprivation study that also points to an almost immediate role for systems-level consolidation processes, Sterpenich et al. (2009), using human subjects, investigated memory for emotional and neutral pictures 6 months after encoding. Half the subjects were deprived of sleep on the first postencoding night, and half were allowed to sleep (and then all subjects slept normally each night thereafter). Six months later, subjects completed a recognition test in the scanner in which each test item was given a judgment of “remember” (previously seen and subjectively recollected), “know” (previously seen but not subjectively recollected), or “new” (not previously seen). A contrast between ac tivity associated with remembered items and known items yielded a smaller difference in the sleep-deprived subjects across a variety of brain areas (ventral mPFC, precuneus, amygdala, and occipital cortex), even though the items had been memorized 6 months earlier, and these results were interpreted to mean that sleep during the first postencod ing night influences the long-term systems-level consolidation of emotional memory. The unmistakable implication from all of these studies is that the process thought to un derlie systems consolidation—namely, neural replay (or neural reactivation)—begins to unfold in a measurable way along a time course ordinarily associated with cellular consol Page 16 of 34

Memory Consolidation idation. That is, in the hours after a trace is formed, hippocampal LTP stabilizes, and neural replay in the hippocampus gets underway. These findings would seem to raise the possibility that the molecular cascade that underlies cellular consolidation also plays a role in initiating neural replay (Mednick, Cai, Shuman, Anagnostaras, & Wixted, 2011). If interference occurs while the trace is still fragile, then LTP will not stabilize, and presum ably, neural replay will not be initiated. In that case, the memory will be lost. But if hip pocampal (p. 447) LTP is allowed to stabilize (e.g., if potentially interfering memories are blocked, or if a period of slow-wave sleep ensues after learning), then (1) the LTP will sta bilize and become more resistant to interference and (2) neural replay in the hippocam pus will commence and the memory will start to become reorganized in the neocortex. Thus, on this view, cellular consolidation is an early component of systems consolidation. With these considerations in mind, it is interesting to consider why Rasch et al. (2007) and Peigneux et al. (2004) both observed performance benefits associated with reactiva tion during slow-wave sleep. Results like these suggest that reactivation not only serves to reorganize the memory trace in the neocortex but also strengthens the memory trace in some way. But in what way is the trace strengthened? Did the reactivation process that occurred during slow-wave sleep act as a kind of rehearsal, strengthening the memory in much the same way that ordinary conscious rehearsal strengthens a memory (increasing the probability that the memory will later be retrieved)? Or did the reactivation instead serve to render the memory trace less dependent on the hippocampus and, in so doing, protect the trace from interference caused by the encoding of new memories in the hip pocampus (e.g., Litman & Davachi, 2008)? Either way, less forgetting would be (and was) observed following a period of reactivation compared with a control condition. The available evidence showing that increased reactivation during slow-wave sleep re sults in decreased forgetting after a night of sleep does not shed any direct light on why reactivation causes the information to be better retained. Evidence uniquely favoring a rehearsal-like strengthening mechanism (as opposed to protection from interference) would come from a study showing that reactivation during sleep can be associated with an actual enhancement of performance beyond the level that was observed at the end of training. Very few declarative memory studies exhibit that pattern, but one such study was reported by Cai, Shuman, Gorman, Sage, and Anagnostaras (2009). Using a Pavlov ian fear-conditioning task, they found that hippocampus-dependent contextual memory in mice was enhanced (in an absolute sense) following a period of sleep whether the sleep phase occurred immediately after training or 12 hours later. More specifically, following sleep, the percentage of time spent freezing (the main dependent measure of memory) in creased beyond that observed at the end of training. This is a rare pattern in studies of declarative memory, but it is the kind of finding that raises the possibility that sleep-relat ed consolidation can sometimes increase the probability that a memory will be retrieved (i.e., it can strengthen memories in that sense).

Page 17 of 34

Memory Consolidation Role of Brain Rhythms in the Encoding and Consolidation States of the Hip pocampus Most of the work on hippocampal replay of past experience has looked for the phenome non during sleep, as if it might be a sleep-specific phenomenon. However, the key condi tion for consolidation to occur may not be sleep, per se. Instead, the key condition may arise whenever the hippocampus is not in an encoding state, with slow-wave sleep being an example of such a condition. Indeed, Karlsson and Frank (2009) found frequent awake replay of sequences of hippocampal place cells in the rat. The rats were exposed to two environments (i.e., two different running tracks) each day, and each environment was as sociated with a different sequence of place cell activity. The interesting finding was that during pauses in awake activity in environment 2, replay of sequential place cell activity associated with environment 1 was observed (replay of the local environment was also observed). The finding that the hippocampus exhibits replay of the remote environment while the rat is awake suggests that the hippocampus may take advantage of any down time (including, but not limited to, sleep) to consolidate memory. That is to say, the processes that underlie systems consolidation may unfold whenever the hippocampus is not encoding new memories (e.g., Buzsáki, 1989). In a two-stage model advanced by Buzsáki (1989), the hippocampus is assumed to alter nate between what might be referred to as an “encoding state” and a “consolidating state.” In the encoding state, the hippocampus receives (and encodes) information from the sensory and association areas of the neocortex. In the consolidating state, the hip pocampus sends encoded information back to the neocortex. Hasselmo (1999) argued that changes in the level of acetylcholine (Ach) mediate the directional flow of informa tion to and from the hippocampus. High levels of Ach, which occur during both activeawake and REM sleep, are associated with the encoding state, whereas low levels of Ach, which occur during both quiet awake (i.e., when the animal is passive) and slow-wave sleep, are associated with the consolidating state. Thus, according to this view, the con solidating state is not specific to sleep, but it does occur during sleep. Critically, (p. 448) the encoding and consolidating states are also associated with characteristic rhythmic ac tivity, and a basic assumption of this account is that communication between the hip pocampus and neocortex is mediated by coordinated oscillatory rhythms across different structures of the brain (Sirota, Csicsvari, Buhl, & Buzsáki, 2003). In the encoding state, the cortex is characterized by beta oscillations (i.e., 12 to 20 Hz), whereas the hippocampus is characterized by theta oscillations (i.e., 4 to 8 Hz). Hip pocampal theta oscillations are thought to synchronize neural firing along an input path way into the hippocampus. For example, in the presence of theta (but not in its absence), the hippocampus receives rhythmic input from neurons in the input layers of the adjacent entorhinal cortex (Chrobak & Buzsáki, 1996). In addition, Siapas, Lubenov, and Wilson (2005) showed that neural activity in the prefrontal cortex of rats was “phase-locked” to theta oscillations in the hippocampus in freely behaving (i.e., active-awake) rats. Findings like these are consistent with the idea that theta rhythms coordinate the flow of informa tion into the hippocampus, and still other findings suggest that theta rhythms may facili Page 18 of 34

Memory Consolidation tate the encoding of information flowing into the hippocampus. During the high-Ach en coding state—which is a time when hippocampal synaptic plasticity is high (Rasmusson, 2000)—electrical stimuli delivered at intervals equal to theta frequency are more likely to induce LTP than stimulation delivered at other frequencies (Larson & Lynch, 1986). Thus, theta appears to play a role both in organizing the flow of information into the hippocam pus and in facilitating the encoding of that information. Lower levels of Ach prevail during quite-awake and slow-wave sleep, and this is thought to shift the hippocampus into the consolidating state (see Rasch, Born, & Gais, 2006). In this state, activity along input pathways (ordinarily facilitated by theta rhythms) is sup pressed, and hippocampal plasticity is low (i.e., hippocampal LTP is not readily induced). As such, and as indicated earlier, recently induced LTP is protected from interference and is given a chance to stabilize as the process of cellular consolidation unfolds. In addition, under these conditions, the cortex is characterized by low-frequency spindle oscillations (i.e., 7 to 14 Hz) and delta oscillations (i.e., 4 Hz or less), whereas the hippocampus is as sociated with a more broad-spectrum pattern punctuated by brief, high-frequency sharp waves (i.e., 30 Hz or more) and very-high-frequency “ripples” (about 200 Hz). These sharp wave oscillations occur within the hippocampal-entorhinal output network, and syn chronized neural discharges tend to occur along this pathway during sharp-wave/ripple events (Buzsáki, 1986; Chrobak & Buzsáki, 1996). Thus, once again, rhythmic activity seems to coordinate communication between adjacent brain structures, and such commu nication has been found to occur between more distant brain structures as well. For ex ample, ripples observed during hippocampal sharp waves have been correlated with the occurrence of spindles in prefrontal cortex (Siapas & Wilson, 1998). Moreover, the neural replay discussed earlier preferentially takes place during the high-frequency bursts of spindle waves (Wilson & McNaughton, 1994). All of this suggests that rhythmically based feedback activity from the hippocampus may serve to “train” the neocortex and thus facil itate the process of systems consolidation. When it occurs in the hours after learning, this kind of systems-level communication presumably involves hippocampal neurons that have encoded information and that are successfully undergoing the process of cellular consoli dation. If so, then, again, cellular consolidation could be regarded as an early component of the systems consolidation process.

Sleep-Related Consolidation of Nondeclarative Memory A novel line of research concerned with the role of sleep in consolidation was initiated by a study suggesting that sleep also plays a role in the consolidation of nondeclarative memories. Karni et al. (1994) presented subjects with computer-generated stimulus dis plays that sometimes contained a small target consisting of three adjacent diagonal bars (arranged either vertically or horizontally) embedded within a background of many hori zontal bars. The displays were presented very briefly (10 ms) and then occluded by a visu al mask, and the subject’s job on a given trial was to indicate whether the target items were arranged vertically or horizontally in the just-presented display. Performance on this

Page 19 of 34

Memory Consolidation task improves with practice in that subjects can correctly identify the target with shorter and shorter delays between the stimulus and the mask. The detection of element orientation differences in these visual displays is a preattentive process that occurs rapidly and automatically (i.e., no deliberate search is required). In addition, the learning that takes place with practice presumably reflects plasticity in the early processing areas of the visual cortex, which would account for why the learning is extremely specific to the trained stimuli (e.g., if the (p. 449) targets always appear in one quadrant of the screen during training, no transfer of learning is apparent when the tar gets are presented in a different quadrant). Thus, the visual segregation task is not a hip pocampus-dependent task involving conscious memory (i.e., it is not a declarative memo ry task); instead, it is a nondeclarative memory task. A remarkable finding reported by Karni et al. (1994; Karni & Sagi, 1993) was that, follow ing a night of normal sleep, performance improved on this task to levels that were higher than the level that had been achieved at the end of training—as if further learning took place offline during sleep. This is unlike what is typically observed on declarative memory tasks, which only rarely show an actual performance enhancement. Various control condi tions showed that the enhanced learning effect was not simply due to a reduction in gen eral fatigue. Instead, some kind of performance-enhancing consolidation apparently oc curred while the subjects slept. Karni et al. (1994) found that depriving subjects of slow-wave sleep after learning did not prevent the improvement of postsleep performance from occurring, but depriving them of REM sleep did. Thus, REM sleep seems critical for the sleep-related enhancement of pro cedural learning to occur, and similar results have been reported in a number of other studies (Atienza et al., 2004; Gais et al., 2000; Mednick et al., 2002, 2003; Stickgold, James, & Hobson, 2000; Walker et al., 2005). These findings have been taken to mean that nondeclarative memories require a period of consolidation and that REM sleep in particular is critical for such consolidation to occur. Although most work has pointed to REM, some work has suggested a role for slow-wave sleep as well. For example, using the same texture-discrimination task, Stickgold et al. (2000) found that the sleep-dependent gains were correlated with the amount of slow-wave sleep early in the night and with the amount of REM sleep late in the night (cf. Gais et al., 2000). In the case of nondeclarative memories, the evidence for consolidation does not consist of decreasing dependence on one brain system (as in systems consolidation) or of increasing resistance to interference (as in cellular consolidation). Instead, the evidence consists of an enhancement of learning beyond the level that was achieved at the end of training. At the time Karni et al. (1994) published their findings, this was an altogether new phenome non, and it was followed by similar demonstrations of sleep-related enhancement using other procedural memory tasks, such as the sequential finger-tapping task (Walker et al., 2002, 2003a, 2003b). In this task, subjects learn a sequence of finger presses, and perfor mance improves with training (i.e., the sequence is completed with increasing speed) and improves still further following a night of sleep, with the degree of improvement often Page 20 of 34

Memory Consolidation correlating with time spent in stage 2 sleep. Fischer, Hallschmid, Elsner, and Born (2002) reported similar results, except that performance gains correlated with amount of REM sleep. However, one aspect of this motor-sequence-learning phenomenon—namely, the fact that performance improves beyond what was observed at the end of training—has been called into question. Rickard et al. (2008) recently presented evidence suggesting that the apparent absolute enhancement of performance on this task following sleep may have resulted from a combination of averaging artifacts, time-of-day confounds (cf. Keisler, Ashe, & Willingham, 2007; Song et al., 2007), and the buildup of fatigue (creating the impression of less presleep learning than actually occurred). This result does not nec essarily question the special role of sleep in the consolidation of motor-sequence learn ing, but it does call into question the absolute increase in performance that has been ob served following a period of sleep. Somewhat more puzzling for the idea that REM plays a special role in the consolidation of nondeclarative memory is that Rasch, Pommer, Diekelmann, and Born (2008) found that the use of antidepressant drugs, which virtually eliminate REM sleep, did not eliminate the apparent sleep-related enhancement of performance on two nondeclarative memory tasks (mirror tracing and motor sequence learning). This result would appear to suggest that REM sleep, per se, is not critical for the consolidation of learning on either task. In stead, conditions that happen to prevail during REM sleep (rather than REM sleep per se) may be critical. Consistent with this possibility, Rasch, Gais, and Born (2009) showed that cholinergic receptor blockade during REM significantly impaired motor skill consolida tion. This finding suggests that the consolidation of motor skill depends on the high cholinergic activity that typically occurs during REM (and that presumably occurs even when REM is eliminated by antidepressant drugs). What consolidation mechanism is responsible for sleep-related enhancement of perfor mance on perceptual learning tasks? Hippocampal replay discussed earlier seems like an unlikely candidate because this is not a hippocampus-dependent (p. 450) task. However, some form of neural reactivation in the cortex may be involved, as suggested by one study using positron emission tomography (PET). Specifically, Maquet et al. (2000) showed that patterns of widely distributed brain activity evident during the learning of an implicit serial reaction time task were again evident during REM sleep. Such offline re hearsal may reflect a neural replay mechanism that underlies the consolidation of proce dural learning, but the evidence on this point is currently quite limited.

Role of Sleep in Creative Problem-Solving In addition to facilitating the consolidation of rote perceptual and (perhaps) motor learn ing, REM might also be an optimal state for reorganizing semantic knowledge (via spreading activation) in neocortical networks. This could occur because hippocampal in put to the neocortex is suppressed during REM, thus allowing for cortical-cortical com munication without interference from the hippocampus. Consolidation of this kind could facilitate insight and creative problem solving (Wagner et al., 2004). In this regard, a re cent study by Cai, Mednick, Harrison, Kanady, and Mednick (2009) found that REM sleep, compared with quiet rest and non-REM sleep, enhanced the integration of previously Page 21 of 34

Memory Consolidation primed items with new unrelated items to create new and useful associations. They used the Remote Associations Test, in which subjects are asked to find a fourth word that could serve as an associative link between three presented words (such as COOKIES, SIXTEEN, HEART). The answer to this item is SWEET (cookies are sweet, sweet sixteen, sweet heart). It is generally thought that insight is required to hit upon solutions to problems such as these because the correct answer is usually not the strongest associate of any of the individual items. After priming the answers earlier in the day using an unrelated analogies task, subjects took an afternoon nap. Cai et al. (2009) found that quiet rest and non-REM sleep did not facilitate performance on this task, but REM sleep did. Important ly, the ability to successfully create new associations was not attributable to conscious memory for the previously primed items because there were no differences in recall or recognition for the primed items between the quiet rest, non-REM sleep, and REM sleep groups. This finding reinforces the notion that REM sleep is important for nondeclarative memory, possibly by providing a brain state in which the association of the neocortex can reorganize without being disrupted by input from the MTL.

Reconsolidation In recent years, a great deal of attention has focused on the possibility that it is not just new memories that are labile for a period of time; instead, any recently reactivated mem ory—even one that consolidated long ago—may become labile again as a result of having been reactivated. That is, according to this idea, an old declarative memory retrieved to conscious awareness again becomes vulnerable to disruption and modification and must undergo the process of cellular consolidation (and, perhaps, systems consolidation) all over again. The idea that recently retrieved memories once again become labile was proposed long ago (Misanin, Miller, & Lewis, 1968), but the recent resurrection of interest in the subject was sparked by Nader, Schafe, and Le Doux (2000). Rats in this study were exposed to a fear-conditioning procedure in which a tone was paired with shock in one chamber (con text A). The next day, the rats were placed in another chamber (context B) and presented with the tone to reactivate memory of the tone–shock pairing. For half the rats, a protein synthesis inhibitor (anisomycin) was then infused into the amygdala (a structure that is adjacent to the hippocampus and that is involved in the consolidation of emotional memo ry). If the tone-induced reactivation of the fear memory required the memory to again un dergo the process of consolidation in order to become stabilized, then anisomycin should prevent that from happening, and the memory should be lost. This, in fact, was what Nad er et al. (2000) reported. Whereas control rats exhibited considerable freezing when the tone was presented again 1 day later (indicating long-term memory for the original tone– shock pairing), the anisomycin-treated rats did not (as if they had forgotten the tone– shock pairing). In the absence of a protein synthesis inhibitor, a reactivated memory should consolidate over the course of the next several hours. In accordance with this prediction, Nader et al. Page 22 of 34

Memory Consolidation (2000) also reported that when the administration of anisomycin was delayed for 6 hours after the memory was reactivated (thereby giving the memory a chance to reconsolidate before protein synthesis was inhibited), little effect on long-term learning was observed. More specifically, in a test 24 hours after reactivation, the treated rats and the control rats exhibited a comparable level of freezing in response to the tone (indicating memory for the tone–shock pairing). (p. 451) All these results parallel the effects of anisomycin on tone–shock memory when it is infused after a conditioning trial (Schafe & LeDoux, 2000). What was remarkable about the Nader et al. (2000) results was that similar consolidation effects were also observed well after conditioning and in response to the reactivation of memory caused by the presentation of the tone. Similar findings have now been reported for other tasks and other species (see Nader & Hardt, 2009, for a review). The notion that a consolidated memory becomes fragile again merely because it is reacti vated might seem implausible because personal experience does not suggest that we place our memories at risk by retrieving them. In fact, the well-known testing effect—the finding that successful retrieval enhances memory more than additional study—seems to suggest that the opposite may be true (e.g., Roediger & Karpicke, 2006). However, a frag ile trace is also a malleable trace, and it has been suggested that the updating of memory —not its erasure—may be a benefit of what otherwise seems like a problematic state of affairs. As noted by Dudai (2004), the susceptibility to corruption of a retrieved memory “might be the price paid for modifiability” (p. 75). If the reactivated trace is susceptible only to agents such as anisomycin, which is not a drug that is encountered on a regular basis, then the price for modifiability might be low indeed. On the other hand, if the trace is vulnerable to corruption by new learning, as a newly learned memory trace appears to be, then the price could be considerably higher. In an intriguing new study, Monfils, Cow ansage, Klann, and LeDoux (2009) showed that contextual fear memories in rats can be more readily eliminated by extinction trials if the fear memory is first reactivated by a re minder trial. For the first time, this raises the possibility that reactivated memories are vulnerable to disruption and modification by new learning (not just by protein synthesis inhibitors). Much remains unknown about reconsolidation, and there is some debate as to whether the disruption of a recently retrieved trace is a permanent or a transient phenomenon. For example, Stafford and Lattal (2009) recently compared the effects of anisomycin ad ministered shortly after fear conditioning (which would disrupt the consolidation of a new memory) or shortly after a reminder trial (which would disrupt the consolidation of a new ly retrieved memory). With both groups equated on important variables such as prior learning experience, they found that the anisomycin-induced deficit on a test of long-term memory was larger and more persistent in the consolidation group compared with the re consolidation group. Still, this study adds to a large and growing literature showing that reactivated memories are in some way vulnerable in a way that was not fully appreciated until Nader et al. (2000) drove the point home with their compelling study.

Page 23 of 34

Memory Consolidation

Conclusion The idea that memories require time to consolidate was proposed more than a century ago, but empirical inquiry into the mechanisms of consolidation is now more intense than ever. With that inquiry has come the realization that the issue is complex, so much so that, used in isolation, the word “consolidation” no longer has a clear meaning. One can speak of consolidation in terms of memory becoming less dependent on the hippocampus (systems consolidation) or in terms of a trace becoming stabilized (cellular consolidation). Alternatively, one can speak of consolidation in terms of enhanced performance (over and above the level of performance achieved at the end of training), in terms of increased re sistance to interference (i.e., less forgetting), or in terms of a presumed mechanism, such as neural replay or neural reactivation. A clear implication is that any use of the word consolidation should be accompanied by a statement of what it means. Similarly, any sug gestion that consolidation “strengthens” the memory trace should be accompanied by a clear statement of the way (or ways) in which the trace is thought be stronger than it was before. A more precise use of the terminology commonly used in this domain of investiga tion will help to make sense of the rapidly burgeoning literature on the always fascinating topic of memory consolidation.

References Anagnostaras, S. G., Maren S., & Fanselow M. S. (1999). Temporally-graded retrograde amnesia of contextual fear after hippocampal damage in rats: Within-subjects examina tion. Journal of Neuroscience, 19, 1106–1114. Anderson, J. R., & Schooler, L. J. (1991). Reflections of the environment in memory. Psy chological Science, 2, 396–408. Atienza, M., Cantero, J. L., & Stickgold, R. (2004) Posttraining sleep enhances automatici ty in perceptual discrimination. Journal of Cognitive Neuroscience, 16, 53–64. Barrett, T. R., & Ekstrand, B. R. (1972). Effect of sleep on memory: III. Controlling for time-of-day effects. Journal of Experimental Psychology, 96, 321–327. Bayley, P. J. Gold, J. J., Hopkins, R. O., & Squire, L. R. (2005). The neuroanatomy of remote memory. Neuron, 46, 799–810. Bernard, F. A., Bullmore, E. T., Graham, K. S., Thompson, S. A., Hodges, J. R., & Fletcher, P. C. (2004). The hippocampal region is involved in successful recognition of both remote and recent famous faces. NeuroImage, 22, 1704–1714. (p. 452)

Bliss, T. V. P., & Collingridge, G. L. (1993). A synaptic model of memory: Long-term poten tiation in the hippocampus. Nature, 361, 31–39. Bliss, T. V. P., Collingridge, G. L., & Morris, R. G. (2003). Longterm potentiation: enhanc ing neuroscience for 30 years – Introduction. Philosophical Transactions of the Royal So ciety of London, Series B, Biological Sciences, 358, 607–611. Page 24 of 34

Memory Consolidation Bliss, T. V. P., & Lomo, T. (1973). Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path. Jour nal of Physiology, 232, 331–356. Bramham, C. R., & Srebo, B. (1989). Synaptic plasticity in the hippocampus is modulated by behavioral state. Brain Research, 493, 74–86. Bruce, K. R., & Pihl, R. O. (1997). Forget “drinking to forget”: Enhanced consolidation of emotionally charged memory by alcohol. Experimental and Clinical Psychopharmacology, 5, 242–250. Buzsáki, G. (1986). Hippocampal sharp waves: Their origin and significance. Brain Re search, 398, 242–252. Buzsáki, G. (1989). A two-stage model of memory trace formation: A role for “noisy” brain states. Neuroscience, 31, 551–570. Cai, D. J., Mednick, S. A., Harrison, E. M., Kanady, J. C., & Mednick, S. C. (2009). REM, not incubation, improves creativity by priming associative networks. Proceedings of the National Academy of Sciences U S A, 106, 10130–10134. Cai, D. J., Shuman, T., Gorman, M. R., Sage, J. R., & Anagnostaras, S. G. (2009). Sleep se lectively enhances hippocampus-dependent memory in mice. Behavioral Neuroscience, 123, 713–719. Chrobak, J. J., & Buzsáki, G. (1996) High-frequency oscillations in the output networks of the hippocampal-entorhinal axis of the freely-behaving rat. Journal of Neuroscience, 16, 3056–3066. Cipolotti, L., Shallice, T., Chan, D., Fox, N., Scahill, R., Harrison, G., Stevens, J., & Rudge, P. (2001). Long-term retrograde amnesia: the crucial role of the hippocampus. Neuropsy chologia, 39, 151–172. Coenen, A. M. L., & Van Luijtelaar, E. L. J. M. (1997). Effects of benzodiazepines, sleep and sleep deprivation on vigilance and memory. Acta Neurologica Belgica, 97, 123–129. Damasio, A.R. (1989). Time-locked multiregional retroactivation: A systems-level proposal for the neural substrates of recall and recognition. Cognition, 33, 25–62. Del Cerro, S., Jung, M., & Lynch, L. (1992). Benzodiazepines block long-term potentiation in slices of hippocampus and piriform cortex. Neuroscience, 49, 1–6. Douville, K., Woodard, J. L., Seidenberg, M., Miller, S. K., Leveroni, C. L., Nielson, K. A., Franczak, M., Antuono, P., & Rao, S. M. (2005). Medial temporal lobe activity for recogni tion of recent and remote famous names: An event related fMRI study. Neuropsychologia, 43, 693–703.

Page 25 of 34

Memory Consolidation Dudai, Y. (2004). The neurobiology of consolidations, or, how stable is the engram? Annu al Review of Psychology, 55, 51–86. Ebbinghaus, H. (1885). Über das Gedchtnis. Untersuchungen zur experimentellen Psy chologie. Leipzig: Duncker & Humblot. English edition: Ebbinghaus, H. (1913). Memory: A contribution to experimental psychol ogy. New York: Teachers College, Columbia University. Ekstrand, B. R. (1972). To sleep, perchance to dream (about why we forget). In C. P. Dun can, L. Sechrest, & A. W. Melton (Eds.), Human memory: Festschrift for Benton J. Under wood (pp. 59–82). New York: Appelton-Century-Crofts. Eslinger, P. J. (1998). Autobiographical memory after temporal lobe lesions. Neurocase, 4, 481–495. Evans, M. S., & Viola-McCabe, K. E. (1996). Midazolam inhibits long-term potentiation through modulation of GABAA receptors. Neuropharmacology, 35, 347–357. Fillmore, M. T., Kelly, T. H., Rush, C. R., & Hays, L. (2001). Retrograde facilitation of mem ory by triazolam: Effects on automatic processes. Psychopharmacology, 158, 314–321. Fischer, S., Hallschmid, M., Elsner, A. L., & Born, J. (2002). Sleep forms memory for fin ger skills. Proceedings of the National Academy of Sciences U S A, 99, 11987–11991. Frankland, P. W., & Bontempi, B. (2005). The organization of recent and remote memory. Nature Reviews Neuroscience, 6, 119–130. Frankland, P. W., & Bontempi, B. (2006). Fast track to the medial prefrontal cortex. Pro ceedings of the National Academy of Sciences U S A, 103, 509–510. Frankland, P. W., Bontempi, B., Talton, L. E., Kaczmarek, L., & Silva, A. J. (2004). The in volvement of the anterior cingulate cortex in remote contextual fear memory. Science, 304, 881–883. Gais, S., Lucas, B., & Born, J. (2006). Sleep after learning aids memory recall. Learning and Memory, 13, 259–262. Gais, S., Plihal, W., Wagner, U., Born, J. (2000). Early sleep triggers memory for early visu al discrimination skills. Nature Neuroscience, 3, 1335–1339. Ghoneim, M. M., Hinrichs, J. V., & Mewaldt, S. P. (1984). Dose-response analysis of the be havioral effects of diazepam: I. Learning and memory. Psychopharmacology, 82, 291–295. Givens, B., & McMahon, K. (1995). Ethanol suppresses the induction of long-term potenti ation in vivo. Brain Research, 688, 27–33.

Page 26 of 34

Memory Consolidation Haist, F., Bowden Gore, J., & Mao, H. (2001). Consolidation of human memory over decades revealed by functional magnetic resonance imaging. Nature Neuroscience, 4, 1139–1145. Hasselmo, M. E. (1999) Neuromodulation: Acetylcholine and memory consolidation. Trends in Cognitive Sciences, 3, 351–359. Hinrichs, J. V., Ghoneim, M. M., & Mewaldt, S. P. (1984). Diazepam and memory: Retro grade facilitation produced by interference reduction. Psychopharmacology, 84, 158–162. Hirano, M., & Noguchi, K. (1998). Dissociation between specific personal episodes and other aspects of remote memory in a patient with hippocampal amnesia. Perceptual and Motor Skills, 87, 99–107. Hoffman, K. L., & McNaughton, B. L. (2002). Coordinated reactivation of distributed memory traces in primate neocortex. Science, 297, 2070–2073. Izquierdo, I., Schröder, N., Netto, C. A., & Medina, J. H. (1999). Novelty causes time-de pendent retrograde amnesia for one-trial avoidance in rats through NMDA receptor- and CaMKII-dependent mechanisms in the hippocampus. European Journal of Neuroscience, 11, 3323–3328. Jenkins, J. B., & Dallenbach, K. M. (1924). Oblivescence during sleep and waking. Ameri can Journal of Psychology, 35, 605–612. Ji, D., & Wilson, M. A. (2007). Coordinated memory replay in the visual cortex and hip pocampus during sleep. Nature Neuroscience, 10, 100–107. Johnson, J. D., McDuff, S. G., Rugg, M. D., & Norman, K. A. (2009). Recollection, familiarity, and cortical reinstatement: A multivoxel pattern analysis. Neuron, 63, 697– 708. (p. 453)

Jones Leonard, B., McNaughton, B. L., & Barnes, C. A. (1987). Suppression of hippocam pal synaptic activity during slow-wave sleep. Brain Research, 425, 174–177. Jost, A. (1897). Die Assoziationsfestigkeit in ihrer Abhängigkeit von der Verteilung der Wiederholungen [The strength of associations in their dependence on the distribution of repetitions]. Zeitschrift fur Psychologie und Physiologie der Sinnesorgane, 16, 436–472. Karlsson, M. P., & Frank, L. M. (2009). Awake replay of remote experiences in the hip pocampus. Nature Neuroscience, 12, 913–918. Karni, A., & Sagi, D. (1993) The time course of learning a visual skill. Nature, 365, 250– 252. Karni, A., Tanne, D., Rubenstein, B. S., Askenasy, J. J. M., & Sagi, D. (1994). Dependence on REM sleep of overnight improvement of a perceptual skill. Science, 265, 679–682.

Page 27 of 34

Memory Consolidation Keppel, G. (1968). Retroactive and proactive inhibition. In T. R. Dixon & D. L. Horton (Eds.), Verbal behavior and general behavior theory (pp. 172–213). Englewood Cliffs, NJ: Prentice-Hall. Keisler, A., Ashe, J., & Willingham, D.T. (2007). Time of day accounts for overnight im provement in sequence learning. Learning and Memory, 14, 669–672. Kitchener, E. G., Hodges, J. R., & McCarthy, R. (1998). Acquisition of post-morbid vocabu lary and semantic facts in the absence of episodic memory. Brain 121, 1313–1327. Lamberty, G. J., Beckwith, B. E., & Petros, T. V. (1990). Posttrial treatment with ethanol enhances recall of prose narratives. Physiology and Behavior, 48, 653–658. Lansink, C. S., Goltstein, P. M., Lankelma, J. V., McNaughton, B. L., & Pennartz, C. M. A. (2009). Hippocampus leads ventral striatum in replay of place-reward information. PLoS Biology, 7 (8), e1000173. Larson, J., & Lynch, G. (1986). Induction of synaptic potentiation in hippocampus by pat terned stimulation involves two events. Science, 23, 985–988. Lechner, H. A., Squire, L. R., & Byrne, J. H. (1999). 100 years of consolidation—Remem bering Müller and Pizecker. Learning and Memory, 6, 77–87. Litman, L., & Davachi, L. (2008) Distributed learning enhances relational memory consol idation. Learning and Memory, 15, 711–716. Louie, K., & Wilson, M. A. (2001). Temporally structured replay of awake hippocampal en semble activity during rapid eye movement sleep. Neuron, 29, 145–156. Lu, W., Man, H., Ju, W., Trimble, W. S., MacDonald, J. F., & Wang, Y. T. (2001). Activation of synaptic NMDA receptors induces membrane insertion of new AMPA receptors and LTP in cultured hippocampal neurons. Neuron, 29, 243–254. MacKinnon, D., & Squire, L. R. (1989). Autobiographical memory in amnesia. Psychobiolo gy, 17, 247–256. Maguire, E. A., & Frith, C. D. (2003) Lateral asymmetry in the hippocampal response to the remoteness of autobiographical memories. Journal of Neuroscience, 23, 5302–5307. Maguire, E. A., Henson, R. N. A., Mummery, C. J., & Frith, C. D. (2001) Activity in pre frontal cortex, not hippocampus, varies parametrically with the increasing remoteness of memories. NeuroReport, 12, 441–444. Maguire, E. A., Nannery, R., & Spiers, H. J. (2006). Navigation around London by a taxi driver with bilateral hippocampal lesions. Brain, 129, 2894–2907. Mann, R. E., Cho-Young, J., & Vogel-Sprott, M. (1984). Retrograde enhancement by alco hol of delayed free recall performance. Pharmacology, Biochemistry and Behavior, 20, 639–642. Page 28 of 34

Memory Consolidation Manns, J. R., Hopkins, R. O., & Squire, L. R. (2003). Semantic memory and the human hip pocampus. Neuron, 37, 127–133. Maquet, P., Laureys, S., Peigneux, P., et al. (2000) Experience-dependent changes in cere bral activation during human REM sleep. Nature Neuroscience, 3, 831–836. Marr, D. (1971). Simple memory: A theory for archicortex. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences, 262, 23–81. Martin, S. J., Grimwood, P. D., & Morris, R. G. M. (2000). Synaptic plasticity and memory: An evaluation of the hypothesis. Annual Review of Neuroscience, 23, 649–711. Mednick, S. C., Cai, D. J., Shuman, T., Anagnostaras, S., & Wixted, J. T. (2011). An oppor tunistic theory of cellular and systems consolidation. Trends in Neurosciences, 34, 504– 514. Mednick, S., Nakayam, K., & Stockgold, R. (2003). Sleep-dependent learning: A nap is as good as a night. Nature Neuroscience, 6, 697–698. Mednick, S., & Stickgold, R. (2002). The restorative effect of naps on perceptual deterio ration. Nature Neuroscience, 5, 677–681. McClelland, J. L., McNaughton, B. L., & O’Reilly, R. C. (1995). Why there are complemen tary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102, 419– 457. McGaugh, J. L. (2000). Memory: A century of consolidation. Science, 287, 248–251. Misanin, J. R., Miller, R. R., & Lewis, D. J. (1968). Retrograde amnesia produced by elec troconvulsive shock after reactivation of a consolidated memory trace. Science, 160, 203– 204. Monfils, M., Cowansage, K. K., Klann, E., & LeDoux, J. E. (2009). Extinction-reconsolida tion boundaries: Key to persistent attenuation of fear memories. Science, 324, 951–955. Morris, R. G. M. (1989). Synaptic plasticity and learning: Selective impairment of learn ing in rats and blockade of long-term potentiation in vivo by the N-methyl-D-aspartate re ceptor antagonist AP5. Journal of Neuroscience, 9, 3040–3057. Morris, R. G. M., Anderson, E., Lynch, G. S., & Baudry, M. (1986). Selective impairment of learning and blockade of long-term potentiation by an N-methyl-D-aspartate receptor an tagonist, AP5. Nature, 319, 774–776. Müller, G. E., & Pizecker, A. (1900). Experimentelle Beiträge zur Lehre vom Gedächtnis. Z. Psychol. Ergänzungsband (Experimental contributions to the science of memory), 1, 1– 300.

Page 29 of 34

Memory Consolidation Nader, K., & Hardt, O. (2009). A single standard for memory: The case for reconsolida tion. Nature Reviews Neuroscience, 10, 224–234. Nader, K., Schafe, G. E., LeDoux, J. E. (2000). Fear memories require protein synthesis in the amygdala for reconsolidation after retrieval. Nature, 406, 722–726. Nielsen, T. A. (2000). Cognition in REM and NREM sleep: A review and possible reconcili ation of two models of sleep mentation. Behavioral and Brain Sciences, 23, 851–866. O’Keefe, J., & Dostrovsky, J. (1971). The hippocampus as a spatial map: Prelimi nary evidence from unit activity in the freely-moving rat. Brain Research, 34, 171–175. (p. 454)

Parker, E. S., Birnbaum, I. M., Weingartner, H., Hartley, J. T., Stillman, R. C., & Wyatt, R. J. (1980). Retrograde enhancement of human memory with alcohol. Psychopharmacology, 69, 219–222. Parker, E. S., Morihisa, J. M., Wyatt, R. J., Schwartz, B. L., Weingartner, H., & Stillman, R. C. (1981). The alcohol facilitation effect on memory: A dose-response study. Psychophar macology, 74, 88–92. Peigneux, P., Laureys, S., Fuchs, S., Collette, F., Perrin, F., et al. (2004). Are spatial memo ries strengthened in the human hippocampus during slow wave sleep? Neuron, 44, 535– 545. Peyrache, A., Khamassi, M., Benchenane, K., Wiener, S. I., & Battaglia, F. P. (2009). Re play of rule-learning related neural patterns in the prefrontal cortex during slee p. Nature Neuroscience, 12, 919–926. Plihal, W., & Born, J. (1997). Effects of early and late nocturnal sleep on declarative and procedural memory. Journal of Cognitive Neuroscience, 9, 534–547. Plihal, W., & Born, J. (1999). Effects of early and late nocturnal sleep on priming and spa tial memory. Psychophysiology, 36, 571–582. Rasch, B. H., Born, J., & Gais, S. (2006). Combined blockade of cholinergic receptors shifts the brain from stimulus encoding to memory consolidation. Journal of Cognitive Neuroscience, 18, 793–802. Rasch, B., Buchel, C., Gais, S., & Born, J. (2007). Odor cues during slow-wave sleep prompt declarative memory consolidation. Science, 315, 1426–1429. Rasch, B., Gais, S., & Born, J. (2009) Impaired off-line consolidation of motor memories af ter combined blockade of cholinergic receptors during REM sleep-rich sleep. Neuropsy chopharmacology, 34, 1843–1853. Rasch, B., Pommer, J., Diekelmann, S., & Born, J. (2008). Pharmacological REM sleep sup pression paradoxically improves rather than impairs skill memory. Nature Neuroscience, 12, 396–397. Page 30 of 34

Memory Consolidation Rasmusson, D. D. (2000). The role of acetylcholine in cortical synaptic plasticity. Behav ioural Brain Research, 115, 205–218. Ribot, T. (1881). Les maladies de la memoire [Diseases of memory]. New York: AppletonCentury-Crofts. Ribot, T. (1882). Diseases of memory: An essay in positive psychology. London: Kegan Paul, Trench & Co. Rickard, T. C., Cai, D. J., Rieth, C. A., Jones, J., & Ard, M. C. (2008). Sleep does not en hance motor sequence learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34, 834–842. Roberto, M., Nelson, T. E., Ur, C. L., & Gruol, D. L. (2002). Long-term potentiation in the rat hippocampus is reversibly depressed by chronic intermittent ethanol exposure. Jour nal of Neurophysiology, 87, 2385–2397. Roediger, H. L., & Karpicke, J.D. (2006). Test-enhanced learning: Taking memory tests im proves long-term retention. Psychological Science, 17, 249–255. Rosenbaum, R.S., McKinnon, M. C., Levine, B., & Moscovitch, M. (2004). Visual imagery deficits, impaired strategic retrieval, or memory loss: Disentangling the nature of an am nesic person’s autobiographical memory deficit. Neuropsychologia, 42, 1619–1635. Schafe, G. E., & LeDoux, J. E. (2000). Memory consolidation of auditory Pavlovian fear conditioning requires protein synthesis and protein kinase A in the amygdala. Journal of Neuroscience, 20, RC96, 1–5. Scoville, W. B., & Milner, B. (1957). Loss of recent memory after bilateral hippocampal le sions. Journal of Neurology, Neurosurgery and Psychiatry, 20, 11–21. Siapas, A. G., Lubenov, E. V., & Wilson, M. A., (2005). Prefrontal phase locking to hip pocampal theta oscillations. Neuron, 46, 141–151. Siapas, A. G., & Wilson, M. A. (1998). Coordinated interactions between hippocampal rip ples and cortical spindles during slow-wave sleep. Neuron, 21, 1123–1128. Sinclair, J. G., & Lo, G. F. (1986). Ethanol blocks tetanic and calcium-induced long-term potentiation in the hippocampal slice. General Pharmacology, 17, 231–233. Sirota, A., Csicsvari, J. Buhl, D., & Buzsáki, G. (2003). Communication between neocortex and hippocampus during sleep in rats and mice. Proceedings of the National Academy of Sciences U S A, 100, 2065–2069. Smith, C. N., & Squire, L. R. (2009). Medial temporal lobe activity during retrieval of se mantic memory is related to the age of the memory. Journal of Neuroscience, 29, 930– 938.

Page 31 of 34

Memory Consolidation Song, S. S., Howard, J. H., Jr., & Howard, D. V. (2007). Sleep does not benefit probabilistic motor sequence learning. Journal of Neuroscience, 27, 12475–12483. Squire, L. R. (1992) Memory and the hippocampus: a synthesis from findings with rats, monkeys, and humans. Psychological Review, 99, 195–231. Squire, L. R. (2009). The legacy of patient H.M. for neuroscience. Neuron, 61, 6–9. Squire, L. R., & Alvarez, P. (1995). Retrograde amnesia and memory consolidation: A neu robiological perspective. Current Opinion in Neurobiology, 5, 169–177. Squire, L. R., Clark, R. E., & Knowlton, B. J. (2001). Retrograde amnesia. Hippocampus, 11, 50–55. Squire, L. R., & Wixted, J. T. (2011). The cognitive neuroscience of human memory since H.M. Annual Review of Neuroscience, 34, 259–288. Squire, L. R., & Zola, S. M. (1996). Structure and function of declarative and nondeclara tive memory systems. Proceedings of the National Academy of Sciences U S A, 93, 13515– 13522. Stafford, J. M., & Lattal, K. M. (2009). Direct comparisons of the size and persistence of anisomycin-induced consolidation and reconsolidation deficits. Learning and Memory, 16, 494–503. Steinvorth, S., Levine, B., & Corkin, S. (2005). Medial temporal lobe structures are need ed to re-experience remote autobiographical memories: evidence from H.M. and W.R. Neuropsychologia, 43, 479–496. Sterpenich, V., Albouy, G., Darsaud, A., Schmidt, C., Vandewalle, G., Dang Vu, T. T., Des seilles, M., Phillips, C., Degueldre, C., Balteau, E., Collette, F., Luxen, A., & Maquet, P. (2009). Sleep promotes the neural reorganization of remote emotional memory. Journal of Neuroscience, 16, 5143–5152. Stickgold, R., James, L., & Hobson, J. A. (2000). Visual discrimination learning requires sleep after training. Nature Neuroscience, 3, 1237–1238. Takashima, A., Nieuwenhuis, I. L. C., Jensen, O., Talamini, L. M., Rijpkema, M., & Fernán dez, G. (2009). Shift from hippocampal to neocortical centered retrieval network with consolidation. Journal of Neuroscience, 29, 10087–10093. (p. 455)

Takashima, A., Petersson, K. M., Rutters, F., Tendolkar, I., Jensen, O., Zwarts, M.

J., McNaughton, B. L., & Fernández, G. (2006). Declarative memory consolidation in hu mans: A prospective functional magnetic resonance imaging study. Proceedings of the Na tional Academy of Sciences U S A, 103, 756–761.

Page 32 of 34

Memory Consolidation Takehara, K., Kawahara, S., & Kirino, Y. (2003). Time-dependent reorganization of the brain components underlying memory retention in trace eyeblink conditioning. Journal of Neuroscience, 23, 9897–9905. Takehara-Nishiuchi, K., & McNaughton, B. L. (2008). Spontaneous changes of neocortical code for associative memory during consolidation. Science, 322, 960–963. Talamini, L. M., Nieuwenhuis, I. L., Takashima, A., & Jensen, O. (2008). Sleep directly fol lowing learning benefits consolidation of spatial associative memory. Learning and Memo ry, 15, 233–237. Tse, D., Langston, R. F., Kakeyama, M., Bethus, I., Spooner, P. A., Wood, E. R., Witter, M. P., & Morris, R. G. (2007). Schemas and memory consolidation. Science, 316, 76–82. Underwood, B. J. (1957). Interference and forgetting. Psychological Review, 64, 49–60. Wagner, U., Gais, S., Haider, H., Verleger, R., & Born, J. (2004) Sleep inspires insight. Na ture, 427, 352–355. Walker, M. P., Brakefield, T., Hobson, J. A., & Stickgold, R. (2003a). Dissociable stages of human memory consolidation and reconsolidation. Nature, 425, 616–620. Walker, M. P., Brakefield, T., Morgan, A., Hobson, J. A., & Stickgold, R. (2002). Practice with sleep makes perfect: Sleep-dependent motor skill learning. Neuron, 35, 205–211. Walker, M. P., Brakefield, T., Seidman, J., Morgon, A., Hobson, J. A., & Stickgold, R. (2003b). Sleep and the time course of motor skill learning. Learning and Memory, 10, 275–284. Walker, M. P., Stickgold, R., Jolesz, F. A., & Yoo, S. S. (2005). The functional anatomy of sleep-dependent visual skill learning. Cerebral Cortex, 15, 1666–1675. Watkins, C., & Watkins, M. J. (1975). Buildup of proactive inhibition as a cue-overload ef fect. Journal of Experimental Psychology: Human Learning and Memory, 1, 442–452. Weingartner, H. J., Sirocco, K., Curran, V., & Wolkowitz, O. (1995). Memory facilitation fol lowing the administration of the benzodiazepine triazolam. Experimental and Clinical Psy chopharmacology, 3, 298–303. Whitlock, J. R., Heynen A. J., Schuler M. G., & Bear M. F. (2006). Learning induces longterm potentiation in the hippocampus. Science, 313, 1058–1059. Wickelgren, W. A. (1974). Single-trace fragility theory of memory dynamics. Memory and Cognition, 2, 775–780. Wilson, M. A., & McNaughton, B. L. (1994). Reactivation of hippocampal ensemble memo ries during sleep. Science, 265, 676–679.

Page 33 of 34

Memory Consolidation Wixted, J. T. (2004a). The psychology and neuroscience of forgetting. Annual Review of Psychology, 55, 235–269. Wixted, J. T. (2004b). On common ground: Jost’s (1897) law of forgetting and Ribot’s (1881) law of retrograde amnesia. Psychological Review, 111, 864–879. Wixted, J. T., & Carpenter, S. K. (2007). The Wickelgren power law and the Ebbinghaus savings function. Psychological Science, 18, 133–134. Wixted, J. T., & Ebbesen, E. (1991). On the form of forgetting. Psychological Science, 2, 409–415. Xu, L., Anwyl, R., & Rowan, M. J. (1998). Spatial exploration induces a persistent reversal of long-term potentiation in rat hippocampus. Nature, 394, 891–894. Yamashita, K., Hirose, S., Kunimatsu, A., Aoki, S., Chikazoe, J., Jimura, K., Masutani, Y., Abe, O., Ohtomo, K., Miyashita, Y., & Konishi, S. (2009). Formation of long-term memory representation in human temporal cortex related to pictorial paired associates. Journal of Neuroscience, 29, 10335–10340. Yaroush, R., Sullivan, M. J., & Ekstrand, B. R. (1971). The effect of sleep on memory: II. Differential effect of the first and second half of the night. Journal of Experimental Psy chology, 88, 361–366. Yuste, R., & Bonhoeffer, T. (2001). Morphological changes in dendritic spines associated with long-term synaptic plasticity. Annual Review of Neuroscience, 24, 1071–1089.

John Wixted

John Wixted is Distinguished Professor of Psychology at the University of California San Diego. Denise J. Cai

Denise J. Cai, University of California, San Diego

Page 34 of 34

Age-Related Decline in Working Memory and Episodic Memory

Age-Related Decline in Working Memory and Episodic Memory Sander Daselaar and Roberto Cabeza The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0022

Abstract and Keywords Memory is one of the cognitive functions that deteriorate most with age. The types of memory most affected by aging are working memory, the short-term memory mainte nance and simultaneous manipulation of information, and episodic memory, our memory for personally experienced past events. Functional neuroimaging studies indicate impor tant roles in age-related memory decline for the medial temporal lobe (MTL) and pre frontal cortex (PFC) regions, which have been linked to two major cognitive aging theo ries, the resource and binding deficit hypotheses, respectively. Interestingly, functional neuroimaging findings also indicate that aging is not exclusively associated with decline. Some older adults seem to deal with PFC and MTL decline by shifting to alternative brain resources that can compensate for their memory deficits. In the future, these findings may help to distinguish normal aging from early Alzheimer’s dementia and the develop ment of memory remediation therapies. Keywords: functional neuroimaging, aging, working memory, episodic memory, medial temporal lobe, prefrontal cortex

Page 1 of 27

Age-Related Decline in Working Memory and Episodic Memory

Introduction

Figure 22.1 Longitudinal changes in volumes of pre frontal cortex (A), hippocampus (B), and rhinal cor tex (C) as a function of baseline age. From Raz et al., 2005. Reprinted with permission from Oxford University Press.

As we age, our brain declines both in terms of anatomy and physiology. This brain decline is accompanied by cognitive decline, which is most notable in the memory domain. Un derstanding the neural basis of age-related memory decline is important for two main reasons. First, in view of the growing number of older adults in today’s society, cognitive aging is increasingly becoming a problem in general health care, and effective therapies can only be developed on the basis of knowledge obtained through basic research. Se cond, there is a subgroup of elderly people whose memory impairments are more severe, preventing normal functioning in society. For these persons, such memory impairments can be the earliest sign of pathological age-related conditions, such as Alzheimer’s de mentia (AD). Particularly in the early stages of this disease, the differentiation from nor mal age-related memory impairments is very difficult to make. Thus, it is important to map out which memory deficits can be regarded as a correlate of normal aging and which deficits are associated with age-related pathology. Working memory (WM) and episodic memory (EM) are the two types of memory most affected by the aging process. WM refers to the short-term memory maintenance and simultaneous manipulation of informa tion. Clinical and functional neuroimaging evidence indicates that WM is particularly de pendent on the functions of the prefrontal cortex (PFC) and the parietal cortex (Wager & Smith, 2003). EM refers to the encoding and retrieval of personally experienced events (Gabrieli, 1998; Tulving, 1983). Clinical studies have shown that EM is primarily depen dent on the integrity of the medial temporal lobe (MTL) memory system (Milner, 1972; Squire, Schmolck, & Stark, 2001). However, functional neuroimaging (p. 457) studies have also underlined the contributions of PFC to EM processes (Simons & Spiers, 2003). In this chapter, we examine the roles of PFC and MTL changes in age-related decline in WM and EM by reviewing findings from functional neuroimaging studies of healthy aging. With respect to EM, most studies have scanned either the encoding or the retrieval phase Page 2 of 27

Age-Related Decline in Working Memory and Episodic Memory of EM, and hence, we will consider these two phases in separate sections. We also discuss how these WM and EM findings relate to two major cognitive theories of aging, the re source deficit hypothesis (Craik, 1986), and the binding deficit hypothesis (Johnson, Hashtroudi, & Lindsay, 1993; Naveh-Benjamin, 2000). Regarding the role of PFC and MTL, one influential neurocognitive view is that the agerelated decline in EM and WM results from a selective PFC decline, whereas MTL degra dation is an indicator of pathological age-related conditions (Buckner, 2004; Hedden & Gabrieli, 2004; West, 1996). This is based on anatomical studies showing that the PFC is the brain region that shows the greatest brain atrophy with age (Raz, 2005; Raz et al., 1997). However, there is now substantial evidence that the MTL also shows substantial anatomical and functional decline in healthy older adults (OAs), and thus, it is no longer possible to ascribe age-related memory deficits exclusively to PFC decline. It should be noted, though, that not all MTL regions show decline with age. As shown in Figure 22.1, a recent longitudinal study found that in healthy OAs, the hippocampus showed atrophy similar to the PFC, whereas the rhinal cortex did not (Raz et al., 2005). The differential ef fects of aging on hippocampus and rhinal cortex is very interesting because the rhinal cortex is one of the regions first affected by AD (Braak, Braak, & Bohl, 1993). As dis cussed later, together with recent functional magnetic resonance imaging (fMRI) evi dence of dissociations between hippocampal and rhinal functions in aging (Daselaar, Fleck, Dobbins, Madden, & Cabeza, 2006), these findings have implications for the early diagnosis of AD. This chapter focuses on two major cognitive factors thought to underlie age-related mem ory decline and strongly linked to PFC and MTL function, namely deficits in executive function and deficits in binding processes. The term executive function describes a set of cognitive abilities that control and regulate other abilities and behaviors. With respect to EM, executive functions are necessary (p. 458) to keep information available online so that it can be encoded in or retrieved from EM. The PFC is generally thought to be the key brain region underlying executive functions (Miller & Cohen, 2001). According to the re source deficit hypothesis (Craik, 1986), age-related cognitive impairments, including WM and EM deficits, are the result of a general reduction in attentional resources. As a result, OAs have greater difficulties with cognitive tasks that provide less environmental sup port, and hence require greater self-initiated processing. Executive and PFC functions are thought to be major factors explaining resource deficits in OAs (Craik, 1977, 1986; Glisky, 2007; West, 1996). Yet, we acknowledge that there are other factors, not discussed in this chapter, that can explain these deficits, including speed of processing (Salthouse, 1996) and inhibition deficits (Hasher, Zacks, & May, 1999). Binding refers to our capacity to bind into one coherent representation the individual ele ments that together make up an episode in memory, such as sensory inputs, thoughts, and emotions. Binding is assumed to be more critical for recollection than for familiarity. Recollection refers to remembering an item together with contextual details, whereas fa miliarity refers to knowing that an item without occurred in the past even though its con textual details cannot be retrieved. According to relational memory theory, different sub Page 3 of 27

Age-Related Decline in Working Memory and Episodic Memory regions of MTL are differentially involved in memory for relations among items and mem ory for individual items (Eichenbaum, Yonelinas, & Ranganath, 2007). In particular, the hippocampal formation is more involved in relational memory (binding, recollection), whereas the surrounding parahippocampal gyrus and rhinal cortex are more involved in item memory (familiarity). According to the binding deficit hypothesis (Johnson et al., 1993; Naveh-Benjamin, 2000), OAs are impaired in forming and remembering associa tions between individual items and between items and their context. As a result, age-re lated WM and EM impairments are more pronounced on tasks that require binding be tween study items (i.e., relational memory and recollection-based tasks). It is important to note that these tests also depend on executive functions mediated by PFC, and hence it is not possible to simply associate age-related deficits in these tasks with MTL function. Given the role of PFC in executive functions and MTL in binding operations, we will take an in-depth look at functional neuroimaging studies of WM and EM that revealed age-re lated functional changes in PFC or MTL regions. We will first discuss studies showing PFC differences during WM and EM encoding and retrieval, and then MTL differences. We will conclude with a discussion of different interpretations of age-related memory de cline derived from these findings, and how they relate to deficits in executive function and binding capacity.

Neuroimaging Studies of Working Memory and Episodic Memory Prefrontal Cortex In the case of the PFC, both age-related reductions and increases in activity have been found. Age-related reductions in PFC activity are often accompanied by increased activity in the contralateral hemisphere, leading to a more bilateral activation pattern in OAs than in younger adults (YAs). This pattern has been conceptualized in a model called hemi spheric asymmetry reduction in older adults (HAROLD), which states that, under similar conditions, PFC activity tends to be less lateralized in OAs than in YAs (Cabeza, 2002). In addition to the HAROLD pattern, over-recruitment of PFC regions in OAs is often found in the same hemisphere. When appropriate, we will distinguish between neuroimaging stud ies presenting task conditions in distinct blocks (blocked design) and studies that charac terize stimuli on a trial-by-trial basis based on performance (event-related design).

Working Memory Imaging studies of aging and WM function have shown altered patterns of activation in OAs compared with YAs, particularly in PFC regions. Studies that compared simple main tenance tasks have generally found increased PFC activity in OAs, which is correlated with better performance (Rypma & D’Esposito, 2000). Additionally, PFC activity in OAs not only is greater overall in these studies but also is often more bilateral, exhibiting the

Page 4 of 27

Age-Related Decline in Working Memory and Episodic Memory aforementioned HAROLD pattern (Cabeza et al., 2004; Park et al., 2003; Reuter-Lorenz et al., 2000).

Figure 22.2 Young participants show left-lateralized prefrontal cortex (PFC) activity during verbal work ing memory, and right PFC activity during spatial working memory, whereas older adults show bilater al PFC activity in both tasks: the HAROLD pattern. From Reuter-Lorenz, et al., 2000. Courtesy of P. Reuter-Lorenz.

For example, Reuter-Lorenz et al. (2000) used a maintenance task in which participants maintained four letters in WM and then compared them to a probe letter. As shown in Figure 22.2, YAs showed left lateralized activity, whereas OAs showed bilateral activity. They interpreted this HAROLD pattern as compensatory. Consistent with this interpreta tion, the OAs who showed the bilateral activation pattern were faster in the verbal WM task than those who did not. In addition to the verbal WM condition, they also included a spatial WM task. In this (p. 459) task, YAs activated right PFC, and OAs additionally re cruited left PFC. Thus, even though age-related increases were in opposite hemispheres, both verbal and spatial WM conditions yielded the HAROLD pattern (see Figure 22.2). This finding supports the generalizability of the HAROLD model to different kinds of stim uli. In contrast to studies using simple maintenance tasks, recent WM studies that manipulat ed WM load found both age-related decreases and increases in PFC activity. Cappell et al. (Cappell, Gmeindl, & Reuter-Lorenz, 2010) used a WM maintenance task with three dif ferent memory loads: low, medium, and high. OAs showed increased activation in right dorsolateral PFC regions during the lower load conditions. However, during the highest load condition, OAs showed a reduction in left dorsolateral PFC activation. Another WM study by Schneider-Garces et al. (2009) reported similar findings. They varied WM load between two and six letters, and measured a “throughput” variable reflecting the amount of information processed at a given load. Whereas YAs showed increasing throughput lev els with higher WM loads, the levels of OAs showed an asymptote-like function with lower levels at the highest load. Matching the behavioral results, they found that overall PFC activity showed an increasing function in YAs, but an asymptotic function in OAs, leading to an age-related over-recruitment in PFC activity during the lower loads and an underrecruitment during the highest load.

Page 5 of 27

Age-Related Decline in Working Memory and Episodic Memory The results by Cappell et al. and Schneider-Garces et al. are in line with the compensa tion-related utilization of neural circuits hypothesis (CRUNCH) (Reuter-Lorenz & Cappell, 2008). CRUNCH was proposed to account for patterns of overactivation and underactiva tion in OAs. According to CRUNCH, declining neural efficiency leads OAs to engage more neural circuits than YAs to meet task demands. Therefore, OAs show more activity at low er levels of task demands. However, as demands increase, YAs show greater activity to meet increasing task loads. OAs, on the other hand, have already reached their ceiling and will show reduced performance and underactivation. In summary, WM studies often found that OAs show reduced activity in the PFC regions engaged by YAs but greater activity in other PFC regions, such as contralateral PFC re gions (i.e., the PFC hemisphere less engaged by YAs). In some cases (Reuter-Lorenz et al., 2000), contralateral recruitment led to a more bilateral pattern of PFC activity in OAs (i.e., HAROLD). Moreover, the WM studies by Cappel et al. and Schneider-Garces et al. il lustrate the importance of distinguishing between difficulty levels when considering age differences in brain activity. In general, age-related increases in PFC activity were attrib uted to compensatory mechanisms.

Episodic Memory Encoding There are two general categories of EM encoding studies: blocked design studies and event-related studies using the subsequent memory paradigm. In blocked design studies, the EM encoding condition is typically compared with a baseline condition, such as read ing. Some blocked studies used intentional learning instructions, asking participants to memorize items for a subsequent memory test, whereas others use incidental learning in structions, asking them to make a judgment (i.e., semantic, size) on each item without mentioning or emphasizing the subsequent test. In event-related studies using the subse quent memory paradigm, activity associated (p. 460) with successful encoding operations is identified by comparing study-phase activity for items remembered versus forgotten in a subsequent memory test (for a review, see Paller & Wagner, 2002). The difference between blocked and event-related encoding studies is that the first method measures overall task activity including regions involved in simply processing the task instructions and stimuli, whereas the second method measures memory-specific ac tivity because overall task characteristics are subtracted out. Although OAs may show a difference in task-related processes that are not immediately relevant for memory encod ing, such as interpreting task instructions and switching between task conditions, they may recruit processes associated with successful memory encoding to a similar or even greater extent. The most common result in blocked design EM encoding studies using incidental and in tentional learning instructions is a reduction in left PFC activity. This reduction in left PFC activity was often coupled with an increase in right PFC activity, yielding the HAROLD pattern. In line with the resource deficit hypothesis, PFC reductions are more pronounced for intentional conditions—which require more self-initiated processing— than for incidental encoding conditions. For example, Logan et al. (2002) reported that Page 6 of 27

Age-Related Decline in Working Memory and Episodic Memory during self-initiated, intentional encoding instructions, OAs compared with YAs showed less activity in left PFC but greater activity in right PFC, resulting in a more bilateral ac tivity pattern (HAROLD). Results were similar for intentional encoding of both verbal and nonverbal material. Interestingly, further exploratory analyses revealed that this pattern was present in a group of “old-old” (mean age = 80), but not in a group of “youngold” (mean age = 67), suggesting that contralateral recruitment is associated with more pronounced age-related cognitive decline. At the same time, the decrease in left PFC was not present in the old-old group during incidental encoding instructions, suggesting that —in line with the resource deficit hypothesis—frontal reductions can be remediated by providing environmental support during encoding (Figure 22.3).

Figure 22.3 Young and young-old adults show leftlateralized prefrontal cortex (PFC) activity during in tentional episodic memory encoding of words, where as old-old adults show bilateral PFC activity (HAROLD). Reprinted from Neuron, 33(5), Jessica M. Logan, Amy L. Sanders, Abraham Z. Snyder, John C. Morris, and Randy L. Buckner, “Under-Recruitment and Non selective Recruitment: Dissociable Neural Mecha nisms Associated with Aging,” 827–840, Copyright (2002), with permission from Elsevier.

An incidental EM encoding study by Rosen at al. (2002) distinguishing between high- and low-performing OAs linked the HAROLD pattern specifically to high-performing OAs. This study distinguished between OAs with high and low memory scores based on a neuropsy chological test battery. The authors reported equivalent left PFC activity but greater right PFC activity in the old-high memory group relative to YAs. In contrast, the old-low memo ry group showed reduced activity in both left and right PFC. As a result, the old-high group showed a more bilateral pattern of PFC activity than YAs (HAROLD). These find ings support a compensatory interpretation of HAROLD.

Page 7 of 27

Age-Related Decline in Working Memory and Episodic Memory In contrast to blocked design EM encoding studies, event-related fMRI studies using sub sequent memory paradigms have often found age-related equivalent or increased activity in left PFC activity. (Dennis, Daselaar, & Cabeza, 2006; Duverne, Motamedinia, & Rugg, 2009; Gutchess et al., 2005; Morcom, Good, Frackowiak, & Rugg, 2003). For instance, Morcom et al. (2003) used event-related fMRI to study subsequent memory for semanti cally encoded words. Recognition memory for these words was tested after a short and a longer delay. Performance of OAs at the short delay was equal to that of YAs at the long delay. Under these conditions, activity in left (p. 461) inferior PFC was greater for subse quently recognized than forgotten words and was equivalent in both age groups. Howev er, OAs showed greater right PFC activity than YAs, again resulting in a more bilateral pattern of frontal activity (HAROLD).

Figure 22.4 Age differences associated with sus tained and transient subsequent memory effects. Older adults show greater transient activation in left prefrontal cortex (PFC), but younger adults showed greater sustained activation in right PFC. The bar graph represents difference scores of functional acti vation (beta weights) between successful and unsuc cessful encoding conditions for sustained and tran sient subsequent memory activity in both younger and older adults. Reprinted from Neurobiology of Aging, Vol. 28, Nan cy A. Dennis, Sander Daselaar, and Roberto Cabeza, “Effects of aging on transient and sustained success ful memory encoding activity,” 1749–1758, Copyright (2007), with permission from Elsevier.

As noted, one explanation for the discrepancy in left PFC activity between blocked and event-related studies may be a difference between overall task activity (reduced) and memory-related activity (preserved or enhanced). Although OAs may show a difference in task-related processes that are not relevant for memory encoding, they may recruit processes associated with successful memory encoding to a similar or even greater ex tent. Another explanation may be a difference in sustained (blocked) and transient (eventrelated) subsequent memory effects. A recent study by Dennis et al. (2006), used hybrid blocked and event-related analyses to distinguish between transient and sustained subse quent memory effects during deep incidental encoding of words. Subsequent memory was defined as parametric increases in encoding activity as a function of a combined sub sequent memory and confidence scale. This parametric response was measured in each trial (transient activity) and in blocks of eight trials (sustained activity). Similar to the re sults of Gutchess et al., subsequent memory analyses of transient activity showed age-re Page 8 of 27

Age-Related Decline in Working Memory and Episodic Memory lated increases in the left PFC. At the same time, subsequent memory analyses of sus tained activity showed age-related reductions in the right PFC (Figure 22.4). The decline in sustained subsequent memory activity in the PFC may involve age-related deficits in sustained attention that affect encoding processes. The results underline the importance of investigating aging effects on both transient and sustained neural activity. To summarize encoding studies, the most consistent finding in incidental and intentional encoding studies is an age-related reduction in left PFC activity. This finding is more fre quent for intentional than for incidental encoding studies, suggesting that, in line with the resource deficit hypothesis, the environmental support provided by a deep semantic encoding task may attenuate the age-related decrease in left PFC activity. This effect was found within subjects in the study by Logan et al. (2002). The difference between inten tional and incidental encoding conditions suggests an important strategic component in age-related memory decline. The reduction in left PFC activity was often coupled with an increase in right PFC activity, leading to a bilateral pattern of PFC activity in OAs (HAROLD). Importantly, the study by Rosen et al. (2002) found the HAROLD pattern only in high-performing OAs. This result provides support for a compensatory account of HAROLD. In contrast to blocked EM encoding studies, studies that used subsequent memory paradigms often found increases in left PFC activity. This discrepancy may relate to differences between overall task activity (reduced) and memory-related activity (pre served or enhanced) and between the different attentional components measured during blocked and event-related paradigms.

Episodic Memory Retrieval In line with the resource deficit hypothesis, age-related deficits in episodic retrieval tend to be more pronounced for recall and context memory tasks than for recognition tasks (Spencer & Raz, 1995). (p. 462) However, considerable differences in activity have also been observed during simple recognition tasks. Similar to EM encoding studies, whereas studies using blocked designs have often found decreases in PFC activity, more recent studies using event-related fMRI designs have found equivalent or increased activity. As an example of a blocked EM retrieval study, Cabeza et al. (1997) used both a word-pair recognition and cued-recall task. During word recognition, OAs showed reduced activity in the right PFC. During recall, OAs also showed weaker activations in the right PFC than YAs, but at the same time, showed greater activity than YAs in the left PFC. The net result was that PFC activity during recall was right lateralized in YAs but bilateral in OAs. The authors noted this change in hemispheric asymmetry and interpreted it as compensatory. This was the first study identifying the HAROLD pattern and the first one suggesting the compensatory interpretation of this finding. These changes were more pronounced dur ing recall than during recognition, consistent with behavioral evidence that recall is more sensitive to aging. In another study by Cabeza and colleagues (2002), YAs, high-performing OAs (old-high), and low-performing OAs (old-low) studied words presented auditorily or visually. During scanning, they were presented with words visually and made either old/new decisions Page 9 of 27

Age-Related Decline in Working Memory and Episodic Memory (item memory) or heard/seen decisions (context memory). Consistent with their previous results, YAs showed right PFC activity for context trials, whereas OAs showed bilateral PFC activity (HAROLD). Importantly, however, this pattern was only seen for the old-high adults, supporting a compensation account of the HAROLD pattern (Figure 22.5).

Figure 22.5 Prefrontal cortex (PFC) activity during episodic memory retrieval is right lateralized in young and old-low participants, but bilateral in oldhigh subjects (HAROLD). Reprinted from NeuroImage, Vol. 3, Roberto Cabeza, Nicole D. Anderson, Jill K. Locantore, and Anthony R. McIntosh, “Aging Gracefully: Compensatory Brain Activity in High-Performing Older Adults,” 1394– 1402, Copyright (2002), with permission from Elsevi er.

As an example of an event-related EM retrieval study, Morcom et al. (Morcom, Li, & Rugg, 2007) studied age differences during successful retrieval using a source memory task involving pictures with two encoding sources: animacy classifications or size judg ments. They also distinguished between an easy and a difficult condition. During the easy condition, OAs encoded the items three times and YAs two times to equate memory per formance. Under these conditions of equal performance, OAs showed more activity dur ing source retrieval (animacy vs. size) in both the left and right PFC regions, as well as in several other cortical areas. Interestingly, in line with CRUNCH (Reuter-Lorenz & Cap pell, 2008) and the WM studies by Cappell et al. (2010) and Schneider-Garcias et al. (2009), OAs did show overall reductions in activity during the difficult source retrieval condition (single encoding presentation). The authors concluded that the over-recruit ment of regions during the easy condition reflects an age-related decline in the efficiency with which neural populations support cognitive function. Summarizing the studies on PFC and EM retrieval, in blocked design studies, PFC differ ences between YA and OAs have been found more frequently in studies using tasks with little environmental support, including recall and context memory tasks, than during sim ple item recognition. This was exemplified in the study by Cabeza et al. (1997), which in cluded both recall and recognition tasks. These findings suggest a three-way interaction between age, executive demand, and frontal laterality. Distinguishing between old-high and old-low adults, the study by Cabeza et al. (2002) provided direct evidence for the compensation account of HAROLD. Similar to EM encoding studies that used the subse quent memory paradigm, event-related EM retrieval studies that focused on the neural correlates of successful EM retrieval have found equivalent or increased PFC activity in OAs. Interestingly, in line with the WM studies by Cappell et al. (2010) and SchneiderPage 10 of 27

Age-Related Decline in Working Memory and Episodic Memory Garces et al. (2009), (p. 463) the event-related study by Morcom et al. (2007) illustrates the importance of distinguishing between differences in task difficulty when considering age differences in brain activity.

Medial Temporal Lobes Frontal activations in aging showed both reductions and increases across aging, as well as shifts in lateralization of activation. On the other hand, activation within the MTL gen erally shows age-related decreases compared with MTL activation seen in YAs. However, EM retrieval studies indicate a shift in the foci of activation from the hippocampus proper to more parahippocampal regions in aging.

Working Memory Although the MTL has been strongly linked to EM, and the PFC to WM, MTL processes are also thought to play a role in WM tasks, particularly when these involve the binding between different elements (Ranganath & Blumenfeld, 2005). Regarding aging, only three WM studies found reductions in hippocampal activity during WM tasks, which all used nonverbal tasks. The first study was conducted by Grady et al. (1998). They employed a face WM task with varying intervals of item maintenance. Results showed that OAs have difficulty maintain ing hippocampal activation across longer delays. As the delay extended from 1 to 6 sec onds, left hippocampal activity increased in YAs but decreased in OAs, which implies that OAs have difficulties initiating memory strategies mediated by MTL or sustaining MTL ac tivity beyond very short retention intervals.

Figure 22.6 The left hippocampus showed an age × condition interaction. In young adults, hippocampal activity was greater in combination trials (object + location) than in the object-only and location-only conditions. In older adults, activation was lower in the combination trials than in the object-only condi tion.

The second study was conducted by Mitchell et al. (2000). They investigated a WM para digm with an important binding component. In each trial, participants were presented an object in a particular screen location and had to hold in WM the object, its location, or both (combination trials). Combination trials can be assumed to involve not only WM maintenance but also the binding of different information into an integrated memory trace (associative memory EM encoding). OAs showed a deficit in accuracy in the combi nation condition but not in the object and location conditions. Two regions were differen Page 11 of 27

Age-Related Decline in Working Memory and Episodic Memory tially involved in the combination condition in YAs but not in OAs: a left anterior hip pocampal region and an anteromedial PFC region (right Brodmann area 10) (Figure 22.6). According to the authors, a disruption of a hippocampal–PFC circuit may underlie binding deficits in OAs. Finally, in a study by Park et al. (2003), OAs showed an age-related reduction in hip pocampal activity. The left hippocampus was more activated in the viewing than in the maintenance condition in YAs but not in OAs. As in the study by Mitchell et al. (2000), the age-related reduction in hippocampal activity was attributed to deficits in binding opera tions. In sum, three nonverbal WM studies using spatial/pictorial stimuli (Grady et al., 1998; Mitchell et al., 2000; Park et al., 2003) found age-related decreases in hippocampus activ ity. Interestingly, no verbal WM study found such decreases (Cappell et al., 2010; ReuterLorenz et al., 2000). It is possible that nonverbal tasks are more dependent on hippocam pal-mediated relational memory processing, and hence more sensitive to age-related deficits in MTL regions. Thus, contrary to PFC findings, WM studies have generally found age-related reductions in MTL activity. (p. 464)

Episodic Memory Encoding

During EM encoding, frontal activations in aging showed both reductions and increases across aging. On the other hand, similar to WM, activation within the MTL during EM en coding generally shows age-related decreases compared with MTL activation seen in YAs. We will briefly discuss the general findings of blocked-design studies and then more re cent event-related fMRI studies using the subsequent memory paradigm. Although not as frequently as reductions in left PFC activity, blocked design studies using both intentional and incidental EM encoding paradigms have found age-related decreases in MTL activity. As an example of intentional EM encoding, in their study examining face encoding, Grady et al. (1995) found that, compared with YAs, OAs showed less activity not only in the left PFC but also in the MTL. Furthermore, they found a highly significant cor relation between hippocampus and left PFC activity in YAs, but not in OAs. Based on these results, they concluded that encoding in OAs is accompanied by reduced neural ac tivity and diminished connectivity between PFC and MTL areas. As an example of inciden tal EM encoding, Daselaar et al. (2003a) investigated levels of processing in aging using a deep (living/nonliving) versus shallow (uppercase/lowercase) encoding task. Despite see ing common activation of regions involved in a semantic network across both age groups, activation differences were seen when comparing levels of processing. OAs revealed sig nificantly less activation in the left anterior hippocampus during deep relative to shallow classification. The researchers concluded that in addition to PFC changes, under-recruit ment of MTL regions contributes, at least in part, to age-related impairments in EM en coding. Similar to block design studies, event-related EM encoding studies have generally found age-related reductions in MTL activity during tasks using single words or pictures (Den Page 12 of 27

Age-Related Decline in Working Memory and Episodic Memory nis et al., 2006; Gutchess et al., 2005). A study by Dennis et al. (2008) also included a source memory paradigm and provided clear support for the binding deficit hypothesis. YAs and OAs were studied with fMRI while encoding faces, scenes, and face–scene pairs (source memory). In line with binding deficit theory, the investigators found age-related reductions in subsequent memory activity in the hippocampus, which were more pro nounced for face–scene pairs than for item memory (faces and scenes). The aforementioned study by Daselaar et al. (2003b) that distinguished between highand low-performing OAs linked MTL reductions directly to individual differences in mem ory performance. These researchers found an age-related reduction activity in the anteri or MTL when comparing subsequently remembered items to a motor baseline, which was specific to low-performing OAs. Based on these findings, they concluded that MTL dys function during encoding is an important factor in age-related memory decline. To summarize, blocked and event-related EM encoding studies have generally found agerelated MTL reductions (Daselaar et al., 2003a, 2003b; Dennis et al., 2006; Gutchess et al., 2005). In line with the binding deficit hypothesis, the study by Dennis et al. links agerelated MTL reductions mainly to source memory. However, other studies also found re ductions during individual items. These findings suggest that age-related binding deficits play a role not only in complex associative memory tasks but also in simpler item memory tasks. One explanation for these results is that, in general, item memory tasks also have an associative component. In fact, the deep processing tasks used in these studies are specifically designed to invoke semantic associations in relation to the study items. As dis cussed in the next section, recollection of these associations can be used as confirmatory evidence during EM retrieval: remembering specific associations with a study item con firms that one has seen the study items. The study by Daselaar et al. (2003b) directly linked reduced MTL activity during single-word encoding to impaired performance on a subsequent recognition test.

Episodic Memory Retrieval As noted, relational memory theory asserts that the hippocampal formation is more in volved in binding or relational memory operations, whereas the surrounding parahip pocampal gyrus is more involved in individual item memory. Recent EM retrieval studies have indicated a shift from hippocampal to parahippocampal regions with age that may reflect a reduced employment of relational memory operations during EM retrieval (Cabeza et al., 2004; Daselaar, Fleck, Dobbins, Madden, & Cabeza, 2006; Giovanello, Kensinger, Wong, & Schacter, 2010). This idea is supported by a large number of behav ioral studies indicating that OAs show an increased reliance on familiarity-based retrieval as opposed to recollection-based retrieval (e.g., Bastin & Van der Linden, 2003; Davidson & Glisky, 2002; Java, 1996; Mantyla, 1993; Parkin & Walter, 1992). As mentioned before, recollection (p. 465) refers to remembering an item together with contextual details, which is more dependent on binding, whereas familiarity refers to knowing that an item occurred in the past even though its contextual details cannot be retrieved.

Page 13 of 27

Age-Related Decline in Working Memory and Episodic Memory The first support for a shift from recollection to familiarity EM retrieval processes came from a study by Cabeza et al. (2004). They investigated the effects of aging on several cognitive tasks, including a verbal recognition task. Within the MTLs, they found a disso ciation between a hippocampal region, which showed weaker activity in OAs than in YAs, and a parahippocampal region, which showed the converse pattern. Given evidence that hippocampal and parahippocampal regions are, respectively, more involved in recollec tion than familiarity (Eichenbaum et al., 2007), this finding is consistent with the notion that OAs are more impaired in recollection than in familiarity (e.g., Jennings & Jacoby, 1993; Parkin & Walter, 1992). Indeed, the age-related increase in parahippocampal cortex activity suggests that OAs may be compensating for recollection deficits by relying more on familiarity. Supporting this idea, OAs had a larger number of “know” (familiarity-based EM retrieval—“knowing” that something is old) responses than YAs, and these responses were positively correlated with the parahippocampal activation. In line with the findings by Cabeza and colleagues, a recent study by Giovanello et al. (2010) also found a hippocampal–parahippocampal shift with age during retrieval. They used a false memory paradigm in which conjunction words during study were recombined during testing. For instance, “blackmail” and “jailbird” were presented during the study, and “blackbird” was presented during test. False memory conjunction errors (responding “blackbird” is old) tended to occur more frequently in OAs than in YAs. Giovanello and colleagues found that OAs showed more parahippocampal activity during recombined conjunction words at retrieval (false memory), but less hippocampal activity during iden tical conjunction words (veridical memory). Given that false memories are associated with gist-based processes that rely on familiarity (Balota et al., 1999; Dennis, Kim, & Cabeza, 2008), the age-related increase in parahippocampal cortex fits well with the results by Cabeza et al. (2004). Another study by Cabeza and colleagues provided a similar pattern of results (Daselaar, Fleck, et al., 2006). YAs and OAs made old/new judgments about previously studied words followed by a confidence judgment from low to high. On the basis of previous research (Daselaar, Fleck, & Cabeza, 2006; Yonelinas, 2001), recollection was measured as an ex ponential change in brain activity as a function of confidence, and familiarity was mea sured as a linear change. The results revealed a clear double dissociation within MTL: whereas recollection-related activity in the hippocampus was reduced by aging, familiari ty-related activity in rhinal cortex was increased by aging (Figure 22.7A). These results suggested that OAs compensate for deficits in recollection processes mediated by the hip pocampus by relying more on familiarity processes mediated by rhinal cortex. Supporting this interpretation, within-participants regression analyses based on single-trial activity showed that recognition accuracy was determined by only hippocampal activity in YAs but by both hippocampal and rhinal activity in OAs. Also consistent with the notion of com pensation, functional connectivity analyses showed that correlations between the hip pocampus and posterior regions associated with recollection were greater in YAs, where as correlations between the rhinal cortex and bilateral PFC regions were greater in OAs (Figure 22.7B). The latter effect suggests a top-down modulation of PFC on rhinal activity in OAs. The finding of preserved rhinal function in healthy OAs has important clinical im Page 14 of 27

Age-Related Decline in Working Memory and Episodic Memory plications because this region is impaired early in AD (Killiany et al., 2000; Pennanen et al., 2004). In sum, retrieval studies have found both increases and decreases in MTL activity. The findings by Cabeza and colleagues suggest that at least some of these increases reflect a shift from recollection-based (hippocampus) to familiarity-based (parahippocampal\rhinal cortex) retrieval. Furthermore, their functional connectivity findings suggest that the greater reliance on familiarity processes in OAs may be mediated by a top-down frontal modulation.

Discussion In summary, our review of functional neuroimaging studies of cognitive aging has identi fied considerable age-related changes in activity during WM, EM encoding, and EM re trieval tasks not only in the PFC but also in the MTL. These findings suggest that func tional changes in both the PFC and MTL play a role in age-related memory deficits. Fo cusing first on PFC findings, the studies indicated both age-related reductions and in creases in PFC activity. During WM tasks, OAs show reduced activity in the PFC regions engaged by YAs, but greater activity in other regions, such as contralateral PFC regions. The latter changes often resulted in the more bilateral pattern of PFC activity in OAs than YAs known as HAROLD (Cabeza

Page 15 of 27

(p. 466)

Age-Related Decline in Working Memory and Episodic Memory

Figure 22.7 The effects of aging yielded a double dissociation between two medial temporal lobe (MTL) subregions: Whereas recollection-related ac tivity (exponential increase) in the hippocampus was attenuated by aging, familiarity-related activity (lin ear decrease) in the rhinal cortex was enhanced by aging. The hippocampal exponential rate parameter (λ) provides a measure of the sharpness of the expo nential increase of the perceived oldness function in the hippocampus. The rhinal slope parameter pro vides a measure of the steepness of the perceived oldness function in the rhinal cortex. From Daselaar, Fleck, Dobbins, Madden, & Cabeza, 2005. Reprinted with permission from Oxford Univer sity Press.

et al., 2004; Park et al., 2003; Reuter-Lorenz et al., 2000, 2001). In general, agerelated PFC increases and HAROLD findings have been attributed to functional com pensation in the aging brain. During EM encoding tasks, the most consistent finding has been a reduction in left PFC activity. This finding is more frequent for intentional than for incidental EM en coding tasks. The age-relat ed reduction in left PFC ac tivity was often coupled with an age-related increase in right PFC activity (i.e., HAROLD). EM retrieval was also associated with HAROLD, and this pattern was found more often in studies using more challeng ing recall and context memo ry tasks than during simple item recognition tasks. Final ly, (p. 467) EM retrieval stud ies suggest a shift from hip pocampal (recollection) to parahippocampal (familiari ty) retrieval processes.

Linking Cognitive Theories to Age-Related Changes in the Prefrontal Cortex and Medial Temporal Lobes In the first part of this chapter, we discussed two important cognitive hypotheses that have been put forward to account for age-related deficits in WM and EM, the resource deficit hypothesis and the binding deficit hypothesis. Below, we connect these behavioral and neurobiological findings by linking the resource and binding deficit hypotheses to PFC and MTL function in OAs. Finally, we discuss the relevance of these findings in terms of the clinical distinction between healthy and pathological deficits in EM.

Page 16 of 27

Age-Related Decline in Working Memory and Episodic Memory

Resource Deficit Hypothesis and Prefrontal Cortex Function The resource deficit hypothesis postulates that aging reduces attentional resources, and as a result, OAs have greater difficulties with cognitive tasks, including EM tasks, that re quire greater self-initiated processing. This hypothesis predicts that age-related differ ences should be smaller when the task provides a supportive environment that reduces attentional demands. Among other findings, the resource deficit hypothesis is supported by evidence that when attentional resources are reduced in YAs, they tend to show EM deficits that resemble those of OAs (Anderson, Craik, & Naveh-Benjamin, 1998; Jennings & Jacoby, 1993). Regarding neural correlates, Craik (1983) proposed that OAs’ deficits in processing are related to a reduction in the efficiency of PFC functioning. This idea fits with the fact that this region shows the most prominent gray matter atrophy. Moreover, functional neu roimaging studies have found age-related changes in PFC activity that are generally in line with the resource deficit hypothesis. Given the critical role of PFC in managing attentional resources, the resource deficit hy pothesis predicts that age-related changes in PFC activity will be larger for tasks involv ing greater self-initiated processing or less environmental support. The results of func tional neuroimaging studies are generally consistent with this prediction. During EM en coding, age-related decreases in left PFC activation were found frequently during inten tional EM encoding conditions (which provide less environmental support) but rarely dur ing incidental EM encoding conditions (which provide greater environmental support). Similarly, during EM retrieval, age-related differences in PFC activity were usually larger for recall and context memory tasks (which require greater cognitive resources) than for recognition memory tasks (which require fewer cognitive resources). Thus, in general, age effects on PFC activity tend to increase as a function of the demands placed on cogni tive resources. However, not all age-related changes in PFC activity suggested decline; on the contrary, many studies found age-related increases in PFC that suggested compensatory mecha nisms in the aging brain. In particular, several studies of EM encoding and EM retrieval found activations in contralateral PFC regions in OAs that were not seen in YAs. Impor tantly, experimental comparisons between high- and low-performing OAs (Cabeza et al., 2002; Rosen et al., 2002) demonstrated the beneficial contribution of contralateral PFC recruitment to memory performance in OAs. Moreover, a recent study using transcranial magnetic stimulation (TMS) found that in YAs, episodic EM retrieval performance was im paired by TMS of the right PFC but not of the left PFC, whereas in OAs, it was impaired by either right or left PFC stimulation (Rossi et al., 2004). This result indicates that the left PFC was less critical for YAs and was used more by OAs, consistent with the compen sation hypothesis. It is important to note that resource deficit and compensatory interpretations are not in compatible. In fact, it is reasonable to assume that the recruitment of additional brain re gions (e.g., in the contralateral PFC hemisphere) reflects an attempt to compensate for Page 17 of 27

Age-Related Decline in Working Memory and Episodic Memory reduced cognitive resources. One way in which OAs could counteract deficits in the par ticular pool of cognitive resources required by a cognitive task is to tap into other pools of cognitive resources. If one task is particularly dependent on cognitive processes mediat ed by one hemisphere, the other hemisphere represents an alternative pool of cognitive resources. Thus, in the case of PFC-mediated cognitive resources, if OAs have deficits in PFC activity in one hemisphere, they may compensate for these deficits by recruiting con tralateral PFC regions. Moreover, age-related decreases suggestive of resource deficits and age-related increases suggestive of compensation have often been found in the same conditions. For example, intentional EM encoding studies have shown age-related de creases in left PFC activity coupled with age-related increases in right PFC activity, lead ing to a dramatic reduction in hemispheric asymmetry in OAs (i.e., HAROLD). (p. 468)

Binding Deficit Hypothesis and Medial Temporal Lobe Function

The binding deficit hypothesis postulates that age-related memory deficits are primarily the result of difficulties in EM encoding and retrieving novel associations between items. This hypothesis predicts that OAs are particularly impaired in EM tasks that involve rela tions between individual items or between items and their context. Given that relational memory has been strongly associated with the hippocampus (Eichenbaum, Otto, & Co hen, 1994), this hypothesis also predicts that OAs will show decreased hippocampal activ ity during memory tasks, particularly when they involve relational information. Functional neuroimaging studies have identified considerable age-related changes not on ly in the PFC but also in MTL regions. As noted, the MTL also shows substantial atrophy in aging. Yet, the rate of decline differs for different subregions: Whereas the hippocam pus shows a marked decline, the rhinal cortex is relatively preserved in healthy aging (see Figure 22.1). This finding is in line with the idea that age-related memory deficits are particularly pronounced during relational memory tasks, which depend on the hippocam pus. In line with anatomical findings, functional neuroimaging studies have found substantial age-related changes in MTL activity during WM, EM encoding, and EM retrieval. Several studies have found age-related decreases in both hippocampal and parahippocampal re gions. During WM and EM encoding, these reductions are seen in tasks that emphasize the binding between different elements (Dennis, Hayes, et al., 2008; Mitchell et al., 2000). However, declines in hippocampal activation are also seen during tasks using indi vidual stimuli in healthy OAs (e.g., Daselaar et al., 2003a, 2003b; Park et al., 2003). Final ly, during EM retrieval some studies found decreases in hippocampal activity, but also greater activity in parahippocampal regions, which may be compensatory (Cabeza et al., 2004; Daselaar, Fleck, & Cabeza, 2006; Giovanello et al., 2010). In general, age-related changes in MTL activity are consistent with the binding deficit hy pothesis. Age-related reductions in hippocampal activity were found during the mainte nance of EM encoding of complex scenes, which involved associations among picture ele ments (Gutchess et al., 2005), and during deep EM encoding of words, which involved Page 18 of 27

Age-Related Decline in Working Memory and Episodic Memory identification of semantic associations (Daselaar et al., 2003a, 2003b; Dennis et al., 2006). Finally, one study specifically linked age-related reductions in hippocampal activity to rec ollection, which involves recovery of item–context associations (Daselaar, Fleck, et al., 2006). Yet, it should be noted that age-related changes in MTL activity were often accom panied by concomitant changes in PFC activity. Hence, in these cases, it is unclear whether such changes signal MTL dysfunction or are the result of a decline in executive processes mediated by PFC regions. However, studies using incidental EM encoding tasks with minimal self-initiated processing requirements have also identified age-related dif ferences in MTL activity without significant changes in PFC activity (Daselaar et al., 2003a, 2003b) As in the case of PFC, not all age-related changes in MTL activity suggest decline; several findings suggest compensation. OAs have been found to show reduced activity in the hip pocampus but increased activity in other brain regions such as the parahippocampal gyrus (Cabeza et al., 2004) and the rhinal cortex (Daselaar, Fleck, et al., 2006). These re sults were interpreted as a recruitment of familiarity processes mediated by parahip pocampal regions to compensate for the decline of recollection processes that are depen dent on the hippocampus proper. These results fit well with the relational memory view (Cohen & Eichenbaum, 1993; Eichenbaum et al., 1994), which states that the hippocam pus is involved in binding an item with its context (recollection), whereas the surrounding parahippocampal cortex mediates item-specific memory processes (familiarity).

Healthy Versus Pathological Aging As mentioned at the beginning of this chapter, one of the biggest challenges in cognitive aging research is to isolate the effects of healthy aging from those of pathological aging. Structural neuroimaging literature suggests that healthy aging is accompanied by greater declines in frontal regions compared with MTL (Raz et al., 2005). In contrast, pathologi cal aging is characterized by greater decline in MTL than in frontal regions (Braak, Braak, & Bohl, 1993; Kemper, 1994). In fact, functional neuroimaging evidence suggests that prefrontal activity tends to be maintained or even increased in early AD (Grady, 2005). Thus, these findings suggest that memory decline in healthy aging is more depen dent on frontal than MTL deficits, whereas the opposite pattern is more characteristic of pathological aging (Buckner, 2004; West, 1996). In view of these findings, clinical studies aimed at an early diagnosis of (p. 469) age-related pathology have mainly targeted changes in MTL (Nestor, Scheltens, & Hodges, 2004). Yet, the studies reviewed in this chapter clearly indicate that healthy OAs are also prone to MTL decline. Hence, rather than focusing on MTL deficits alone, diagnosis of age-related pathology may be improved by employing some type of composite score reflecting the ratio between MTL and frontal decline. In terms of MTL dysfunction in healthy and pathological aging, it is also critical to assess the specific type or loci of MTL dysfunction. Critically, a decline in hippocampal function can be seen in both healthy aging and AD. Thus, even though hippocampal volume de cline is an excellent marker of concurrent AD (Scheltens, Fox, Barkhof, & De Carli, 2002), Page 19 of 27

Age-Related Decline in Working Memory and Episodic Memory it is not a reliable measure for distinguishing normal aging from early stages of the dis ease (Raz et al., 2005). In contrast, changes in the rhinal cortex are not apparent in healthy aging (see Figure 22.1), but they are present in early AD patients with only mild impairments (Dickerson et al., 2004). In a discriminant analysis, Pennanen and colleagues (2004) showed that, although hippocampal volume is indeed the best marker to discrimi nate AD patients from normal controls, measuring the volume of the entorhinal cortex is much more useful for distinguishing between incipient AD (mild cognitive impairment) and healthy aging. The fMRI study by Daselaar, Cabeza, and colleagues provides indica tions that the implementation of the recollection/familiarity distinction during EM re trieval in combination with fMRI may be promising in that respect (Daselaar, Fleck, et al., 2006). Finally, it should be noted that, despite the rigorous screening procedures typical of functional neuroimaging studies of healthy aging, it remains possible that early symp toms of age-related pathology went undetected in some of the studies reviewed in this chapter.

Summary In this chapter, we reviewed functional neuroimaging evidence highlighting the role of the PFC and MTL regions in age-related decline in WM and EM function. The chapter fo cused on two major factors thought to underlie age-related memory decline and strongly linked to PFC and MTL function, namely deficits in executive function and deficits in binding processes. We discussed functional neuroimaging studies that generally showed age-related decreases in PFC and MTL activity during WM, EM encoding, and EM re trieval. Yet, some of these studies also found preserved or increased levels of PFC or MTL activity in OAs, which may be compensatory. Regarding the PFC, several WM and EM studies have found an age-related increase in contralateral PFC activity, leading to an overall reduction in frontal asymmetry in OAs (HAROLD). As discussed, studies that divid ed OAs into high and low performers provided strong support for the idea that HAROLD reflects a successful compensatory mechanism. Regarding the MTL, several WM and EM studies reported age-related decreases in MTL activity. Yet, studies of EM retrieval have also found age-related increases in MTL activity. Recent findings suggest that at least some of these increases reflect a compensatory shift from hippocampal-based recollection processes to parahippocampal-based familiarity processes. In sum, in view of the substan tial changes in PFC and MTL that take place when we grow older, a reduction in WM and EM function seems inevitable. Yet, our review also suggests that some OAs can counter act this reduction by using alternative brain resources within the PFC and MTL that allow them to compensate for the general deficits in executive and binding operations underly ing WM and EM decline with age.

Page 20 of 27

Age-Related Decline in Working Memory and Episodic Memory

Future Directions In this chapter we reviewed functional neuroimaging studies of WM and EM that identi fied considerable age-related changes in PFC and MTL activity. These findings suggest that functional changes in both PFC and MTL play a role in age-related memory deficits. However, as outlined below, there are several open questions that need to be addressed in future functional neuroimaging studies of memory and aging What is the role of PFC-MTL functional connectivity in age-related memory decline? In this chapter, we discussed the role of PFC and MTL and age-related memory decline in WM and EM separately, and only mentioned a few cases in which age-related differences in PFC activation were correlated with MTL activations across participants. However, these studies did not assess age differences in functional connectivity, which need to be measured within participants on a trial-by-trial basis. Moreover, the role of white matter integrity in age-related differences in PFC–MTL connectivity has been ignored in studies of aging and memory. Future aging and memory research should elucidate the relation between age-related memory decline and PFC–MTL coupling by combining functional connectivity measures with structural connectivity in the form of white matter integrity. (p. 470)

What is the role of task demands in age differences in memory activations?

According to CRUNCH (Reuter-Lorenz & Cappell, 2008), OAs show more activity at lower levels of task demands, but as demands increase, YAs show greater activity to meet in creasing task loads. Yet, OAs have already reached their ceiling and will show reduced performance and less activity. We discussed evidence from two recent WM studies sup porting the CRUNCH model. We also discussed similar findings regarding the HAROLD model, indicating, for instance, greater age-related asymmetry during more difficult re call tasks than during simple recognition tasks. Future fMRI studies of aging and memory should further clarify the relation between task difficulty and age-related activation dif ferences by incorporating multiple levels of task difficulty in their design and including task demand functions in their analyses. Finally, two open question are: To what extent is the age-related shift from recollectionbased (hippocampus) to familiarity-based (rhinal cortex) retrieval indeed specific to healthy OAs? and Can the familiarity versus recollection distinction, combined with fMRI, be used in practice for the clinical diagnosis of pathological age-related conditions?

Author Note This work was supported by grants from the National Institute on Aging to RC (AG19731; AG23770; AG34580).

Page 21 of 27

Age-Related Decline in Working Memory and Episodic Memory

References Anderson, N. D., Craik, F. I. M., & Naveh-Benjamin, M. (1998). The attentional demands of encoding and retrieval in younger and older adults: I. Evidence from divided attention costs. Psychology and Aging, 13, 405–423. Balota, D. A., Cortese, M. J., Duchek, J. M., Adams, D., Roediger, H. L., McDermott, K. B., et al. (1999). Veridical and false memories in healthy older adults and in dementia of the Alzheimer’s type. Cognitive Neuropsychology, 16 (3-5), 361–384. Bastin, C., & Van der Linden, M. (2003). The contribution of recollection and familiarity to recognition memory: A study of the effects of test format and aging. Neuropsychology, 17 (1), 14–24. Braak, H., Braak, E., & Bohl, J. (1993). Staging of Alzheimer-related cortical destruction. European Neurology, 33 (6), 403–408. Buckner, R. L. (2004). Memory and executive function in aging and AD: Multiple factors that cause decline and reserve factors that compensate. Neuron, 44 (1), 195–208. Cabeza, R. (2002). Hemispheric asymmetry reduction in older adults: The HAROLD mod el. Psychology and Aging, 17 (1), 85–100. Cabeza, R., Anderson, N. D., Locantore, J. K., & McIntosh, A. R. (2002). Aging gracefully: Compensatory brain activity in high-performing older adults. NeuroImage, 17 (3), 1394– 1402. Cabeza, R., Daselaar, S. M., Dolcos, F., Prince, S. E., Budde, M., & Nyberg, L. (2004). Task-independent and task-specific age effects on brain activity during working memory, visual attention and episodic retrieval. Cerebral Cortex, 14 (4), 364–375. Cabeza, R., Grady, C. L., Nyberg, L., McIntosh, A. R., Tulving, E., Kapur, S., et al. (1997). Age-related differences in neural activity during memory encoding and retrieval: A positron emission tomography study. Journal of Neuroscience, 17, 391–400. Cappell, K. A., Gmeindl, L., & Reuter-Lorenz, P. A. (2010). Age differences in prefrontal recruitment during verbal working memory maintenance depend on memory load. Cortex, 46 (4), 462–473. Cohen, N. J., & Eichenbaum, H. (1993). Memory, Amnesia and the Hippocampal system. Cambridge MA: MIT Press. Craik, F. I. M. (1977). Age differences in human memory. In J. E. Birren & K. W. Schaie (Eds.), Handbook of the psychology of aging (pp. 384–420). New York: Van Nostrand Rein hold. Craik, F. I. M. (1983). On the transfer of information from temporary to permanent memo ry. Philosophical Transactions of the Royal Society, London, Series B, 302, 341–359. Page 22 of 27

Age-Related Decline in Working Memory and Episodic Memory Craik, F. I. M. (1986). A functional account of age differences in memory. In F. Klix & H. Hagendorf (Eds.), Human memory and cognitive capabilities, mechanisms, and perfor mances (pp. 409–422). Amsterdam: Elsevier. Daselaar, S. M., Fleck, M. S., & Cabeza, R. (2006). Triple dissociation in the medial tem poral lobes: Recollection, familiarity, and novelty. Journal of Neurophysiology, 96 (4), 1902–1911. Daselaar, S. M., Fleck, M. S., Dobbins, I. G., Madden, D. J., & Cabeza, R. (2006). Effects of healthy aging on hippocampal and rhinal memory functions: An event-related fMRI study. Cerebral Cortex, 16 (12), 1771–1782. Daselaar, S. M., Veltman, D. J., Rombouts, S. A., Raaijmakers, J. G., & Jonker, C. (2003a). Deep processing activates the medial temporal lobe in young but not in old adults. Neuro biology of Aging, 24 (7), 1005–1011. Daselaar, S. M., Veltman, D. J., Rombouts, S. A., Raaijmakers, J. G., & Jonker, C. (2003b). Neuroanatomical correlates of episodic encoding and retrieval in young and elderly sub jects. Brain, 126 (Pt 1), 43–56. Davidson, P. S., & Glisky, E. L. (2002). Neuropsychological correlates of recollection and familiarity in normal aging. Cognitive, Affective and Behavioral Neuroscience 2 (2), 174– 186. Dennis, N. A., Daselaar, S., & Cabeza, R. (2006). Effects of aging on transient and sus tained successful memory encoding activity. Neurobiology of Aging, 28 (11), 1749–1758. Dennis, N. A., Hayes, S. M., Prince, S. E., Madden, D. J., Huettel, S. A., & Cabeza, R. (2008). Effects of aging on the neural correlates of successful item and source memory encoding. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34 (4), 791–808. Dennis, N. A., Kim, H., & Cabeza, R. (2008). Age-related differences in brain activity dur ing true and false memory retrieval. Journal of Cognitive Neuroscience, 20 (8), 1390– 1402. Dickerson, B. C., Salat, D. H., Bates, J. F., Atiya, M., Killiany, R. J., Greve, D. N., et al. (2004). Medial temporal lobe function and structure in mild cognitive impairment. Annals of Neurology, 56 (1), 27–35. Duverne, S., Motamedinia, S., & Rugg, M. D. (2009). The relationship between ag ing, performance, and the neural correlates of successful memory encoding. Cerebral (p. 471)

Cortex, 19 (3), 733–744. Eichenbaum, H., Otto, T., & Cohen, N. J. (1994). Two functional components of the hip pocampal memory system. Behavioral and Brain Sciences, 17 (3), 449–472.

Page 23 of 27

Age-Related Decline in Working Memory and Episodic Memory Eichenbaum, H., Yonelinas, A. P., & Ranganath, C. (2007). The medial temporal lobe and recognition memory. Annual Review in Neuroscience, 30, 123–152. Gabrieli, J. D. (1998). Cognitive neuroscience of human memory. Annual Review of Psy chology, 49, 87–115. Giovanello, K. S., Kensinger, E. A., Wong, A. T., & Schacter, D. L. (2010). Age-related neur al changes during memory conjunction errors. Journal of Cognitive Neuroscience, 22 (7), 1348–1361. Glisky, E. L. (2007). Changes in cognitive function in human aging. In D. R. Riddle (Ed.), Brain aging: Models, methods, and mechanisms (pp. 1–15). Boca Raton, FL: CRC Press. Grady, C. L. (2005). Functional connectivity during memory tasks in healthy aging and de mentia. In R. Cabeza, L. Nyberg, & D. Park (Eds.), Cognitive neuroscience of aging (pp. 286–308). New York: Oxford University Press. Grady, C. L., McIntosh, A. R., Bookstein, F., Horwitz, B., Rapoport, S. I., & Haxby, J. V. (1998). Age-related changes in regional cerebral blood flow during working memory for faces. NeuroImage, 8 (4), 409–425. Grady, C. L., McIntosh, A. R., Horwitz, B., Maisog, J. M., Ungerleider, L. G., Mentis, M. J., et al. (1995). Age-related reductions in human recognition memory due to impaired en coding. Science, 269 (5221), 218–221. Gutchess, A. H., Welsh, R. C., Hedden, T., Bangert, A., Minear, M., Liu, L. L., et al. (2005). Aging and the neural correlates of successful picture encoding: Frontal activations com pensate for decreased medial-temporal activity. Journal of Cognitive Neuroscience, 17 (1), 84–96. Hasher, L., Zacks, R. T., & May, C. P. (1999). Inhibitory control, circadian arousal, and age. In D. Gopher & A. Koriat (Eds.), Attention and performance XVII, cognitive regula tion of performance: Interaction of theory and application (pp. 653–675). Cambridge, MA: MIT Press. Hedden, T., & Gabrieli, J. D. (2004). Insights into the ageing mind: A view from cognitive neuroscience. Nature Reviews Neuroscience, 5 (2), 87–96. Java, R. I. (1996). Effects of age on state of awareness following implicit and explicit word-association tasks. Psychology and Aging, 11 (1), 108–111. Jennings, J. M., & Jacoby, L. L. (1993). Automatic versus intentional uses of memory: Ag ing, attention, and control. Psychology and Aging, 8 (2), 283–293. Johnson, M. K., Hashtroudi, S., & Lindsay, D. S. (1993). Source monitoring. Psychological Bulletin, 114 (1), 3–28.

Page 24 of 27

Age-Related Decline in Working Memory and Episodic Memory Kemper, T. (1994). Neuroanatomical and neuropathological changes during aging and in dementia. In M. L. Albert & E. J. E. Knoepfel (Eds.), Clinical neurology of aging (2nd ed., pp. 3–67). New York: Oxford University Press. Killiany, R. J., Gomez-Isla, T., Moss, M., Kikinis, R., Sandor, T., Jolesz, F., et al. (2000). Use of structural magnetic resonance imaging to predict who will get Alzheimer’s disease. An nals of Neurology, 47 (4), 430–439. Logan, J. M., Sanders, A. L., Snyder, A. Z., Morris, J. C., & Buckner, R. L. (2002). Under-re cruitment and nonselective recruitment: Dissociable neural mechanisms associated with aging. Neuron, 33 (5), 827–840. Mantyla, T. (1993). Knowing but not remembering: Adult age differences in recollective experience. Memory and Cognition, 21 (3), 379–388. Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. An nual Review of Neuroscience, 24, 167–202. Milner, B. (1972). Disorders of learning and memory after temporal lobe lesions in man. Clinical Neurosurgery, 19, 421–446. Mitchell, K. J., Johnson, M. K., Raye, C. L., & D’Esposito, M. (2000). fMRI evidence of agerelated hippocampal dysfunction in feature binding in working memory. Cognitive Brain Research, 10 (1-2), 197–206. Morcom, A. M., Good, C. D., Frackowiak, R. S., & Rugg, M. D. (2003). Age effects on the neural correlates of successful memory encoding. Brain, 126, 213–229. Morcom, A. M., Li, J., & Rugg, M. D. (2007). Age effects on the neural correlates of episodic retrieval: Increased cortical recruitment with matched performance. Cerebral Cortex, 17, 2491–2506. Naveh-Benjamin, M. (2000). Adult age differences in memory performance: Tests of an associative deficit hypothesis. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26 (5), 1170–1187. Nestor, P. J., Scheltens, P., & Hodges, J. R. (2004). Advances in the early detection of Alzheimer’s disease. Nature Medicine, 10 (Suppl), S34–S41. Paller, K. A., & Wagner, A. D. (2002). Observing the transformation of experience into memory. Trends in Cognitive Sciences, 6 (2), 93–102. Park, D. C., Welsh, R. C., Marshuetz, C., Gutchess, A. H., Mikels, J., Polk, T. A., et al. (2003). Working memory for complex scenes: Age differences in frontal and hippocampal activations. Journal of Cognitive Neuroscience, 15 (8), 1122–1134. Parkin, A. J., & Walter, B. M. (1992). Recollective experience, normal aging, and frontal dysfunction. Psychology and Aging, 7, 290–298. Page 25 of 27

Age-Related Decline in Working Memory and Episodic Memory Pennanen, C., Kivipelto, M., Tuomainen, S., Hartikainen, P., Hanninen, T., Laakso, M. P., et al. (2004). Hippocampus and entorhinal cortex in mild cognitive impairment and early AD. Neurobiology of Aging, 25 (3), 303–310. Ranganath, C., & Blumenfeld, R. S. (2005). Doubts about double dissociations between short- and long-term memory. Trends in Cognitive Sciences, 9 (8), 374–380. Raz, N. (2005). The aging brain observed in vivo. In R. Cabeza, L. Nyberg & D. C. Park (Eds.), Cognitive neuroscience of aging (pp. 19–57). New York: Oxford University Press. Raz, N., Gunning, F. M., Head, D., Dupuis, J. H., McQuain, J., Briggs, S. D., et al. (1997). Selective aging of the human cerebral cortex observed in vivo: Differential vulnerability of the prefrontal gray matter. Cerebral Cortex, 7 (3), 268–282. Raz, N., Lindenberger, U., Rodrigue, K. M., Kennedy, K. M., Head, D., Williamson, A., et al. (2005). Regional brain changes in aging healthy adults: General trends, individual differ ences and modifiers. Cerebral Cortex, 15 (11), 1676–1689. Reuter-Lorenz, P. A., & Cappell, K. A. (2008). Neurocognitive aging and the compensation hypothesis. Current Directions in Psychological Science, 17 (3), 177–182. Reuter-Lorenz, P., Jonides, J., Smith, E. S., Hartley, A., Miller, A., Marshuetz, C., et al. (2000). Age differences in the frontal lateralization of verbal and spatial working mem ory revealed by PET. Journal of Cognitive Neuroscience, 12, 174–187. (p. 472)

Reuter-Lorenz, P. A., Marshuetz, C., Jonides, J., Smith, E. E., Hartley, A., & Koeppe, R. (2001). Neurocognitive ageing of storage and executive processes. European Journal of Cognitive Psychology, 13 (1-2), 257–278. Rosen, A. C., Prull, M. W., O’Hara, R., Race, E. A., Desmond, J. E., Glover, G. H., et al. (2002). Variable effects of aging on frontal lobe contributions to memory. NeuroReport, 13 (18), 2425–2428. Rossi, S., Miniussi, C., Pasqualetti, P., Babiloni, C., Rossini, P. M., & Cappa, S. F. (2004). Age-related functional changes of prefrontal cortex in long-term memory: A repetitive transcranial magnetic stimulation study. Journal of Neuroscience, 24 (36), 7939–7944. Rypma, B., & D’Esposito, M. (2000). Isolating the neural mechanisms of age-related changes in human working memory. Nature Neuroscience, 3 (5), 509–515. Salthouse, T. A. (1996). The processing-speed theory of adult age differences in cognition. Psychological Review, 103 (3), 403–428. Scheltens, P., Fox, N., Barkhof, F., & De Carli, C. (2002). Structural magnetic resonance imaging in the practical assessment of dementia: beyond exclusion. Lancet Neurology, 1 (1), 13–21.

Page 26 of 27

Age-Related Decline in Working Memory and Episodic Memory Schneider-Garces, N. J., Gordon, B. A., Brumback-Peltz, C. R., Shin, E., Lee, Y., Sutton, B. P., et al. (2009). Span, CRUNCH, and beyond: Working memory capacity and the aging brain. Journal of Cognitive Neuroscience, 22 (4), 655–669. Simons, J. S., & Spiers, H. J. (2003). Prefrontal and medial temporal lobe interactions in long-term memory. Nature Reviews Neuroscience, 4 (8), 637–648. Spencer, W. D., & Raz, N. (1995). Differential effects of aging on memory for content and context: A meta-analysis. Psychology and Aging, 10 (4), 527–539. Squire, L. R., Schmolck, H., & Stark, S. M. (2001). Impaired auditory recognition memory in amnesic patients with medial temporal lobe lesions. Learning and Memory, 8 (5), 252– 256. Tulving, E. (1983). Elements of episodic memory. Oxford, UK: Clarendon Press. Wager, T. D., & Smith, E. E. (2003). Neuroimaging studies of working memory: A metaanalysis. Cognitive, Affective and Behavioral Neuroscience, 3 (4), 255–274. West, R. L. (1996). An application of prefrontal cortex function theory to cognitive aging. Psychological Bulletin, 120 (2), 272–292. Yonelinas, A. P. (2001). Components of episodic memory: The contribution of recollection and familiarity. Philosophical Transactions of the Royal Society of London, Series B, Bio logical Sciences, 356 (1413), 1363–1374.

Sander Daselaar

Sander Daselaar, Donders Institute for Brain, Cognition, and Behaviour, Radboud University, Nijmegen, Netherlands, Center for Cognitive Neuroscience, Duke Univer sity, Durham, NC Roberto Cabeza

Roberto Cabeza is professor at the Department of Psychology and Neuroscience and a core member at the Center for Cognitive Neuroscience, Duke University.

Page 27 of 27

Memory Disorders

Memory Disorders Barbara Wilson and Jessica Fish The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0023

Abstract and Keywords Following some introductory comments, this chapter describes the ways in which memo ry systems can be affected after an insult to the brain. The main focus is on people with nonprogressive brain injury, particularly traumatic brain injury (TBI), stroke, encephali tis, and hypoxia. References are provided for those readers interested in memory disor ders following progressive conditions such as dementia. The chapter then considers re covery of memory function for people with nonprogressive memory deficits. This is fol lowed by a section addressing the assessment of memory abilities, suggesting that both standardized and functional assessment procedures are required to identify an individual’s cognitive strengths and weaknesses and to plan for rehabilitation. The final part of the chapter addresses rehabilitation of memory functioning, including compen satory strategies, ways to improve new learning, the value of memory groups, and ways to reduce the emotional consequences of memory impairment. Keywords: memory disorders, brain injury, memory function, memory systems

Introduction and Overview The neuroanatomical structures involved in memory functioning include the hippocampi and surrounding areas, the limbic structures, and the frontal-temporal cortex. Given the number of structures and networks involved, it is not surprising that memory problems are so frequently reported after many kinds of brain damage. Memory impairment is one of the most common consequences of an insult to the brain (Stilwell et al., 1999), affecting almost all people with dementia (approximately 10 per cent of people over the age of 65 years), some 36 percent of survivors of TBI, and about 70 percent of survivors of encephalitis, as well as those with hypoxic brain damage fol lowing cardiac or pulmonary arrest, attempted suicide, or near drowning, Parkinson’s dis ease, multiple sclerosis, AIDS, Korsakoff’s syndrome, epilepsy, cerebral tumors, and so

Page 1 of 26

Memory Disorders forth. The problem is enormous, and unfortunately, many people do not get the care and help they and their families need. What do we mean by memory? Memory is not one skill or function or system but is a “complex combination of memory subsystems” (Baddeley, 1992, p. 5). We can classify or understand these systems in a number of ways: We can consider the stages involved in memory, the length of time for which information is stored, the type of information stored, the modality information is stored in, whether explicit or implicit recall is re quired, whether recall or recognition is required, whether we are trying to remember things that have already occurred or remember to do things in the future, and whether memories date from before or after an insult to the brain. These distinctions are elaborat ed on in the next section of this chapter. Although survivors of brain injury can have problems with each one of these subcate gories of (p. 474) memory, the most typical scenario is for a memory-impaired person to have (1) normal or near normal immediate memory; (2) problems remembering after a delay or distraction; (3) a memory gap for the time just before the accident, illness, or in sult to the brain; and (4) difficulty learning most new information. Those with no other cognitive deficits (apart from memory) are said to have an amnesic syndrome, whereas those with more widespread cognitive deficits (the majority of memory-impaired people) are said to have memory impairment. This distinction is not always adhered to, however, and it is not uncommon to hear the term amnesia used for all memory-impaired people. Although people with progressive conditions will not recover or improve memory func tioning, much can be done to help them survive with a better quality of life (Camp et al., 2000; Clare, 2008; Clare & Woods, 2004), they can learn new information (Clare et al., 1999, 2000) and be helped to compensate for their difficulties (Berry, 2007). Recovery for people with nonprogressive brain injury is addressed below. Assessment of memory functioning and other cognitive areas is an important part of any treatment program for people with memory difficulties. We need to identify the person’s cognitive strengths and weaknesses, and we need to identify the everyday problems caus ing the most distress for the patient and family. Having identified these situations, we then need to help the memory-impaired person to reduce his or her real-life problems. As sessment and treatment are the themes of the final parts of this chapter.

Understanding Memory Disorders Stages of Remembering There are three stages involved in remembering: namely encoding, or the stage of taking in information; storage, the stage of retaining of information; and retrieval, the stage of accessing information when it is required. All can be differentially affected by an insult to the brain even though, in real life, these stages interact with one another. For example, people with encoding problems show attention difficulties. Although it is the case that in Page 2 of 26

Memory Disorders some circumstances remembering is unintentional and we can recall things that happen when we are not paying attention to them, typically we need to pay attention when we are learning something new or when it is important to remember. People with the classic amnesic syndrome do not have encoding difficulties, while those with executive deficits, say, following a TBI or in the context of dementia, may well have them. Once information is registered in memory, it has to be stored there until needed. Most people forget new information rather rapidly over the first few days, and then the rate of forgetting slows down. This is also true for people with memory problems, bearing in mind of course that in their case, relatively little information may be stored in the first place. However, once the information is encoded adequately and enters the long-term store, testing, rehearsal, or practice can help keep it there. Retrieving information when we need it is the third stage in the memory process. We all experience occasions when we know we know something such as a particular word or the name of a person or a film, yet we cannot retrieve it at the right moment. This is known as the “tip of the tongue phenomenon.” If someone provides us with a word we can usual ly determine immediately whether or not it is correct. Retrieval problems are even more likely for people with memory problems. If we can provide a “hook” in the form of a cue or prompt, we may be able to help them access the correct memory. Wilson (2009) discusses ways to improve encoding, storage, and retrieval.

Anterograde and Retrograde Amnesia A distinction frequently applied to memory disorders is that between anterograde and retrograde amnesia. Retrograde amnesia refers to the period of impaired recall of events that took place before the insult to the brain. Anterograde amnesia refers to the memory difficulties that follow an insult to the brain. Most memory-impaired people have both ret rograde and anterograde memory deficits, with a variable period of retrograde amnesia. It may be as short as a few minutes following TBI and may extend back for decades fol lowing encephalitis (Wilson et al., 2008). A few reports exist of people with an isolated retrograde amnesia and no anterograde deficits (Kapur, 1993, 1999; Kopelman, 2000), al though some of these at least are likely to be of psychogenic origin (Kopelman, 2004). Conversely, there are a few people with severe anterograde amnesia with no loss of mem ories before the insult (Wilson, 1999). In clinical practice, anterograde amnesia is more handicapping for memory-impaired patients and is, therefore, the main focus of rehabili tation, although a few people need help with recall of their earlier autobiographical knowledge.

Memory Systems Short-Term and Working Memory Atkinson and Shiffrin’s (1971) modal model classified human memory on the broad basis of the length of time for which information is stored. (p. 475) On this basis, sensory memo ry stores information for less than one-fourth of a second (250 ms),1 the short-term memo Page 3 of 26

Memory Disorders ry store holds information for a few seconds, and the long-term memory store holds infor mation for anything from minutes to years. The term short-term memory is so misused by the general public that it may be better to avoid it and instead to use the terms immediate memory, primary memory, and working memory. Immediate memory and primary memory refer to the memory span as measured by the number of digits that can be repeated in the correct order after one presentation (seven plus or minus two for the vast majority of people; Miller, 1956) or by the recency effect in free recall (the ability to recall the last few items in a list of words, letters, or digits). Baddeley (2004) prefers the term primary memory when referring to the simple unitary memory system and working memory when referring to the interacting range of temporary memory systems people typically use in real-life situations. Baddeley and Hitch’s (1974) working memory model, which describes how information is stored and manipulated over short periods of time, is one of the most influential theories in the history of psychology. Working memory is composed of three components: the cen tral executive and two subsidiary systems, the phonological loop and the visual-spatial sketchpad. The phonological loop is a store that holds memory for a few seconds, al though this can be increased through the use of subvocal speech and can also convert vi sual material that is capable of being named into a phonological code (Baddeley, 2004). The visual-spatial sketchpad is the second subsidiary system that allows for the tempo rary storage and manipulation of visual and spatial information. In clinical practice, one can see people with selective damage to each of these systems. Those with a central exec utive disorder are said to have the dysexecutive syndrome (Baddeley, 1986; Baddeley & Wilson, 1988a). Their main difficulties are with planning, organization, divided attention, perseveration, and dealing with new or novel situations (Evans, 2005). Those with phono logical loop disorders have difficulties with speech processing and with new language ac quisition. The few patients we see with immediate memory disorders, who can only re peat back one or two digits, words, or letters accurately, have a phonological loop disor der. Vallar and Papagno (2002) discuss the phonological loop in detail, whereas Baddeley and Wilson (1988b; Wilson & Baddeley, 1993) describe a patient with a phonological loop deficit and his subsequent recovery. People with visual-spatial sketchpad disorders may have deficits in just one of these functions, in other words, visual and spatial difficulties are separable (Della Sala & Logie, 2002). Although one such patient was described by Wilson, Baddeley, and Young (1999), such patients are rare in clinical practice and are not typical of memory-impaired people or of those referred for rehabilitation. In 2000, Baddeley added a fourth component to working memory, namely, the episodic buffer, a multimodal temporary interface between the two subsidiary systems and longterm memory. When assessing an amnesic patient’s immediate recall of a prose passage, it is not uncommon to find that the score is in the normal range even though, given the capacity of the short-term store, the recall should be no more than seven words plus or minus two (Miller, 1956). The episodic buffer provides an explanation for the enhanced score because it allows amnesic patients with good intellectual functioning and no execu tive impairments to show apparently normal immediate memory for prose passages that Page 4 of 26

Memory Disorders would far exceed the capacity of either of the subsidiary systems. Delayed memory for people with amnesia is, of course, still severely impaired.

Declarative Long-Term Memory The term declarative memory refers to forms of memory that involve explicit or conscious recollection. Within this, Tulving (1972) introduced the distinction between semantic memory and episodic memory. Memory for general knowledge, such as facts, the mean ings of words, the visual appearance of objects, and the color of things, is known as se mantic memory. In contrast, episodic memory relates to memory for personal experiences and events, such as the memory of one’s last vacation or paying the telephone bill.

Episodic Memory Episodic memory is autobiographical in nature. Tulving (2002) states that it is “about hap penings in particular places at particular times, or about ‘what,’ ‘where,’ and ‘when,’”, and that it “makes possible mental time travel,” in the sense that it allows one to relive previous experiences (p. 3). Broadly speaking, episodic memory is supported by the medi al temporal lobes, although there is continuing debate regarding the specific roles of the hippocampus and the parahippocampal, entorhinal, and perirhinal cortices in supporting different aspects of episodic memory. As would be expected, impairment of episodic mem ory is common following damage (p. 476) to the temporal lobes, and it is the primary fea ture of Alzheimer’s disease. Tulving and colleagues have also reported a case of relatively specific and near-total episodic memory loss. The patient K.C. suffered a closed head in jury that resulted in widespread and extensive brain damage, including damage to the medial temporal lobes. Subsequently, K.C. was unable to remember any events from his life before the injury and was unable to retain “new” events for more than a few minutes. In contrast, he retained semantic memory for personal details such as addresses and school names, although he could not create new personal semantic memories (see Tulv ing, 2002, for a summary). In understanding disorders of episodic memory, modality and material specificity are im portant considerations to bear in mind. In everyday life, we may be required to remember information and events from several modalities, including things we have seen, heard, smelled, touched, and tasted. In memory research, however, the main area of concern is memory for auditory or visual information. In addition to the modality that material is presented in, we should also consider the material itself, that is, whether it is verbal or nonverbal. Not only can we remember things that we can label or read (verbal material), we can also remember things that we cannot easily verbalize such as a person’s face (vi sual material). Because different parts of the brain are responsible for visual and verbal processing, these can be independently affected, with some people having a primary diffi culty with nonverbal memory and others with verbal memory, as demonstrated by Milner many years ago (1965, 1968, 1971) when she found that removal of the left temporal lobe resulted in verbal memory deficits and removal of the right temporal lobe led to more dif ficulties with remembering nonverbal material, such as faces, patterns, and mazes. If memory for one kind of material is less affected, it might be possible to capitalize on the Page 5 of 26

Memory Disorders less damaged system in order to compensate for the damaged one. People with amnesic syndrome have problems with both systems. This does not mean that they cannot benefit from mnemonics and other memory strategies, as we will see later in this chapter. Recall and recognition are two other episodic memory processes that may be differential ly affected in people with neurological disorders. Differences between recall and recogni tion may be attributable to their differential reliance on recollection (having distinct and specific memory of the relevant episode) and familiarity (a feeling of knowing in relation to the episode), and there has been considerable debate about whether familiarity and recollection are separable within the medial temporal lobes. Diana et al. (2007) reviewed functional imaging literature on this topic, and concluded that recollection is associated with patterns of increased activation within the hippocampus and posterior parahip pocampal gyrus, whereas familiarity is associated with activations in anterior parahip pocampal gyrus (perirhinal cortex). For most memory-impaired people, recall is believed to be harder than recognition. Kopel man et al. (2007), however, believe this may be due to the way the two processes are measured. They studied patients with hippocampal, medial temporal lobe, more wide spread temporal lobe, or frontal pathology. Initially, it looked as if all patients found recall harder than recognition, but when converted to Z scores, the differences were eliminat ed. In clinical practice, it is important to measure both visual and verbal recall and recog nition, as discussed in the assessment section below. It is also well established that patients with frontal lobe damage, because of poor strategy application (Kopelman & Stanhope 1998; Shallice & Burgess 1991), benefit from cues and thus are less impaired on recognition tasks than on recall tasks because they do not have to apply a retrieval strategy. This probably depends on which regions of the frontal lobes are affected. Stuss and Alexander (2005) found that different areas of the frontal lobes were concerned with different memory processes and that some regions were involved with nonstrategic memory encoding. A later study by McDonald et al. (2006) found that only patients with right frontal (and not those with left frontal) lesions had recognition memory deficits. Fletcher and Henson (2001) reviewed functional imaging studies of frontal lobe contributions to memory and concluded that ventrolateral frontal regions are associated with updating or maintenance in memory; dorsolateral areas with selection, manipulation, and monitoring, and anterior areas with the selection of processes or sub goals.

Semantic Memory We all have a very large store of information about what things mean, look like, sound like, smell like, and feel like, and we do not need to remember when this information was acquired or who was present at the time. Learning typically takes place over many occa sions in a variety of circumstances. Most memory-impaired people have a normal (p. 477) semantic memory at least for information acquired before the onset of the memory im pairment. People with a pure amnesic syndrome have little difficulty recalling from their semantic memory store information laid down well before the onset of the amnesia, but Page 6 of 26

Memory Disorders they may have great difficulty laying down new semantic information. This is because ini tially one has to depend on episodic memory in order for information to enter the seman tic store (Cermak & O’Connor, 1983). The group most likely to show problems with semantic memory includes those with pro gressive semantic dementia (Snowden, 2002; Snowden, Neary, Mann, et al., 1992; War rington, 1975). This condition is characterized by a “selective degradation of core seman tic knowledge, affecting all types of concept, irrespective of the modality of testing” (Lambon Ralph & Patterson, 2008, p. 61). Episodic memory deficits in these peo ple may be less affected. Mummery et al. (2000) reported that although several areas of frontal and temporal cortex are affected in semantic dementia, the area most consistently and significantly affected in semantic dementia is the left anterior temporal lobe. Further more, the degree of atrophy in this region is associated with the degree of semantic im pairment—a relationship that does not hold for the other regions affected. Semantic memory difficulties may also be seen in some survivors of nonprogressive brain injury, particularly after encephalitis or anoxia, although people with TBI can also exhibit these problems (Wilson, 1997).

Relationship Between Semantic Memory and Episodic Memory Tulving’s initial conceptualization saw semantic and episodic memory as wholly indepen dent systems. Subsequent revisions to this conceptualization (e.g., Tulving, 1995) have, however, incorporated the idea that episodic memory is dependent on semantic memory and, as such, that the system is hierarchical. This idea has intuitive appeal because it is difficult to imagine how one could remember an event (e.g., eating an apple for break fast) in the absence of knowledge about the constituent factors (e.g., knowing what an ap ple is). It has, however, been demonstrated that episodic memory can indeed be pre served in the absence of semantic memory. For example, Graham et al. (2000) found that people with impaired semantic memory retained the ability to recognize previously seen objects regarding which they had no semantic knowledge, as long as they were perceptu ally identical at the study and test phases. This suggests that episodic learning can be supported by a perceptual process in the absence of conceptual ones, and argues against a strictly hierarchical model. Baddeley (1997, 2004) also disagrees with the view that se mantic and episodic memory are independent, albeit from a different perspective. He says that in most situations, there is a blend of the two: If one recalls what happened last Christmas (an episodic task), then this will be influenced by the semantic knowledge of what one typically does at Christmas. Most people with memory problems have episodic memory deficits, which are their major handicap in everyday life.

Prospective Memory Yet another way we can understand memory problems is to distinguish between retro spective memory and prospective memory. The former is memory for past events, inci dents, word lists, and other experiences as discussed above. This can be contrasted with prospective memory, which is remembering to do things rather than remembering things that have already happened. One might have to remember to do something at a particular Page 7 of 26

Memory Disorders time (e.g., to watch the news on television at 9.00 p.m.), within a given interval of time (e.g., to take your medication in the next 10 minutes), or when a certain event happens (e.g., when you next see your sister, to give her a message from an old school friend). Groot, Wilson, Evans, and Watson (2002) showed that prospective memory can fail be cause of memory failures or because of executive difficulties. Indeed, there is support for the view that measures of executive functioning are better at accounting for prospective memory failures than are measures of memory (Burgess et al., 2000). In a review of the literature, Fish, Manly, and Wilson (2010) suggested a hierarchical model whereby episodic memory problems are likely to lead to prospective memory problems due to for getting task-related information (e.g., what to do, when to do it, that there is a task to do). However, when retrospective memory functioning is adequate, other more executive problems can lead to prospective memory failure, for example, those resulting from inad equate monitoring of the self and environment for retrieval cues, failure to initiate the in tended activity, or difficulty in applying effective strategies and managing multiple task demands. Considering the multi-componential nature of prospective memory tasks, it is not surprising that problems with such tasks are one of the most frequently reported complaints when people are asked what everyday memory problems they face (Baddeley, 2004). (p. 478) In clinical practice, treatment of prospective memory disorders is one of the major components of memory rehabilitation.

Nondeclarative Long-Term Memory As mentioned previously, the term nondeclarative memory refers to memory that does not involve explicit or conscious remembering. Squire and Zola-Morgan (1988) identified sev eral heterogeneous subtypes of nondeclarative memory, including procedural memory, priming, classical conditioning, and adaptation effects. These types of memory are gener ally reported to be intact in even densely amnesic patients, such as the famous case of H.M. (for a review, see Corkin, 2002). The nondeclarative abilities most relevant in terms of memory disorders are procedural memory and priming.

Procedural Memory Procedural memory refers to the abilities involved in learning skills or routines. These abilities are generally tested in the laboratory by asking people to repeatedly perform an unfamiliar perceptual or motor task, such as reading mirror-reversed words or tracing the shape of a star with only the reflection of a mirror for visual feedback. Common reallife examples are learning how to use a computer or to ride a bicycle. The primary char acteristic of this kind of learning is that it does not depend on conscious recollection; in stead, the learning can be demonstrated without the need to be aware of where and how the original learning took place. For this reason, most people with memory problems show normal or relatively normal procedural learning (Brooks & Baddeley, 1976; Cohen et al., 1985). Some patients are known to show impaired procedural learning, particularly those with Huntington’s disease and those with Parkinson’s disease (Osman et al., 2008; Vakil & Herishanu-Naaman, 1998). People with Alzheimer’s disease may show a deficit (Mitchell & Schmitt, 2006) or may not (Hirono et al., 1997).

Page 8 of 26

Memory Disorders Priming The term priming refers to the processes whereby performance on a given task is im proved or biased through prior exposure or experience. Again, no conscious memory of the previous episode is necessary. Thus, if an amnesic patient is shown and reads a list of words before being shown the word stems (in the form of the first two or three letters of the word), he or she is likely to respond with the previously studied words even though there is no conscious or explicit memory of seeing them before (Warrington & Weiskrantz, 1968). Further, a double dissociation has been reported between priming and recognition memory. Keane et al. (1995) reported that a patient with bilateral occipital lesions showed impaired priming but intact recognition memory, whereas patient H.M., who had bilateral medial temporal lesions, showed intact priming but impaired recognition memory. Hen son (2009), integrating such evidence from neuropsychological studies, along with experi mental psychological and neuroimaging studies of priming, considers that priming is a form of memory dissociable from declarative memory, which reflects plasticity in the form of reduced neural activity in areas of the brain relevant to the task in question. For exam ple in word-reading tasks, such reductions are seen in the left inferior frontal, left inferior temporal, and occipital regions. These reductions are thought to reflect more efficient processing of the stimuli.

Recovery of Memory Functioning Recovery means different things to different people. Some focus on survival rates, others are more concerned with recovery of cognitive functions, and others only consider biolog ical recovery such as repair of brain structures. Some interpret recovery as the exact re instatement of behaviors disrupted by the brain injury (LeVere, 1980), but this state is rarely achieved by memory-impaired people. Jennett and Bond (1975) define “good recov ery” on the Glasgow Outcome Scale as “resumption of normal life even though there may be minor neurological and psychological deficits” (p. 483). This is sometimes achievable for those with organic memory problems (Wilson, 1999). Kolb (1995) says that recovery typically involves partial recovery of lost functioning together with considerable substitu tion of function, that is, compensating for the lost function through other means. Because this includes both natural recovery and compensatory approaches, it is, perhaps, more satisfactory for those of us working in rehabilitation. TBI is the most common cause of brain damage and memory impairment in people younger than 25 years. Some, and often considerable, recovery may be seen in people in curring such injury. This is likely to be fairly rapid in the early weeks and months after in jury, followed by a slower recovery that can continue for many years. Those with other kinds of nonprogressive injury such as stroke (cerebrovascular accident), encephalitis, and hypoxic brain damage may show a similar pattern of recovery, although (p. 479) this typically lasts for months rather than years. For many people with severe brain damage, recovery will be minimal, and compensatory approaches may provide the best chance of reducing everyday problems, increasing independence, and enhancing quality of life.

Page 9 of 26

Memory Disorders Mechanisms of recovery include resolution from edema or swelling of the brain, diaschi sis (whereby lesions cause damage to other areas of the brain through shock), plasticity or changes to the structure of the nervous system, and regeneration or regrowth of neur al tissue. Changes seen in the first few minutes (e.g., after a mild head injury) probably reflect the resolution of temporary damage that has not caused structural damage. Changes seen within several days are more likely to be due to resolution of temporary structural abnormalities such as edema, vascular disruption, or the depression of enzyme metabolic activity. Recovery after several months or years is less well understood. There are several ways in which this might be achieved, including diaschisis, plasticity, or re generation. Age at insult, diagnosis, the number of insults sustained by an individual, and the premorbid status of the individual’s brain are other factors that may influence recov ery from brain injury. Kolb (2003) provides an overview of plasticity and recovery from brain injury. So much for general recovery, but what about recovery of memory functions themselves? Although some recovery of lost memory functioning occurs in the early weeks and months following an insult to the brain, many people remain with lifelong memory problems. Pub lished studies show contradictory evidence, with some individuals showing no improve ment and others showing considerable improvement. It is clear that we can improve on natural recovery through rehabilitation (Wilson, 2009). Because restoration of episodic memory is unlikely in most cases after the acute period, compensatory approaches are the most likely to lead to changes in everyday memory functioning. Before beginning re habilitation, however, a thorough assessment is required, and this is addressed in the next section.

Assessment of Memory Functioning Assessment is the systematic collection, organization, and interpretation of information about a person and his or her situation. It is also concerned with the prediction of behav ior in new situations (Sundberg & Tyler, 1962). Of course, the means by which we collect, organize, and interpret this information depends on the purpose of the assessment. We carry out assessments in order to answer questions, and the questions determine the as sessment procedure. A research question such as whether there are clear distinctions be tween immediate and delayed memory will be answered one way, whereas clinical ques tions such as, “what are the most frequent everyday memory problems faced by this pa tient?” will require a different approach. Some questions can be answered through the use of standardized tests, others need func tional or behavioral assessments, and others may require specially designed procedures. Standardized tests can help us answer questions about the nature of the memory deficit— for example, “does this person have an episodic memory disorder?” or “Is the memory problem global or restricted to certain kinds of material (e.g., Is memory for visual mater ial better than for verbal material)”? They can also help us answer questions about indi rect effects on memory functioning such as, “To what extent are the memory problems due to executive, language, perceptual, or attention difficulties?” or “Is this person de Page 10 of 26

Memory Disorders pressed?” If the assessment question is, “What are this person’s memory strengths and weaknesses?” we should assess immediate and delayed memory; verbal and visuo-spatial memory; recall and recognition; semantic and episodic memory; explicit and implicit memory; anterograde and retrograde amnesia; and new learning and orientation. Pub lished standardized tests exist for many of these functions, apart from implicit memory and retrograde amnesia, for which one might need to design specific procedures (see Wil son, 2004, for a fuller discussion of the assessment of memory). Of course, memory as sessments should not be carried out alone—it will also be necessary to assess general in tellectual functioning; predict premorbid ability; assess language, reading, perception, at tention, and executive functioning to get a clear picture of the person’s cognitive strengths and weaknesses; and assess anxiety, depression, and perhaps other areas of emotion such as post-traumatic stress disorder. These assessments are needed for reha bilitation because the emotional consequences of memory impairment should be treated alongside the memory and other cognitive consequences of any insult to the brain (Williams & Evans, 2003) When we want to answer more treatment-related questions like, “How are the memory difficulties manifested in everyday life?” or “What coping strategies are used?” we need a more functional or behavioral approach through observations, self-report measures (from relatives or caregivers as well as (p. 480) from the memory-impaired person), or inter views because these are more suited and able to answer real-life, practical problems. The standardized and functional assessment procedures provide complementary information: The former allow us to build up a cognitive map of a person’s strengths and weaknesses, whereas the latter enable us to target areas for treatment.

Rehabilitation of Memory Rehabilitation should focus on improving aspects of everyday life and should address per sonally meaningful themes, activities, settings, and interactions (Ylvisaker & Feeney, 2000). Many survivors of brain injury will face problems in everyday life. These could be difficulties with motor functioning, impaired sensation, reduced cognitive skills, emotion al troubles, conduct or behavioral problems, and impoverished social relationships. Some people will have all of these. In addition to emotional and psychosocial problems, neu ropsychologists in rehabilitation are likely to treat cognitive difficulties, including memo ry deficits (Wilson et al., 2009). The main purposes of rehabilitation, including memory rehabilitation, are to enable people with disabilities to achieve their optimal level of wellbeing, to reduce the impact of their problems on everyday life, and to help them return to their own most appropriate environments. Its purpose is not to teach individuals to score better on tests or to learn lists of words or to be faster at detecting stimuli. Apart from some emerging evidence that it might be possible to achieve some degree of restoration of working memory in children with attention deficit hyperactivity disorder (Klingberg, 2006; Klingberg et al., 2005) and possibly in stroke patients (Westerberg et al., 2007), no evidence exists for recovery of episodic memory in survivors of brain injury. Thus restora tion of memory functioning is at this time an unrealistic goal. There is evidence, however, that we can help people to compensate for their difficulties, find ways to help them learn Page 11 of 26

Memory Disorders more efficiently, and for those with very severe and widespread cognitive problems, orga nize the environment so that they can function without the need for memory (Wilson, 2009). These are more realistic goals for memory rehabilitation.

Helping People to Compensate for Their Memory Difficulties Through the Use of External Memory Aids A wide variety of external memory aids exist to help people remember what has to be done in everyday life. These range from simple Post-It notes to pill boxes, alarms, and talking watches to very sophisticated aids such as electronic organizers and global posi tioning system devices. External aids may alert someone to the fact that something needs to be done at a particular time and place, taking medication or collecting children from school, or they may act as systems to store information unrelated to a particular time or place, such as an address and telephone book. Because the use of such aids involves memory, the people who need them most usually have the greatest difficulty in learning to use them. Nevertheless, some people use compensatory aids well (Evans et al., 2003; Kime et al., 1996; Wilson, 1991). If aids are to be used successfully, people need insight; motivation; certain cognitive, emotional, and motivational characteristics; previous use of memory aids; demands on memory; support from family, school, or work; and availability of appropriate aids (Scherer, 2005). Several studies have looked at the efficacy of exter nal aids for memory-impaired people (summarized in Wilson, 2009). One series of studies carried out in Cambridge, England involves the use of a paging sys tem, “NeuroPage,” to determine whether people can carry out everyday tasks more effi ciently with or without a pager. Following a pilot study (Wilson et al., 1997), a randomized control trial (cross-over design) was carried out. People were randomly allocated to pager first (Group A) or waiting list first (Group B). All participants chose their own target be haviors that they needed to remember each day. Taking medication, feeding pets, and turning on the hot water system were the kinds of target behavior selected. Most people selected between four and seven messages each day. For 2 weeks (the baseline period), participants recorded the frequency with which these targets were achieved each day; an independent observer (usually a close relative) also checked to ensure the information was accurate. In the baseline period, there were no significant differences between the two groups. Group A participants were then provided with pagers for 7 weeks, while Group B participants remained on the waiting list. There was a very significant improve ment in the targets achieved by Group A. Group B (on the waiting list) did not change. Then Group A returned the pagers, which were given to Group B. Now Group B showed a statistically significant improvement over baseline and waiting list periods. Group A par ticipants dropped back a little but were still better than baseline, showing that they had learned many of their target behaviors during the 7 weeks with the pager (Wilson et al., 2001). (p. 481) This study comprised several diagnostic groups, which have been reported separately. People with TBI performed like the main group (Wilson et al., 2005), which is not surprising because they were the largest subgroup. Four people with encephalitis all improved with the pager (Emslie et al., 2007). People with stroke performed like the main group in the baseline and treatment phases but dropped back to baseline levels when the Page 12 of 26

Memory Disorders pagers were returned (Fish et al., 2008), possibly because they were older and had more executive deficits as a result of ruptured aneurysms on the anterior communicating artery. There were twelve children in the study with ages ranging from 8 to 18 years (Wil son et al., 2009), all of whom benefited. Approximately 80 percent of the 143 people in the study reduced their everyday memory and planning problems. As a result of the study, the local health authority set up a clinical service for people throughout the United King dom—so this was an example of research influencing clinical practice.

New Learning for Memory-Impaired People One of the most handicapping aspects of severe memory impairment is the great difficul ty in learning new information. Although many think that repetition is the answer, rote re hearsal or simply repeating material is not a particularly good learning strategy for peo ple with memory deficits. We can hear or read something many times over and still not remember it, and the information may simply “go in one ear and out the other.” There are ways to help memory-impaired people learn more efficiently, and this is one of the main focuses of memory rehabilitation. The method of vanishing cues, spaced retrieval, mnemonics, and errorless learning are, perhaps, the main ways to enhance new learning. In the method of vanishing cues, prompts are provided and then gradually faded out. For example, someone learning a new name might be expected first to copy the whole name, then the last letter would be deleted; the name would be copied again and the last letter inserted by the memory-impaired person, then the last two letters would be deleted and the process repeated until all letters were completed by the memory-impaired person. Glisky et al. (1986) were the first to report this method with memory-impaired people. Several studies have since been published with both nonprogressive patients and those with dementia (see Wilson, 2009, for a full discussion). Spaced retrieval, also known as expanding rehearsal (Landauer & Bjork, 1978), is also widely used in memory rehabilitation. This method involves the presentation of material to be remembered, followed by immediate testing, then a very gradual lengthening of the retention interval. Spaced retrieval may work because it is a form of distributed practice, that is, distributing the learning trials over a period of time rather than massing them to gether in one block. Distributed practice is known to be more effective than massed prac tice (Baddeley, 1999). Camp and colleagues in the United States have used this method extensively with dementia patients (Camp et al., 1996; 2000; McKitrick & Camp, 1993), but it has also been used to help people with TBI, stroke, encephalitis, and dementia. Sohlberg (2005) discusses using this method in people with learning difficulties. Those of us without severe memory difficulties can benefit from trial-and-error learning. We are able to remember our mistakes and thus can avoid making the same mistake in fu ture attempts. Because memory-impaired people have difficulty with this, any erroneous response may be strengthened or reinforced. This is the rationale behind errorless learn ing, a teaching technique whereby the likelihood of mistakes during learning is mini mized as far as possible. Another way of understanding errorless learning is through the principle of Hebbian plasticity and learning (Hebb, 1949). At a synaptic level, Hebbian plasticity refers to increases in synaptic strength between neurons that fire together Page 13 of 26

Memory Disorders (“neurons that fire together wire together”). Hebbian learning refers to the detection of temporally related inputs. If an input elicits a pattern of neural activity, then, according to the Hebbian learning rule, the tendency to activate the same pattern on subsequent occa sions is strengthened. This means that the likelihood of making the same response in the future, whether correct or incorrect, is strengthened (McClelland et al., 1999). Like im plicit memory, Hebbian learning has no mechanism for filtering out errors. Errors can be avoided through the provision of spoken or written instructions or guiding someone through a particular task or modeling the steps of a procedure. There is now considerable evidence that errorless learning is superior to trial-and-error learning for people with severe memory deficits. In a meta-analysis of errorless learning, Kessels and De Haan (2003) found a large and statistically significant effect size of this kind of learn ing for those with severe memory deficits. The combination of (p. 482) errorless learning and spaced retrieval would appear to be a powerful learning strategy for people with pro gressive conditions in addition to those with nonprogressive conditions (Wilson, 2009). There has been some debate in the literature as to whether errorless learning depends on explicit or implicit memory. Baddeley and Wilson (1994) argued that implicit memory was responsible for the efficacy of errorless learning: Amnesic patients had to rely on implicit memory, a system that is poor at eliminating errors (this is not to say that errorless learn ing is a measure of implicit memory). Nevertheless, there are alternative explanations. For example, the errorless learning advantage could be due to residual explicit memory processes or to a combination of both implicit and explicit systems. Hunkin et al. (1998) argued that it is due entirely to the effects of error prevention on the residual explicit memory capacities, and not to implicit memory at all. Specifically, they used errorless and errorful learning protocols similar to Baddeley and Wilson (1994) to teach word lists to people with moderate to severe and relatively specific memory impairment. They investi gated the errorless learning advantage in a fragment completion task intended to tap im plicit memory of the learned words and in a cued-recall task intended to tap explicit mem ory. The fragment completion task presented participants with two letters from learned words from noninitial positions (e.g., _ _ T _ S _ ), with the task being to complete the word. The cued-recall condition presented the first two letters of learned words (e.g., A R _ _ _ _). In both of these examples the learned word is “ARTIST.” Hunkin et al. found that the errorless learning advantage was only evident in the cued-recall task, with perfor mance in fragment completion being equivalent between errorless and errorful learning conditions. The authors interpreted this result as meaning that the errorless learning ad vantage relies on residual explicit memory. Tailby and Haslam (2003) also believe that the benefits of EL are due to residual explicit concurrent memory processes, although they do not rule out implicit memory processes altogether. They say the issue is a complex one and that different individuals may rely on different processes. Support for this view can also be found in a paper by Kessels et al. (2005). Page et al. (2006) claim, however, that preserved implicit memory in the absence of ex plicit memory is sufficient to produce the errorless learning advantage. They challenge the conclusions of Hunkin et al. (1998) because the design of their implicit task was such Page 14 of 26

Memory Disorders that it was unlikely to be sensitive to implicit memory for prior errors. As is clear from the example above, the stimuli in the fragment completion task used by Hunkin et al. did not prime errors made during learning in the same way that the cued-recall stimuli did. To continue with the previous example, if the errors “ARCHES” and “AROUND” had been made during learning, neither would fit with the fragment “_ _ T _ S _.” If the errorless learning advantage results from avoiding implicit memory of erroneous responses, then an advantage would not be expected within this fragment completion task. Furthermore, there was an element of errorful learning in both the errorless and errorful explicit mem ory conditions. They also challenge the Tailby and Haslam (2003) paper because it con flates two separate questions. First, is the advantage of errorless learning due to the con tribution of implicit memory? And second, is learning under errorless conditions due to implicit memory? Perhaps some people do use both implicit and explicit systems when learning material, but this does not negate the argument that the advantage (i.e., that seen at retrieval) is due to implicit memory, particularly implicit memory for prior errors following errorful learning. Some people with no or very little explicit recall can learn un der certain conditions such as errorless learning. For example, the Baddeley and Wilson (1994) study included sixteen very densely amnesic participants with extremely little ex plicit memory capacity, yet nevertheless, every single one of them showed an errorless over errorful advantage. Page et al. (2006) also found that people with moderate memory impairment, and hence some retention of explicit memory, showed no greater advantage from errorless learning than people with very severe memory impairment who had very little explicit memory, which again supports the hypothesis that implicit memory is suffi cient to produce that advantage. Mnemonics are systems that enable us to remember things more easily and usually refer to internal strategies, such as reciting a rhyme to remember how many days there are in a month, or remembering the order of information such as “My very elderly mother just sat upon a new pin” to remember the order of the planets around the sun (where my stands for Mercury, very for Venus, elderly for Earth, etc). Although verbal and visual mnemonic systems have been used successfully with memory-impaired people (Wilson, 2009), not everyone can use them. Instead of expecting memory-impaired people to use mnemonics spontaneously, therapists may need to employ them to help their patients (p. 483) achieve faster learning for particular pieces of information, such as names of a few people or a new address.

Modifying the Environment for Those with Severe and Widespread Cogni tive Deficits People with very severe memory difficulties and widespread cognitive problems may be unable to learn compensatory strategies and have major difficulties learning new episodic information. They may be able to learn things implicitly. C.W., for example, the musician who has one of the most severe cases of amnesia on record (Wearing, 2005; Wilson et al., 2008) is unable to lay down any new episodic memories but has learned some things im plicitly. Thus, if he is asked “where is the kitchen?” he says he does not know, but if asked if he would like to go and make himself some coffee, he will go to the kitchen without er Page 15 of 26

Memory Disorders ror. He has implicit but no explicit memory of how to find the kitchen. For such people, our only hope of improving quality of life and giving them as much independence as possi ble is probably to organize or structure the environment so that they can function without the need for memory. People with severe memory problems may not be handicapped in environments where there are no demands made on memory. Thus, if doors, closets, drawers, and storage jars are clearly labeled, if rooms are cleared of dangerous equip ment, if someone appears to remind or accompany the memory-impaired person when it is time to go to the hairdresser or to have a meal, the person may cope reasonably well. Kapur et al. (2004) give other examples. Items can be left by the front door for people who forget to take belongings with them when they leave the house; a message can be left on the mirror in the hallway; and a simple flow chart can be used to help people search in likely places when they cannot find a lost belonging (Moffat, 1989). Modifica tions can also be made to verbal environments to avoid irritating behavior such as the repetition of a question, story, or joke. It might be possible to identify a “trigger” or an an tecedent that elicits this behavior. Thus, by eliminating the “trigger,” one can avoid the repetitious behavior. For example, in response to the question “How are you feeling to day?” one young brain-injured man would say “Just getting over my hangover.” If staff simply said “Good morning,” however, he replied “Good morning,” so the repetitious com ments about his supposed hangover were avoided. Hospitals and nursing homes, in addition to other public spaces such as shopping cen ters, may use color coding, signs, and other warning systems to reduce the chances of getting lost. “Smart houses” are already in existence to help “disable the disabling envi ronment” described by Wilson and Evans (2000). Some of the equipment used in Smart houses can be employed to help the severely memory-impaired patient survive more easi ly. For example, “photo phones” are telephones with large buttons (available from ser vices for visually impaired people); each button can be programmed to dial a particular number and a photo of the person who owns that number can be pasted on to the large button. Thus, if a memory-impaired person wants to dial her daughter or her district nurse, she simply presses the right photograph, and the number is automatically dialled.

Emotional Consequences of Memory Impairment In addition to their memory problems, many memory-impaired people have other cogni tive deficits such as impaired attention, word-finding problems, and difficulties with plan ning, judgment, and reasoning, and they may also suffer emotional disorders such as anx iety, depression, mood swings, and anger problems. When neuropsychological rehabilita tion programs address the cognitive, emotional, and psychosocial consequences of brain injury, patients experience less emotional distress, increased self-esteem, and greater productivity (Prigatano, 1994; Prigatano et al., 1999). Treatment for emotional difficulties includes psychological support for individuals and for groups (Wilson et al., 2009). Individual psychological support is mostly derived from cog nitive behavior therapy, which is now very much part of neuropsychological rehabilitation programs, particularly in the United Kingdom (Gracey et al., 2009). Tyerman and King Page 16 of 26

Memory Disorders (2004) provide suggestions on how to adapt psychotherapy and cognitive behavior thera py for those with memory problems. Notes, audiotapes and videotapes of sessions, fre quent repetitions, mini-reviews, telephone reminders to complete homework tasks, and use of family members as co-therapists can all help to circumvent the difficulties posed by impaired retention of the therapeutic procedures. Group therapy can also be of great assistance in helping to reduce anxiety and other emo tional difficulties. Memory-impaired people often benefit from interaction with others having similar problems. (p. 484) Those who fear they are losing their sanity may have their fears allayed through the observation of others with similar problems. Groups can reduce anxiety and distress; they can instill hope and show patients that they are not alone; and it may be easier to accept advice from peers than from therapists or easier to use strategies that peers are using rather than strategies recommended by professional staff (Evans, 2009; Malley et al., 2009; Wilson, 2009).

Conclusions 1. Memory disorders can be classified in a number of ways, including the amount of time for which information is stored, the type of information stored, the type of mate rial to be remembered, the modality being employed, the stages involved in the mem ory process, explicit and implicit memory, recall and recognition, retrospective and prospective memory, and anterograde and retrograde amnesia. 2. Some recovery of memory functioning can be expected after an insult to the brain, especially in the early days, weeks, and months after nonprogressive damage. Age at insult, diagnosis, the number of insults sustained by the individual, and the premor bid status of the individual’s brain are just a few of the factors influencing recovery. Some people will remain with lifelong memory impairment. There is no doubt that we can improve on natural recovery through rehabilitation. Given the fact that restora tion of episodic, explicit memory is unlikely in most cases after the acute period, compensatory approaches are the most likely to lead to change in everyday memory functioning. 3. Before planning treatment for someone with memory difficulties, a detailed as sessment should take place. This should include a formal neuropsychological assess ment of all cognitive abilities, including memory, to build up a picture of a person’s cognitive strengths and weaknesses. In addition, assessment of emotional and psy chosocial functioning should be carried out. Standardized tests should be comple mented with observations, interviews, and self-report measures. 4. Once the assessment has been carried out, one can design a rehabilitation pro gram. One of the major ways of helping people with memory problems cope in every day life is to enable them to compensate through the use of external aids; we can al so help them learn more efficiently, and for those who are very severely impaired, we may need to structure or organize the environment to help them function without a memory. We can also provide support and psychotherapy to address the emotional consequences of memory impairment. Page 17 of 26

Memory Disorders 5. Rehabilitation can help people to compensate for, bypass, or reduce their everyday problems and thus survive more efficiently in their own most appropriate environ ments. Rehabilitation makes clinical and economic sense and should be widely avail able to all those who need it.

References Atkinson, R. C., & Shiffrin, R. M. (1971). The control of short-term memory. Scientific American, 225: 82–90. Baddeley, A. D. (1986). Working memory. Gloucester, UK: Clarendon Press. Baddeley, A. D. (1992). Memory theory and memory therapy. In B. A. Wilson & N. Moffat (Eds.), Clinical management of memory problems (2nd ed., pp. 1–31). London: Chapman & Hall. Baddeley, A. D. (1997). Human memory: Theory and practice (revised edition). Hove: Psy chology Press. Baddeley, A. D. (1999). Essentials of human memory. Hove: Psychology Press. Baddeley, A. D. (2000). The episodic buffer: A new component of working memory? Trends in Cognitive Sciences, 4 (11), 417–423. Baddeley, A. D. (2004). The psychology of memory. In A. D. Baddeley, M. D. Kopelman, & B. A. Wilson (Eds.), The essential handbook of memory disorders for clinicians. (pp. 1–14). Chichester, UK: John Wiley & Sons. Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. H. Bower (Ed.), The psychol ogy of learning and motivation: Advances in research and theory (pp. 47–89). New York: Academic Press. Baddeley, A. D., & Wilson, B. A. (1988a). Comprehension and working memory: A single case neuropsychological study. Journal of Memory and Language, 27 (5), 479–498. Baddeley, A. D., & Wilson, B. A. (1988b). Frontal amnesia and the dysexecutive syndrome. Brain and Cognition, 7 (2), 212–230. Baddeley, A. D., & Wilson, B. A. (1994). When implicit learning fails: Amnesia and the problem of error elimination. Neuropsychologia, 32 (1), 53–68. Berry, E. (2007). Using SenseCam, a wearable camera, to alleviate autobiographical mem ory loss. Talk given at the British Psychological Society Annual Conference, York, UK. Brooks, D. N., & Baddeley, A. D. (1976). What can amnesic patients learn? Neuropsycholo gia, 14 (1), 111–122.

Page 18 of 26

Memory Disorders Burgess, P. W., Veitch, E., de Lacy Costello, A., & Shallice, T. (2000). The cognitive and neuroanatomical correlates of multitasking. Neuropsychologia, 38 (6), 848–863. Camp, C. J., Bird, M., & Cherry, K. (2000). Retrieval strategies as a rehabilitation aid for cognitive loss in pathological aging. In R. D. Hill, L. Bäckman, & A. StigsdotterNeely (Eds.), Cognitive rehabilitation in old age (pp. 224–248). New York: Oxford Univer sity Press. (p. 485)

Camp, C. J., Foss, J. W., Stevens, A. B., & O’Hanlon, A. M. (1996). Improving prospective memory performance in persons with Alzheimer’s disease. In M. A. Brandimonte, G.O. Einstein, & M. A. McDaniel (Eds.), Prospective memory: Theory and application (pp. 351– 367). Mahwah, NJ: Erlbaum. Cermak, L. S., & O’Connor, M. (1983). The anterograde and retrograde retrieval ability of a patient with amnesia due to encephalitis. Neuropsychologia, 21 (3), 213–234. Clare, L. (2008). Neuropsychological rehabilitation and people with dementia. Hove, UK: Psychology Press. Clare, L., Wilson, B. A., Breen, K., & Hodges, J. R. (1999). Errorless learning of face-name associations in early Alzheimer’s disease. Neurocase, 5, 37–46. Clare, L., Wilson, B. A., Carter, G., Breen, K., Gosses, A., & Hodges, J. R. (2000). Interven ing with everyday memory problems in dementia of Alzheimer type: An errorless learning approach. Journal of Clinical and Experimental Neuropsychology, 22 (1), 132–146. Clare, L., & Woods, R. T. (2004) Cognitive training and cognitive rehabilitation for people with early-stage Alzheimer’s disease: A review. Neuropsychological Rehabilitation, 14, 385–401. Cohen, N. J., Eichenbaum, H., Deacedo, B. S., & Corkin, S. (1985). Different memory sys tems underlying acquisition of procedural and declarative knowledge. Annals of the New York Academy of Sciences, 444, 54–71. Corkin, S. (2002). What’s new with the amnesic patient H.M.? Nature Reviews Neuro science, 3, 153–160. Della Sala, S., & Logie, R. H. (2002). Neuropsychological impairments of visual and spa tial working memory. In A. D. Baddeley, M. D. Kopelman, & B. A. Wilson (Eds.), The hand book of memory disorders (pp. 271–292). Chichester, UK: John Wiley & Sons. Diana, R. A., Yonelinas, A. P., & Ranganath, C. (2007). Imaging recollection and familiarity in the medial temporal lobe: a three-component model. Trends in Cognitive Sciences, 11, 379–386. Emslie, H., Wilson, B. A., Quirk, K., Evans, J., & Watson, P. (2007). Using a paging system in the rehabilitation of encephalitic patients. Neuropsychological Rehabilitation, 17, 567– 581. Page 19 of 26

Memory Disorders Evans, J. J. (2005). Can executive impairments be effectively treated? In P. W. Halligan & D. Wade (Eds.), The effectiveness of rehabilitation for cognitive deficits (pp. 247–256). Ox ford, UK: Oxford University Press. Evans, J. J. (2009). The cognitive group part two: Memory. In B. A. Wilson, F. Gracey, J. J. Evans, & A. Bateman (Eds.), Neuropsychological rehabilitation: Theory, therapy and out comes (pp. 98–111). Cambridge, UK: Cambridge University Press. Evans, J. J., Wilson, B. A., Needham, P., & Brentnall, S. (2003). Who makes good use of memory aids: Results of a survey of 100 people with acquired brain injury. Journal of the International Neuropsychological Society, 9 (6), 925–935. Fish, J., Manly, T., Emslie, H., Evans, J. J., & Wilson, B. A. (2008). Compensatory strategies for acquired disorders of memory and planning: Differential effects of a paging system for patients with brain injury of traumatic versus cerebrovascular aetiology. Journal of Neu rology, Neurosurgery and Psychiatry, 79, 930–935. Fish, J., Wilson, B. A., & Manly, T. (2010). The assessment and rehabilitation of prospec tive memory problems in people with neurological disorders: A review. Neuropsychologi cal Rehabilitation, 20, 161–179. Fletcher, P. C., & Henson, R. N. A. (2001). Frontal lobes and human memory: Insights from functional neuroimaging. Brain, 124, 849–881. Glisky, E. L., Schacter, D. L., & Tulving, E. (1986). Computer learning by memory-im paired patients: Acquisition and retention of complex knowledge. Neuropsychologia, 24 (3), 313–328. Gracey, F., Yeates, G., Palmer, S., & Psaila, K. (2009) The psychological support group. In B. A. Wilson, F. Gracey, J. J. Evans, & A. Bateman (Eds.), Neuropsychological Rehabilita tion: Theory, therapy and outcomes (pp. 123–137). Cambridge, UK: Cambridge University Press. Graham, K. S., Simons, J.S., Pratt, K. H., Patterson, K., & Hodges, J. R. (2000). Insights from semantic dementia on the relationship between episodic and semantic memory. Neu ropsychologia, 38, 313–324. Groot, Y. C. T., Wilson, B. A., Evans, J. J., & Watson, P. (2002). Prospective memory func tioning in people with and without brain injury. Journal of the International Neuropsycho logical Society, 8 (05), 645–654. Hebb, D. O. (1949). The organization of behavior: A neuropsychological theory. Chich ester, UK: Wiley. Henson, R. N. (2009). Priming. In L. Squire, T. Albright, F. Bloom, F. Gage & N. Spitzer (Eds.), New encyclopedia of neuroscience (pp. 1055–1063). Online: Elsevier.

Page 20 of 26

Memory Disorders Hirono, N., Mori, E., Ikejiri, Y., Imamura, T., Shimomura, T., Ikeda, M., et al. (1997). Pro cedural memory in patients with mild Alzheimer’s disease. Dementia and Geriatric Cogni tive Disorders, 8 (4), 210–216. Hunkin, N. M., Squires, E. J., Parkin, A. J., & Tidy, J. A. (1998). Are the benefits of error less learning dependent on implicit memory? Neuropsychologia, 36 (1), 25–36. Jennett, B., & Bond, M. (1975). Assessment of outcome after severe brain damage. Lancet, 1 (7905), 480–484. Kapur, N. (1993). Focal retrograde amnesia in neurological disease: A critical review. Cor tex; a Journal Devoted to the Study of the Nervous System and Behavior, 29 (2), 217–234. Kapur, N. (1999). Syndromes of retrograde amnesia: A conceptual and empirical synthe sis. Psychological Bulletin, 125, 800–825. Kapur, N., Glisky, E. L., & Wilson, B. A. (2004). Technological memory aids for people with memory deficits. Neuropsychological Rehabilitation, 14 (1/2), 41–60. Keane, M. M., Gabrieli, J. D., Mapstone, H. C., Johnson, K. A., & Corkin, S. (1995). Double dissociation of memory capacities after bilateral occipital-lobe or medial temporal-lobe le sions. Brain, 119, 1129–1148. Kessels, R. P. C., Boekhorst, S. T., & Postma, A. (2005). The contribution of implicit and ex plicit memory to the effects of errorless learning: A comparison between young and older adults. Journal of the International Neuropsychological Society, 11 (2), 144–151. Kessels, R. P. C., & de Haan, E. H. F. (2003). Implicit learning in memory rehabilitation: A meta-analysis on errorless learning and vanishing cues methods. Journal of Clinical and Experimental Neuropsychology, 25 (6), 805–814. Kime, S. K., Lamb, D. G., & Wilson, B. A. (1996). Use of a comprehensive program of ex ternal cuing to enhance procedural memory in a patient with dense amnesia. Brain Injury, 10, 17–25. Klingberg, T. (2006). Development of a superior frontalintraparietal network for visuo-spatial working memory. Neuropsychologia, 44 (11), 2171–2177. (p. 486)

Klingberg, T., Fernell, E., Olesen, P., Johnson, M., Gustafsson, P., Dahlström, K., Gillberg, C. G., Forssberg, H., & Westerberg, H. (2005). Computerized training of working memory in children with ADHD—a randomized, controlled trial. Journal of the American Academy of Child and Adolescent Psychiatry, 44 (2), 177–186. Kolb, B. (1995). Brain plasticity and behaviour. Hillsdale, NJ: Erlbaum. Kolb, B. (2003). Overview of cortical plasticity and recovery from brain injury. Physical Medicine and Rehabilitation Clinics of North America, 14 (1), S7–S25.

Page 21 of 26

Memory Disorders Kopelman, M. D. (2000). Focal retrograde amnesia and the attribution of causality: An ex ceptionally critical view. Cognitive Neuropsychology, 17 (7), 585–621. Kopelman, M. D. (2004). Psychogenic amnesia. In A. D. Baddeley, M. D. Kopelman, & B. A. Wilson (Eds.), The essential handbook of memory disorders for clinicians (pp. 69–90). Chichester, UK: Wiley. Kopelman, M. D., Bright, P., Buckman, J., Fradera, A., Yoshimasu, H., Jacobson, C., & Colchester, A. C. F. (2007). Recall and recognition memory in amnesia: Patients with hip pocampal, medial temporal, temporal lobe or frontal pathology. Neuropsychologia, 45, 1232–1246. Kopelman, M. D., & Stanhope, N. (1998). Recall and recognition memory in patients with focal frontal, temporal lobe and diencephalic lesions. Neuropsychologia, 37, 939–958. Lambon Ralph, M. A., & Patterson, K. (2008). Generalization and differentiation in seman tic memory: Insights from semantic dementia. Annals of the New York Academy of Sciences, 1124, 61–76. Landauer, T. K., & Bjork, R. A. (1978). Optimum rehearsal patterns and name learning. In M. M. Gruneberg, P. Morris, & R. N. Sykes (Eds.), Practical aspects of memory (pp. 625– 632). London: Academic Press. LeVere, T. E. (1980). Recovery of function after brain damage: A theory of the behavioral deficit. Physiological Psychology, 8, 297–308. Malley, D., Bateman, A., & Gracey, F. (2009). Practically based project groups. In B. A. Wilson, F. Gracey, J. J. Evans, & A. Bateman (Eds.), Neuropsychological rehabilitation: Theory, therapy and outcomes (pp. 164–180). Cambridge, UK: Cambridge University Press. McClelland, J. L., Thomas, A. G., McCandliss, B. D., & Fiez, J. A. (1999). Understanding failures of learning: Hebbian learning, competition for representational space, and some preliminary experimental data. Progress in Brain Research, 121, 75–80. McDonald, C., Bauer, R., Filoteo, J., Grande, L., Roper, S., & Gilmore, R. (2006). Episodic memory in patients with focal frontal lobe lesions. Cortex, 42, 1080–1092. McKitrick, L. A., & Camp, C. J. (1993). Relearning the names of things: The spaced-re trieval intervention implemented by a caregiver. Clinical Gerontologist, 14 (2), 60–62. Miller, G. A. (1956). The magical number seven plus or minus two: Some limits on our ca pacity for processing information. Psychological Review, 63 (2), 81–97. Milner, B. (1965). Visually-guided maze learning in man: Effects of bilateral hippocampal, bilateral frontal, and unilateral cerebral lesions. Neuropsychologia, 3 (3), 17–338.

Page 22 of 26

Memory Disorders Milner, B. (1968). Visual recognition and recall after right temporal lobe excision in man. Neuropsychologia, 6, 191–209. Milner, B. (1971). Interhemispheric differences in the localisation of psychological processes in man. British Medical Bulletin, 27 (3), 272–277. Mitchell, D. B., & Schmitt, F. A. (2006). Short- and long-term implicit memory in aging and Alzheimer’s disease. Neuropsychology, Development, and Cognition. Section B, Ag ing, Neuropsychology and Cognition, 13 (3-4), 611–635. Moffat, N. (1989). Home-based cognitive rehabilitation with the elderly. In L. W. Poon, D. C. Rubin, & B. A. Wilson (Eds.), Everyday cognition in adulthood and later life (pp. 659– 680). Cambridge, UK: Cambridge University Press. Mummery, C. J., Patterson, K., Price, C. J., Ashburner, J., Frackowiak, R. S. J., & Hodges, J. R. (2000). A voxelbased morphometry study of semantic dementia: relationship between temporal lobe atrophy and semantic memory. Annals of Neurology, 47, 36–45. Osman, M., Wilkinson, L., Beigi, M., Castaneda, C. S., & Jahanshahi, M. (2008). Patients with Parkinson’s disease learn to control complex systems via procedural as well as nonprocedural learning. Neuropsychologia, 46 (9), 2355–2363. Page, M., Wilson, B. A., Shiel, A., Carter, G., & Norris, D. (2006). What is the locus of the errorless-learning advantage? Neuropsychologia, 44 (1), 90–100. Prigatano, G. P. (1999). Principles of neuropsychological rehabilitation. New York: Oxford University Press. Prigatano, G. P., Klonoff, P. S., O’Brien, K. P., Altman, I. M., Amin, K., Chiapello, D., et al. (1994). Productivity after neuropsychologically oriented milieu rehabilitation. Journal of Head Trauma Rehabilitation, 9 (1), 91. Scherer, M. (2005). Assessing the benefits of using assistive technologies and other sup ports for thinking, remembering and learning. Disability and Rehabilitation, 27 (13), 731– 739. Shallice, T., & Burgess, P. W. (1991). Higher-order cognitive impairments and frontal lobe lesions in man. In H. S. Levin, H. M. Eisenberg, & A. L. Benton (Eds.), Frontal lobe func tion and dysfunction (pp. 125–138). New York: Oxford University Press. Snowden, J. S. (2002). Disorders of semantic memory. In A. D. Baddeley, M. D. Kopelman, & B. A. Wilson (Eds.), The handbook of memory disorders. (pp. 293–314). Chichester, UK: John Wiley & Sons. Snowden, J. S., Neary, D., Mann, D. M., Goulding, P. J., & Testa, H. J. (1992). Progressive language disorder due to lobar atrophy. Annals of Neurology, 31 (2), 174–183.

Page 23 of 26

Memory Disorders Sohlberg, M. M. (2005). External aids for management of memory impairment. In W. High, A. Sander, K. M. Struchen, & K. A. Hart (Eds.), Rehabilitation for traumatic brain in jury (pp. 47–70). New York: Oxford University Press. Squire, L. R., & Zola-Morgan, S. (1988). Memory: brain systems and behavior. Trends in neurosciences, 11 (4), 170–175. Stilwell, P., Stilwell, J., Hawley, C., & Davies, C. (1999). The national traumatic brain in jury study: Assessing outcomes across settings. Neuropsychological Rehabilitation, 9 (3), 277–293. Stuss D. T., & Alexander, M. P. (2005). Does damage to the frontal lobes produce impair ment in memory? Current Directions in Psychological Science, 14, 84–88. Sundberg, N. D., & Tyler, L. E. (1962). Clinical psychology. New York: Appleton-CenturyCrofts. Tailby, R., & Haslam, C. (2003). An investigation of errorless learning in memory-impaired patients: improving the technique and clarifying theory. Neuropsychologia, 41 (9), 1230– 1240. Tulving, E. (1972). Episodic and semantic memory. In E. Tulving & W. Donaldson (Eds.), Organization of memory (pp. 381–402). New York: Academic Press. (p. 487)

Tulving, E. (1995). Organization of memory: Quo vadis? In M. S. Gazzaniga (Ed.), The cog nitive neurosciences (pp. 839–847). Cambridge, MA: MIT Press. Tulving, E. (2002). Episodic memory: From mind to brain. Annual Review of Psychology, 53, 1–25. Tyerman, A., & King, N. (2004). Interventions for psychological problems after brain in jury. In L. H. Goldstein & J. E. McNeil (Eds.), Clinical neuropsychology: A practical guide to assessment and management for clinicians (pp. 385–404). Chichester, UK: John Wiley & Sons. Vakil, E., & Herishanu-Naaman, S. (1998). Declarative and procedural learning in Parkinson’s disease patients having tremor or bradykinesia as the predominant symptom. Cortex; a Journal Devoted to the Study of the Nervous System and Behavior, 34 (4), 611– 620. Vallar, G., & Papagno, C. (2002). Neuropsychological impairments of verbal short-term memory. In Handbook of memory disorders (2nd ed., pp. 249–270). Chichester, UK: Wiley. Warrington, E. K. (1975). The selective impairment of semantic memory. Quarterly Jour nal of Experimental Psychology, 27 (4), 635–657. Warrington, E. K., & Weiskrantz, L. (1968). A study of learning and retention in amnesic patients. Neuropsychologia, 6 (3), 283–291. Page 24 of 26

Memory Disorders Wearing, D. (2005). Forever today: A memoir of love and amnesia. London: Doubleday. Westerberg, H., Jacobaeus, H., Hirvikoski, T., Clevberger, P., Ostensson, M. L., Bartfai, A., & Klingberg, T. (2007). Computerized working memory training after stroke—a pilot study. Brain Injury, 21 (1), 21–29. Williams, W. H., & Evans, J. J. (2003). Biopsychosocial approaches in neurorehabilitation: Assessment and management of neuropsychiatric, mood and behaviour disorders. Special Issue of Neuropsychological Rehabilitation, 13, 1–336. Wilson, B. A. (1991). Long-term prognosis of patients with severe memory disorders. Neu ropsychological Rehabilitation, 1 (2), 117–134. Wilson, B. A. (1997). Semantic memory impairments following non progressive brain in jury a study of four cases. Brain Injury, 11 (4), 259–270. Wilson, B. A. (1999). Case studies in neuropsychological rehabilitation (p. 384). New York: Oxford University Press. Wilson, B. A. (2004). Assessment of memory disorders. In A. D. Baddeley, M. D. Kopel man, & B. A. Wilson (Eds.), The essential handbook of memory disorders for clinicians (pp. 159–178). Chichester, UK: John Wiley & Sons. Wilson, B. A. (2009). Memory rehabilitation: Integrating theory and practice. New York: Guilford Press. Wilson, B. A., & Baddeley, A. D. (1993). Spontaneous recovery of impaired memory span: Does comprehension recover? Cortex, 29 (1), 153–159. Wilson, B. A., Baddeley, A. D., & Young, A. W. (1999). LE, a person who lost her “mind’s eye.” Neurocase, 5 (2), 119–127. Wilson, B. A., Emslie, H. C., Quirk, K., & Evans, J. J. (2001). Reducing everyday memory and planning problems by means of a paging system: A randomised control crossover study. Journal of Neurology, Neurosurgery and Psychiatry, 70 (4), 477–482. Wilson, B. A., Emslie, H., Quirk, K., Evans, J., & Watson, P. (2005). A randomised control trial to evaluate a paging system for people with traumatic brain injury. Brain Injury, 19, 891–894. Wilson, B. A., & Evans, J. J. (2000). Practical management of memory problems. In G. E. Berrios & J. R. Hodges (Eds.), Memory disorders in psychiatric practice (pp. 291–310). Cambridge, UK: Cambridge University Press. Wilson, B. A., Evans, J. J., Emslie, H., & Malinek, V. (1997). Evaluation of NeuroPage: A new memory aid. Journal of Neurology, Neurosurgery and Psychiatry, 63, 113–115. Wilson, B. A., Evans, J. J., Gracey, F., & Bateman, A. (2009). Neuropsychological rehabilita tion: Theory, therapy and outcomes. Cambridge, UK: Cambridge University Press. Page 25 of 26

Memory Disorders Wilson, B. A., Fish, J., Emslie, H. C., Evans, J. J., Quirk, K., & Watson, P. (2009). The Neu roPage system for children and adolescents with neurological deficits. Developmental Neurorehabilitation, 12, 421–426. Wilson, B. A., Kopelman, M. D., & Kapur, N. (2008). Prominent and persistent loss of selfawareness in amnesia: Delusion, impaired consciousness or coping strategy? Neuropsy chological Rehabilitation, 18, 527–540. Ylvisaker, M., & Feeney, T. (2000). Reconstruction of identity after brain injury. Brain Im pairment, 1 (1), 12–28.

Notes: (1) . The two sensory memory systems most studied are visual or iconic memory and audi tory or echoic memory. People with disorders in the sensory memory systems would usu ally be considered to have a visual or an auditory perceptual disorder rather than a mem ory impairment and thus are beyond the scope of this chapter.

Barbara Wilson

Barbara A. Wilson, MRC Cognition and Brain Sciences Unit, Cambridge, MA Jessica Fish

Jessica Fish, MRC Cognition and Brain Sciences Unit, Cambridge, UK

Page 26 of 26

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Kyrana Tsapkini and Argye Hillis The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0024

Abstract and Keywords Spelling and reading are evolutionarily relatively new functions, and therefore it is plausi ble that they are accomplished by engaging neural networks initially devoted to other functions, such as object recognition in the case of reading. However, there are unique aspects of these complex tasks, such as the fact that certain words (e.g., “regular” words) never previously encountered can often be read accurately. Furthermore, spelling in many ways seems to be simply the reverse of reading, but the relationship is not quite so simple, at least in English. For example, the spoken word “lead” can be spelled led or lead, but the printed word lead can be pronounced like “led” or “lead” (rhyming with “feed”). Therefore, there may be some unique areas of the brain devoted to certain com ponents of reading or spelling. This chapter reviews the cognitive processes underlying these tasks as well as areas of the brain that are thought to be necessary for these com ponent processes (e.g., on the basis of individuals who are impaired in each component because of lesions in a particular area) and areas of the brain engaged in each compo nent on the basis of functional imaging studies showing neural activation associated with a particular type of processing. Keywords: reading, writing, dyslexia, dysgraphia, neuroimaging

Introduction In this chapter, we tackle the issue of correspondence between cognitive and neural sub strates underlying the processes of reading and spelling. The main questions we explore are how the brain reads and spells and how the cognitive architecture corresponds to the neural one. We focus our discussion on the role of the following areas of the left hemi sphere that have been found to be involved in the processes of reading and spelling in le sion and functional neuroimaging studies: the inferior frontal gyrus (Brodmann area [BA] 44/45), fusiform gyrus (inferior and mesial part of BA 37), angular gyrus (BA 39), supra marginal gyrus (BA 40), and superior temporal gyrus (BA 22). We also discuss the role of Page 1 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing some other areas that seem to be involved only in reading or spelling. The particular is sues we address are (1) the distinction between areas that are engaged in, versus neces sary for, reading and spelling, as shown by functional neuroimaging and lesion studies; (2) the unimodality versus multimodality of each area; and (3) whether areas are dedicat ed to one or more cognitive processes. The question of common regions for reading and spelling is not the focus of this chapter, but we summarize recent papers that deal with this particular issue (Philipose et al., 2007; Rapcsak, 2007; Rapp & Lipka, 2011). First, we present what is known about the cognitive architecture of reading (recognition/compre hension) and spelling (production) as derived from classic case studies in the tradition of cognitive neuropsychology as well as relative computational (nonrepresentational) mod els, briefly exposing the rationale for postulating each cognitive module or process. Sub sequently, we connect what we know about neural processes (p. 492) in reading and writ ing with the cognitive processes described earlier.

Cognitive Architecture of Reading and Spelling Investigators have tried to integrate the processes of reading and spelling of the literate mind in a single cognitive model, in both cognitive neuropsychological accounts (see Colt heart et al., 1980; Ellis & Young, 1988) and connectionist accounts (see Seidenberg & Mc Clelland, 1989; Plaut & Shallice, 1993). In these models, each cognitive component pro posed is derived from evidence from neurologically impaired subjects. Although compo nents in these models are depicted as independent, there is ample evidence that these modules interact and that the processing is parallel rather than strictly serial. Figure 24.1 represents a cognitive model of reading and spelling according to a representational ac count. We chose an integrative model for reading and spelling because we will not ex plore the issue of shared versus independent components between reading and spelling that has been addressed elsewhere (see Tainturier & Rapp, 2001, for a discussion).

Page 2 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing

Figure 24.1 A representational model of reading and spelling.

Although both representational and connectionist models represent dynamic systems, es pecially in their more recent versions, the main difference between them is that in repre sentational accounts, reading and spelling of familiar versus unfamiliar words are accom plished through two distinct routes or mechanisms, whereas in connectionist architec ture, they are accomplished through the same process. The two routes in the representa tional account are a lexical route, where stored (learned) representations of the words (phonological, semantic, and orthographic) may be accessed and available for reading or spelling; and a sublexical route, which converts graphemes (abstract letter identities) to phonemes (speech sounds) in reading and phonemes to graphemes in writing. The lexical route is proposed to be used for reading and spelling irregular and low-frequency words. For example, to read yacht, one would access the orthographic representation (learned spelling) of yacht, the semantic representation (learned meaning) of yacht, and the phonological representation (learned pronunciation) of yacht. The sublexical (orthogra phy-to-phonology conversion) route is proposed to be used for decoding nonwords or first-encountered words, such as unfamiliar proper names. Both routes may interact and operate in parallel; that is, there is likely to be functional interaction between processes and components (Hillis & Caramazza, 1991; Rapp et al., 2002). Hence, the sublexical mechanisms (e.g., grapheme-to-phoneme conversion) might contribute (along with se mantic information) to accessing stored lexical representations for output. In a connectionist architecture, on the other hand, both words and nonwords (or unfamil iar words) are read and spelled through the same mechanisms (Seidenberg & McClel land, 1989; Plaut & Shallice, 1993). An important feature of these models is that in order to read or write any word, the system does not need to rely on any previous representa tion of the word or on any memory trace, but instead relies on the particular algorithm of interaction between phonological units, semantic units, and orthographic units, and the on frequency with which they are activated together. Page 3 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing The issue of which architecture better explains the behavioral findings from both normal participants and neurologically impaired patients has generated debate over the past decades but has also helped to further our understanding of the cognitive architecture of reading and writing. We do not address the issue here. For heuristic reasons, we adopt a representational model to consider the neural substrates of reading and spelling, but we assume, consistent with computational models, that the component processes are interac tive and operate in parallel. We also assume, as in computational models, that the repre sentations are distributed and overlapping. That is, the phonological representation of “here” overlaps considerably with the phonological representations of “fear” and “heap,” and completely with the phonological representation of “hear.” Before discussing evi dence for the neural substrates of reading and spelling, we (p. 493) sketch out the basic cognitive elements that have been proposed.

Reading Mechanisms The reading process (see Figure 24.1) begins with access to graphemes from letter shapes, with specific font, case, and so on. Once individual graphemes are accessed, they comprise a case-, font-, and orientation-independent graphemic description, which is held by a temporary working memory component (sometimes called the graphemic buffer). From here, representational models split processing into the lexical and sublexical routes. The sublexical route, used to read nonwords or first-encountered letter strings be fore they acquire a memory trace (orthographic representations), is a sublexical mecha nism that transforms graphemes to phonemes. The lexical mechanism, on the other hand, consists of three components: the orthographic lexicon, which is a long-term memory storage of orthographic representations of words that have been encountered before; the semantic system, which contains the semantic representations (meanings) of these words; and the phonological lexicon, which is a long-term memory storage of the phono logical representations of words that have been encountered before. During reading, both these computational routes end up at the peripheral component of converting phonemes to motor plans for articulation, so that the word or nonword can be read aloud. As men tioned earlier, the independence of these components was inferred from lesion studies, in which patients showed patterns of performance that could be explained by assuming se lective damage to a single component. Patients with selective deficits at the level of the orthographic lexicon cannot access lexical orthographic representations; that is, cannot read or understand familiar words, but they understand the same words when they hear them (Patterson et al., 1985). Patients with selective deficits at the semantic level, on the other hand, cannot understand either spoken or written familiar words (Howard et al., 1984). Patients with selective deficits at the level of the phonological lexicon can under stand written words but make word selection errors when trying to pronounce them (Bub & Kertesz, 1982; Hillis & Caramazza, 1990; Rapp et al., 1997). Patients with deficits in the sublexical grapheme-to-phoneme conversion mechanisms are unable to read pseudo words or unfamiliar words but can read correctly familiar words (Beauvois & Derousne, 1979; Funnell, 1996; Marshall & Newcombe, 1973).

Page 4 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing

Spelling Mechanisms There are two hypotheses regarding the cognitive computations of spelling: In the first, they are described as the reverse of reading, and in the second, reading and spelling share only the semantic system. According to the first account (also called the sharedcomponents account), the orthographic and phonological lexicons in spelling are the same ones used in reading, whereas the second account (also called the independentcomponents account) posits that phonological and orthographic lexicons used in spelling are independent from reading. The issue of whether reading and spelling share the same components has been a controversial issue. The existence of patients with both associa tions and dissociations between their reading and spelling does not prima facie favor one or the other account. The issue of whether reading and writing share the same computa tional components is discussed in great detail by Hillis and Rapp (2004) for each mecha nism: the semantic system, the orthographic lexicon, the sublexical mechanism, the graphemic buffer, and the representation of letter shapes. In either case, spelling to dictation starts off with a peripheral phonemic analysis system that distinguishes the sounds of one’s own language from other languages or environmen tal sounds, by matching the sounds to learned phonemes. Then, there are again perhaps two routes: the lexical route and the sublexical route. In the lexical route, the sequence of phonemes accesses the phonological lexicon, the suppository of stored representations of spoken words, so that the heard word is recognized. The phonological representation then accesses the semantic system that evokes the meaning of the word. Finally, the stored orthographic representation of the word is accessed from the orthographic lexi con. In the sublexical route, the phoneme-to-grapheme conversion mechanism allows for the heard word or nonword to be transformed to a sequence of corresponding graphemes. Both routes, then, end up at the graphemic buffer, a temporary storage mechanism for sequences of graphemes, which “holds” the position and identity of letters while each letter is written or spelled aloud. Finally, graphemes (abstract letter identities) are converted to letter shapes, with specific case, font, and so on, by evoking motor schemes for producing the letter.

Neural Correlates of Reading and Writing In this section we discuss evidence from both lesion-deficit correlation studies in patients with (p. 494) impairments in reading or spelling (including studies using voxel-based mor phometry and diffusion tensor imaging [DTI]) as well as data from functional neuroimag ing (positron emission tomography [PET], functional magnetic resonance imaging [fMRI], and magnetoencephalography [MEG]). Despite the appeal of the enterprise, there re mains much to be learned about the neural correlates of cognitive processes. We attempt in this chapter to review evidence that certain areas are involved in reading and writing. We examine data from both lesion-deficit and functional neuroimaging studies in order to specify the particular role of these areas in particular cognitive processes. We start with the role of the inferior occipital-temporal cortex and fusiform gyrus in particular (BA 37), Page 5 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing then continue with the inferior frontal gyrus (IFG) and in particular BA 44/45, as well as other perisylvian areas, such as the superior temporal gyrus (BA 22); then we discuss the role of more controversial areas such as the angular gyrus (BA 39) and the supramarginal gyrus/sulcus (BA 40). We end this discussion with the role of areas more peripheral to the process areas, such as premotor BA 6 (including so-called Exner’s area). The general lo cations of these areas are shown in Figure 24.2.

Figure 24.2 General locations of the Brodmann ar eas involved in reading and spelling. Red circles represent areas important for reading and spelling; blue circle represents area important for spelling on ly. Note that there is substantial variation in the pre cise location of cytoarchitectural fields (as well as in the shapes and sizes of brains) across individuals.

Because we will discuss brain areas involved in reading and spelling, it is important to clarify the unique contribution of each type of study, hence the distinction between areas “necessary for” and those “engaged in” a given cognitive process. Lesion and functional neuroimaging studies answer different types of questions regarding the relationship be tween brain areas and cognitive processes. The main question asked in lesion studies is whether a brain area is necessary for a certain type of processing. If an area is necessary for a particular aspect of reading or spelling, then a patient with damage to that area will not be able to perform the particular type of processing. However, areas that appear to be activated in functional neuroimaging studies in normal subjects may well be involved in, but are not always necessary for, the particular type of processing. In the following sec tions we discuss some brain areas found to be involved in reading and writing in a variety of studies such as lesion-deficit correlation studies, functional neuroimaging studies with normal control subjects, and functional neuroimaging studies with patients.

Fusiform Gyrus (BA 37) The fusiform gyrus (part of BA 37) is the area that has been associated with written lan guage and has dominated both the neuropsychological and the neuroimaging literature more than any other area. In this section we summarize and update the available evi Page 6 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing dence from both the lesion-deficit and the functional neuroimaging literature. We then discuss possible interpretations of the role of the fusiform in reading and spelling. Several lesion-deficit studies have shown that dysfunction or lesion at the left fusiform gyrus or at the left inferior-temporal gyrus (lateral BA 37) causes disrupted access to or thographic representations in oral reading or spelling in both acute and chronic stroke patients (Gailard et al., 2006; Hillis et al., 2001, 2004; Patterson & Kay, 1982; Philipose et al., 2007; Rapcsak & Beeson, 2004; Rapcsak et al., 1990; Tsapkini & Rapp, 2010). In par ticular, Rapcsak and Beeson (2004) examined eight patients after chronic stroke that had caused a lesion in BA 37 and BA 20 (mid- and anterior fusiform gyrus) who showed read ing and writing impairments. The importance of this region in oral reading of words and pseudowords as well as in oral and written naming was also confirmed in acute lesion studies (Hillis et al., 2004; Philipose et al., 2007). Philipose and colleagues (2007) reported deficits in oral reading and spelling of words and pseudowords in sixty-nine cas es of acute stroke. Their analyses showed that dysfunction of BA 37 (and BA 40, as we discuss later), as demonstrated by hypoperfusion, was strongly correlated with impair ments of oral reading and spelling of both words and pseudowords. In general, lesion studies, whether they refer to reading (p. 495) only or spelling only (Hillis et al., 2002; Pat terson & Kay, 1982; Rapcsak et al., 1990), or both (Hillis et al., 2005; Philipose et al., 2007; Rapcsak et al., 2004; Tsapkini & Rapp, 2009), confirmed that dysfunction in left BA 37 results in impairment in written language output (oral reading and spelling), indicat ing that function in this area is necessary for some aspect of these tasks. However, in most lesion-deficit studies the lesions are large, and in most (but not all) cas es, there are other areas that are dysfunctional that might contribute to the deficit. Two recent studies (Gaillard et al., 2006; Tsapkini & Rapp, 2009) address this issue by describ ing patients with focal lesions that do not involve extended posterior visual areas known to be responsible for general visual processing, and they offer additional evidence of the importance of the fusiform gyrus in reading and spelling. Gaillard and colleagues (2006) reported on a patient who exhibited impairment in reading but not spelling after a focal lesion to the posterior portion of the left fusiform gyrus. The deficit in this patient was in terpreted as resulting from a disconnection between the early reading areas of the poste rior fusiform and abstract orthographic representations whose access might require midfusiform gyrus. Similar “disconnection” accounts of pure alexia (also called letter-by-let ter reading or alexia without agraphia) after left occipital-temporal or occipital-splenial le sions. usually due to posterior cerebral artery stroke, have been given since the 1880s (Binder et al., 1992; Cohen et al., 2003; Déjerine, 1891; Epelbaum et al., 2008; Gaillard et al., 2006; Leff et al., 2006). In contrast to cases of pure alexia, many studies have indicated that damage in the midfusiform results in impairments in both oral reading and spelling (Hillis et al., 2005; Phili pose et al., 2007; Rapcsak et al., 2004; Tsapkini & Rapp, 2010). For example, in a case re port of a patient with a focal lesion in this area after resection of the more anterior midfusiform gyrus, Tsapkini and Rapp (2009) showed that orthographic representations as well as access to word meanings from print were disrupted for both reading and spelling. Page 7 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Furthermore, except for the impairment in orthographic processing of words (but not of nonwords), there was no impairment in other types of visual processing such as the pro cessing of faces and objects. It is striking that the localization of reading and spelling is the same in all studies in which both reading and spelling are affected (Hillis et al., 2005; Philipose et al., 2007; Rapcsak et al., 2004; Tsapkini & Rapp, 2009). The above findings lend support to the claim that the mid-fusiform gyrus is a necessary area for orthographic processing (but see Price & Devlin, 2003, below, for a different view, as discussed later). Functional neuroimaging studies have also found that reading words and nonwords, or simply viewing letters relative to other visual stimuli, activates the left (greater than right) mid-fusiform gyri. The area of activation is remarkably consistent across studies and across printed stimuli, irrespective of location, font, or size (Cohen et al., 2000, 2002; Dehaene et al., 2001, 2002; Gros et al., 2001; Polk et al., 2002; Polk & Farah, 2002; Price et al., 1996; Puce et al., 1996; Uchida et al., 1999; see also Cohen & Dehaene, 2004, for review). Some investigators have taken these data as evidence that the left mid-fusiform gyrus has a specialized role in computing location- and font-independent visual word forms (Cohen et al., 2003; McCandliss et al., 2003). This area has been labeled the visual word form area (VWFA; Cohen et al., 2000, 2002, 2004a, 2004b). However, based on acti vation of the same or a nearby area during a variety of nonreading lexical tasks, and on the fact that lesions in this area are associated with a variety of lexical deficits, other in vestigators have objected to the label. These opponents of the label of VWFA have pro posed that the left mid-fusiform gyrus has a specialized role in modality-independent lexi cal processing, rather than in reading alone (Büchel, Price, & Friston, 1998; Price et al., 2003; Price & Devlin, 2003, 2004; see below for discussion). Recently there have been further specifications of the VWFA proposal. Recent findings suggest that there is a hierarchical organization in the fusiform gyrus and that word recognition is a process accomplished from posterior-to-anterior direction in the occipitaltemporal cortex, which tunes progressively to more specified elements starting from ele mentary visual features to letters, bigrams, and finally whole words (Cohen et al., 2008; Dehaene et al., 2005; Vinckier et al., 2007). This proposal is accompanied by a proposal about the experiential attenuation of the occipital-temporal cortex. Because reading and writing area very recent evolutionary skills, it would be hard to argue for an area “set aside” for orthographic processing only. Therefore, the proposal is that reading expertise develops though experience and fine-tunes neural tissue that was supposed to achieve de tailed visual processing (Cohen et al., 2008; McCandliss et al., 2003). Supporting evi dence for this proposal comes from the developmental literature as well as from Aghababian and Nazir, 2000, who reported (p. 496) increases in reading accuracy and speed with age and maturation. The experiential attenuation of the occipital-temporal cortex with experience may also be the reason we see different subdivisions of the fusiform involved in word versus pseudoword reading, or even in the frequency effects observed (Binder et al., 2006; Bruno et al., 2008; Dehaene et al., 2005; Glezer et al., 2009; Kronbichler et al., 2004, 2007; Mechelli et al., 2003). This familiarity effect in the fusiform has also been found for writing (Booth et al., 2003; Norton et al., 2007), confirming a role for the fusiform in both reading and spelling. Recent functional neuroimaging findings of Page 8 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing reading and spelling in the same individuals confirmed that the mid-fusiform is more sen sitive to words than letter strings and to low- than high-frequency words, showing the sensitivity of this area to lexical orthography for both reading and spelling (Rapp & Lipka, 2011). A different interpretation of these results has been offered. Price and Devlin (2003) have rigorously questioned the notion that this area is dedicated to orthographic processing only, by arguing that this area is also found to be activated in hearing or speaking (Price et al., 1996, 2005, 2006), undermining any claims about orthographic specificity. Other studies have shown that the VWFA is activated when subjects process nonorthographic vi sual patterns, and investigators have claimed that this area is not dedicated to ortho graphic processing per se but rather to complex visual processing (Ben-Shachar et al., 2007; Joseph et al., 2006; Starrfelt & Gerlach, 2007; Wright et al., 2008). Others have found that this area may selectively process orthographic forms, but not exclusively (Bak er et al., 2007; Pernet et al., 2005; Polk et al., 2002). To evaluate the proposed roles of the mid-fusiform gyrus in orthographic processing ver sus more general lexical processing, Hillis et al. (2005) studied eighty patients with acute, left-hemisphere ischemic stroke on two reading tasks that require access to a visual word form (or orthographic lexicon) but do not require lexical output—written lexical decision and written word–picture verification. Patients were also administered other lexical tasks, including oral and written naming of pictures, oral naming of objects from tactile explo ration, oral reading, spelling to dictation, and spoken word–picture verification. Patients underwent magnetic resonance imaging, including diffusion- and perfusion-weighted imaging, the same day. It was argued that if left mid-fusiform gyrus were critical to ac cessing visual word forms, then damage or dysfunction of this area would, at least at the onset of stroke (before reorganization and recovery), reliably cause impaired perfor mance on written lexical decision and written word–picture verification. However, there was no significant association between damage or dysfunction of this re gion and impaired written lexical decision or written word–picture verification, indicating that left mid-fusiform gyrus is not reliably necessary for accessing visual word forms (Hillis et al., 2005). Of the fifty-three patients who showed infarct or hypoperfusion (se vere enough to cause dysfunction) involving the left mid-fusiform gyrus, twenty-two sub jects had intact written word comprehension, and fifteen had intact written lexical deci sion, indicating that both tasks can be accomplished without function of this region. How ever, there was a strong association between damage or dysfunction of this area and im pairment in oral reading (χ2 = 10.8; df1; p = 0.001), spoken picture naming (χ2 = 18.9; df1; p < 0.0001), spoken object naming to tactile exploration (χ2 = 8.2; df1; p < 0.004); and written picture naming (χ2 = 13.5; df1; p < 0.0002). These results indicate that struc tural damage or tissue dysfunction of the left mid-fusiform gyrus was associated with im paired lexical processing, irrespective of the input or output modality. In this study, the in farct or hypoperfusion that included the left mid-fusiform gyrus usually extended to adja cent areas of the left fusiform gyrus (BA 37). Therefore, it is possible that dysfunction of areas near the left mid-fusiform gyrus, such as the lateral inferior-temporal multimodality Page 9 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing area (LIMA; Cohen, Jobert, Le Bihan, & Dehaene, 2004), was critical to modality-indepen dent lexical processing. Further support for a critical role of at least a portion of the left fusiform gyrus in modality-independent lexical output is provided by patients who show impaired naming (with visual or tactile input) and oral reading when this region is hypop erfused, and improved naming and oral reading when this region is reperfused (Hillis, Kane, et al., 2002; Hillis et al., 2005). Previous studies have also reported oral naming and oral reading deficits associated with lesions to this region (e.g., Foundas et al., 1998; Hillis, Tuffiash, et al., 2002; Raymer et al., 1997; Sakurai, Sakai, Sakuta, & Iwata, 1994). One possible way to accommodate these various sets of results is the following. Part of left fusiform gyrus (perhaps lateral to the VWFA identified by Cohen and colleagues) may be critical for modality-independent lexical processing, rather than a reading-specific process (Price & Devlin, 2003, 2004); (p. 497) and part of the left or right mid-fusiform gyrus is essential for computing a location-, font-, and orientation-independent graphemic description (one meaning of a visual word form). Evidence in support of this latter compo nent of this account comes from patients with pure alexia (who have a disconnection be tween visual input to the left mid-fusiform gyrus) (Binder & Mohr, 1992; Chialant & Cara mazza, 1997; Marsh & Hillis, 2005; Miozzo & Caramazza, 1997; Saffran & Coslett, 1998). In one case of acute pure alexia, written word recognition was impaired when the left mid-fusiform gyrus was infarcted and the splenium of the corpus callosum was hypoper fused, but recovered when the splenium was reperfused (despite persistent damage to the left mid-fusiform gyrus). These results could be explained by proposing that the pa tient was able to rely on the right mid-fusiform gyrus to compute case-, font- and orienta tion-independent graphemic descriptions (when the left was damaged); but these graphemic descriptions could not be used for reading until reperfusion of the splenium al lowed information from the right mid-fusiform gyrus to access language cortex for addi tional components of the reading process (Marsh & Hillis, 2005). This proposal of a criti cal role of either the left or right mid-fusiform gyrus in computing graphemic descriptions is consistent with findings of (1) reliable activation of this area in response to written words, pseudowords, and letters in functional imaging; (2) activation of bilateral midfusiform gyrus in written lexical decision (Fiebach et al., 2002); and (3) electrophysiologi cal evidence of a mid-fusiform activity early in the reading process (Salmelin et al., 1996, Tarkiainen et al., 1999). In general, there are fewer functional imaging studies of spelling than reading (for re views and meta-analyses, see Jobard et al., 2003; Mechelli et al., 2003; Turkeltaub et al., 2002). However, four of the six spelling studies show that the left fusiform gyrus is an area involved in spelling (Beeson et al., 2003; Norton et al., 2007; Rapp & Hsieh, 2002; Rapp & Lipka, 2011). Spelling might require both computation of a case-, font-, and orien tation-independent graphemic description (proposed to require right or left mid-fusiform gyrus) and modality-independent lexical output (proposed to require the lateral part of left mid-fusiform gyrus). Further evidence of a role for part of the left mid-fusiform gyrus in modality-independent lexical output is that reading Braille in blind subjects (Büchel et al., 1998) and sign language in deaf subjects (Price et al., 2005) activates this area. Page 10 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Figure 24.3 shows the remarkable similarity in the area identified as a key neural sub strate of reading, spelling, or naming across various lesion and functional imaging stud ies.

Inferior Frontal Gyrus and Broca’s Area (BA 44/45) The left posterior IFG is probably the area of the brain mostly found to be involved in lan guage processing, and in particular in language production, since the 1800s. Damage in the IFG has also been associated with impairments in fluency (Goodglass et al., 1969), picture naming (Hecaen & Consoli, 1973; Hillis et al., 2004; Miceli & Caramazza, 1988), lexical impairment (Goodglass, 1969; Perani & Cappa, 2006), semantic retrieval (Good glass et al., 1969), and syntactic processing (Grodzinsky, 2000). There have also been many reports of patients with impairments in reading and spelling after lesions in Broca’s area. IFG lesions are often associated with deficits in reading or writing nonwords using grapheme-to-phoneme conversion rules, sometimes with accu rate performance on words. However, in most of these cases, lesions in BA 44/45 were ac companied by damage to other perisylvian areas, such as the insula, precentral gyrus (BA 4/6), and superior temporal gyrus or Wernicke’s area (BA 22) (Coltheart, 1980; Fiez et al., 2006; Henry et al., 2005; Rapcsak et al., 2002, 2009; Roeltgen et al., 1983). However, in the chronic lesion studies there were no areas that could be identified as critical for non word reading or spelling by themselves, in the absence of damage to the surrounding ar eas (Rapcsak, 2007).

Page 11 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing

Figure 24.3 A, Cluster of voxels where hypoperfu sion/infarct was most strongly associated with im paired spelling of words in acute stroke. B, Cluster of voxels with greatest activation associated with read ing of words in a functional magnetic resonance imaging (fMRI) study. C, Cluster of voxels with great est activation associated with spelling of words in an fMRI study. D, Magnetic resonance perfusion-weight ed image showing area of hypoperfusion in patient with impaired oral reading, oral naming, and spelling in acute stroke. In each panel, the arrow points to the important voxels within left Brodmann area 37 in the mid-fusiform gyrus. A, adapted with permission from Philipose et al., 2007. Copyright © 2007, John Wiley and Sons; B, Reprinted from Trends in Cognitive Sciences, Vol. 7, Issue 7, Bruce D. McCandliss, Laurent Cohen, and Stanislas Dehaene, “The visual word form area: ex pertise for reading in the fusiform gyrus,” 293–299, Copyright 2003, with permission from Elsevier; C, adapted with permission from Beeson, et al. 2003. Copyright © 2003 Routledge. D, adapted with per mission from Hillis et al, 2004. Copyright © 2004 Routledge.

Studies have shown that a lesion or tissue dysfunction of this area alone may result in deficits in spelling (see Hillis et al., 2002, for review). Many patients with relatively local ized lesions involving Broca’s area have pure agraphia, with impaired written naming (particularly of verbs compared with nouns), but spared oral naming of verbs and nouns, accompanied by impaired sublexical mechanisms for converting phonemes to graphemes (Hillis, Chang, & Breese, 2004; see also Hillis, Rapp, & Caramazza, 1999). Additional evi dence that the IFG is crucial for written naming of verbs comes from the use of reperfu sion in acute stroke (Hillis et al., 2002). For example, Hillis and colleagues found that im pairment in accessing orthographic representations (particularly verbs) with relatively spared oral naming and word meaning was associated with dysfunction (hypoperfusion) of Broca’s area. Furthermore, (p. 498) reperfusion of Broca’s area resulted in recovery of written naming of verbs. Thus, Broca’s area seems to be critical at least for writing verbs (in the normal state, before reorganization).

Page 12 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing The reason we do not always see spelling or reading deficits caused by a lesion to the IFG in chronic stroke patients may be that the brain reorganizes so that the contribution of this area in spelling can be taken over by other areas over time (Price & Friston, 2002). Functional neuroimaging studies have been employed to assess whether the IFG is en gaged in reading and spelling. Many functional neuroimaging studies of reading and spelling have found IFG involvement, even when no overt speech production is required (Bolger et al., 2005; Fiez & Petersen, 1998; Jobard et al. 2003; Mechelli et al., 2003; Paulesu et al., 2000; Price 2000; Rapp & Lipka, 2011 [review]; Turkeltaub, 2002). Howev er, not all studies have shown IFG activation during spelling. One functional neuroimag ing study reported frontal activation in written naming versus oral naming, but the activa tion did not clearly involve Broca’s area (Katonoda et al., 2001). Another study did report the activation of Broca’s area in written naming, but this activation did not survive the comparison to oral naming (Beeson et al., 2003). Nevertheless, both Beeson and col leagues (2003) and Rapp and Hsieh (2002) found spelling-specific activation in the left IFG. Therefore, it is hard to identify the exact role of the left IFG in the process of read ing or spelling; subparts of this area may be necessary or engaged in distinct cognitive processes underlying reading or spelling, such as lexical retrieval or selection (Price, 2000; Thompson-Schill et al., 1999); phonological processing (Pugh et al., 1996), or re trieval or selection from the orthographic lexicon specifically (Hillis et al., 2002). The de termination of the role of Broca’s area in reading and writing becomes even more compli cated if one considers the many other language and nonlanguage functions that have been attributed to this area and that may be required for reading and writing, such as working memory (Smith & Jonides, 1999), memory (Paulesu et al., 1993), syntactic pro cessing (Thompson et al., 2007), cognitive control and task updating (Brass (p. 499) & von Cramon, 2002, 2004; Derrfuss et al., 2005; Thompson-Schill et al., 1999), and semantic encoding (Mummery et al., 1996; Paulesu et al., 1997; Poldrack et al., 1999).

Other Areas Implicated in Reading and Writing Supramarginal Gyrus (BA 40) The supramarginal gyrus has been found to be compromised (often along with other peri sylvian areas such as Broca’s area, precentral gyrus, and Wernicke’s area) mostly in pa tients with impaired reading or spelling of nonwords or unfamiliar words (Alexander et al., 1992; Coltheart et al., 1980; Fiez et al., 2006; Henry et al., 2007; Lambon Ralph & Graham, 2000; Rapcsak & Beeson, 2002; Rapcsak et al., 2009; Roeltgen et al., 1984; Shal lice, 1981). However, acute lesions of supramarginal gyrus do seem to interfere with spelling familiar words as well (Hillis et al., 2002). Probably the strongest evidence that the supramarginal gyrus is important in reading and spelling words as well as nonwords comes from a perfusion study in acute stroke patients (Philipose et al., 2007). In this study, the authors found that the supramarginal gyrus was one of the two areas (the other was the fusiform gyrus) in which hypoperfusion (independently of hypoperfusion in other regions) predicted impairments in reading and spelling words and nonwords. These re gions might be tightly coupled acutely so that dysfunction in one brain region or the oth Page 13 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing er is sufficient to impair the neural network necessary for reading and spelling both words and nonwords. The reason we do not often see chronic lesions in supramarginal gyrus to result in deficits in reading or spelling is that other areas might successfully as sume its function in the networks.

Figure 24.4 Magnetic resonance diffusion-weighted image (left) and perfusion image (right) of a patient with acute infarct and hypoperfusion in left angular gyrus, who was selectively impaired in reading and spelling of words and nonwords.

The supramarginal gyrus (BA 40) has also been found to be involved in both reading and spelling in the functional neuroimaging literature for both words and nonwords (Booth et al., 2002, 2003; Corina et al., 2003; Law et al., 1991; Mummery et al., 1998; Price, 1998). In these studies, the supramarginal gyrus was implicated in sublexical conversion either from graphemes to phonemes or from phonemes to graphemes. However, in these studies clusters at the supramarginal gyrus during sublexical conversion were always accompa nied by clusters at the frontal operculum in the IFG, as Jobard and colleagues (2003) note in their extensive meta-analysis. Some researchers have claimed that this set of brain re gions sustains phonological working memory and serves as a temporary phonological store (Becker et al., 1999; Fiez et al., 1996, 1997; Paulesu et al., 1993). Thus, it is plausi ble that this region is activated in reading and writing because the sequential computa tions involved in sublexical conversion must be held in working memory as the word is written or spoken.

Angular Gyrus (BA 39) Lesion studies provide evidence for a critical role of angular gyrus in reading (Benson, 1979; Black & Behrmann, 1994; Déjerine, 1892) and spelling (Hillis et al., 2001; Rapcsak & Beseson, 2002; Roeltgen & Heilman, 1984). The angular gyrus is an area that appears to be critical for both sublexical conversion mechanisms, that is, orthography to phonolo gy in reading (Hillis et al., 2001) and phonology to orthography in spelling (Hillis et al., 2002), but also for access to lexical orthographic representations (Hillis et al., 2002). Fig ure 24.4 shows the MRIs of a patient with acute infarct and hypoperfusion in left angular gyrus, who was (p. 500) selectively impaired in reading and spelling of words and non words. However, some patients with lesions in this area show deficits for only reading or spelling. This finding should not be too surprising because the angular gyrus is a fairly Page 14 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing large area, and it is conceivable that, because even small brain areas may be specialized, there may be smaller areas with different functions included in this area. Furthermore, in chronic stroke patients, there may be other areas that take over one function but not the other, depending on factors of individual neuronal connectivity or practice, or for some other unknown reason. Functional neuroimaging studies of normal adults have been inconsistent in finding evi dence for angular gyrus involvement; that is, some of them did find angular gyrus activa tion (Rapp & Hsieh, 2002), but others did not (Beeson & Rapcsak, 2002). However, sever al developmental studies have found that the angular gyrus activation is associated with reading of pseudowords more than words and proposed that this area may be more im portant during development (Pugh et al., 2001). Therefore, the evidence is not yet conclu sive about the exact contribution of the angular gyrus in reading or spelling, but this area seems to have an important role in these tasks.

Exner’s Area (BA 6) Hillis et al. (2002) have found that hypoperfusion (causing dysfunction) of Exner’s area was highly associated with impaired allographic conversion (converting from an abstract grapheme to a particular letter shape or letter-specific motor plan), but not with impaired access to orthographic representations. This finding agrees with Exner’s original propos al for a cortical center of movements for writing. Exner’s center contribution to the con trol of hand movements has also been found in lesion studies (Anderson et al., 1990), in which patients were impaired in written but not oral spelling. It is also consistent with fMRI studies in which the comparison between written and oral spelling showed activa tion in Exner’s area and with cortical stimulation studies in which temporary lesions in Exner’s area caused difficulty with motor aspects of writing (see Roux et al., 2009, for a review). In functional neuroimaging studies of spelling, BA 6 was also found to be in volved (Beeson et al., 2003; Rapp & Hsieh, 2002). It seems that this area is more involved in writing than reading and probably has a role in neural machinery needed for motoric aspects of written output.

Superior Temporal Gyrus (BA 22) The left superior temporal gyrus is an area, in conjunction with the other perisylvian ar eas (BA 44/45, insula, BA 40), involved in phonological processing (see Rapcsak, 2009, for a review). In lesion studies, especially in the case of left middle cerebral artery stroke, it has been shown to be involved in comprehension in spoken language. Dysfunction of this area has been shown to disrupt understanding of oral and written words and of oral and written naming in acute stroke patients (Hillis et al., 2001). Furthermore, in a recent study, Philipose and colleagues (2007) found that hypoperfusion in this area was associat ed with impaired word reading. The above evidence was taken as indication that this area is critical for linking words to their meaning, and it was concluded that this area may be important for both reading and spelling words to the extent that they require access to meaning. However, this area does not seem to play a critical role in either reading or spelling of nonwords (Philipose et al., 2007). Page 15 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing However, functional imaging studies show activation of anterior and middle areas of the superior temporal gyrus linked to phonological processing or to grapheme-to-phoneme conversion in reading (Jobard et al., 2003; Price et al., 1996; Wise et al., 1991). Likewise, an MEG study of adult reading indicated that the middle part of the superior temporal gyrus was involved in pseudoword but not word reading (Simos et al., 2002). The discrepancy between lesion and functional imaging studies observed may be due to the fact that the superior temporal gyrus is a large area of cortex, and its posterior part may well be related to accessing meaning from phonological or orthographic codes as Wernicke has claimed, whereas its middle and anterior parts may be more dedicated to phonological processing per se. Alternatively, the middle and anterior parts of the superi or temporal gyrus may be consistently engaged in all aspects of phonological processing (including sublexical grapheme-to-phoneme conversion and hearing), but not necessary for these processes (perhaps because the right superior temporal gyrus is capable of at least some aspects of phonological processing), so that unilateral lesions do not cause deficits in these aspects.

Conclusion The evidence reviewed in this chapter, although controversial and inconclusive on some points, illustrates the complexity of the cognitive and neural mechanisms underlying reading and spelling and their relationships to one another. We started (p. 501) this review by taking the paradigms of reading and spelling as instances to discuss the correspon dence between cognitive and neural architecture, just to discover the complexity, al though not ineffability, of this task. One of the most intriguing observations in reviewing the bulk of the current evidence is that some brain areas seem to be crucial for more than one language function or component, and some are more specialized. One function may be subserved by more than one brain area, and a single brain area may subserve more than one function. Thus, reading and spelling seem to be complex processes consisting of distinct cognitive subprocesses that seem to rely on overlapping networks of separate brain regions. Furthermore, the connectivity of this cortical circuitry may be altered and reorganized over time, owing to practice or therapy (or to new infarcts, atrophy, or resec tion). Further insights into the plasticity of the human cortex and its dynamic nature are needed to shed greater light on the cognitive and neural processes underlying reading and writing.

References Aghababian, V., & Nazir, T. A. (2000). Developing normal reading skills: Aspects of the vi sual process underlying word recognition. Journal of Experimental Child Psychology, 76, 123–150. Alexander, M. P., Friedman, R. B., Loverso, F., & Fischer, R. S. (1992). Lesion localization in phonological agraphia. Brain and Language, 43, 83–95. Page 16 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Anderson, S.W., Damasio, A. R., & Damasio, H. (1990). Troubled letters but not numbers: Domain specific cognitive impairments following focal damage in frontal-cortex. Brain, 113, 749–766. Baker, C. I., Liu, J., Wald, L. L., et al. (2007). Visual word processing and the experiential origins of functional selectivity in human extrastriate cortex. Proceedings of the National Academy of Sciences U S A, 104, 9087–9092. Beauvois, M. F., & Dérouesné, J. (1979). Phonological alexia: Three dissociations. Journal of Neurology, Neurosurgery and Psychiatry, 42, 1115–1124. Becker, J. T., Danean, K., MacAndrew, D. K., & Fiez, J. A. (1999). A comment on the func tional localization of the phonological storage subsystem of working memory. Brain and Cognition, 41, 27–38. Beeson, P. M., & Rapcsak, S. Z. (2002). Clinical diagnosis and treatment of spelling disor ders. In A. E. Hillis (Ed.), Handbook on adult language disorders: Integrating cognitive neuropsychology, neurology, and rehabilitation (pp. 101–120). Philadelphia: Psychology Press. Beeson, P. M., & Rapcsak, S. Z. (2003). The neural substrates of sublexical spelling (INS Abstract). Journal of the International Neuropsychological Society, 9, 304. Beeson, P. M., Rapcsak, S. Z., Plante, E., et al. (2003). The neural substrates of writing: A functional magnetic resonance imaging study. Aphasiology, 17, 647–665. Behrmann, M., & Bub, D. (1992). Surface dyslexia and dysgraphia: Dual routes, single lexicon. Cognitive Neuropsychology, 9, 209–251. Behrmann, M., Nelson, J., & Sekuler, E. B. (1998). Visual complexity in letter-by-letter reading: “Pure” alexia is not pure. Neuropsychologia, 36, 1115–1132. Behrmann, M., Plaut, D. C., & Nelson, J. (1998). A literature review and new data support ing an interactive account of letter-by-letter reading. Cognitive Neuropsychology, 15, 7– 51. Ben-Shachar, M., Dougherty, R. F., Deutsch, G. K., & Wandell, B. A. (2007). Differential sensitivity to words and shapes in ventral occipito-temporal cortex. Cerebral Cortex, 17, 1604–1611. Benson, D. F. (1979). Aphasia, alexia and agraphia. New York: Churchill Livingstone. Binder, J. R., McKiernan, K. A., Parsons, M. E., et al. (2003). Neural correlates of lexical access during visual word recognition. Journal of Cognitive Neuroscience, 15, 372–393. Binder, J. R., Medler, D. A., Westbury, C. F., et al. (2006). Tuning of the human left fusiform gyrus to sublexical orthographic structure. NeuroImage, 33, 739–748.

Page 17 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Binder, J. R., & Mohr, J. P. (1992). The topography of callosal reading pathways. Brain, 115, 1807–1826. Binder, J., & Price, C. J. (2001). Functional neuroimaging of language. In R. Cabeza & A. Kingstone (Eds.), Handbook of functional neuroimaging of cognition (pp. 187–251). Cam bridge, MA: MIT Press. Black, S., & Behrmann, M. (1994). Localization in alexia. In A. Kertesz (Ed.), Localization and neuroimaging in neuropsychology. San Diego: Academic Press. Bolger, D. J., Perfetti, C. A., & Schneider, W. (2005). A cross-cultural effect on the brain re visited. Human Brain Mapping, 25, 92–104. Booth, J. R., Burman, D. D., Meyer, J. R, et al. (2003). Relation between brain activation and lexical performance. Human Brain Mapping, 19, 155–169. Booth, J. R., Burman, D. D., Meyer, J. R., Gitelman, D. R., Parrish, T. B., & Mesulam, M. M. (2002). Modality independence of word comprehension. Human Brain Mapping, 16, 251– 261. Borowsky, R., & Besner, D. (1993). Visual word recognition: A multistage activation mod el. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 813–840. Bowers, J. S., Bub, D., & Arguin, M. A. (1996). Characterization of the word superiority ef fect in a case of letter-by-letter surface alexia. Cognitive Neuropsychology, 13, 415–441. Boxer, A. L., Rankin, K. P., Miller, B. L., et al. (2003). Cinguloparietal atrophy distinguish es Alzheimer’s disease from semantic dementia. Archives of Neurology, 60, 949–956. Brass, M., & von Cramon, D. Y. (2002). The role of the frontal cortex in task preparation. Cerebral Cortex, 12, 908–914. Brass, M., & von Cramon, D. Y. (2004). Decomposing components of task preparation with functional magnetic resonance imaging. Journal of Cognitive Neuroscience, 16, 609–620. Bruno, J. L., Zumberge, A., & Manis, F. R., et al. (2008). Sensitivity to orthographic famil iarity in the occipito-temporal region. NeuroImage, 39, 1988–2001. Bub, D., & Kertesz, A. (1982). Deep agraphia. Brain and Language, 17, 146–165. Büchel, C., Price, C., Frackowiak, R. S., & Friston, K. (1998). Different activation patterns in the visual cortex of late and congenitally blind subjects. Brain: A Journal of Neurology, 121, 409–419. Burton, M. W., LoCasto, P. C., Krebs-Noble, D., & Gullapalli, R. P. (2005). A system atic investigation of the functional neuroanatomy of auditory and visual phonological pro (p. 502)

cessing. NeuroImage, 26, 647–661.

Page 18 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Chan, D., Fox, N. C., Scahill, R. I., et al. (2001). Patterns of temporal lobe atrophy in se mantic dementia and Alzheimer’s disease. Annals of Neurology, 49, 433–442. Chialant, D., & Caramazza, A. (1997). Identity and similarity factors in repetition blind ness: Implications for lexical processing. Cognition, 63, 79–119. Cloutman, L., Gingis, L., Newhart, M., Davis, C., Heidler-Gary, J., Crinion, J., & Hillis, A. E. (2009). A neural network critical for spelling. Annals of Neurology, 66 (2), 249–253. Cohen, L., & Dehaene, S. (2004). Specialization within the ventral stream: The case for the visual word form area. NeuroImage, 22, 466–476. Cohen, L., Dehaene, S., Naccache, L., et al. (2000). The visual word form area: Spatial and temporal characterization of an initial stage of reading in normal subjects and poste rior split-brain patients. Brain, 123, 291–307. Cohen, L., Dehaene, S., Vinckier, F., et al. (2008). Reading normal and degraded words: Contribution of the dorsal and ventral visual pathways. NeuroImage, 40, 353–366. Cohen, L., Henry, C., Dehaene, S., et al. (2004). The pathophysiology of letter-by-letter reading. Neuropsychologia, 42, 1768–1780. Cohen, L., Jobert, A., Bihan, D. L., & Dehaene, S. (2004). Distinct unimodal and multi modal regions for word processing in left temporal cortex. NeuroImage, 23, 1256–1270. Cohen, L., Lehéricy, S., Chochon, F., et al. (2002). Language-specific tuning of visual cor tex? Functional properties of the Visual Word Form Area. Brain, 125, 1054–1069. Cohen, L., Martinaud, O., Lemer, C., et al. (2003). Visual word recognition in the left and right hemispheres: anatomical and functional correlates of peripheral alexias. Cerebral Cortex, 13, 1313–1333. Coltheart, M., Patterson, K., & Marshall, J. C. (1980). Deep dyslexia. London: Routledge & Kegan Paul. Coltheart, M., Rastle, K., Perry, C., et al. (2001). DRC: A dual route cascaded model of vi sual word recognition and reading aloud. Psychological Review, 108, 204–256. Corina, D. P., San Jose-Robertson, L., Guillemin, A., High, J., & Braun, A. R. (2003). Lan guage lateralization in a bimanual language. Journal of Cognitive Neuroscience, 15, 718– 730. Crisp, J., & Lambon Ralph, M. A. (2006). Unlocking the nature of the phonological-deep dyslexia continuum: the keys to reading aloud are in phonology and semantics. Journal of Cognitive Neuroscience, 18, 348–362. Damasio, A. R., & Damasio, H. (1983). The anatomic basis of pure alexia. Neurology, 33, 1573–1583. Page 19 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Dehaene, A., Cohen, L., Sigman, M., & Vinckier, F. (2005). The neural code for written words: A proposal. Trends in Cognitive Sciences, 9, 335–341. Dehaene, S., Le Clec, H. G., Poline, J. B., Le Bihan, D., & Cohen, L. (2002). The visual word form area: A prelexical representation of visual words in the fusiform gyrus. Neu roReport, 13, 321–325. Dehaene, S., Naccache, L., Cohen, L., Bihan, D. L., Mangin, J. F., Poline, J. B., et al. (2001). Cerebral mechanisms of word masking and unconscious repetition priming. Nature Neu roscience, 4, 752–758. Déjerine, J. (1891). Sur un cas de cécité verbale avec agraphie, suivi d’autopsie. Mém Soc Biol, 3, 197–201. Déjerine, J. (1892). Contribution a l’étude anatomo-pathologique et clinique des differ entes variétés de cécité verbale. Mém Soc Biol, 4, 61–90. DeLeon, J., Gottesman, R. F., Kleinman, J. T., et al. (2007). Neural regions essential for dis tinct cognitive processes underlying picture naming. Brain, 130, 1408–1422. De Renzi, E., Zambolin, A., & Crisi, G. (1987). The pattern of neuropsychological impair ment associated with left posterior cerebral artery infarcts. Brain, 110, 1099–1116. Derrfuss, J., Brass, M., Neumann, J., & von Cramon, D. Y. (2005). Involvement of the infe rior frontal junction in cognitive control: meta-analyses of switching and Stroop studies. Human Brain Mapping, 25, 22–34. Ellis, A. W., & Young, A. W. (1988). Human cognitive neuropsychology. Hove, UK: Erl baum. Epelbaum, S., Pinel, P., Gaillard, R., et al. (2008). Pure alexia as a disconnection syn drome: new diffusion imaging evidence for an old concept. Cortex, 44, 962–974. Exner, S. (1881). Lokalisation des Funcktion der Grosshirnrinde des Menschen. Wein: Braunmuller. Farah, M. J., Stowe, R. M., & Levinson, K. L. (1996). Phonological dyslexia: Loss of a read ing-specific component of the cognitive architecture? Cognitive Neuropsychology, 13, 849–868. Farah, M. J., & Wallace, M. A. (1991). Pure alexia as a visual impairment: A reconsidera tion. Cognitive Neuropsychology, 8, 313–334. Fiez, J. A. (1997). Phonology, semantics, and the role of the left inferior prefrontal cortex. Human Brain Mapping, 5, 79–83. Fiez, J. A., & Petersen, S. E. (1998). Neuroimaging studies of word reading. Proceedings of the National Academy of Science U S A, 95, 914–921. Page 20 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Fiez, J. A., Raichle, M. E., Balota, D. A., Tallal, P., & Petersen, S. E. (1996). PET activation of posterior temporal regions during auditory word presentation and verb generation. Cerebral Cortex, 6, 1–10. Fiez, J. A., Tranel, D., Seager-Frerichs, D., & Damasio, H. (2006). Specific reading and phonological processing deficits are associated with damage to the left frontal opercu lum. Cortex, 42, 624–643. Foundas, A., Daniels, S. K., & Vasterling, J. J. (1998). Anomia: Case studies with lesion lo calization. Neurocase, 4, 35–43. Friedman, R. B. (1996). Recovery from deep alexia to phonological alexia: Points on a con tinuum. Brain and Language, 52, 114–128. Friedman, R. B., & Hadley, J. A. (1992). Letter-by-letter surface alexia. Cognitive Neu ropsychology, 9, 185–208. Funnell, E. (1996). Response bias in oral reading: An account of the co-occurrence of sur face dyslexia and semantic dementia. Quarterly Journal of Experimental Psychology, 49A, 417–446. Gaillard, R., Naccache, L., Pinel, P., et al. (2006). Direct intracranial, fMRI, and lesion evi dence for the causal role of left inferotemporal cortex in reading. Neuron, 50, 191–204. Galton, C. J., Patterson, K., Graham, K., et al. (2001). Differing patterns of temporal atro phy in Alzheimer’s disease and semantic dementia. Neurology, 57, 216–225. Gelb, I. J. (1963). A study of writing. Chicago: University of Chicago Press. Glezer, L. S., Jiang, X., & Riesenhuber, M. (2009). Evidence for highly selective neuronal tuning to whole words in the “visual word form area.” Neuron, 62 (2), 199–204. Glosser, G., & Friedman, R. B. (1990). The continuum of deep/phonological alexia. Cortex, 26, 343–359. (p. 503)

Gold, B. T., & Kertesz, A. (2000). Right-hemisphere semantic processing of visual words in an aphasic patient: An fMRI study. Brain and Language, 73, 456–465. Goodglass, H., Hyde, M. R., & Blumstein, S. (1969). Frequency, picturability and availabil ity of nouns in aphasia. Cortex, 5, 104–119. Gorno-Tempini, M. L., Dronkers, N. F., Rankin, K. P., et al. (2004). Cognition and anatomy in three variants of primary progressive aphasia. Annals of Neurology, 55, 335–346. Graham, K. S., Hodges, J. R., & Patterson, K. (1994). The relationship between compre hension and oral reading in progressive fluent aphasia. Neuropsychologia, 32, 299–316. Graham, N. L., Patterson, K., & Hodges, J. R. (2000). The impact of semantic memory im pairment on spelling: evidence from semantic dementia. Neuropsychologia, 38, 143–163. Page 21 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Grodzinsky, Y. (2000). The neurology of syntax: Language use without Broca’s area. Be havioral and Brain Sciences, 23, 1–21. Hecaen, H., & Consoli, S. (1973). Analysis of language disorders in lesions of Broca’s area. Neuropsychologia, 11, 377–388. Henry, C., Gaillard, R., & Volle, E., et al. (2005). Brain activations during letter-by-letter reading: A follow-up study. Neuropsychologia, 43, 1983–1989. Henry, M. L., Beeson, P. M., Stark, A. J., & Rapcsak, S. Z. (2007). The role of left perisyl vian cortical regions in spelling. Brain and Language, 100, 44–52. Hickok, G., & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews Neuroscience, 8, 393–402. Hillis, A. E., Chang, S., & Breese, E. (2004). The crucial role of posterior frontal regions in modality specific components of the spelling process. Neurocase, 10, 157–187. Hillis, A. E., Kane, A., Barker, P., et al. (2001). Neural substrates of the cognitive process es underlying reading: Evidence from magnetic resonance perfusion imaging in hypera cute stroke. Aphasiology, 15, 919–931. Hillis, A. E., Kane, A., Tuffiash, E., et al. (2002). Neural substrates of the cognitive processes underlying spelling: Evidence from MR diffusion and perfusion imaging. Apha siology 16, 425–438. Hillis, A. E., Kane, A., Tuffiash, E., Beauchamp, N., Barker, P. B., Jacobs, M. A., & Wityk, R. (2002). Neural substrates of the cognitive processes underlying spelling: Evidence from MR diffusion and perfusion imaging. Aphasiology, 16, 425–438. Hillis, A. E., Newhart, M., Heidler, J., et al. (2005). The roles of the “visual word form area” in reading. NeuroImage, 24, 548–559. Hillis, A., & Rapp, B. (2004). Cognitive and neural substrates of written language compre hension and production. In M. Gazzaniga (Ed.), The new cognitive neurosciences (3rd ed., pp. 755–788). Cambridge, MA: MIT Press. Hillis, A. E., Rapp, B. C., & Caramazza, A. (1999). When a rose is a rose in speaking but a tulip in writing. Cortex, 35, 337–356. Hillis, A.E., Tuffiash, E., & Caramazza, A. (2002). Modality specific deterioration in oral naming of verbs. Journal of Cognitive Neuroscience, 14, 1099–1108. Hodges, J. R., & Patterson, K. (2007). Semantic dementia: A unique clinicopathological syndrome. Lancet Neurology, 6, 1004–1014. Howard, D., Patterson, K., Franklin, S., Morton, J., & Orchard-Lisle, V. (1984). Variability and consistency in picture naming by aphasic patients. Advances in Neurology, 42, 263– 276. Page 22 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Ino, T., Tokumoto, K., Usami, K., et al. (2008). Longitudinal fMRI study of reading in a pa tient with letter-by-letter reading. Cortex, 44, 773–781. Jefferies, E., Sage, K., & Lambon Ralph, M. A. (2007). Do deep dyslexia, dysphasia and dysgraphia share a common phonological impairment? Neuropsychologia, 45, 1553–1570. Jobard, G., Crivello, F., & Tzourio-Mazoyer, N. (2003). Evaluation of the dual route theory of reading: a metanalysis of 35 neuroimaging studies. NeuroImage, 20, 693–712. Johnson, M. H. (2001). Functional brain development in humans. Nature Reviews Neuro science, 2, 475–483. Joseph, J. E., Cerullo, M. A., Farley, A. B., et al. (2006). fMRI correlates of cortical special ization and generalization for letter processing. NeuroImage, 32, 806–820. Katzir, T., Misra, M., & Poldrack, R. A. (2005). Imaging phonology without print: Assess ing the neural correlates of phonemic awareness using fMRI. NeuroImage, 27, 106–115. Klein, D., Milner, B., Zatorre, R. J., Zhao, V., & Nikelski, J. (1999). Cerebral organization in bilinguals: A PET study of Chinese-English verb generation. NeuroReport, 10, 2841–2846. Kronbichler, M., Bergmann, J., Hutzler, F., et al. (2007). Taxi vs. taksi: On orthographic word recognition in the left ventral occpitotemporal cortex. Journal of Cognitive Neuro science, 19, 1584–1594. Kronbichler, M., Hutzler, F., Wimmer, H., et al. (2004). The visual word form area and the frequency with which words are encountered: Evidence from a parametric fMRI study. NeuroImage, 21, 946–953. Lambert, J., Giffard, B., Nore, F. et al. (2007). Central and peripheral agraphia in Alzheimer’s disease: From the case of Auguste D. to a cognitive neuropsychology ap proach. Cortex, 43, 935–951. Lambon Ralph, M. A., & Graham, N. L. (2000). Previous cases: Acquired phonological and deep dyslexia. Neurocase, 6, 141–178. Larsen, J., Baynes, K., & Swick, D. (2004). Right hemisphere reading mechanisms in a global alexic patient. Neuropsychologia, 42, 1459–1476. Law, I., Kannao, I., Fujita, H., Miura, S., Lassen, N., & Uemura, K. (1991). Left supramar ginal/angular gyri activation during reading of syllabograms in the Japanese language. Journal of Neurolinguistics, 6, 243–251. Leff, A. P., Crewes, H., Plant, G. T., et al. (2001). The functional anatomy of single-word reading in patients with hemianopic and pure alexia. Brain, 124, 510–521. Leff, A. P., Spitsyna, G., Plant, G. T., & Wise, R. J. S. (2006). Structural anatomy of pure and hemianopic alexia. Journal of Neurology, Neurosurgery, and Psychiatry, 77, 1004– 1007. Page 23 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Lichtheim, L. (1885). On aphasia. Brain, VII, 433–484. Lubrano, V., Roux, F. E., & Démonet, J. F. (2004). Writingspecific sites in frontal areas: A cortical stimulation study. Neurosurgery, 101 (5), 787–798. Luders, H., Lesser, R. P., Hahn, J., et al. (1991). Basal temporal language area. Brain, 114, 743–754. Mainy, N., Jung, J., Baciu, M., et al. (2008). Cortical dynamics of word recognition. Human Brain Mapping, 29, 1215–1230. Mani, J., Diehl, B., Piao, Z., et al. (2008). Evidence for a basal temporal visual language center: Cortical stimulation producing pure alexia. Neurology, 71, 1621–1627. Marinkovic, K., Dhond, R. P., Dale, A. M., et al. (2003). Spatiotemporal dynamics of modality-specific and supramodal word processing. Neuron, 38, 487–497. (p. 504)

Marsh, E. B., & Hillis, A. E. (2005). Cognitive and neural mechanisms underlying reading and naming: Evidence from letter-by-letter reading and optic aphasia. Neurocase, 11, 325–337. Marshall, J. C., & Newcombe, F. (1973). Patterns of paralexia: a psycholinguistic ap proach. Journal of Psycholinguistic Research, 2, 175–199. McCandliss, B. D., Cohen, L., & Dehaene, S. (2003). The visual word form area: Expertise for reading in the fusiform gyrus. Trends in Cognitive Sciences, 7, 293–299. McKay, A., Castles, M., & Davis, C. (2007). The impact of progressive semantic loss on reading aloud. Cognitive Neuropsychology, 24, 162–186. Mechelli, A., Gorno-Tempini, M. L., & Price, C. J. (2003). Neuroimaging studies of word and pseudoword reading: Consistencies, inconsistencies, and limitations. Journal of Cog nitive Neuroscience, 15, 260–271. Miceli, G., & Caramazza, A. (1988). Dissociation of inflectional and derivational morpholo gy. Brain and Language, 35, 24–65. Miozzo, M., & Caramazza, A. (1997). Retrieval of lexical-syntactic features in tip-of-thetongue states. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 1410–1423. Molko, N., Cohen, L., Mangin, J. F., et al. (2002). Visualizing the neural bases of a discon nection syndrome with diffusion tensor imaging. Journal of Cognitive Neuroscience, 14, 629–636. Moro, A., Tettamanti, M., Perani, D., Donati, C., Cappa, S. F., & Fazio, F. (2001). Syntax and the brain: Disentangling grammar by selective anomalies. NeuroImage, 13, 110–118.

Page 24 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Mummery, C. J., Patterson, K., Hodges, J. R., & Price, C. J. (1998). Functional neuroanato my of the semantic system: Divisible by what? Journal of Cognitive Neurosci ence, 10, 766–777. Mummery, C. J., Patterson, K., Price, C. J., et al. (2000). A voxel-based morphometry study of semantic dementia: Relationship between temporal lobe atrophy and semantic memo ry. Annals of Neurology, 47, 36–45. Murtha, S., Chertkow, H., Beauregard, M., & Evans, A. (1999). The neural substrate of picture naming. Journal of Cognitive Neurosci ence, 11, 399–423. Nakamura, K., Honda, M., Hirano, S., et al. (2002). Modulation of the visual word re trieval system in writing: A functional MRI study on the Japanese orthographies. Journal of Cognitive Neuroscience, 14, 104–115. Nakamura, K., Honda, M., Okada, T., et al. (2000). Participation of the left posterior inferi or temporal cortex in writing and mental recall of kanji orthography: A functional MRI study. Brain, 123, 954–967. Nobre, A. C., Allison, T., & McCarthy, G. (1995). Word recognition in the human inferior temporal lobe. Nature, 372, 260–273. Norton, E. S., Kovelman, I., & Petitto, L.-A. (2007). Are there separate neural systems for spelling? New insights into the role of rules and memory in spelling from functional mag netic resonance imaging. Mind, Brain, and Education, 1, 48–59. Ogden, J. A. (1996). Phonological dyslexia and phonological dysgraphia following left and right hemispherectomy. Neuropsychologia, 34, 905–918. Omura, K., Tsukamoto, T., Kotani, Y., et al. (2004). Neural correlates of phonemegrapheme conversion. NeuroReport, 15, 949–953. Patterson, K. E., & Kay, J. (1982). Letter-by-letter reading: Psychological descriptions of a neurological syndrome. Quarterly Journal of Experimental Psychology, 34A, 411–441. Patterson, K. E., Marshall, J. C., & Coltheart, M. (1985). Surface dyslexia: Neuropsycho logical and cognitive studies of phonological reading. London: Erlbaum. Paulesu, E., Frith, C. D., & Frackowiak, R. S. (1993). The neural correlates of the verbal component of working memory. Nature, 362 (6418), 342–345. Paulesu, E., Goldacre, B., Scifo, P., Cappa, S. F., Gilardi, M. C., Castiglioni, I., Perani, D., & Fazio, F. (1997). Functional heterogeneity of left inferior frontal cortex as revealed by fM RI. NeuroReport, 8, 2011–2017. Paulesu, E., McCrory, E., Fazio, F., Menoncello, L., Brunswick, N., Cappa, S. F., Cotelli, M., Cossu, G., Corte, F., Lorusso, M., Pesenti, S., Gallagher, A., Perani, D., Price, C., Frith, C. D., & Frith, U. (2000). A cultural effect on brain function. Nature Neuroscience, 3, 91–96. Page 25 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Paulesu, E., Perani, D., Blasi, V., Silani, G., Borghese, N. A., De Giovanni, U., Sensolo, S., & Fazio, F. (2003). A functional-anatomical model for lipreading. Journal of Neurophysiology, 90 (3), 2005–2013. Perani, D., & Cappa, S. (2006). Broca’s area and lexical-semantic processing. In Y. Grodzinsky & K Amunts (Eds.), Broca’s region. New York: Oxford University Press, 2006. Perani, D., Cappa, S. F., Schnur, T., Tettamanti, M., Collina, S., Rosa, M. M., & Fazio, F. (1999). The neural correlates of verb and noun processing: A PET study. Brain, 122, 2337– 2344. Perfetti, C. A. (2003). The universal grammar of reading. Scientific Studies of Reading, 7, 3–24. Pernet, C., Celsis, P., & Demonet, J.-F. (2005). Selective response to letter categorization within the left fusiform gyrus. NeuroImage, 28, 738–744. Petersen, S. E., Fox, P. T., Posner, M. I., Mintun, M., & Raichle, M. E. (1988). Positron emission tomographic studies of the cortical anatomy of single-word processing. Nature, 331 (6157), 585–589. Philipose, L. E., Gottesman, R. F., Newhart, M., et al. (2007). Neural regions essential for reading and spelling of words and pseudowords. Annals of Neurology, 62, 481–492. Plaut, D. C., McClelland, J. L., Seidenberg, M. S., & Patterson, K. (1996). Understanding normal and impaired word reading: Computational principles in quasi-regular domains. Psychological Review, 103, 56–115. Plaut, D. C., & Shallice, T. (1993). Deep dyslexia: A case study of connectionist neuropsy chology. Cognitive Neuropsychology, 10, 377–500. Polk, T. A., & Farah, M. J. (2002). Functional MRI evidence for an abstract, not perceptu al, word-form area. Journal of Experimental Psychology: General, 131 (1), 65–72. Polk, T. A., Stallcup, M., Aguirre, G. K., et al. (2002). Neural specialization for letter recognition. Journal of Cognitive Neuroscience, 14, 145–159. Price, C. J. (2000). The anatomy of language: Contributions from functional neuroimaging [review]. Journal of Anatomy, 197 (Pt 3), 335–359. Price, C. J., & Devlin, J. T. (2003). The myth of the visual word form area. NeuroImage, 19, 473–481. Price, C. J., & Devlin, J. T. (2004). The pro and cons of labelling a left occipitotem poral region: “The visual word form area” NeuroImage, 22, 477–479. (p. 505)

Price, C. J., Devlin, J. T., Moore, C. J., et al. (2005). Meta-analyses of object naming: Effect of baseline. Human Brain Mapping, 25, 70–82. Page 26 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Price, C. J., & Friston, K. (2002). Degeneracy and cognitive anatomy. Trends in Cognitive Sciences, 6, 416–421. Price, C. J., & Friston, K. J. (2005). Functional ontologies for cognition: The systematic de finition of structure and function. Cognitive Neuropsychology, 22, 262–275. Price, C. J., Gorno-Tempini, M. L., Graham, K. S., et al. (2003). Normal and pathological reading: Converging data from lesion and imaging studies. NeuroImage, 20, S30–S41. Price, C. J., Howard, D., Patterson, K., et al. (1998). A functional neuroimaging description of two deep dyslexic patients. Journal of Cognitive Neuroscience, 10, 303–315. Price, C. J., & Mechelli, A. (2005). Reading and reading disturbance. Current Opinion in Neurobiology, 15, 231–238. Price, C. J., Moore, C. J., & Frackowiak, R. S. (1996). The effect of varying stimulus rate and duration on brain activity during reading. NeuroImage, 3 (1), 40–52. Pugh, K. R., Mencl, W. E., Jenner, A. R., Katz, L., Frost, S. J., Lee, J. R., Shaywitz, S. E., & Shaywitz, B. A. (2001). Neurobiological studies of reading and reading disability. Journal of Communication Disorders, 34 (6), 479–492. Pugh, K. R., Shaywitz, B. A., Shaywitz, S. E., Constable, R. T., Skudlarski, P., Fulbright, R. K., et al. (1996). Cerebral organization of component processes in reading. Brain, 119, 1221–1238. Rabinovici, G. D., Seeley, E. J., Gorno-Tempini, M. L., et al. (2008). Distinct MRI atrophy patterns in autopsy-proven Alzheimer’s disease and frontotemporal lobar degeneration. American Journal of Alzheimer’s Disease and Other Dementias, 22, 474–488. Rapcsak, S. Z., & Beeson, P. M. (2002). Neuroanatomical correlates of spelling and writ ing. In A. E. Hillis (Ed.), Handbook of adult language disorders: Integrating cognitive neu ropsychology, neurology, and rehabilitation (pp. 71–99). Philadelphia: Psychology Press. Rapcsak, S. Z., & Beeson, P. M. (2004). The role of left posterior inferior temporal cortex in spelling. Neurology, 62, 2221–2229. Rapcsak, S. Z., Beeson, P. M., Henry, M. L., Leyden, A., Kim, E., Rising, K., Andersen, S., & Cho, H. (2009). Phonological dyslexia and dysgraphia: Cognitive mechanisms and neur al substrates. Cortex, 45 (5), 575–591. Rapcsak, S. Z., Beeson, P. M., & Rubens, A. B. (1991). Writing with the right hemisphere. Brain and Language, 41, 510–530. Rapcsak, S. Z., Henry, M. L., Teague, S. L., et al. (2007). Do dual-route models accurately predict reading and spelling performance in individuals with acquired alexia and agraphia? Neuropsychologia, 45, 2519–2524.

Page 27 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Rapcsak, S. Z., Rubens, A. B., & Laguna, J. F. (1990). From letters to words: Procedures for word recognition in letter-by-letter reading. Brain and Language, 38, 504–514. Rapp, B. Uncovering the cognitive architecture of spelling. In A. Hillis (Ed.), Handbook on adult language disorders: Integrating cognitive neuropsychology, neurology and rehabili tation. Philadelphia: Psychology Press. Rapp, B., & Caramazza, A. (1998). A case of selective difficulty in writing verbs. Neuro case, 4, 127–140. Rapp, B., Epstein, C., & Tainturier, M. J. (2002). The integration of information across lexi cal and sublexical processes in spelling. Cognitive Neuropsychology, 19, 1–29. Rapp, B., & Hsieh, L. (2002). Functional magnetic resonance imaging of the cognitive components of the spelling process. Cognitive Neuroscience Society Meeting, San Fran cisco. Rapp, B., & Lipka, K. (2011). The literate brain: The relationship between spelling and reading. Journal of Cognitive Neuroscience, 23 (5), 1180–1197. Raymer, A., Foundas, A. L., Maher, L. M., et al. (1997). Cognitive neuropsychological analysis and neuroanatomical correlates in a case of acute anomia. Brain and Language, 58, 137–156. Roeltgen, D. P., & Heilman, K. M. (1984). Lexical agraphia: Further support for the twosystem hypothesis of linguistic agraphia. Brain, 107, 811–827. Roeltgen, D. P., Sevush, S., & Heilman, K. M. (1983). Phonological agraphia: Writing by the lexical-semantic route. Neurology, 33, 755–765. Rosen, H. J., Gorno-Tempini, M. L., Goldman, W. P., et al. (2002). Patterns of brain atrophy in frontotemporal dementia and semantic dementia. Neurology, 58, 198–208. Roux, F. E., Dufor, O., Giussani, C., Wamain, Y., Draper, L., Longcamp, M., & Démonet, J. F. (2009). The graphemic/motor frontal area Exner’s area revisited. Annals of Neurology, 66 (4), 537–545. Roux, F. E., Dufor, O., Giussani, C., Wamain, Y., Draper, L., Longcamp, M., & Démonet, J. F. (2009). The graphemic/motor frontal area Exner’s area revisited. Annals of Neurology, 66, 537–545. Saffran, E. M., & Coslett, H. B. (1998). Implicit vs. letter-by-letter reading in pure alexia: A tale of two systems. Cognitive Neuropsychology, 15, 141–165. Sakurai, Y., Sakai, K., Sakuta, M., & Iwata, M. (1994). Naming difficulties in alexia with agraphia for kanji after a left posterior inferior temporal lesion. Journal of Neurology, Neurosurgery, and Psychiatry, 57, 609–613.

Page 28 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Salmelin, R. (2007). Clinical neurophysiology of language: The MEG approach. Clinical Neurophysiology, 118, 237–254. Salmelin, R., Service, E., Kiesilä, P., Uutela, K., & Salonen, O. (1996). Impaired visual word processing in dyslexia revealed with magnetoencephalography. Annals of Neurology, 40 (2), 157–162. Schlaggar, B. L., & McCandliss, B. D. (2007). Development of neural systems for reading. Annual Review of Neuroscience, 30, 475–503. Seidenberg, M. S., & McClelland, J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96 (4), 523–568. Shallice, T. (1981). Phonological agraphia and the lexical route in writing. Brain, 104, 413–429. Simos, P. G., Breier, J. I., Fletcher, J. M., Foorman, B. R., Castillo, E. M., & Papanicolaou, A. C. (2002). Brain mechanisms for reading words and pseudowords: An integrated ap proach. Cerebral Cortex, 12 (3), 297–305. Smith, E. E., & Jonides, J. (1999). Storage and executive processes in the frontal lobes [re view]. Science, 283 (5408), 1657–1661. Starrfelt, R., & Gerlach, C. (2007). The visual what for area: Words and pictures in the left fusiform gyrus. NeuroImage, 35, 334–342. Strain, E., Patterson, K., Graham, N., & Hodges, J. R. (1998). Word reading in Alzheimer’s disease: Cross-sectional and longitudinal analyses of response time and accu racy data. Neuropsychologia, 36, 155–171. (p. 506)

Tainturier, M.-J., & Rapp, B. (2001). The spelling process. In B. Rapp (Ed.), The handbook of cognitive neuropsychology: What deficits reveal about the human mind (pp. 263–289). Philadelphia: Psychology Press. Tarkiainen, A., Helenius, P., Hansen, P. C., Cornelissen, P. L., & Salmelin, R. (1999). Dy namics of letter string perception in the human occipitotemporal cortex. Brain, 122 (Pt 11): 2119–2132. Tettamanti, M., Buccino, G., Saccuman, M. C., Gallese, V., Danna, M., Scifo, P., Fazio, F., Rizzolatti, G., Cappa, S. F., & Perani, D. (2005). Listening to action-related sentences acti vates fronto-parietal motor circuits. Journal of Cognitive Neurosci ence, 17 (2), 273–281. Thompson-Schill, S. L., D’Esposito, M., & Kan, I. P. (1999). Effects of repetition and com petition on activity in left prefrontal cortex during word generation. Neuron, 23 (3), 513– 522.

Page 29 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Tsapkini, K., & Rapp, B. (2009). Neural response to word and pseudoword reading after a left fusiform gyrus resection: An fMRI investigation. Poster presented at the Academy of Aphasia, Boston. Tsapkini, K., & Rapp, B. (2010). The orthography-specific functions of the left fusiform gyrus: Evidence of modality and category specificity. Cortex, 46, 185–205. Turkeltaub, P. E., Eden, G. F., Jones, K. M., & Zeffiro, T. A. (2002). Meta-analysis of the functional neuroanatomy of single-word reading: method and validation. NeuroImage, 16 (3 Pt 1), 765–780. Usui, K., Ikeda, A., Takayama, M., et al. (2003). Conversion of semantic information into phonological representation: A function in left posterior basal temporal area. Brain, 126, 632–641. Usui, K., Ikeda, A., Takayama, M., et al. (2005). Processing of Japanese morphogram and syllabogram in the left basal temporal area: Electrical cortical stimulation studies. Cogni tive Brain Research, 24, 274–283. Vanier, M., & Caplan, D. (1985). CT correlates of surface dyslexia. In K. E. Patterson, J. C. Marshall, & M. Coltheart (Eds.), Surface dyslexia: Neuropsychological and cognitive stud ies of phonological reading (pp. 511–525). London: Erlbaum. Vigneau, M., Beaucousin, V., Hervé, P. Y., et al. (2006). Meta-analyzing left hemisphere language areas: Phonology, semantics, and sentence processing. NeuroImage, 30, 1414– 1432. Vinckier, F., Dehaene, S., Jobert, A., et al. (2007). Hierarchical coding of letter strings in the ventral stream: dissecting the inner organization if the visual word-form system. Neu ron, 55, 143–156. Weekes, B. (1995). Right hemisphere writing and spelling. Aphasiology, 9, 305–319. Weekes, B., Coltheart, M., & Gordon, E. (1997). Deep dyslexia and right-hemisphere read ing: Regional cerebral blood flow study. Aphasiology, 11, 1139–1158. Wernicke, C. (1894). Grundriss der Psychiatrie in Kliniscen Vorlesungen. Leipzig: Verlag von Georg Thieme. Wilson, S. M., Brambati, S. M., Henry, R. G., et al. (2009). The neural basis of surface dyslexia in semantic dementia. Brain, 132, 71–86. Wilson, T. W., Leuthold, A. C., Lewis, S. M., et al. (2005). Cognitive dimensions of ortho graphic stimuli affect occipitotemporal dynamics. Experimental Brain Research, 167, 141–147.

Page 30 of 31

Cognitive Neuroscience of Written Language: Neural Substrates of Reading and Writing Wilson, T. W., Leuthold, A. C., Moran, J. E., et al. (2007). Reading in a deep orthography: Neuromagnetic evidence of dual mechanisms. Experimental Brain Research, 180, 247– 262. Wise, R., Hadar, U., Howard, D., & Patterson, K. (1991). Language activation studies with positron emission tomography [review]. Ciba Foundation Symposium, 163, 218–228; dis cussion 228–234. Woollams, A., Lambon Ralph, M. A., Plaut, D. C., & Patterson, K. (2007). SD-squared: On the association between semantic dementia and surface dyslexia. Psychological Review, 114, 316–339. Wright, N., Mechelli, A., Noppeney, U., et al. (2008). Selective activation around the left occipito-temporal sulcus for words relative to pictures: Individual variability of false posi tives? Human Brain Mapping, 29, 986–1000. Xue, G., & Poldrack, R. A. (2007). The neural substrates of visual perceptual learning of words: Implications for the visual word form area hypothesis. Journal of Cognitive Neuro science, 19, 1643–1655. Zaidel, E. (1990). Language functions in the two hemispheres following complete cere bral commissurotomy and hemispherectomy. In F. Boller, J. Grafman (Eds.), Handbook of neuropsychology (Vol. 4, pp. 115–150). Amsterdam: Elsevier.

Kyrana Tsapkini

Kyrana Tsapkini, Departments of Neurology, and Physical Medicine and Rehabilita tion, Johns Hopkins University, Baltimore, MD Argye Hillis

Argye Hillis, Johns Hopkins University School of Medicine, Departments of Neurolo gy and Physical Medicine and Rehabilitation, and Department of Cognitive Science, Johns Hopkins University

Page 31 of 31

Neural Systems Underlying Speech Perception

Neural Systems Underlying Speech Perception Sheila Blumstein and Emily B. Myers The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0025

Abstract and Keywords This chapter reviews the neural systems underlying speech perception processes with a focus on the acoustic properties contributing to the phonetic dimensions of speech and the mapping of speech input to phonetic categories and word representations. Neu roimaging findings and results from brain lesions resulting in aphasia suggest that these processes recruit a neural system that includes temporal (Heschl’s gyrus, superior tempo ral gyrus, superior temporal sulcus, and middle temporal gyrus), parietal (supramarginal and angular gyri), and frontal lobe structures (inferior frontal gyrus) and support a func tional architecture in which there are distinct stages organized such that information at one stage of processing influences and modulates information at other stages of process ing. In particular, spectral-temporal analysis of speech recruits temporal areas, lexical processing, i.e. the mapping of sound structure to phonetic category and lexical represen tations, recruits posterior superior temporal gyrus and parietal areas, and selection processes recruit frontal areas. The authors consider how evidence from cognitive neuro science can inform a number of theoretical issues that have dominated speech perception research, including invariance, the influence of higher level processes (both lexical and sentence processing) on lower level phonetic categorization processes, the influence of articulatory and motor processes on speech perception, the effects of phonetic and phonological competition on lexical access, and the plasticity of phonetic category learn ing in the adult listener. Keywords: speech perception, lexical processing, lexical competition, invariance, neuroimaging, aphasia, acoustic properties, phonetic categories, plasticity

Introduction Language processing appears to be easy and seamless. However, evidence across a wide range of disciplines and approaches to the study of language, including theoretical lin guistics, behavioral studies of language, and the cognitive neuroscience of language, sug gests that the language system can be broken down into separable components—speech, words, syntactic structure, meaning. It is the goal of this chapter to focus on one of these Page 1 of 30

Neural Systems Underlying Speech Perception components—namely, speech—and to examine the neural systems recruited in its pro cessing. For most of us, speech is the medium we use for language communication. Thus, in both listening and speaking, it serves as the interface with the language system. In language understanding, the auditory input must be mapped onto phonetic categories, and these must be mapped onto word representations that are ultimately mapped onto meaning or conceptual structures. Similarly, in spoken word production, concepts must be mapped onto word representations that are ultimately mapped onto articulatory plans and com mands for producing the sounds of speech. Although study of the cognitive (p. 508) neuro science of speech processing includes the processes that map sounds to words and words to speech, we will focus in this chapter on the perceptual processing stream, namely, the processes that map auditory input to phonetic category and word representations.

Figure 25.1 Functional architecture (right panel) of the auditory speech processing system from auditory input to lexical access and the neural systems under lying this processing (left panel).

To begin this review it is important to provide a theoretical framework that elucidates our current understanding of the functional architecture of speech processing. Such a frame work serves as a guide as we examine the neural systems underlying speech processing. At the same time, examination of the neural systems recruited during speech processing may in turn shape and constrain models of speech processing (Poldrack, 2006). For exam ple, although most models agree that in both speech perception and speech production there is a series of successive stages in which information is transformed along the pro cessing stream, there is disagreement about the extent to which the processing stages in fluence each other. In one view, the processing stages are discrete and informationally en capsulated (Fodor, 1983; Levelt, 1992). In this view, the input to a stage or module is processed and transformed, but its output is sent to the next stage without leaving any “residue” of information from the earlier processing stage. In contrast, the interactive or cascading view holds that although there are multiple stages of processing, information flow at one level of processing influences and hence modulates other stages of processing (Dell, 1986; Gaskell & Marslen-Wilson, 1997; McClelland & Elman, 1986). For example, it has been shown in spoken word production that the sound structure properties of the lex icon cascade and influence phonological planning and ultimately articulatory implementa tion stages or processing (Baese-Berk & Goldrick, 2009; Goldrick & Blumstein, 2006; Per amunage, Blumstein, Myers, Goldrick, & Baese-Berk, 2011). Page 2 of 30

Neural Systems Underlying Speech Perception The theoretical framework we adopt views speech processing as an integrated system in which there are multiple stages of processing organized in a network-like architecture such that information at one stage of processing influences that at other stages of pro cessing (Gaskell & Marslen-Wilson, 1999; Marslen-Wilson, 1987; Marslen-Wilson & Welsh, 1978; McClelland & Elman, 1986). In speech perception, as shown in Figure 25.1, the au ditory input is mapped onto acoustic-phonetic (spectral-temporal) properties of speech, which are then mapped onto more abstract phonological properties such as features, seg ments, and syllables, and these phonological properties are in turn mapped onto a word, i.e. lexical (sound structure) representations. At each stage of processing, representations are coded in terms of patterns of activation and inhibition. The extent of activation of a representation unit is not all or none, but rather is graded and depends on the goodness of the input and the extent to which there are other representational units that share properties with it and hence compete for ac cess. Indeed, the extent of activation and the subsequent selection of a particular repre sentational unit occur at each stage of processing. Thus, acoustic-phonetic representa tions that are similar compete with each other (e.g., [da] and [ta] share spectral proper ties but are distinguished by temporal properties of voicing), phonological representa tions compete with each other (e.g., /da/ and /ba/ (p. 509) share features of manner of ar ticulation and voicing but are distinguished by place of articulation), and word represen tations compete with each other (e.g., hammock and hammer share the same onset prop erties but are distinguished by the sound structure of the final syllable). In the remainder of this chapter we discuss in detail how this functional architecture maps onto the neural system. In broad strokes, speech perception and the mapping of sounds to words recruits a neural system which includes temporal lobe structures (Heschl’s gyrus, superior temporal gyrus [STG], superior temporal sulcus [STS], and mid dle temporal gyrus [MTG]), parietal lobe structures (supramarginal gyrus [SMG] and an gular gyrus [AG]), and frontal lobe structures (inferior frontal gyrus [IFG]) (see Figure 25.1). Early stages of processing appear to be bilateral, whereas later stages appear to recruit primarily the left hemisphere. We examine how the neural system responds to dif ferent acoustic properties contributing to the phonetic dimensions of speech. We investi gate the neural substrates reflecting the adult listener’s ability to respond to variability in the speech stream and to perceive a stable phonetic representation. We consider the po tential relationship between speech perception and speech production processes by ex amining the extent to which speech perception processes recruit and are influenced by the speech production and motor system. We review the various stages implicated in the mapping from sounds to words as a means of examining whether the same neural system is recruited for resolving competition across linguistic domains or whether different areas are recruited as a function of a particular linguistic domain. Finally, we examine the neur al systems underlying the plasticity of the adult listener in learning new phonetic cate gories.

Page 3 of 30

Neural Systems Underlying Speech Perception There is a rich literature investigating speech processing in cognitive neuroscience using a range of methods. Much of the early work that laid the foundations for structure–func tion relations in speech processing comes from lesion-based research examining the na ture of speech perception and production deficits in aphasia. The strength of this method is that it provides a rich tapestry of behaviors in which one can see impairment or sparing of particular language functions in the same patient. Of importance, it is also possible to see different patterns of impairment across different types of aphasics, allowing for in sights into the potential functional role of particular neural areas in accomplishing a par ticular language task. Nonetheless, because the brain lesions of patients tend to be large, it is difficult to make strong inferences about structure–function relations. Evidence from the neuroimaging literature can speak to these issues, allowing for a closer examination of the neural areas activated during speech processing and the potential modulation of activation along the processing stream. For this reason, this chapter looks at converging evidence from multiple cognitive neuroscience methods in examining the neural systems underlying speech perception.

Perception of the Acoustic Properties of Speech The listener faces a daunting task in perceiving speech. In order to map the complex acoustic signal onto a speech sound in his linguistic repertoire, the acoustic details of the speech signal that are relevant for phonetic identity must be extracted. These acoustic properties include changes in the fine-grained spectral details of speech over short and long time scales. At a short time scale, fine-grained timing distinctions on the order of 10 ms between the onset of the burst and the onset of voicing serve as a cue to voicing in stop consonants in many languages, and differentiate sounds such as /d/ versus /t/ (Lisker & Abramson, 1964). Spectral sweeps about 40 ms in length mark the transitions from consonant to vowel in stop consonants, and signal place of articulation, distinguishing be tween syllables such as /da/ and /ga/. At a longer time scale, on the order of 150 to 300 ms, relatively steady-state spectral patterns distinguish between vowels and between fricative consonants such as /f/ and /s/. At a yet longer time scale, changes in pitch over the course of an entire word are important for establishing lexical tone, which determines lexical identity in languages such as Chinese and Thai. Moreover, information at short and long temporal durations may be useful for processing indexical properties of speech such as speaker identity, speaker emotional state, speech rate, and prosodic boundaries (Wildgruber, Pihan, Ackermann, Erb, & Grodd, 2002). Taken together, evidence suggests that the perception of speech is supported by lower level, bottom-up acoustic processes that detect and analyze the acoustic details of speech. There is fairly general consensus that processing the acoustic details of speech occurs in the temporal lobes, specifically in Heschl’s gyrus, the STG, and extending into the STS (Binder & Price, 2001; Hickok & Poeppel, 2004). Phoneme identification and discrimina tion tasks are disrupted when the (p. 510) left posterior temporal gyrus is directly stimu Page 4 of 30

Neural Systems Underlying Speech Perception lated (Boatman & Miglioretti, 2005), and damage to the bilateral STG can result in deficits in processing the details of speech (Auerbach, Allard, Naeser, Alexander, & Al bert, 1982; Coslett, Brashear, & Heilman, 1984; cf. Poeppel, 2001, for review). However, interesting differences emerge in the neural systems recruited in the processing of the acoustic parameters of speech. These differences appear to reflect whether the acoustic attribute is spectral or temporal, whether the acoustic information occurs over a short or long time window, and whether the cue plays a functional role in the language. We now turn to these early analysis stages in which the acoustic-phonetic properties are mapped onto phonetic representations, which is the primary focus of the next section.

Perception of the Temporal Properties of Speech: Voicing in Stop Consonants Voice onset time (VOT) is defined as the time between the release of an oral closure and the onset of voicing in syllable-initial stop consonants. This articulatory/acoustic parame ter serves as a cue to the perception of voicing contrasts in syllable-initial stop conso nants, distinguishing between voiced sounds such as /b/, /d/, and /g/, and voiceless sounds such as /p/, /t/, and /k/ (Lisker & Abramson, 1964). VOT is notable in part because small differences (10 to 20 ms) in the timing between the burst and onset of voicing cue per ceptual distinctions between voiced and voiceless stop consonants (Liberman, Delattre, & Cooper, 1958; Liberman, Harris, Hoffman, & Griffith, 1957). Nonetheless, depending on the VOT values, similar differences are perceived by listeners as belonging to the same phonetic category. Thus, stimuli on a VOT continuum are easily discriminated when they fall into two distinct speech categories (e.g., “d” versus “t”), but are difficult to distin guish if they are members of the same phonetic category. This is true even when the acoustic distance between exemplars is equivalent. Evidence from intercerebral evoked potentials and magnetic and electrophysiological recordings suggests that at least some aspects of this type of discontinuous perception may arise as a result of properties of the auditory system. Steinschneider and colleagues (1999) recorded intercerebral auditory evoked potentials from three patients undergoing cortical mapping before surgery for intractable epilepsy. In these patients, evoked re sponses from Heschl’s gyrus and the posterior portion of the STG in the right hemisphere showed distinct patterns of response to tokens with VOTs less than 20 ms than those with VOTs greater than 20 ms. For VOTs less than 20 ms, only one component was observed, regardless of the timing of the burst with respect to voicing (cf. Sharma & Dorman, 1999). For speech tokens with VOTs greater than 20 ms, two components were observed, one that was time-locked to the onset of the burst, which signals the release of the stop closure, and another that was time-locked to the onset of voicing. Given that 20-ms VOT is near the boundary between voiced and voiceless phonetic categories in some lan guages, this categorical response to voiced tokens was taken as evidence that physiologi cal properties of auditory cortex support the categorical perceptual response. A similar pattern has been observed in nonhuman primates and more phylogenetically distant species such as guinea pigs (McGee, Kraus, King, Nicol, & Carrell, 1996; Steinschneider, Schroeder, Arezzo, & Vaughan, 1995), suggesting that the sensitivity of the auditory sys Page 5 of 30

Neural Systems Underlying Speech Perception tem to this property forms the basis for the phonetic property of voicing in stop conso nants. Consistent with these findings are results from Liégeois-Chauvel and colleagues (1999). Using intracerebral evoked potentials, they observed components time-locked to the acoustic properties of voiced and voiceless tokens for both speech and nonspeech lateral ized to the left hemisphere in Heschl’s gyrus, the planum temporale (PT), and the posteri or part of the STG (Brodmann area [BA] 22). Results from a magnetoencephalography (MEG) study (Papanicolaou et al., 2003) suggest that responses to nonspeech “tone-onsettime” and speech VOT stimuli do not differ in laterality early in processing (before 130 ms). However, a strong leftward laterality is observed for speech tokens, but not for non speech tokens, later in processing (>130 ms). Similarly, in a MEG study examining withinphonetic-category variation, Frye et al. (2007) showed greater sensitivity to within-cate gorization in left-hemisphere evoked responses than in the right hemisphere. Taken to gether, these findings suggest that early stages of processing recruit auditory areas bilat erally, but that at later stages, the processing is lateralized to the left hemisphere, pre sumably when the auditory information is encoded onto higher level acoustic-phonetic properties of speech that ultimately have linguistic relevance. Indeed, left posterior temporal activation is modulated by the nature of the phonetic cate gory information conveyed by a particular VOT value. In particular, results of neuroimag ing (p. 511) studies have shown graded activation to stimuli along a VOT continuum in left posterior temporal areas with increasing activation as the VOT of the stimuli approached the phonetic boundary (Blumstein, Myers, & Rissman, 2005; Hutchison, Blumstein, & My ers, 2008; Myers, 2007; Myers, Blumstein, Walsh, & Eliassen, 2009). Exemplar stimuli (i.e., those stimuli perceived as a clear-cut voiced or voiceless stop consonant) showed the least activation. Stimuli that were poorer exemplars of their category, because those stimuli either were close to the category boundary (Blumstein et al., 2005) or had ex treme VOT values at the periphery of the continuum (Myers, 2007), showed increased neural activation, suggesting that the posterior portion of the STG is sensitive to the acoustic-phonetic structure of the phonetic category.

Perception of the Spectral Properties of Speech: Place of Articulation in Stop Consonants Stop consonants such as /b/, /d/, and /g/ differ with respect to their place of articulation— that is, the location in the oral cavity where the closure is made to produce these sounds. In stop consonants, differences in place of articulation are signaled by the shape of for mants as they transition from the burst to the vowel. Unlike VOT, which is a temporal pa rameter, place of articulation is cued by rapid spectral changes over a relatively short time window of some 20 to 40 ms (Stevens & Blumstein, 1978). Neuroimaging evidence largely supports the notion that place of articulation, like VOT, recruits temporal lobe structures and is processed primarily in the left hemisphere.

Page 6 of 30

Neural Systems Underlying Speech Perception Joanisse and colleagues (2007) examined cortical responses to changes in phonetic identi ty from /ga/ to /da/ as participants listened passively to phonetic tokens while they watched a silent movie. In this study, sensitivity to between-category shifts was observed in the left STS, extending into the MTG. Of interest, similar areas were recruited for sinewave analogues of place of articulation (Desai, Liebenthal, Waldron, & Binder, 2008). Sine-wave speech is speech that has been filtered so that the energy maxima are re placed by simple sine waves. Initially, sine-wave speech sounds like unintelligible noise; however, it has been shown that with experience, listeners become adept at perceiving these sounds as speech (Remez, Rubin, Pisoni, & Carrell, 1981). Of importance, although sine-wave speech is stripped of the precise acoustic cues that distinguish the sounds of speech, it preserves relative shifts in spectral energy that signal place of articulation in stop consonants. In a functional magnetic resonance imaging (fMRI) study by Desai et al. (2008), participants heard sounds along a continuum from /ba/ to /da/, as well as sinewave versions of these sounds and sine-wave stimuli that did not correspond to any speech sound. They were scanned while listening to speech and sine-wave speech sounds before and after they were made aware of the phonetic nature of the sine-wave speech sounds. Results showed that activation patterns were driven not by the spectral charac teristics of the stimuli, but rather by whether sounds were perceived as speech or not. Specifically, left posterior temporal activation was observed for real speech sounds before training and for sine-wave speech sounds once participants had begun to perceive them as speech. That sine-wave speech recruits the same areas as speech but only when it is perceived by the listener as speech suggests that the linguistic function of an input is crit ical for recruiting the left-hemisphere speech processing stream. As we will see below (see the section, Perception of Lexical Tone), these findings are supported by the percep tion of tone when it has linguistic relevance to a speaker.

Perception of Vowels Vowel perception involves attention to spectral information at a longer time scale than that of either VOT or place of articulation, typically on the order of 150 to 300 ms. Al though stop consonants are marked by spectral transitions over short time windows, spectral information in vowels is relatively steady at the vowel midpoint, although there are spectral changes as a function of the consonant environment in which a vowel occurs (Strange, Jenkins, & Johnson, 1983). Moreover, vowels are more forgiving of differences in timing—for instance, vowels as short as 25 ms can still be perceived as members of their category, suggesting that the dominant cue to vowel identity is the spectral cue, rather than durational cues (Strange, 1989). Indeed, vowel quality can be perceived based on the spectral transitions of a preceding consonant (Blumstein & Stevens, 1980). There are few behavioral studies examining the perception of vowels pursuant to brain in jury. Because participants with right-hemisphere damage typically do not present with aphasia, it is usually assumed that they do not have speech perception impairments. The sparse evidence available supports this claim (cf. Tallal & Newcombe, 1978). (p. 512) Nonetheless, dichotic listening experiments with normal individuals have typically shown that, unlike consonants, which display a strong right ear (left hemisphere) advantage, Page 7 of 30

Neural Systems Underlying Speech Perception vowels show either no ear preference or a slight left ear (right hemisphere) advantage (Shankweiler & Studdert-Kennedy, 1967; Studdert-Kennedy & Shankweiler, 1970). As de scribed below, neuroimaging experiments also suggest neural differences in the process ing of consonants and vowels. Not surprisingly, fMRI studies of vowel perception show recruitment of the temporal lobe. However, results suggest that vowel processing appears to differ from consonant process ing both in terms of the areas within the temporal lobe that are recruited and in terms of the laterality of processing. With respect to the recruitment of temporal lobe areas, it appears that vowels recruit the anterior temporal lobe to a greater degree than consonant processing. In particular, acti vation patterns observed by Britton et al. (2009), Leff et al. (2009), and Obleser et al. (2006) have shown that areas anterior to Heschl’s gyrus are recruited in the perception of vowels. For instance, in a study by Obleser and colleagues (2006), activation was moni tored using fMRI as participants listened to sequences of vowels that varied in either the first or second formants, acoustic parameters that spectrally distinguish vowels. In gener al, more activation was observed for vowels than nonspeech sounds in the anterior STG bilaterally. Moreover, anterior portions of this region responded more to back vowels ([u] and [o]), whereas more posterior portions responded more to front vowels ([i] and [e]), suggesting that the anterior portions of the superior temporal cortex respond to the structure of vowel space. These findings suggest that there may be a subdivision of temporal cortex along the ante rior-posterior axis as a function of the phonetic cues giving rise to the sounds of lan guage. Phonetic cues to stop consonants seem to recruit temporal areas posterior to Heschl’s gyrus as well as the MTG (Joanisse, Zevin, & McCandliss, 2007; Liebenthal et al., 2010; Myers et al., 2009), whereas phonetic cues to vowels seem to recruit more ante rior regions (Britton et al., 2009; Leff et al., 2009; Obleser et al., 2006). With respect to laterality, the preponderance of neuroimaging evidence suggests a bilat eral or somewhat right-lateralized organization for vowel processing (Kasai et al., 2001; Obleser et al., 2006; Obleser, Elbert, Lahiri, & Eulitz, 2003). For instance, a study by Formisano and colleagues (2008) used support vector machines to learn the patterns of fMRI activation associated with three different vowels ([i], [a], [u]) spoken by three speak ers. Activation in both the right and left STG and extending into the STS distinguished be tween vowels in both the trained speakers’ voices and a novel speaker’s voice. Consistent with these findings, Guenther and colleagues (2004) report activation in the right STG for 500-ms isolated vowel stimuli, which reflected a vowel’s prototypicality as a member of its phonetic category. Nonetheless, it appears that the laterality of vowels is influenced by their duration and the context in which they occur. In particular, it appears that vowels when presented in more naturally occurring contexts, where either their durations are shorter or they occur in consonant contexts, are more likely to recruit left-hemisphere mechanisms. For exam ple, Britton et al. (2009) reported a study of vowel processing in which vowels were pre Page 8 of 30

Neural Systems Underlying Speech Perception sented at varying durations, from 75 to 300 ms. Vowels at longer durations showed greater activation in both left and right anterior STG, with a general rightward asymme try for vowels. However, there was a decreasing right-hemisphere preference as the dura tion of the vowels got shorter. A study by Leff et al. (2009) examined the hemodynamic re sponse to an unexpected vowel stimulus embedded in a /bVt/ context. Increased activa tion for the vowel deviant was observed in anterior portions of the left STG, whereas no such response to deviant stimuli was observed when the standard and deviant stimuli were nonspeech stimuli that were perceptually matched to the vowels.

Perception of Lexical Tone Until now we have examined acoustic features that are limited to phonetic segments. However, suprasegmental information in the acoustic stream also has linguistic rele vance. One such suprasegmental property of language is lexical tone, in which changes in the pitch contour over a word signal differences in meaning. For example, tonal lan guages such as Chinese and Thai employ distinct pitch patterns (e.g., rising and falling) to distinguish between lexical items with the same segmental structure. Of interest, lexi cal tone is an acoustic property with a relatively long duration: on the order of 300 to 500 ms. Nonlinguistic processing of pitch information, for instance in the perception of music, has long been considered a right-hemisphere process (Zatorre, Belin, & Penhune, 2002). Nonetheless, for native speakers of tone languages, lexical tone appears to be processed in (p. 513) the left hemisphere. In contrast, native speakers of English do not show lefthemisphere processing for Chinese or Thai words with different tones because lexical tones do not play a linguistic role (Gandour et al., 2000). Evidence suggests that even at the level of the brainstem, native speakers of tone lan guages respond differently to tone contours even when this information is not embedded in linguistic structure (Krishnan, Swaminathan, & Gandour, 2009), suggesting that expe rience in learning a tonal language fundamentally alters one’s auditory perception of pitch changes in general. At the level of the cortex, however, responses to a given type of tone pattern may be more language specific. For instance, Xu et al. (2006) reported a study of tone processing that used Chinese syllables with appropriate overlaid Chinese tones, and Chinese syllables with overlaid Thai tones. This latter group of stimuli was not perceived by Chinese listeners as words in their language, and neither stimulus type was perceived as words by Thai listeners. When native speakers of Thai and Chinese per formed a discrimination task on these stimuli, an interaction was observed in the left PT, such that activation was greatest for tone contours that corresponded to each group’s na tive language (i.e., Chinese tones for Chinese listeners, Thai tones for Thai listeners). Of particular interest, this language-specific preference was observable irrespective of whether stimuli had lexical content or not. Although processing lower level details of tone may rely on a bilaterally distributed system, it seems clear that even nonlinguistic uses of tone become increasingly left lateralized when they correspond to the tonal patterns used in one’s native language.

Page 9 of 30

Neural Systems Underlying Speech Perception

Invariance for Phonetic Categories of Speech Exemplars of speech categories vary in the way they are realized acoustically (Peterson & Barney, 1952). This variability can come from numerous sources, including different vocal tract properties and sizes associated with distinct talkers, imprecise motor control in speech articulation, and changes in speaking rate. Listeners are surprisingly sensitive to these variations. For example, evidence suggests that not all members of a phonetic cate gory are perceived as equal (Kuhl, 1991; Pisoni & Tash, 1974). This sensitivity to withincategory variation is reflected in neural sensitivity in the posterior STG, where there are graded neural responses, with increased activation as a member of a phonetic category approaches the acoustic-phonetic boundary (Frye et al., 2007, 2008; Guenther, Nieto-Cas tanon, Ghosh, & Tourville, 2004; Myers, 2007; Myers et al., 2009). However, for the listen er, the primary purpose of the speech signal is to denote meaning. Indeed, listeners per ceive a stable phonetic percept despite this variability. As such, a crucial question is how this acoustic variability is mapped onto a stable representation of a phonetic category. It is this question that has been termed the “invariance problem.” The search for the neural correlates of invariance presents methodological challenges to the cognitive neuroscientist. The goal is to identify patterns of response that show sensi tivity to variation between phonetic categories yet are insensitive to variation within the category. One way researchers have approached this issue is to use adaptation or oddball designs, in which a standard speech stimulus is repeated a number of times. This string of standard stimuli is interrupted by a stimulus that differs from the standard along some dimension, either acoustically (e.g., a different exemplar of the same phonetic category, or the same phonetic content spoken by a different speaker) or phonetically (e.g., a differ ent phonetic category). Increases in activation for the “different” stimulus are presumed to reflect either a release from adaptation (Grill-Spector, Henson, & Martin, 2006; GrillSpector & Malach, 2001) or an active change detection response (Zevin, Yang, Skipper, & McCandliss, 2010), but in either case serve to index a region’s sensitivity to the dimen sion of change. A study by Joanisse and colleagues (2007) used such a method to investigate neural acti vation patterns to within- and between-category variation along a place of articulation continuum from [da] to [ga]. Sensitivity to between-category variation was observed along the left STS and into the MTG. One cluster in the left SMG showed an invariant re sponse, that is, sensitivity to between-category changes in the face of insensitivity to within-category changes. In this study, no areas showed sensitivity to within-category variation, which leaves open the question of whether the paradigm was sensitive enough to detect small acoustic changes. A study by Myers and colleagues (2009) investigated the same question with a similar de sign using a VOT continuum from [da] to [ta]. In contrast to Joanisse et al., sensitivity to both within- and between-category variation was evident in the left posterior STG, indi cating that the neural system was sensitive to within-category contrasts. The (p. 514) only region to show an “invariant” response (i.e., sensitivity to 25-ms differences that signaled Page 10 of 30

Neural Systems Underlying Speech Perception phonetic between-category differences and insensitivity to 25-ms differences that sig naled within-category differences) was in the posterior portion of the left IFG extending into the precentral gyrus, a region that was not imaged in the study by Joanisse et al. (2007). The fact that invariant responses emerged in frontal areas rather than in temporal areas suggests that the perception of category invariance may arise, at least in part, from com putations in frontal areas on graded phonetic input processed in the temporal lobes (Chang et al., 2010). A cognitive, rather than perceptual, role in computing phonetic cate gory membership seems plausible in light of behavioral evidence showing that the precise location of the phonetic category boundary is not fixed, but rather varies with a host of factors including, but not limited to, lexical context, speaker identity, and speech rate (Ganong, 1980; Kraljic & Samuel, 2005; Miller & Volaitis, 1989). A mechanism that relied only on the specific properties of the acoustic signal would be unable to account for the influence of these factors on speech perception. Listeners perceive the same phonetic category spoken by both a male and a female speaker even though the absolute acoustic properties are not the same. At the same time, the listener also recognizes that two different speakers produced the same utterance. Thus, the perceptual system must be able to respond in an invariant way at some level of processing to phonetic identity while at the same time being able to respond differently to speaker identity. This relationship between phonetic category invariance and acoustic variability from dif ferent speakers has been the topic of much research in the behavioral literature. This type of invariance is especially relevant given the controversy about the degree to which indexical features in the speech signal such as talker information are retained or discard ed as we map speech to meaning. Behavioral studies (Goldinger, 1998; Palmeri, Goldinger, & Pisoni, 1993) suggest that speaker information is encoded along with the lexical form of a word, and that indexical information affects later access to that form. This observation has led to episodic theories of speech perception, in which nonphonetic information such as talker information is preserved in the course of speech processing. Episodic theories stand in contrast to abstractionist theories of speech perception, which propose that the speech signal is stripped of indexical features before the mapping of sound structure to lexical form (Stevens, 1960). Studies investigating neural activation patterns associated with changes in either speaker or phonetic information have revealed dissociations between processing of these two types of information. Sensitivity to changing speakers in the face of constant linguistic in formation has been observed in the right anterior STS (Belin & Zatorre, 2003) and in the bilateral MTG and SMG (Wong, Nusbaum, & Small, 2004). In contrast, neural coding that distinguished among vowel types, but not individual speakers, was reported by Formisano and colleagues (2008) across a distributed set of regions in left and right STG and STS.

Page 11 of 30

Neural Systems Underlying Speech Perception Nonetheless, none of these studies has specifically shown invariance to speaker change— that is, an area that treats phonetically equivalent utterances spoken by different speak ers as the same. Using an adaptation paradigm, Salvata and colleagues (2012) showed speaker-invariant responses in the anterior portion of the STG bilaterally, suggesting that at some level of analysis, the neural system may abstract away from the acoustic variabili ty across speakers in order to map to a common phonetic form. The fact that speaker-in variant regions were observed in the temporal lobes suggests that this abstraction may occur relatively early in the phonetic processing stream, before sound is mapped to lexi cal form, challenging a strict episodic view of lexical processing. Of interest, invariant responses may in some cases be shifted by attention to different as pects of the signal. Bonte et al. (2009) used electroencephalography (EEG) to investigate the temporal alignment of evoked responses to vowels spoken by a set of three different speakers. Participants engaged in a one-back task on either phonetic identity or speaker identity. When engaged in the phonetic task, significant temporal alignment was observed for vowels regardless of the speaker, and when engaged in the speaker task, signals were aligned for different vowels spoken by the same speaker. Similarly, von Kriegstein et al. (2003) showed that attention to voices versus linguistic content resulted in shifts in acti vation, with attention to voices activating the right middle STS and attention to linguistic content activating the homologous area on the left.

Articulatory and Motor Influences on Speech Perception Until now, we have focused on the influence of information in the acoustic stream on the mapping to (p. 515) a phonetic category. However, acoustic information is only part of the array of multimodal information associated with phonetic categories. Importantly, speech sound categories are linked to the articulatory codes that are necessary to produce these same sounds. This has led to the question of the precise role that articulatory and motor codes play in the perception of speech (Galantucci, Fowler, & Turvey, 2006; Liberman & Mattingly, 1985). Evidence from neuroimaging suggests that motor regions involved during speech produc tion do play some role in speech perception processes. A study by Wilson and colleagues showed coactivation of a region on the border of BA 4 and BA 6 for both the perception and production of speech (Wilson, Saygin, Sereno, & Iacoboni, 2004). A similar precen tral region, together with a region in the left superior temporal lobe, was also shown to be more active for the perception of non-native speech sounds than for native speech sounds (Wilson & Iacoboni, 2006), suggesting that motor regions may be recruited, espe cially when the acoustic input is difficult to map to an existing speech sound in the native language inventory. What is less clear is the degree to which motor regions involved in ar ticulation are necessary for the perception of speech.

Page 12 of 30

Neural Systems Underlying Speech Perception A transcranial magnetic stimulation (TMS) study by Mottonen & Watkins (2009) investigated this issue. TMS was applied to the lip region of motor cortex, after which participants performed phonetic categorization and discrimination tasks on items along four acoustic phonetic continua. In two of them, [ba]-[da] and [pa]-[ta], at one end of the continuum, the phonetic category involved articulation at the lips (i.e., [ba] and [pa]), and at the other end it did not (i.e., [da] and [ta]). In the two others, [ka]-[ga] and [da]-[ga], none of the phonetic categories involved articulation at the lips. Results showed after TMS stimulation a change in the slope and discrimination function for stimuli near the phonetic boundary for the continua that involved the lips. No changes were found for the other two continua. Of importance, the changes in phonetic identification and in discrimi nation occurred only for those stimuli near the phonetic boundary and did not affect the end-point, good-exemplar stimuli. Thus, although TMS affected the perception of speech, it only affected those sounds that were poorer exemplars of the phonetic category. It is possible then that, similar to the findings of Wilson and Iacoboni (2006), who showed that motor regions appear to be recruited when the acoustic input is difficult to map to an ex isting speech sound in the native language inventory, motor regions are recruited in speech perception when the acoustic input is poor. The question remains as to whether these regions form the core of speech perception abilities, or instead are recruited as a supporting resource in cases in which the acoustic signal is compromised in some way. Some evidence comes from the neuropsychological literature. In particular, it is the case that patients with damage to the speech motor system typically show good auditory lan guage comprehension (see Hickok, 2009, for a review). Nonetheless, although this evi dence is suggestive, it is not definitive because auditory comprehension is supported by multiple syntactic, semantic, and contextual cues that may disguise an underlying speech perception deficit. To answer this question, it is necessary to examine directly the extent to which damage to motor areas produces a frank speech perception deficit. Few studies have investigated this question. However, those that have are consistent with the view that motor area involvement is not central to speech perception. In particular, aphasic pa tients with anterior brain damage who showed deficits in the production of VOT in stop consonants nonetheless performed normally in the perception of a VOT continuum (Blum stein, Cooper, Zurif, & Caramazza, 1977).

Mapping of Sounds to Words As we have seen, the perception of the sound properties of speech engages a processing stream in the temporal lobes in which the auditory information is hierarchically organized and transformed into successively more abstract representations. Although early analysis stages recruit the temporal lobes, other neural areas, including the IFG and inferior pari etal areas such as the left SMG and AG (Burton, 2001, 2009), are recruited in speech per ception processes. Evidence from the aphasia literature (Blumstein, Cooper, Zurif, & Caramazza, 1977) is consistent with these findings because patients with anterior lesions as well as those with lesions extending into parietal areas present with deficits in speech perception (Caplan, Gow, & Makris, 1995). It is generally assumed that the functional role Page 13 of 30

Neural Systems Underlying Speech Perception of these areas is to map acoustic-phonetic properties of speech onto phonetic category representations and to ultimately make phonetic decisions. But speech is more than simply identifying the sound shape of language. It serves as the vehicle for accessing the words of a language, that is the lexicon. As we discuss later, ac cessing the sound shape of words activates a processing stream that (p. 516) includes the posterior STG, the SMG and AG, and the IFG. Of importance, the ease or difficulty with which individuals access the lexicon in auditory word recognition is influenced by a num ber of factors related to the sound structure properties of words in a language. In this way, the structure of the phonological lexicon influences activation patterns along this processing stream. We turn now to a discussion of these factors because they provide a window into not only the functional architecture of the language processing system but also the neural systems engaged in selecting a word from the lexicon. In particular, they show that lexical processes are influenced by the extent to which a target word shares phonological structure with other words in the lexicon, and hence the extent to which a target word competes for access and selection.

Lexical Competition Whether understanding or speaking, we need to select the appropriate words from our mental lexicon, a lexicon that contains thousands of words, many of which share soundshape properties. It is generally assumed that in auditory word recognition, the auditoryphonetic-phonological input activates not only the target word but also a set of words that share sound-shape properties with it (Gaskell & Marslen-Wilson, 1997; Marslen-Wilson, 1987). The relationship between the sound structure of a word and the sound structure of other words in the lexicon affects lexical access, presumably because both processing and neural resources are influenced by the extent to which a target word has to be selected from a set of competing words that share this phonological structure. Indeed, it has been shown that the extent to which a word shares phonological properties with other words in the lexicon affects the ease with which that word is accessed. In par ticular, it is possible to count the number of words that share all but one phoneme with other words in the lexicon and quantify the density of the neighborhood in which a partic ular word resides. A word that has a lot of words that share phonological properties with it is said to come from a high-density neighborhood, and a word that has a few words that share phonological properties is said to come from a low-density neighborhood (Luce & Pisoni, 1998). Behavioral research has shown that reaction-time latencies are slower in a lexical deci sion task for words from high-density neighborhoods compared with low-density neigh borhoods (Luce & Pisoni, 1998). These findings are consistent with the view that it takes greater computational resources to select a word when there are many potential competi tors compared with when there are a few potential competitors. FMRI results show that there is greater activation in the posterior STG and the SMG in a lexical decision task for words from high-density compared with low-density neighborhoods (Okada & Hickok, Page 14 of 30

Neural Systems Underlying Speech Perception 2006; Prabhakaran, Blumstein, Myers, Hutchison, & Britton, 2006). These findings sug gest that these areas are recruited in accessing the lexical representations of words and that there are greater processing demands for accessing words as a function of the set of competing word targets. They are also consistent with studies indicating that posterior STG and SMG are recruited in phonological and lexical processing (Binder & Price, 2001; Indefrey & Levelt, 2004; Paulesu, Frith, & Frackowiak, 1993). Of importance, a series of fMRI studies have shown that the information flow under con ditions of lexical competition cascade throughout the lexical processing stream, activat ing the posterior STG, SMG, AG, and IFG. Using the visual world paradigm coupled with fMRI, Righi et al. (2010) showed increased activation in a temporal-parietal-frontal net work for words that shared onsets, that is, initial sound segments or initial syllables. In these studies, subjects were required to look at the picture of an auditorily presented word from an array of four pictures including the named picture (hammock), a picture of a word that shared the onset of the target word (hammer), and two other pictures that had neither a phonological nor semantic relationship with the target word (monkey, chocolate). Participants’ eye movements were tracked during fMRI scanning. Behavioral results of participants in the scanner replicated earlier findings showing more looks to the onset competitor than to the unrelated stimuli (Allopenna, Magnuson, & Tanenhaus, 1998). These findings show that the presence of phonological competition not only modu lates activation in the posterior STG and the SMG, areas implicated in phonological pro cessing and lexical access, but also has a modulatory effect on activation in frontal areas and in particular the IFG, an area implicated in selecting among competing semantic al ternatives (Thompson-Schill, D’Esposito, Aguirre, & Farah 1997; Thompson-Schill, D’Esposito, & Kan, 1999). That the IFG is also activated in selecting among competing phonological alternatives raises one of two possibilities, as yet unresolved in the literature, about the functional role of the IFG in resolving (p. 517) competition. One possibility is that selection processes are domain general and cut across different levels of the grammar (cf. Duncan, 2001; Duncan & Owen, 2000; Miller & Cohen, 2001; Smith & Jonides, 1999). Another possibility is that there is a functional subdivision of the IFG depending on the source of the compe tition. In particular, BA 44 is recruited in selection based on phonological properties (Buckner, Raichle, & Petersen, 1995; Burton, Small, & Blumstein, 2000; Fiez, 1997; Pol drack et al., 1999), and BA 45 is recruited in selection based on semantic properties (Sny der, Feignson, & Thompson-Schill, 2007; Thompson-Schill et al., 1997, 1999). Whatever the ultimate conclusion, the critical issue is that phonological properties of the lexicon have a modulatory effect along the speech-lexical processing stream consistent with the view that information flow at one level of processing (phonological lexical access) cas cades and influences other stages of processing downstream from it (lexical selection). Converging evidence from lesion-based studies supports these general conclusions. In particular, lexical processing deficits emerge in patients with IFG lesions and those with temporo-parietal lesions involving the posterior portions of the STG, the SMG, and the AG. A series of studies using both lexical decision and eye-tracking paradigms in auditory Page 15 of 30

Neural Systems Underlying Speech Perception word recognition has shown deficits in aphasic patients in accessing words in the lexicon, especially under conditions of phonological and lexical competition (cf. Blumstein, 2009, for review). Of importance, different patterns of deficits emerge for patients with frontal lesions and for those with temporo-parietal lesions, suggesting that these areas play dif ferent functional roles in lexical access and lexical selection processes (Blumstein & Mil berg, 2000; Utman, Blumstein, & Sullivan, 2001; Yee, Blumstein, & Sedivy, 2008).

Nature of Information Flow To this point, it appears as though information flow in speech processing and in accessing the sound properties of words is unidirectional, going from lower level processes to in creasingly more abstract representations, and recruiting a processing stream from tem poral to parietal to frontal areas. However, we also know that listeners have a lexical bias when processing the phonetic categories of speech, suggesting that higher level, lexical information may influence lower level, speech processing. In a seminal paper, Ganong (1980) showed that when presented with two continua varying along the same acousticphonetic attribute (e.g., VOT of the initial stop consonant), listeners show a lexical bias. Thus, they perceive more [b]’s in a beef–peef continuum and more [p]’s in a beace–peace continuum. Of importance, the same stimulus at or near the phonetic boundary is per ceived differently as a function of the lexical status of the stimulus. These findings raise the question of whether information flow in mapping sounds to words is solely bottom-up or whether top-down information affects lower level perceptual processes. There has been much debate about this question in the behavioral literature (cf. Burton et al., 1989; Connine & Clifton, 1987; McQueen, 1991; Pitt & Samuel, 1993), some claiming that these results demonstrate that top-down lexical information shapes perceptual processes (Pitt & Samuel, 1993), and others claiming that they reflect post perceptual decision-related processes (Fox, 1984), and more recently in the fMRI litera ture (Davis et al., 2011; Guediche et al., 2013). Evidence from functional neuroimaging provides a potential resolution to this debate. In particular, findings using fMRI (Myers & Blumstein, 2008) and MEG coupled with EEG (Gow, Segawa, Ahlfors, & Lin, 2008) are consistent with a functional architecture in which information flow is not just bottom-up but is also top-down. In both studies, participants listened to stimuli taken from continua ranging from a word to a nonword (e.g., gift to kift) or from a nonword to a word (e.g., giss to kiss). Participants were asked to categorize the initial phoneme of each stimulus, and in both studies there was evidence of a shift in the perception of tokens from the con tinuum such that more categorizations were made that were consistent with the word endpoint (e.g., more “g” responses in the “gift–kift” continuum). Activation patterns in the STG reflected this shift in perception, showing modulation as a function of the lexically bi ased shift in the locus of the phonetic boundary. Given that modulation of activation was seen early in the neural processing stream, these data were taken as evidence that lexi cally biased shifts in perception affect early processing of speech stimuli in the STG. In deed, the MEG/EEG source estimates suggest that the information flow from the SMG, an

Page 16 of 30

Neural Systems Underlying Speech Perception area implicated in lexical processing, influenced activation in the left posterior STG (Gow et al., 2008). Taken together, these results are consistent with a functional architecture of language in which information flow is bidirectional. Of interest, the STG appears to integrate phonetic information with multiple sources of linguistic information, including (p. 518) lexical and sentence level (meaning) information (Chandrasekaran, Chan, & Wong, 2011; Obleser, Wise, Dresner, & Scott, 2007).

Neural Plasticity: Phonetic Category Learning in Adults The neural architecture involved in processing mature, native-language phonetic catego ry distinctions has been well studied. What has been less well addressed is how this struc ture arises. Second language learning offers a window into the processes that give rise to language organization in the brain. In particular, one can explore the neural systems that support the learning of a new phonetic contrast and examine whether the neural struc tures that support speech perception are plastic into adulthood. Although typically-devel oping children learn to perceive the sounds of their native languages, and indeed can gain native-like proficiency in learning a second language, adult second-language learn ers meet with much more variable success in learning new speech sound contrasts (Brad low, Pisoni, Akahane-Yamada, & Tohkura, 1997). Explanations for the locus of difficulties in the perception of second-language contrasts differ, yet it is clear that native-like discrimination of many non-native speech contrasts is unobtainable for some listeners, even with considerable experience or training (Pallier, Bosch, & Sebastian-Galles, 1997). Taken at face value, this finding argues against signifi cant neural plasticity of the structures involved in phonetic learning in adulthood. Howev er, using an individual differences approach, a number of investigations have looked for brain areas that correlate with better success in learning non-native contrasts. Evidence from fMRI and event-related potential (ERP) studies of non-native sound pro cessing suggests that better proficiency in perceiving non-native speech contrasts is ac companied by neural activation patterns that increasingly resemble the patterns shown for native speech sounds. Specifically, increased leftward lateralization is found in the temporal lobe for those listeners who show better perceptual performance for trained non-native speech sounds (Naatanen et al., 1997; Zhang et al., 2009; cf. Naatanen, Paavi lainen, Rinne, & Alho, 2007, for review). Moreover, increased sensitivity to non-native contrasts also correlates with less activation in the left IFG as measured by fMRI (Golestani & Zatorre, 2004; Myers & Blumstein, 2011). At the same time, increased profi ciency in perceiving non-native sounds is accompanied by an increase in the size of the mismatch negativity, or MMN (Ylinen et al., 2010; Zhang et al., 2009), which is thought to

Page 17 of 30

Neural Systems Underlying Speech Perception index preattentive sensitivity to acoustic contrasts and to originate from sources in the bi lateral temporal lobes and inferior frontal gyri. These findings are not necessarily incompatible with one another. The neural sensitivity indexed by the MMN response may reflect more accurate or more efficient encoding of learned phonetic contrasts, particularly in the temporal lobes. In turn, decreases in acti vation in the left IFG for processing non-native speech sounds may be related to greater efficiency and hence fewer neural resources needed to access and make decisions about category representations when speech sound perception is well encoded in the temporal lobes. In fact, there is some evidence suggesting that frontal areas play a more active role in de termining the degree of success that learners attain on non-native contrasts. In an ERP study of bilingualism by Diaz et al. (2008), early Spanish-Catalan bilinguals were grouped by their mastery of a vowel contrast in their second language, Catalan. MMN responses to a non-native and a native vowel contrast as well as to nonspeech stimuli were assessed in a group of “good perceivers” and a group of “poor perceivers.” Although good and poor perceivers did not differ in their MMN response to either the native and non-native speech contrasts or the nonspeech stimuli at electrodes over the temporal poles, signifi cant differences between groups emerged for both types of speech contrasts over frontal electrode sites, with good perceivers showing a larger MMN to both non-native and na tive speech contrasts than poor perceivers. Of interest, some recent research suggests that individual differences in activation for non-native speech sounds may arise at least in part because of differences in brain mor phology. In Heschl’s gyrus, greater volume in the left but not right hemisphere correlates with individual success in learning a non-native tone contrast (Wong et al., 2008) and a non-native place of articulation contrast (Golestani, Molko, Dehaene, LeBihan, & Pallier, 2007). Given that Heschl’s gyrus is involved in auditory processing, these anatomical asymmetries give rise to the hypothesis that better learning of non-native contrasts comes about because some individuals are better able to perceive the fine-grained acoustic details of speech. The question remains, however, of whether individual differences in brain morphology are the cause or consequence of differences in proficiency (p. 519) with non-native contrasts. That is, do preexisting, potentially innate differences in brain morphology support better learning, or do differences in cortical volume arise because of neural plasticity resulting from experience or with the sounds of speech in adulthood? This question was examined in a recent study by Golestani and colleagues (2011) in which differences in brain mor phology were measured in a group of trained phoneticians and a group of untrained con trols. Greater white matter volume in Heschl’s gyrus was evident in the group of phoneti cians compared to the normal controls. However, differences in these early auditory pro cessing areas did not correlate with amount of phonetic training experience among the phoneticians. In contrast, regions in a subportion of the left IFG, the left pars opercularis, showed a significant correlation with years of phonetic training. Taken together, these re Page 18 of 30

Neural Systems Underlying Speech Perception sults suggest that differences in the size and morphology of Heschl’s gyrus may confer an innate advantage in perceiving the fine-grained details of the speech stream. The in creased volume in frontal areas appears instead to reflect experience-dependent plastici ty. Thus, phoneticians may become phoneticians because they have a “natural” propensity for perceiving speech. But their performance is enhanced by experience and the extent to which they become “experts.”

Summary and Future Directions In this chapter, we have reviewed the nature of speech perception processing and the neural systems underlying such processing. As we have seen, the processing of the sounds of language recruits a neural processing stream involving temporal, parietal, and frontal structures. Together they support a functional architecture in which information flow is progressively transformed in stages from the auditory input to spectral-temporal properties of speech, phonetic category representations, and ultimately lexical represen tations. Of importance, patterns of neural activation are modulated throughout the pro cessing stream, suggesting that the system is highly interactive. Although there has been much progress, many questions remain unanswered. In particu lar, the direction of information flow has largely been inferred from our knowledge of the effects of lesions on speech perception processes. Studies that integrate imaging meth ods with fine-grained temporal resolution (ERP, MEG) with the spatial resolution afforded by fMRI will be critical for our understanding of the extent to which feedforward or feed back mechanisms underlie the modulation of activation found, for example, on the influ ence of lexical information on the perception of the acoustic-phonetic properties of speech, the influence of motor and articulatory processes on speech perception, and the influence of sound properties of speech on lexical access. Of particular interest is whether the IFG is involved solely in decision-related executive processes or whether it, in turn, modulates activation of neural areas downstream from it. That is, does IFG acti vation influence the activation of temporal lobe areas in the processes of phonetic catego rization, and does it influence the activation of parietal lobe areas in the processes of lexi cal access? At each level of processing, questions remain. Evidence we have reviewed suggests that the acoustic-phonetic properties of speech are extracted in different areas within the tem poral lobe, and at least at early stages of processing, they differentially recruit the right hemisphere. However, listeners perceive a unitary percept of individual sound segments and syllables. How does the neural system integrate or “bind” this information? As listen ers we are attuned to fine acoustic differences, whether it is to within-category variation or speaker variation, and yet we ignore these differences as we perceive a stable phonet ic category or lexical representation. How does our neural system solve this invariance problem across the different sources of variability encountered by the listener? Results to date suggest that the resolution of this variability may recruit different neural areas, de pending on the type of variability. Page 19 of 30

Neural Systems Underlying Speech Perception And finally, we know that adult listeners show some degree of plasticity in processing lan guage. They can learn new languages and their attendant phonetic and phonological structures, and they show the ability to dynamically adapt to variability in the speech and language input when accessing the sound structure and lexical representations of their native language (see Kraljic & Samuel, 2007, for a review). What are the neural systems underlying this plasticity? Is the same neural system recruited that underlies adult pro cessing of the sound structure of language, or are other areas recruited in support of such learning? These are only a few of the questions remaining to be answered, but together they set an agenda for future research on the neural systems underlying speech perception process es.

Author Note This research was supported in part by NIH NIDCD Grant R01 DC006220, R01 DC00314, and NIH NIDCD Grant R03 DC009495 to Brown University, and NIHDCD P30 DC010751 to the

(p. 520)

University of Connecticut. The content is solely the responsibility of the au

thors and does not necessarily represent the official views of the National Institute on Deafness and Other Communication Disorders or the National Institutes of Health.

References Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998). Tracking the time course of spoken word recognition using eye movements: Evidence for continuous mapping models. Journal of Memory and Language, 38 (4), 419–439. Auerbach, S. H., Allard, T., Naeser, M., Alexander, M. P., & Albert, M. L. (1982). Pure word deafness: Analysis of a case with bilateral lesions and a defect at the prephonemic level. Brain: A Journal of Neurology, 105 (Pt 2), 271–300. Baese-Berk, M., & Goldrick, M. (2009). Mechanisms of interaction in speech production. Language and Cognitive Processes, 24 (4), 527–554. Belin, P., & Zatorre, R. J. (2003). Adaptation to speaker’s voice in right anterior temporal lobe, 14 (16), 2105–2109. Binder, J. R., & Price, C. (2001). Functional neuroimaging of language. In R. Cabeza & A. Kingstone (Eds.), Handbook of functional neuroimaging of cognition (pp. 187–251). Cam bridge, MA: MIT Press. Blumstein, S. E. (2009). Auditory word recognition: Evidence from aphasia and functional neuroimaging. Language and Linguistics Compass, 3, 824–838. Blumstein, S. E., Cooper, W. E., Zurif, E. B., & Caramazza, A. (1977). The perception and production of voice-onset time in aphasia. Neuropsychologia, 15, 371–383. Page 20 of 30

Neural Systems Underlying Speech Perception Blumstein, S. E., & Milberg, W. 2000. Language deficits in Broca’s and Wernicke’s apha sia: A singular impairment. In Y. Grodzinsky, L. Shapiro, & D. Swinney (Eds.), Language and the brain: Representation and processing (pp. 167–183). New York: Academic Press. Blumstein, S. E., Myers, E. B., & Rissman, J. (2005). The perception of voice onset time: An fMRI investigation of phonetic category structure. Journal of Cognitive Neuroscience, 17 (9), 1353–1366. Blumstein, S. E., & Stevens, K. N. (1980). Perceptual invariance and onset spectra for stop consonants in different vowel environments. Journal of the Acoustical Society of America, 67 (2), 648–662. Boatman, D. F., & Miglioretti, D. L. (2005). Cortical sites critical for speech discrimination in normal and impaired listeners. Journal of Neuroscience, 25 (23), 5475–5480. Bonte, M., Valente, G., & Formisano, E. (2009). Dynamic and task-dependent encoding of speech and voice by phase reorganization of cortical oscillations. Journal of Neuroscience, 29 (6), 1699. Bradlow, A. R., Pisoni, D. B., Akahane-Yamada, R., & Tohkura, Y. (1997). Training Japan ese listeners to identify English /r/ and /l/: IV. Some effects of perceptual learning on speech production. Journal of the Acoustical Society of America, 101 (4), 2299–2310. Britton, B., Blumstein, S. E., Myers, E. B., & Grindrod, C. (2009). The role of spectral and durational properties on hemispheric asymmetries in vowel perception. Neuropsychologia, 47 (4), 1096–1106. Buckner, R. L., Raichle, M. E., & Petersen, S. E. (1995). Dissociation of human prefrontal cortical areas across different speech production tasks and gender groups. Journal of Neurophysiology, 74 (5), 2163–2173. Burton, M. W. (2001). The role of inferior frontal cortex in phonological processing. Cogni tive Science, 25, 695–709. Burton, M. W. (2009). Understanding the role of the prefrontal cortex in phonological pro cessing. Clinical Linguistics and Phonetics, 23 (3), 180–195. Burton, M. W., Baum, S. R., & Blumstein, S. E. (1989). Lexical effects on the phonetic cat egorization of speech: The role of acoustic structure. Journal of Experimental Psychology: Human Perception and Performance, 15, 567–575. Burton, M. W., Small, S. L., & Blumstein, S. E. (2000). The role of segmentation in phono logical processing: An fMRI investigation. Journal of Cognitive Neuroscience, 12 (4), 679– 690. Caplan, D., Gow, D., & Makris, N. (1995). Analysis of lesions by MRI in stroke patients with acoustic-phonetic processing deficits. Neurology, 45, 293–298.

Page 21 of 30

Neural Systems Underlying Speech Perception Chandrasekaran, B., Chan, A. H. D., & Wong, P. C. M. (2011). Neural processing of what and who information in speech. Journal of Cognitive Neuroscience, 2 (10), 2690–2700. Chang, E. F., Rieger, J. W., Johnson, K., Berger, M. S., Barbaro, N. M., & Knight, R. T. (2010). Categorical speech representation in human superior temporal gyrus. Nature Neuroscience, 13 (11), 1428–1432. Connine, C. M., & Clifton, C. (1987). Interactive use of lexical information in speech per ception. Journal of Experimental Psychology: Human Perception and Performance, 13, 291–299. Coslett, H. B., Brashear, H. R., & Heilman, K. M. (1984). Pure word deafness after bilater al primary auditory cortex infarcts. Neurology, 34 (3), 347–352. Davis, M. H., Ford, M. A., Kherif, F., & Johnsrude, I. S. (2011). Does semantic context ben efit speech understanding through “top–down” processes? Evidence from time-resolved sparse fMRI. Journal of Cognitive Neuroscience, 23 (12), 3914–3932. Dell, G. S. (1986). A spreading activation theory of retrieval in sentence production. Psy chological Review, 93, 283–321. Desai, R., Liebenthal, E., Waldron, E., & Binder, J. R. (2008). Left posterior temporal re gions are sensitive to auditory categorization. Journal of Cognitive Neuroscience, 20 (7), 1174–1188. Diaz, B., Baus, C., Escera, C., Costa, A., & Sebastian-Galles, N. (2008). Brain potentials to native phoneme discrimination reveal the origin of individual differences in learning the sounds of a second language. Proceedings of the National Academy of Science U S A, 105 (42), 16083–16088. Duncan, J. (2001). An adaptive model of neural function in prefrontal cortex. Nature Re views Neuroscience, 2, 820–829. Duncan, J., & Owen, A. M. (2000). Common regions of the human frontal lobe recruited by diverse cognitive demands. Trends in Neurosciences, 23, 475–483. Fiez, J. A. (1997). Phonology, semantics, and the role of the left inferior prefrontal cortex. Human Brain Mapping, 5 (2), 79–83. Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Formisano, E., De Martino, F., Bonte, M., & Goebel, R. (2008). “Who” is saying “what”? Brain-based decoding of human voice and speech. Science, 322, 970–973. Fox, R. A. (1984). Effect of lexical status on phonetic categorization. Journal of Experi mental Psychology: Human Perception and Performance, 10, 526–540.

Page 22 of 30

Neural Systems Underlying Speech Perception Frye, R. E., Fisher, J. M. G., Witzel, T., Ahlfors, S. P., Swank, P., Liederman, J., & Halgren, E. (2008). Objective phonological and subjective perceptual characteristics of syllables modulate spatiotemporal patterns of superior temporal gyrus activity. NeuroI mage, 40 (4), 1888–1901. (p. 521)

Frye, R. E., Fisher, J. M., Coty, A., Zarella, M., Liederman, J., & Halgren, E. (2007). Linear coding of voice onset time. Journal of Cognitive Neuroscience, 19 (9), 1476–1487. Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). The motor theory of speech percep tion reviewed. Psychonomic Bulletin and Review, 13 (3), 361–377. Gandour, J., Wong, D., Hsieh, L., Weinzapfel, B., Van Lancker, D., & Hutchins, G. D. (2000). A crosslinguistic PET study of tone perception. Journal of Cognitive Neuroscience, 12 (1), 207–222. Ganong, W. F. (1980). Phonetic categorization in auditory word perception. Journal of Ex perimental Psychology: Human Perception and Performance, 6 (1), 110–125. Gaskell, M. G., & Marslen-Wilson, W. D. (1997). Integrating form and meaning: A distrib uted model of speech perception. Language and Cognitive Processes, 12, 613–656. Gaskell, M. G., & Marslen-Wilson, W. D. (1999). Ambiguity, competition, and blending in spoken word recognition. Cognitive Science, 23 (4), 439–462. Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. Psycholog ical Review, 105 (2), 251–278. Goldrick, M., & Blumstein, S. E. (2006). Cascading activation from phonological planning to articulatory processes: Evidence from tongue twisters. Language and Cognitive Processes, 21, 649–683. Golestani, N., Molko, N., Dehaene, S., LeBihan, D., & Pallier, C. (2007). Brain structure predicts the learning of foreign speech sounds. Cerebral Cortex, 17 (3), 575. Golestani, N., Price, C. J., & Scott, S. K. (2011). Born with an ear for dialects? Structural plasticity in the expert phonetician brain. Journal of Neuroscience, 31 (11), 4213–4220. Golestani, N., & Zatorre, R. J. (2004). Learning new sounds of speech: Reallocation of neural substrates, 21 (2), 494–506. Gow, D. W., Segawa, J. A., Ahlfors, S. P., & Lin, F.-H. (2008). Lexical influences on speech perception: A Granger causality analysis of MEG and EEG source estimates. NeuroImage, 43 (3), 614–623. Grill-Spector, K., Henson, R., & Martin, A. (2006). Repetition and the brain: Neural mod els of stimulus-specific effects. Trends in Cognitive Sciences, 10 (1), 14–23.

Page 23 of 30

Neural Systems Underlying Speech Perception Grill-Spector, K., & Malach, R. (2001). fMR-adaptation: A tool for studying the functional properties of human cortical neurons. Acta Psychologica (Amsterdam), 107 (1-3), 293– 321. Guediche, S., Salvata, C., & Blumstein S. E. (2013). Temporal cortex reflects effects of sentence context on phonetic processing. Journal of Cognitive Neuroscience, 25 (5), 706– 718. Guenther, F. H., Nieto-Castanon, A., Ghosh, S. S., & Tourville, J. A. (2004). Representation of sound categories in auditory cortical maps. Journal of Speech, Language, and Hearing Research, 47 (1), 46–57. Hickok, G. (2009). The functional neuroanatomy of language. Physics of Life Reviews, 6, 121–143. Hickok, G., & Poeppel, D. (2004). Dorsal and ventral streams: A framework for under standing aspects of the functional anatomy of language. Cognition, 92 (1-2), 67–99. Hutchison, E. R., Blumstein, S. E., & Myers, E. B. (2008). An event-related fMRI investiga tion of voice-onset time discrimination. NeuroImage, 40 (1), 342–352. Indefrey, P., & Levelt, W. J. M. (2004). The spatial and temporal signatures of word pro duction components, Cognition, 92 (1–2), 101–144. Joanisse, M. F., Zevin, J. D., & McCandliss, B. D. (2007). Brain mechanisms implicated in the preattentive categorization of speech sounds revealed using fMRI and a shortinterval habituation trial paradigm. Cerebral Cortex, 17 (9), 2084–2093. Kasai, K., Yamada, H., Kamio, S., Nakagome, K., Iwanami, A., Fukuda, M., Itoh, K., Koshi da, I., Yumoto, M., Iramina, K., Kato, N., & Ueno, S. (2001). Brain lateralization for mis match response to across- and within-category change of vowels. NeuroReport, 12 (11), 2467–2471. Kraljic, T., & Samuel, A. G. (2005). Perceptual learning for speech: Is there a return to normal? Cognitive Psychology, 51 (2), 141–178. Kraljic, T., & Samuel, A. G. (2007). Perceptual adjustments to multiple speakers. Journal of Memory and Language, 56, 1–15. Krishnan, A., Swaminathan, J., & Gandour, J. T. (2009). Experience-dependent enhance ment of linguistic pitch representation in the brainstem is not specific to a speech con text. Journal of Cognitive Neuroscience, 21 (6), 1092–1105. Kuhl, P. K. (1991). Human adults and human infants show a “perceptual magnet effect” for the prototypes of speech categories, monkeys do not. Perception and Psychophysics, 50 (2), 93–107.

Page 24 of 30

Neural Systems Underlying Speech Perception Leff, A. P., Iverson, P., Schofield, T. M., Kilner, J. M., Crinion, J. T., Friston, K. J., & Price, C. J. (2009). Vowel-specific mismatch responses in the anterior superior temporal gyrus: An fMRI study. Cortex, 45 (4), 517–526. Levelt, W. J. M (1992). Accessing words in speech production: Stages, processes, and rep resentations. Cognition, 42, 1–22. Liberman, A. M., Delattre, P. C., & Cooper, F. S. (1958). Some cues for the distinction be tween voiceless and voiced stops in initial position. Language and Speech, 1, 153–157. Liberman, A. M., Harris, K. S., Hoffman, H. S., & Griffith, B. C. (1957). The discrimination of speech sounds within and across phoneme boundaries. Journal of Experimental Psy chology, 54 (5), 358–368. Liberman, A. M., & Mattingly, I. G. (1985). The motor theory of speech perception re vised. Cognition, 21 (1), 1–36. Liebenthal, E., Desai, R., Ellingson, M. M., Ramachandran, B., Desai, A., & Binder, J. R. (2010). Specialization along the left superior temporal sulcus for auditory categorization. Cerebral Cortex, 20, 2958–2970. Liégeois-Chauvel, C., de Graaf, J. B., Laguitton, V., & Chauvel, P. (1999). Specialization of left auditory cortex for speech perception in man depends on temporal coding. Cerebral Cortex, 9 (5), 484–496. Lisker, L., & Abramson, A. S. (1964). A cross-language study of voicing in initial stops: Acoustical measurements. Word, 20, 384–422. Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activa tion model. Ear and Hearing, 19 (1), 1–36. Marslen-Wilson, W. D. (1987). Functional parallelism in spoken word-recognition. Cogni tion, 25 (1-2), 71–102. Marslen-Wilson, W. D., & Welsh, A. (1978). Processing interactions and lexical access dur ing word recognition in continuous speech. Cognitive Psychology, 10 (1), 29–63. McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech perception. Cogni tive Psychology, 18 (1), 1–86. McGee, T., Kraus, N., King, C., Nicol, T., & Carrell, T. D. (1996). Acoustic elements of speechlike stimuli are reflected in surface recorded responses over the guinea pig tem (p. 522)

poral lobe. Journal of the Acoustical Society of America, 99 (6), 3606–3614. McQueen, J. M. (1991). The influence of the lexicon on phonetic categorization: Stimulus quality in word-final ambiguity. Journal of Experimental Psychology: Human Perception and Performance, 17, 433–443.

Page 25 of 30

Neural Systems Underlying Speech Perception Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. An nual Review of Neuroscience, 24, 167–202. Miller, J. L., & Volaitis, L. E. (1989). Effect of speaking rate on the perceptual structure of a phonetic category. Perception and Psychophysics, 46 (6), 505–512. Mottonen, R., & Watkins, K. E. (2009). Motor representations of articulators contribute to categorical perception of speech sounds. Journal of Neuroscience, 29 (31), 9819. Myers, E. B. (2007). Dissociable effects of phonetic competition and category typicality in a phonetic categorization task: An fMRI investigation. Neuropsychologia, 45 (7), 1463– 1473. Myers, E. B., & Blumstein, S. E. (2008). The neural bases of the lexical effect: An fMRI in vestigation. Cerebral Cortex, 18 (2), 278. Myers, E. B., & Blumstein, S. E. (2011). Individual differences in neural sensitivity to a novel phonetic contrast. Unpublished manuscript. Myers, E. B., Blumstein, S. E., Walsh, E., & Eliassen, J. (2009). Inferior frontal regions un derlie the perception of phonetic category invariance. Psychological Science, 20 (7), 895– 903. Naatanen, R., Lehtokoski, A., Lennes, M., Cheour, M., Huotilainen, M., Iivonen, A., Vainio, M., et al. (1997). Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature, 385 (6615), 432–434. Naatanen, R., Paavilainen, P., Rinne, T., & Alho, K. (2007). The mismatch negativity (MMN) in basic research of central auditory processing: a review. Clinical Neurophysiolo gy, 118 (12), 2544–2590. Obleser, J., Boecker, H., Drzezga, A., Haslinger, B., Hennenlotter, A., Roettinger, M., Eu litz, C., et al. (2006). Vowel sound extraction in anterior superior temporal cortex. Human Brain Mapping, 27 (7), 562–571. Obleser, J., Elbert, T., Lahiri, A., & Eulitz, C. (2003). Cortical representation of vowels re flects acoustic dissimilarity determined by formant frequencies. Cognitive Brain Research, 15 (3), 207–213. Obleser, J., Wise, R. J. S., Dresner, M. A., & Scott, S. K. (2007). Functional integration across brain regions improves speech perception under adverse listening conditions. Jour nal of Neuroscience, 27 (9), 2283–2289. Okada, K., & Hickok, G. (2006). Identification of lexical-phonological networks in the su perior temporal sulcus using functional magnetic resonance imaging. NeuroReport, 17 (12), 1293–1296.

Page 26 of 30

Neural Systems Underlying Speech Perception Pallier, C., Bosch, L., & Sebastián-Gallés, N. (1997). A limit on behavioral plasticity in speech perception. Cognition, 64 (3), B9–B17. Palmeri, T. J., Goldinger, S. D., & Pisoni, D. B. (1993). Episodic encoding of voice attribut es and recognition memory for spoken words. Learning, Memory, 19 (2), 309–328. Papanicolaou, A. C., Castillo, E., Breier, J. I., Davis, R. N., Simos, P. G., & Diehl, R. L. (2003). Differential brain activation patterns during perception of voice and tone onset time series: a MEG study. NeuroImage, 18 (2), 448–459. Paulesu, E., Frith, C. D., & Frackowiak, R. S. J. (1993). The neural correlates of the verbal component of working memory. Nature, 362 (6418), 342–345. Peramunage, D., Blumstein, S. E., Myers, E. B., Goldrick, M., & Baese-Berk, M. (2011). Phonological neighborhood effects in spoken word production: An fMRI study. Journal of Cognitive Neuroscience, 23 (3), 593–603. Peterson, G. E., & Barney, H. L. (1952). Control methods used in a study of vowels. Jour nal of the Acoustical Society of America, 24, 175–184. Pisoni, D. B., & Tash, J. (1974). Reaction times to comparisons within and across phonetic categories. Perception and Psychophysics, 15, 289–290. Pitt, M.A., & Samuel, A.G. (1993). An empirical and meta-analytic evaluation of the phoneme identification task. Journal of Experimental Psychology: Human Perception and Performance, 19, 699–725. Poeppel, D. (2001). Pure word deafness and the bilateral processing of the speech code. Cognitive Science, 25 (5), 679–693. Poldrack, R. A. (2006). Can cognitive processes be inferred from neuroimaging data? Trends in Cognitive Sciences, 10, 59–63. Poldrack, R. A., Wagner, A. D., Prull, M. W., Desmond, J. E., Glover, G. H., & Gabrieli, J. D. (1999). Functional specialization for semantic and phonological processing in the left in ferior prefrontal cortex. NeuroImage, 10 (1), 15–35. Prabhakaran, R., Blumstein, S. E., Myers, E. B., Hutchison, E., & Britton, B. (2006). An event-related fMRI investigation of phonological-lexical competition. Neuropsychologia, 44 (12), 2209–2221. Remez, R. E., Rubin, P. E., Pisoni, D. B., & Carrell, T. D. (1981). Speech perception with out traditional speech cues. Science, 212 (4497), 947–949. Righi, G., Blumstein, S. E., Mertus, J., & Worden, M. S. (2010). Neural systems underlying lexical competition: An eye tracking and fMRI study. Journal of Cognitive Neuroscience, 22 (2), 213–224.

Page 27 of 30

Neural Systems Underlying Speech Perception Salvata, C., Blumstein, S. E., & Myers, E. B. (2012). Speaker invariance for phonetic infor mation: An fMRI investigation, Language and Cognitive Processes, 27 (2), 210–230. Shankweiler, D., & Studdert-Kennedy, M. (1967). Identification of consonants and vowels presented to left and right ears. Quarterly Journal of Experimental Psychology, 19, 59–63. Sharma, A., & Dorman, M. F. (1999). Cortical auditory evoked potential correlates of cate gorical perception of voice-onset time. Journal of the Acoustical Society of America, 106 (2), 1078–1083. Smith, E. E., & Jonides, J. (1999). Storage and executive processes in the frontal lobes. Science, 283, 1657–1661. Snyder, H. R., Feignson, K., & Thompson-Schill, S. L. (2007). Prefrontal cortical response to conflict during semantic and phonological tasks. Journal of Cognitive Neuroscience, 19, 761–775. Steinschneider, M., Schroeder, C. E., Arezzo, J. C., & Vaughan, H. G. (1995). Physiologic correlates of the voice onset time boundary in primary auditory cortex (A1) of the awake monkey: Temporal response patterns. Brain and Language, 48 (3), 326–340. Steinschneider, M., Volkov, I. O., Noh, M. D., Garell, P. C., & Howard, M. A. (1999). Tem poral encoding of the voice onset time phonetic parameter by field potentials recorded di rectly from human auditory cortex. Journal of Neurophysiology, 82 (5), 2346–2357. Stevens, K. N. (1960). Toward a model for speech recognition. Journal of the Acoustical Society of America, 32 (1), 47–55. (p. 523)

Stevens, K. N., & Blumstein, S. E. (1978). Invariant cues for place of articulation in stop consonants. Journal of the Acoustical Society of America, 64, 1358–1368. Strange, W. (1989). Evolving theories of vowel perception. Journal of the Acoustical Soci ety of America, 85 (5), 2081–2087. Strange, W., Jenkins, J., & Johnson, T. L. (1983). Dynamic specification of coarticulated vowels. Journal of the Acoustical Society of America, 74 (3), 697–705. Studdert-Kennedy, M., & Shankweiler, D. (1970). Hemispheric specialization for speech perception. Journal of the Acoustical Society of America, 48, 579–594. Tallal, P., & Newcombe, F. (1978). Impairment of auditory perception and language com prehension in aphasia. Brain and Language, 5, 13–24. Thompson-Schill, S. L., D’Esposito, M., Aguirre, G. K., & Farah, M. J. (1997). Role of the left inferior prefrontal cortex in retrieval of semantic knowledge: A reevaluation. Proceed ings of the National Academy of Sciences U S A, 94, 14792–14797.

Page 28 of 30

Neural Systems Underlying Speech Perception Thompson-Schill, S. L., D’Esposito, M., & Kan, I. P. (1999). Effects of repetition and com petition on activation in left prefrontal cortex during word generation. Neuron, 23, 513– 522. Utman, J. A., Blumstein, S. E., & Sullivan, K. (2001). Mapping from sound to meaning: Re duced lexical activation in Broca’s aphasics, Brain and Language, 79, 444–472. von Kriegstein, K., Eger, E., Kleinschmidt, A., & Giraud, A. L. (2003). Modulation of neur al responses to speech by directing attention to voices or verbal content. Cognitive Brain Research, 17 (1), 48–55. Wildgruber, D., Pihan, H., Ackermann, H., Erb, M., & Grodd, W. (2002). Dynamic brain ac tivation during processing of emotional intonation: Influence of acoustic parameters, emotional valence, and sex. NeuroImage, 15 (4), 856–869. Wilson, S. M., & Iacoboni, M. (2006). Neural responses to non-native phonemes varying in producibility: Evidence for the sensorimotor nature of speech perception. NeuroImage, 33 (1), 316–325. Wilson, S. M., Saygin, A. P., Sereno, M. I., & Iacoboni, M. (2004). Listening to speech acti vates motor areas involved in speech production. Nature Neuroscience, 7 (7), 701–702. Wong, P. C. M., Nusbaum, H. C., & Small, S. L. (2004). Neural bases of talker normaliza tion. Journal of Cognitive Neuroscience, 16 (7), 1173–1184. Wong, P., Warrier, C. M., Penhune, V. B., Roy, A. K., Sadehh, A., Parrish, T. B., & Zatorre, R. J. (2008). Volume of left Heschl’s gyrus and linguistic pitch learning. Cerebral Cortex, 18 (4), 828–836. Xu, Y., Gandour, J., Talavage, T., Wong, D., Dzemidzic, M., Tong, Y., Li, X., et al. (2006). Ac tivation of the left planum temporale in pitch processing is shaped by language experi ence. Human Brain Mapping, 27 (2), 173–183. Yee, E., Blumstein, S. E., & Sedivy, J. C. (2008). Lexical-semantic activation in Broca’s and Wernicke’s aphasia: Evidence from eye movements. Journal of Cognitive Neuroscience, 20, 592–612. Ylinen, S., Uther, M., Latvala, A., Vepsäläinen, S., Iverson, P., Akahane-Yamada, R., & Näätänen, R. (2010). Training the brain to weight speech cues differently: A study of Finnish second-language users of English. Journal of Cognitive Neuroscience, 22 (6), 1319–1332. Zatorre, R. J., Belin, P., & Penhune, V. B. (2002). Structure and function of auditory cortex: Music and speech. Trends in Cognitive Sciences, 6 (1), 37–46. Zevin, J. D., Yang, J., Skipper, J. I., & McCandliss, B. D. (2010). Domain general change de tection accounts for “dishabituation” effects in temporal-parietal regions in functional

Page 29 of 30

Neural Systems Underlying Speech Perception magnetic resonance imaging studies of speech perception. Journal of Neuroscience, 30 (3), 1110. Zhang, Y., Kuhl, P. K., Imada, T., Iverson, P., Pruitt, J., Stevens, E. B., Kawakatsu, M., et al. (2009). Neural signatures of phonetic learning in adulthood: A magnetoencephalography study. NeuroImage, 46 (1), 226–240.

Sheila Blumstein

Sheila Blumstein is the Albert D. Mead Professor of Cognitive, Linguistic and Psycho logical Sciences at Brown University. Emily B. Myers

Emily B. Myers, Department of Psychology, University of Connecticut, Storrs, CT

Page 30 of 30

Multimodal Speech Perception

Multimodal Speech Perception Agnès Alsius, Ewen MacDonald, and Kevin Munhall The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0026

Abstract and Keywords Spoken language can be understood through different sensory modalities. Audition, vi sion, and haptic perception each can transduce speech information from a talker as a sin gle channel of information. The more natural context for communication is for language to be perceived through multiple modalities and for multimodal integration to occur. This chapter reviews the sensory information provided by talkers and the constraints on multi modal information processing. The information generated during speech comes from a common source, the moving vocal tract, and thus shows significant correlations across modalities. In addition, the modalities provide complementary information for the per ceiver. For example, the place of articulation of speech sounds is conveyed more robustly by vision. These factors explain the fact that multisensory speech perception is more ro bust and accurate than unisensory perception. The neural networks responsible for this perceptual activity are diverse and still not well understood. Keywords: sensory modalities, spoken language, multisensory speech perception

Multisensory Integration: Cross-Talk Between the Senses Evolution has equipped organisms with multiple senses, each one conveying a unique and particular aspect of the outside world. For example, when we eat we not only perceive gustatory and olfactory information of the food (Small et al., 2004), but we also often ap preciate its visual appearance (imagine having to eat a green steak!), feel its texture in our mouths (i.e., somatosensation), or even hear the sounds it produces when we chew it. One important challenge facing researchers is to understand how the brain orchestrates the processing of all this multisensory information, synthesizing the massive and constant influx into coherent representations to produce a single perceptual reality.

Page 1 of 56

Multimodal Speech Perception The brain’s ability to combine sensory information from multiple modalities into a single, unified percept is a key feature of organisms’ successful interaction with the external world. Psychophysical studies have demonstrated that multisensory integration results in perceptual enhancement by reducing ambiguity and therefore enhancing our ability to re act to external events (Ernst & Banks, 2002; Welch & Warren, 1986). Furthermore, given that information specified in each sensory modality reflects different features of the same stimulus, the integration of sensory inputs provides complementary information about the environment, increasing the likelihood of its accurate identification (O’Hare, 1991). In short, multisensory integration produces a much richer and a more diverse sensory expe rience than that offered by each sensory modality in isolation (Stein & Meredith, 1993). Whereas multisensory redundancy is known to be highly advantageous, its benefits can only arise if the information carried along the different modalities is perceived as belong ing to the same external (p. 525) object or event. How this process is accomplished, given that information from the various sensory systems and transducers differs in resolution, neural processing time, and type of physical energy, has been the focus of much behav ioral and neuroscientific research (see Calvert et al., 2004, for a review). Across all com binations of sensory modalities, both structural factors, such as temporal and spatial proximity, and cognitive factors, such as semantic congruency (i.e., “appropriateness” of intermodality match) have been defined as major determinants for co-registration. That is, semantically congruent stimuli occurring at an approximately coincident time and space have greater chances to be considered as originating from a common external source, and therefore integrated by the perceptual system (see King, 2005). Interpersonal communication provides a rich perceptual environment with information being available across the different sensory modalities. Although the auditory signal by it self is often sufficient for accurate speech comprehension, in face-to-face conversations the sight of the talker conveys an additional source of information that can be actively used by the listener during speech perception. Speech perceivers are, in the words of Carol Fowler (2004, p.189), “informational omnivores:” When we see someone speaking, our processing is not limited to the linguistic analysis of the words, but also encompasses the perception of visible speech movements and other nonverbal information such as fa cial expression, body movements, posture, gaze, manual gestures, and the tone and the timing of the voice. This broad spectrum of audiovisual information is produced in paral lel by the talker and must be processed and integrated by the listener in order to grasp the talker’s full intent. Thus human communication, in its most natural form (i.e., face to face), is multisensorial and multidimensional (Figure 26.1).

Page 2 of 56

Multimodal Speech Perception

Figure 26.1 A portrait of Helen Keller (seated on the left), Alexander Graham Bell (seated on the right), and Anne Sullivan (standing). This remarkable photo graph shows communication happening simultane ously in different modalities. Bell is listening to Sulli van while Keller performs tactile lip reading. Bell and Sullivan are also watching Keller, who is also using tactile finger spelling with Bell. From Parks Canada; Cultural Heritage Image Gallery.

Given the multiple levels of audiovisual speech, a comprehensive explanation of what in formation is available for speech, therefore, requires a more specific statement of what unit of communication is being examined. For example, if the level of communication is the message and its meaning, then global features, such as posture and tone of speech, may be important. At times, how you say something—not what you say—conveys the im portant information. At a more micro level, the unit might be the individual sound or word, a dimension that, again, can be acquired acoustically (i.e., speech sounds) as well as optically (i.e., articulatory movements). To a great extent, this more fine-grained analy sis of speech has been the major focus of research, and it will be the level at which we consider the existing literature.

Speech Production: Inherent Link Between Ar ticulatory Movements and Vocal Acoustics The production of speech is a mixture of cognitive-linguistic planning and the biophysics of sound production. Both levels leave their trace in the acoustics of speech. Since the middle of the last century, research has depicted sound production as an interaction be tween the sources of sound and their filtering by the vocal tract resonances. For vowels and some consonants, the primary source of sound is the vocal folds vibrating. As the Page 3 of 56

Multimodal Speech Perception sound propagates through the vocal tract, the spatial configuration of the articulator structures (e.g., velum, palate, tongue, lips) creates resonances that give each sound its final characteristic spectral shape. Thus, when talking, the articulators must move to achieve the (p. 526) spatial-temporal configurations necessary for each speech sound. Such time-varying configurations of the vocal tract not only deform the immediate re gions around the oral aperture but also encompass a much larger region of the face (Pre minger et al., 1998; Thomas & Jordan, 2004; Yehia et al., 1998). Because some of the ar ticulators of the oral aperture are visible along with their distributed correlates around the face, there is an inherent relationship between the acoustic speech sounds and their visible generators.

Properties and Perception of the Auditory Speech Signal Spoken language comprehension is one of our most complex and striking cognitive abili ties: A noisy, incomplete, and usually ambiguous acoustic waveform must be parsed to lin guistic representational units, which will ultimately allow extraction and decoding of the meaning intended by the talker (Pisoni & Luce, 1987). A major challenge in speech per ception, therefore, has been to provide an account of how these mappings (i.e., from a continuously varying speech waveform onto the discrete linguistic units such as phonemes; and from discrete units to semantics) are accomplished. Following the devel opment of the sound spectrograph1 (Joos, 1948; Potter et al., 1947), much of the study of speech perception focused on the acoustic invariants of the complex signal underling the perception of individual sound segments or phonemes (see Raphael, 2005, for a review of this body of work). This research failed to find consistent correspondences between phys ical properties of the acoustic signal and the perceived phonemes (Miller & Eimas, 1995). Rather, the speech signal is continuous and exhibits a considerable amount of variability (i.e., the “lack of invariance”). One of the sources of this variability is coarticulation: In natural, fluent speech, given the obvious need for rapid transitions from one articulatory configuration to another, the acoustic signal at any particular time reflects not only the current segment being produced but also previous and subsequent segments. For in stance, even though /di/ and /du/ audibly share the /d/ phoneme, the acoustical character istics (e.g., formant transitions) of /d/ vary considerably depending on the following vowel (Liberman et al., 1954). Furthermore, in addition to coarticulation, differences between talkers’ vocal tracts, speaking rate, and dialect, along with variation in social contexts and environmental characteristics (e.g., reverberation, noise, transmission media), can produce large changes in the acoustic properties of speech (Klatt, 1986). The fact that lis teners are able to identify phonemes categorically despite such acoustic/phonemic vari ability suggests that acoustic signal is replete with numerous redundant features, which vary with the context of the spoken utterance. Individually, none of these acoustic cues would be necessary or sufficient to signal the phonemic identity (Liberman et al., 1967). For instance, Lisker (1986) identified sixteen possible cues to the opposition of stop con sonants (e.g., /b/ and /p)/ in an intervocalic lexical context (e.g., rabid vs. rapid). In gener Page 4 of 56

Multimodal Speech Perception al, however, a few basic features can be indexed to signal place, manner, and voicing of consonants or frontness versus backness and height of vowels. The absence of reliable acoustic boundary markers in phoneme segments led a number of researchers to abandon the notion that a phonemic level of representation is activated during speech processing. Alternative accounts proposed, among others, syllables (see Dupoux, 1993, for a review), context-sensitive allophones (Wickelgren, 1969), and articu latory gestures (Fowler, 1986; Liberman et al., 1967, Liberman & Mattingly, 1985)2 as the minimal units that the listeners could use to parse the incoming input. Some alternative models have even proposed that lexical representations are compared directly with a transformed speech signal, with no intermediate stages of processing (Klatt, 1989). According to the great majority of psycholinguistic models of word recognition, therefore, listeners would extract prelexical representational units (in whatever form) that would then be matched to representations of words stored in long-term memory. These segmen tal cues would also convey information about a word’s functional, semantic, and syntactic roles that would help the listener to parse and interpret the utterance. Prosodic informa tion (i.e., suprasegmental features such as rhythm, stress, and intonation of speech) is al so processed to define word, phrase, and sentence boundaries, stress patterns, and syn tactic structure (Cutler & Butterfield, 1992; Soto-Faraco et al., 2001). Besides the linguis tic structure of the message, suprasegmental features also provide information about cer tain pragmatic aspects of the conversational situation, such as the emotional state or communicative intent. Finally, when considering the mapping from prelexical segments to words stored in the mental lexicon, some models have proposed that information only flows in one direction: from sounds to words, without any backward influence (e.g., autonomous models such as the merge model; (p. 527) Norris et al., 2000). According to other interactive models, how ever, the flow of information between prelexical and lexical stages of processing is bidi rectional, with top-down feedback from lexical forms to earlier acoustic and phonemic processing contributing to word recognition (e.g., TRACE model; McClelland & Elman, 1986). The debate between autonomous and interactive models of word recognition has become one central topic in the psychology of speech and seems to be far from being re solved (see Norris et al., 1995, for a review).

Properties and Perception of the Visual Speech Signal In comparison to what is now known about speech perception, the study of the specific contribution of visual information to speech processing in the absence of sound (also called speech reading or lip reading)3 has a more limited history, with far fewer studies of its features and its perception. Relative to auditory speech stimuli that are presented un der good listening conditions, the ability to discriminate words from sight alone is gener ally rather limited (Bernstein et al., 1998) and can be influenced by factors such as lin Page 5 of 56

Multimodal Speech Perception guistic context (e.g., Rönnberg et al., 1998; Samuelsson & Rönnberg, 1993) or training. Although there is great individual variability in the ability to speech-read, some highly skilled individuals are capable of reaching high comprehension levels (e.g., Andersson & Lidestam, 2005). The impoverishment of visual phonetic signals relative to acoustic phonetic signals is mainly due to the ambiguity of the visual patterns. Even though visual speech contains highly salient cues to certain critical aspects of speech (e.g., place of articulation, vowel rounding), other aspects of speech (e.g., manner of articulation, voicing) are realized by articulators usually hidden from view (MacLeod & Summerfield, 1987). For example, the vibrations of the vocal folds, a critical feature to distinguish voiced and unvoiced conso nants, cannot be perceived through the visual modality (Jackson, 1988; though see Mayer et al., 2011). Furthermore, those phonemes that can be easily seen are often indistin guishable from each other because the places of articulation may be very closely located (Lidestam & Beskow, 2006). As a result, only clusters of phonemes can be visually distin guished from each other. These are referred to in the literature as visemes (Fisher, 1968). Differences in clarity of articulation across talkers and the external conditions in which the sounds are produced influence the visual information that is present. Thus, there is a lack of agreement among researchers regarding the exact grouping of phonemes into visemes. Depending on the study, the approximately forty-two phonemes of American English have been grouped into as few as five to as many as fifteen visemes (see Jackson, 1988). Some clusters, however, have been consistently defined in all these studies. For ex ample, the phonemes /p/, /b/, and/m/ (bilabial group) are articulated at the same place (lips) and appear the same visually (Auer & Bernstein, 1997; Massaro, 1998; Summer field, 1987). Consonants and vowels have been shown to map onto distinct visemes (Campbell & Massaro, 1997; Owens & Blazek, 1985; Rosenblum & Saldaña, 1998). Because of this high degree of visual confusability, it could be expected to find that many words are indistinguishable from each other on the basis of the visual information alone (a phenomenon called homophony; Berger, 1972; Nitchie, 1916). However, some studies have shown that a reduction in phonemic distinctiveness does not necessarily imply a loss of word recognition in visual speech perception. Indeed, lexical intelligibility during speech reading is much better than would be expected on the basis of the visemic reper toire alone (Auer, 2002; Auer & Bernstein, 1997; Bernstein et al., 1997; Mattys et al., 2002). Additional information such as knowledge of phonotactic constraints (i.e., phoneme patterns that constitute words in the language; Auer & Bernstein, 1997), re duced lexical density (i.e., the degree of visual similarity to other words in the lexicon), high frequency of occurrence (Auer, 2009), and semantic predictability (Gagné et al., 1991) may allow individuals to identify a word even in a reduced phonetic representation. Visual primitives of speech have also been described based on time-varying features of ar ticulation (see Jackson, 1988; Summerfield, 1987). For instance, Rosenblum & Saldaña (1996) showed that isolated visual time-varying information of the speech event is suffi cient for its identification. In fact, recent research suggests that the information that can be retrieved from the visual speech may be much more detailed than previously thought. Page 6 of 56

Multimodal Speech Perception Growing evidence shows that subtle jaw, lip, and cheek movements can be perceived by a viewer, thus allowing finer distinctions within gestures belonging to the same viseme (Vatikiotis-Bateson et al., 1996; Yehia et al., 2002). This suggests that visible speech fea tures can be described along kinematic as well as static dimensions, a point to which we shall return in a following section (see later section, Correlated and Complementary Na ture of Facial Movements and Vocal Acoustics). Besides providing information of the phonemes, visual speech cues have been shown to improve the recognition of prosodic aspects of the message. For example, phonologically similar languages can be discriminated on the basis of the visual informa tion alone, possibly owing to differences in the rhythmic pattern of the languages (SotoFaraco et al., 2007; Weikum et al., 2007). Similarly, other prosodic dimensions, such as emphatic stress, sentence intonation (Fisher et al., 1969), and pitch changes associated with lexical tone (Burnham et al., 2000), can be processed by using visual cues alone, sometimes even when the lower part of the face is occluded (Cvejic et al., 2010; Davis & Kim, 2006). For instance, raising eyebrow movements (Granström et al., 1999) or eye widening (Massaro & Beskow, 2002) can serve as an independent prosodic cue to promi (p. 528)

nence, and observers preferentially look at these regions in prosody-related judgments (Buchan et al., 2004; Lansing & McConkie, 1999). In fact, visual cues such as movements of the head and eyebrows have been shown to correlate with the basic cues for prosody in the auditory domain, namely changes in voice pitch, loudness, or duration (Foxton et al., 2009; Munhall et al., 2003). However, the visual contribution to the prosodic aspects of the message is not necessarily limited to the upper part of the face. That is, head move ments also include parts of the lower face (i.e., chin; Figure 26.2). The neck and chin movements can provide information about the identity of lexical tones (e.g., Chen & Mas saro, 2008), and the magnitude of mouth movements can possibly be cues for the percep tion for loudness (i.e., amplitude). Therefore, the different regions of the face can be in formative for various dimensions and to various degrees.

Page 7 of 56

Multimodal Speech Perception

Visual Contributions to Speech Intelligibility

Figure 26.2 Two-dimensional plots of the motion of seventy markers on the face during speech produc tion. Panel A shows the motion in space that results from the combined face and head motion. Panel B shows the facial motion with the head motion re moved.

Figure 26.3 Perception of speech in noise for audito ry-only presentation (blue line) and audiovisual pre sentation (black line) in a closed-set task with thirtytwo words. Adapted from Sumby & Pollack, 1954. Reprinted with permissions from Acoustical Society of America.

Page 8 of 56

Multimodal Speech Perception When speech is perceived bimodally, such as audiovisually or even visuotactually (see lat er section, Tactile Contributions to Speech), perception is often enhanced in a synergistic fashion, especially if the auditory input is somehow degraded. That is, speech-reading performance in bimodal conditions is better than the pooled performance from the uni modal conditions (Summerfield, 1987). In the first known experimental demonstration of this phenomenon (Cotton, 1935), the voice of a talker speaking inside a dark sound booth was filtered and masked with a loud buzzing noise, rendering it almost unintelligible. However, when the lights of the sound booth were switched on and participants could see the talker’s face, they correctly reported most of the words. Almost 20 years later, these findings were quantified by Sumby and Pollack (1954), who showed that when the intelli gibility of acoustic speech is impoverished by adding noise, the concurrent presentation of its corresponding visual speech cues can improve comprehension to a degree equiva lent to increasing acoustic signal-to-noise ratio by 15 to 20 dB (see also Rosenblum et al., 1996; Ross et al., 2006; Figure 26.3). Because this gain in performance is so large, the combination of multiple sources of information about speech has the potential to be ex tremely useful for listeners with hearing impairments (Berger, 1972). Indeed, the benefits derived from having access to the speaker’s facial speech information have been docu mented in listeners with mild to more severe types of hearing loss (p. 529) (Payton et al., 1994; Picheny et al., 1985) and even deaf listeners with cochlear implants (Rouger et al., 2007, 2008). Nonetheless, the visual contribution to speech perception has been demonstrated even in circumstances in which the auditory signal is not degraded. For example, novice speakers of a second language often report that face-to-face conversation is easier than situations without visual support (Navarra & Soto-Faraco, 2007; Reisberg et al., 1987). Similarly, Reisberg et al. (1987) demonstrated that when listening to perfectly audible messages from a speaker with a heavy foreign accent or to a passage with difficult semantic con tent (e.g., Kant’s Critique of Pure Reason), the availability of the visual information en hances comprehension (see also Arnold & Hill, 2001). This synergy effect in bimodal stimulation contexts may be attributed to at least two sepa rate mechanisms. First, the perceiver might exploit time-varying features common to the physical signals of both the acoustical and the visual input. That is, in addition to match ing onsets and offsets, the concurrent streams of auditory and visual speech information are usually invariant in terms of their tempo, rhythmical patterning, duration, intensity variations, and even affective tone (Lewkowicz et al., 2000). These common (“amodal”) properties have been proposed to be the critical determinants for integration by some models of audiovisual speech perception (Rosenblum, 2008; Studdert-Kennedy, 1989; Summerfield, 1987).4 Second, the auditory and visual signals are fused to a percept that is more than the sum of its parts. That is, because each modality contains partially inde pendent phonological cues about the same speech event, recognition is boosted when both sources of information are available.

Page 9 of 56

Multimodal Speech Perception

Correlated and Complementary Nature of Fa cial Movements and Vocal Acoustics As the act of talking unfolds, the movement of articulators to produce the acoustic output and the speech-related facial motion results in a structural coupling between auditory and visual events. Whereas the primary locus of visual speech information is around the mouth and jaw—owing to their principal role in speech sound generation—the motion of articulation spreads across the entire face (Vatikiotis-Bateson et al., 1996). Observation of the articulators during speech can give direct information regarding not only the place of articulation but also the onset, offset, and rate of change speech, as well as the overall amplitude contour (Grant & Seitz, 2000) and the spectral properties of the acoustic signal (Grant, 2001). In fact, investigators have found robust correlations between the spa tiotemporal characteristics of vocal tract configurations, visual facial information, and the acoustic output. For instance, Yehia et al. (1998) showed that the spectral envelope of the speech signal can be estimated with more than 90 percent accuracy simply by tracking the position of the talker’s moving head or even by the motion of the tongue alone (an ar ticulator that is not necessarily coupled with the face; Yehia et al., 1998). Similarly, Munhall et al. (2004) showed a kinematic–acoustic relation between head motion alone (i.e., with no facial motion) and the pitch (fundamental frequency) and amplitude root mean square (RMS) of the speech sound during natural speech production. Chan drasekaran et al. (2009) computed the frequency spectra of the envelopes of the auditory signal and the corresponding mouth area function to identify the temporal structure of the auditory envelope and the movement of the mouth, and found that audiovisual speech has a stereotypical rhythm that is between 2 and 7 Hz (see also Ohala, 1975). Given that there is so much coherence between speech production components, it makes sense that perceivers are sensitive to this structure and benefit from the time-varying re dundancies across modalities to decode the spoken message more reliably (Grant & Seitz, 2000; Schwartz et al., 2002). This redundancy may permit early modulation of audition by vision. For example, the visual signal may amplify correlated auditory inputs (Schroeder (p. 530) et al., 2008). Grant and Seitz (2000) showed that the ability to detect the presence of auditory speech in a background of noise improved by seeing a face articulating the same utterance. Critically, this benefit depended on the correlation between the sound in tensity (RMS energy in the mid- to high-frequency energy envelope) and the articulator movement (see also Bernstein et al., 2004; Eskelund et al., 2011; Kim & Davis, 2004). Similar detection advantages have been recently reported in the opposite direction, that is, when the visual input is presented in noise. Kim et al. (2010) showed that the detection of a point-light talking face among dynamic visual noise is improved when a temporally correlated, noninformative, auditory speech stream is present.

Page 10 of 56

Multimodal Speech Perception

Figure 26.4 Confusion trees for (A) lip reading and (B) auditory perception of speech. At the top of each tree, all of the consonants can be distinguished. As the branches are followed downward, the tree dis plays the common confusions that are made in each modality as (A) the degree of visual distinctiveness decreases or (B) signal-to-noise level decreases. Reprinted with permission from Summerfield, 1987.

It has been suggested that the redundancy of information perceived from the talking head, in particular in the common dynamic properties of the utterance, may be the metric listeners use for perceptual grouping and phonetic perception (Munhall et al., 1996; Sum merfield, 1987). In keeping with this idea, several studies have shown that the visual en hancement of speech perception depends primarily on dynamic rather than static charac teristics of facial images. For example, Vitkovitch and Barber (1994) have demonstrated that speech-reading accuracy for auditory dynamic noise decreases as the frame rate (temporal resolution) of the video of the speaker’s face drops below 16 Hz. Furthermore, the temporal characteristics of facial motion can enhance phonetic perception, as demon strated by use of dynamic point-light displays (Rosenblum & Saldaña, 1996). Neuropsy chological data confirm the specific role of the dynamic, time-varying characteristics on audiovisual speech integration. Munhall et al. (2002) presented dynamic and static (i.e., single frame) visual vowels to a patient who had suffered selective damage to the ventral stream, a brain region involved in the discrimination of forms (e.g., faces, a deficit known as agnosia). The authors found that, whereas the patient was unable to identify any speech gestures from the static photographs, she did not differ from controls in the dy namic condition (see also Campbell, 1997, for a similar case). Overall, these results sug gest that local motion cues are critical for speech recognition. The second way in which visible speech influences auditory speech is by providing com plementary information. In this case, vision provides stronger cues than the auditory sig nal, or even information that is missing from the auditory signal. As Summerfield (1987) pointed out, the speech units that are most confusable visually are not those that are most confusable with auditory stimuli (Figure 26.4). In Miller and Nicely’s (1955) classic study of consonant perception in noise, place of articulation is one of the first things to become difficult to perceive as the signal-to-noise ratio decreases. The auditory cues re garding manner of consonants (e.g., stops vs. fricatives) are less susceptible to noise while voicing, and presence of nasality is even more robust. These latter cues are very weakly conveyed visually, if at all. Thus, audio and visual speech cues are complementary; visual cues may be superior in conveying information about the place of articulation (e.g., Page 11 of 56

Multimodal Speech Perception at the lips vs. at the back of the mouth), whereas auditory cues may be more robust for conveying other phonetic information, such as the manner of articulation and voicing. The visual input, therefore, provides (p. 531) redundant cues to reinforce the auditory stimulus and can also be used to disambiguate some speech sounds with quite similar acoustics, such as /ba/ versus /da/, which differ in place of articulation. The most famous multisenso ry speech phenomenon, the McGurk effect, probably has its roots in this difference in in formation strength.

Classic Demonstration of Audiovisual Speech Integration: McGurk Effect

Figure 26.5 Schematic showing the McGurk effect. In the standard version of the illusion, the face is presented saying the syllable /ga/ while the auditory syllable /ba/ is played simultaneously. Common per ceptions are shown on the right.

The most compelling illustration of the consequences of audiovisual speech integration is the McGurk illusion (McGurk & MacDonald, 1976), which does not involve noisy acoustic conditions or complex messages. In this illusion, observers exposed to mismatched audi tory and visual speech signals often experience (i.e., hear) a phoneme different from that originally presented in either modality (Figure 26.5). For instance, in the classic “fusion” version of the illusion, a visual /ga/ consonant is dubbed in synchrony with an acoustic / ba/. This new audiovisual syllable is perceived by most subjects as beginning with a dif ferent consonant. Depending on the particular tokens used, most people hear /da/, /tha/, or /ga/. The contradictory visual information alters perception of the acoustic signal, /ba/, and this has been characterized as a fusion of phonetic features. There are a number of important aspects to this illusion. First, the voicing category of the consonant is deter mined by the auditory signal. Second, the use of /b/ as the auditory consonant is impor tant to the illusion. Although this stop consonant is correctly categorized when presented acoustically, this perception is not necessarily “strong.” The best McGurk illusions result from “weaker” /b/ tokens, that is, tokens for which the acoustic cues regarding place of articulation are more ambiguous. Finally, in the illusion, the perceived place of articula tion changes from bilabial to something else. This may result either from the absence of visual bilabial movement, one of the clearest of visual speech cues, or from perceiving the visual information for other consonants. For most talkers, the visual cues for a /g/ are also somewhat ambiguous. Thus, the main visual cue received by the listener is that a nonlabi al consonant is being produced. In summary, the illusory percept in the McGurk effect re sults from the presence of a weak auditory signal and the absence of the expected and Page 12 of 56

Multimodal Speech Perception complementary signal for /b/. Another version of the McGurk effect, the combination illu sion, reinforces this interpretation. In the combination illusion (sometimes called audiovi sual phonological fusion; Radicke, 2007; see also Troyer et al., 2010), two stimuli for which the cues are very strong are presented: a visual /ba/ and an acoustic / ga/. In this case, both consonants are perceived as a cluster /bga/. This version of the illusion is more reliable and more consistently perceived than the fusionillusion. The McGurk effect has been replicated many times with different stimuli under a variety of manipulations (e.g., Green & Gerdeman, 1995; Green et al., 1991; Jordan & Bevan, 1997; MacDonald & McGurk, 1978; Massaro & Cohen, 1996; Rosenblum & Saldaña, 1996), and has often been described as being very compelling and robust. However, after extensive experience in the laboratory with this type of stimuli, one cannot help but no tice that the effect and the illusory experience derived from it are not as robust as usually described in the audiovisual speech literature. For instance, it is not uncommon to find that the effect simply fails to occur for utterances produced by some talkers, even when the auditory and visual channels are accurately dubbed (Carney et al., 1999). Moreover, even for those talkers more amenable to producing the illusion, the effect is usually only observed in a (p. 532) percentage of trials over the course of an experiment (Brancazio, 2004; Brancazio & Miller, 2005; Massaro & Cohen, 1983), contrary to what was claimed in the original report (McGurk & MacDonald, 1976, p. 747). Furthermore, even when au diovisual discrepancy is not recognized as such, the phenomenological experience arising from a McGurk-type stimulus is often described as being different from the experience of a naturally occurring equivalent audiovisual event. On the other hand, not experiencing the illusion does not necessarily mean that there is no influence of vision on audition (Brancazio & Miller, 2005; Gentilucci & Cattaneo, 2005). Brancazio and Miller, for in stance, found that a visual phonetic effect involving the influence of visual speaking rate on perceived voicing (Green & Miller, 1985) occurred even when observers did not expe rience the McGurk illusion. According to the authors, this result suggests that the inci dence of the McGurk effect as an index for audiovisual integration may be underestimat ing the actual extent of interaction among the two modalities. Along the same lines, Gen tilucci and Cattaneo (2005) showed that even when participants did not experience the McGurk illusion, an acoustical analysis of the participants’ spoken responses revealed that these utterances were always influenced by the lip movements of the speaker, sug gesting that some phonetic features present in the visual signal were being processed. Another factor that often puzzles researchers is that the overall incidence of the McGurk effect for a given stimulus typically varies considerably across individuals (Brancazio et al., 1999), with some people not experiencing the illusion at all. Whereas these differ ences have been explained in terms of modality dominance (i.e., individual differences in the weighting of auditory vs. visual streams during integration; Giard & Peronnet, 1999), it is still surprising that some people can weigh the auditory and visual information in such a way that the information provided by one modality can be completely filtered out.

Page 13 of 56

Multimodal Speech Perception Given that this phenomenon has traditionally been used to quantify the necessary and sufficient conditions under which audiovisual integration occurs (some of these are dis cussed below), we believe a full understanding of the processes underlying the McGurk effect is required before extrapolating the results to naturally occurring (i.e., audiovisual matching) speech (see also Brancazio & Miller, 2005).

Tactile Contributions to Speech Besides vision and hearing, the tactile modality has also been shown to be an efficient in put channel for processing speech information. Strong support for the capacity to use touch as a communicative sense is provided by individuals born both deaf and blind, and that, by means of a variety of natural communication methods, have been able to acquire a full range of spoken language abilities. For instance, in tactile finger spelling, the manu al alphabet of the local sign language is adapted, so that by placing the palms over the signer’s hands, the deaf-blind person can feel the shape, movement, and location of the different signs. Particularly noteworthy, however, is the Tadoma method of speech percep tion, which is based on the vibrotactile reception of the articulatory movements and ac tions that occur during the production of speech. In this method, the hand of the deafblind receiver is placed over the speaker’s face in such a way that the little finger, on the throat, detects laryngeal vibration, the ring and middle fingers pick up information on the jaw and cheek movement, the index finger detects nasal resonance, and the thumb de tects lip movement and airflow changes (Weisenberger et al., 1989; Figure 26.6). Previ ous research has demonstrated that high levels of comprehension can be reached by us ing the Tadoma method. That is, proficient users achieve almost normal communication with this method (i.e., they can track 80 percent of the key words in running speech, at a rate of three syllables per second; Reed et al., 1982, 1985). This suggests that the sense of touch has sufficient capacity for decoding time-varying complex cues in the speech sig nal.

Figure 26.6 Drawing of an individual communicating using Tadoma. The perceiver places his hand on the face and throat of the talker as she talks and he can perceive her speech at a high level of intelligibility.

Page 14 of 56

Multimodal Speech Perception A few studies have shown that even untrained hearing individuals can benefit from this form of natural tactile speech reading when speech is presented in adverse listening con ditions (Fowler & Dekle, 1991; Gick et al., 2008; Sato et al., 2010). For instance, Gick et al. (2008) found that untrained participants could identify auditory or visual syllables about 10 percent better when they were paired with congruent tactile information from the face. Similarily, Sato et al. (2010) presented participants with auditory syllables em bedded in noise (e.g., / ga/) alone or together with congruent (e.g., /ga/) and incongruent (e.g., /ba/) McGurk-like combinations (e.g., /bga/) tactile utterances. The results showed that manual tactile contact with the speaker’s face coupled with congruent auditory infor mation facilitated the identification of the syllables. Interestingly, they also found that al though auditory identification was significantly decreased by (p. 533) the influence of the incongruent tactile information, participants did not report combined illusory percepts (e.g., /bda/). Instead, on some trials, they selected the haptically specified event. This re sult is in line with previous findings by Fowler and Dekle (1991), who showed no evidence of illusory McGurk percepts as a result of audiotactile incongruent pairings (only one of seven tested participants reported hearing a new percept under McGurk conditions). The fact that audiotactile incongruent combinations do not elicit illusory percepts, together with the fairly small audiotactile benefits observed in the matching conditions (in compar ison to audiovisual matching conditions), raises theoretical questions regarding the com binative structuring of audiotactile speech information. That is, these results suggest that rather than genuine perceptual interactions, audiotactile interactions may be the result of postperceptual decisions reached through probability summation of the two sources of in formation (see Massaro, 2009). Note that if visual and tactile information were not per ceptually integrated but rather were processed in parallel and independently of each oth er, one should observe improvements in the syllable identification (because of signal re dundancy), but incongruent conditions should never lead to hearing new percepts. This is the exact pattern of results found in these studies. The finding that integrated processing in speech perception is highly dependent on the modality through which information is initially encoded clashes with a recent hypothesis suggesting that audiotactile speech in formation is integrated in a similar way as synchronous audiovisual information (Gick & Derrick, 2009, p. 503). In a recent study, Gick and Derrick (2009) paired acoustic speech utterances (i.e., the syllables /pa/ or /ba/) with small bursts of air on their necks or hands and found that participants receiving puffs of air were more likely to perceive both sounds as aspirated (i.e., /pa/). According to the authors, the fact that this effect occurred in untrained perceivers and at body locations unlikely to be reinforced by frequent experi ence suggest that tactile information is combined in a natural manner with the auditory speech signal. However, it is more likely that this interference is due to postperceptual analysis (Massaro, 2009).

Tactile Aids Given the substantial success of Tadoma and other natural methods for tactile communi cation, researchers have become interested in exploiting this modality as an alternative channel of communication in individuals with severe hearing impairments. The possibility Page 15 of 56

Multimodal Speech Perception of tactile information working as a speech-reading supplement is grounded on the com plementary nature of visual and tactile cues, analogous to the previously described rela tionship between auditory and visual information in speech (Summerfield, 1987). That is, lip and jaw movements, two of the parameters Tadoma users exploit to extract linguistic information, are available to the sighted lip reader with hearing deficits. The effort in developing tactile aids to assist speech reading and maximize sensory redun dancy started in 1924 when Dr. Robert Gault built an (p. 534) apparatus, a long tube that the speaker placed in front of the mouth and the receiver held in her hands, that deliv ered unprocessed sound vibrations to the receivers (Gault, 1924). Although Gault’s vibra tor only transmitted very limited cues for speech recognition (possibly only the temporal envelope; see Levitt, 1995), it was nevertheless found to be useful to supplement speech reading. The relative success in this early work boosted the development of more sophis ticated sensory substitution systems,5 aiming at decoding different aspects of speech sig nals into patterns of tactile stimulation. Haptic devices have varied with respect to the type of transducers (electrotactile vs. vi brotactile), the stimulated body site (e.g., finger, hand, forearm, abdomen, and thigh), and the number and configurations of stimulators (e.g., single-channel vs. multi-channel stim ulation; see Levitt, 1988, for a review). In the single-channel approach, for example, a sin gle transducer directly presents minimally processed acoustic signals, thus conveying global properties of speech such as intensity, rhythm, and energy contours (Boothroyd, 1970; Erber & Cramer, 1974; Gault, 1924; Gault & Crane, 1928; Schulte, 1972). In multi channel schemes, the acoustic input signal is decomposed into a number of frequency bands, the outputs of which drive separate tactile transducers that present a spectral dis play along the user’s skin surface. The skin, thus, is provided with an artificial place mechanism for coding frequencies. In most systems, moreover, the sound energy at a giv en locus of stimulation cues intensity in the corresponding channel. Psychophysical evaluations of performance with these methods have usually been done by comparing speech reading alone, the tactile device alone, and speech reading plus the tactile device. These studies have shown that after extensive training, vocoders can pro vide enough feedback to improve speech intelligibility both when used in isolation and presented together with visual speech information (Brooks & Frost, 1983; Brooks et al., 1986). Nevertheless, none of these artificial methods has reached the performance achieved through the Tadoma method. Such performance differences may be partly at tributed to the superior overall richness that the Tadoma displays. Whereas Tadoma con veys a variety of sensory qualities directly tied to the articulation process, tactile devices encode and display acoustic information in an arbitrary fashion (i.e., employing a sensory system not typically used for this sort of information; Weisenberg & Percy, 1995). Further more, as pointed out before (see earlier section, Properties and Perception of the Audito ry Speech Signal), phonemes are signalled by multiple acoustic cues that vary greatly as a function of the context. It is possible, therefore, that some critical acoustic information is lost in the processing and transduction when using such an apparatus.

Page 16 of 56

Multimodal Speech Perception In terms of the information provided, single-channel devices have been generally de scribed as being superior in conveying supra-segmental information, such as syllable number, syllabic stress, and intonation (Bernstein et al., 1989; Carney & Beachler, 1986). Multichannel aids, on the other hand, show better performance for tasks requiring the identification of fine-structure phoneme information (both single-item and connected speech; Brooks & Frost, 1983; Plant, 1989; Summers et al, 1997; Weisenberger et al., 1991; Weisenberger & Russell, 1989). Other studies, however, have reported fairly simi lar-sized benefits in extracting segmental and supra-segmental information regardless of the number of channels. For instance, Carney (1988) and Carney and Beachler (1986) compared a single-channel and a twenty-four-channel vibrotactile device in phoneme recognition tasks, and reported similar levels of performance under both tactile aid alone and speech reading plus tactile aid conditions. Hanin et al. (1988) measured the percep tion of words in sentences by speech reading with and without tactile presentation of voice fundamental frequency (F0) using both multichannel display and single-channel dis plays. Mean performance with the tactile displays was found to be slightly, but signifi cantly, better than speech reading alone, but no significant differences were observed be tween the two displays. Overall, what is clear from these studies, however, is that providing complementary tac tile stimulation—in whatever form—is effective in helping hearing-impaired individuals to decode different kinds of linguistic information such as segmental (e.g., Yuan et al., 2005), and supra-segmental features (e.g., Auer et al., 1998; Bernstein et al., 1989; Grant et al., 1986; Thompson, 1934), closed-set words (Brooks & Frost, 1983), and even con nected speech (Miyamoto et al., 1987; Skinner et al., 1989).

Sensory Factors in Audiovisual Speech Integra tion Temporal Constraints in Audiovisual Speech Perception As previously mentioned (see earlier section, Multisensory Integration: Cross-Talk Be tween The (p. 535) Senses), the temporal synchrony of audiovisual signals provides a pow erful cue for linking multi-sensory inputs (see Vroomen & Keetels, 2010, for a tutorial re view). One compelling example of the role that temporal synchrony plays in multisensory speech perception can be seen in the discomfort experienced when stimuli from different modalities occur separated by a large temporal gap, as can happen during television broadcast or video playback in which the face and the voice are out of sync (see Hamilton et al., 2006, for a study of a patient who consistently perceives natural speech in asyn chrony). Studies investigating the temporal structure of multisensory events have shown, however, that whereas multisensory signals must be relatively synchronized to be per ceived as a single event, strict temporal overlapping is by no means necessary. Instead, there is a temporal window of integration over which asynchronies between sensory modalities are not detected and multisensory effects (i.e., visual enhancement of speech Page 17 of 56

Multimodal Speech Perception intelligibility and the McGurk effect) are still observed. This temporal window of integra tion is thought to compensate for small temporal delays between modalities that may arise under natural conditions because of both the physical characteristics of the arriving inputs (i.e., differences in the relative time of arrival of stimuli at the eye and ear) and biophysical differences on sensory information processing (i.e., differences in neural transduction latencies between vision and audition; Spence & Squire 2003; Vatakis & Spence, 2010).6 In the area of audiovisual speech perception, results suggest that the perceptual system can handle a relatively large temporal offset between auditory and visual speech signals. For example, studies measuring sensitivity to intermodal asynchrony (i.e., judgments based on the temporal aspects of the stimuli) for syllables (Conrey & Pisoni, 2006) or sen tences (Dixon & Spitz, 1980) have consistently identified a window of approximately 250 ms over which auditory-visual speech asynchronies are not reliably perceived. Further more, this time window is often found to be longer when the visual input precedes the au ditory input than when the auditory input precedes the visual. For instance, in one of the first attempts to find the threshold for detection of asynchrony, Dixon and Spitz (1980) presented participants with audiovisual speech streams that became gradually out of sync, and instructed them to press a button as soon as they noticed the asynchrony. They found that the auditory stream had to either lag by 258 ms or lead by 131 ms before the discrepancy was detected. Later studies using detection methods less susceptible to bias (see Vatakis & Spence, 2006, for a discussion on this),7 have provided a sharper delimita tion of this temporal window, showing limits ranging between 40 and 100 ms for auditory leading (see Soto-Faraco & Alsius, 2007, 2009; Vatakis et al., 2006) and up to 250 for video leading stimuli (Grant et al. 2004; Soto-Faraco & Alsius, 2009; Vatakis & Spence 2006; Figure 26.7).

Page 18 of 56

Multimodal Speech Perception

Figure 26.7 Meta-analysis of the window of per ceived synchrony for a range of different asyn chronies and different stimuli. SC, syllable catego rization; SJ, simultaneity judgment; TJ, temporal or der judgment; 2IFC, two-interval forced choice. The white area in the temporal window of integration col umn corresponds to the stimulus onset asynchrony (SOA) range in which the auditory information is pre sented before the visual information. The shaded area corresponds to the SOA range in which the visu al information is presented before the auditory infor mation. The dark horizontal lines correspond to the window of audiovisual integration.

Consistent findings are found in studies estimating the boundaries of this temporal win dow by quantifying the magnitude of multisensory effects, such as visual enhancement of speech intelligibility (Conrey & Pisoni, 2006; Grant & Greenberg, 2001; McGrath & Sum merfield, 1985; Pandey et al., 1986) or the McGurk illusion (Jones & Jarick, 2006; Mas saro & Cohen; 1993; Massaro et al., 1996; Miller & D’Esposito, 2005; Munhall et al., 1996; Soto-Faraco, & Alsius, 2007, 2009; van Wassenhove et al., 2007). For instance, Munhall et al. (1996) presented McGurk syllables at different asynchronies (i.e., from a 360-ms auditory lead to a 360-ms auditory lag, in 60-ms steps) and found that, although illusory percepts prevailed at small asynchronies, a significant amount of fused responses were still observed when the audio track lagged the video track for 240 ms, or when it lead it by 60 ms. In a recent report, however, this temporal window was reported to be much wider (480 ms with an audio lag and 320 ms with an audio lead; Soto-Faraco & Al sius, 2009). According to some researchers, this much greater tolerance for situations in which the audio signal lags the visual than for situations in which the visual signal lags the audio can be explained by perceivers’ experience with the physical properties of the natural world (see Munhall et al., 1996). That is, in audiovisual speech, like in many other natural audiovisual occurrences, the visual information (i.e., the visible byproducts of speech articulation) almost always precedes the acoustic output (i.e., speech sounds). It is Page 19 of 56

Multimodal Speech Perception conceivable, therefore, that after repeated exposure to visually leading occurrences in the external word, our perceptual system has adapted to tolerate and bind visual-leading mul tisensory events to a greater extent than audio-leading stimuli. Indeed, this hypothesis is supported by recent studies showing that the temporal window of integration is adapt able in size, as a result of perceptual experience. For instance, repeated exposure to tem porally misaligned speech stimuli can alter the perception of synchrony or asynchrony (Vatakis & Spence, (p. 536) 2007; see also Fujisaki et al., 2004, and Vroomen et al., 2004, for nonspeech related results). Furthermore, the ability of the human perceptual system to deal with audiovisual asyn chronies appears to vary as a function of the nature of the stimuli with which the system is being confronted. For instance, for stimuli with limited informational structure within either modality (e.g., such as beeps and flashes; see Hirsh & Sherrick, 1961), the tempo ral window for subjective simultaneity has been shown to be much narrower, with timing differences of less than 60 to 70 ms being detected (Zampini et al., 2003). When the com plexity of multisensory information increases, as in audiovisual speech or highly ecologi cal nonspeech audiovisual events (e.g., music instruments or object actions; Vatakis & Spence, 2006), the information content (i.e., the semantics and the inherent structure/dynamics that are extended in time) of the unisensory inputs may serve as an additional factor promoting integration (Calvert et al., 1998; Laurienti et al., 2004), and hence lead to a widening of the temporal window for integration (i.e., larger temporal and spatial disparities are tolerated; Vatakis & Spence, 2008). For speech, some authors have suggested, moreover, that once such complex sensory signals are merged into a single unified percept, the system interprets that this multisensory stimulus has a unique tempo ral onset and a common external origin (see Jackson, 1953), the “unity assumption hy pothesis.” Consequently, the final perceptual outcome of multisensory integration re duces or eliminates the original temporal asynchrony between the auditory and visual sig nals (Vatakis & Spence, 2007; Welch & Warren, 1980). Support for the unity assumption has primarily come from studies showing participants’ sensitivity to temporal misalign ment in audiovisual speech signals is higher for mismatching audiovisual speech events (e.g., different face-voice gender or McGurk syllables) than for matching ones (Vatakis & Spence, 2007, and van Wassenhove et al., 2007; respectively). Other studies, however, qualified these findings by showing that such informational congruency effects (i.e., wider temporal windows for matching vs. nonmatching stimuli) were only observed for audiovi sual speech stimuli, but not for other complex audiovisual events such as music instru (p. 537)

ments or object actions (Vatakis & Spence, 2006) and even some vocalizations (Vatakis et al., 2008). This has been used to argue that the temporal synchrony perception for speech stimuli may be somewhat special (Radeau, 1994; Vatakis et al., 2008).8 A critical methodological concern in Vatakis and Spence (2007) and van Wassenhove et al. (2007), however is that mismatch conditions differed not only in the informational con tent of the stimuli but also in a number of physical (sensory) dimensions (e.g., structural factors; Welch, 1999). As a result, the reported effects may be explained by differences at the level of simple spatial-temporal attributes. In fact, two recent studies evaluating the temporal window of integration for different attributes and perceptual interpretations in Page 20 of 56

Multimodal Speech Perception a set of identical multisensory objects (thus controlling for low-level factors) have chal lenged the unity assumption hypothesis (Soto-Faraco & Alsius, 2009; Vroomen & Steke lenburg, 2011). Soto-Faraco and Alsius (2009) explored the tolerance of the McGurk com bination effect to a broad range of audiovisual temporal asynchronies, while measuring, at the same time, the temporal resolution across the two modalities involved. Critically, they found that the McGurk illusion can arise even when perceivers are able to detect the temporal mismatch between the face and the voice (Soto-Faraco & Alsius, 2009). This suggests that the final perceptual outcome of multisensory integration perceptual input does not overwrite the original temporal relation between the two inputs, as the unity as sumption would predict. Instead, it demonstrates that the temporal window of multisen sory integration has different widths depending on the perceptual attribute at stake. In the Vroomen and Stekelnburg (2011) study, participants were required to detect asyn chronies between synthetically modified (i.e., sine-wave speech) pseudowords and the corresponding talking face. Sine-wave speech is an impoverished speech signal that is not recognized as speech by naïve observers; however, when perceivers are informed of its speech-like nature, they become able to decode its phonetic content. The authors found that, whereas the sound was more likely integrated with lip-read speech if heard as speech than non-speech (i.e., the magnitude of the McGurk effect was dependent on the speech mode; see also Tuomainen et al., 2005), observers in both a speech and non speech mode were equally sensitive at judging audiovisual temporal order of the events. This result suggests that previously found differences between speech and nonspeech stimuli were due to low-level stimulus differences, rather than reflecting the putative spe cial nature of speech. The wider temporal windows of integration for audiovisual speech—as compared with complex nonspeech stimuli or simple stimuli—observed in previous studies can be better explained by the increased low-level time-varying correlations that underlie audiovisual speech. That is, in contrast to simple transitory stimuli, where asynchronies can be de tected primarily by temporal onset–offsets cues, in slightly asynchronous matching audiovisual speech there is still a fine temporal correlation between sound and vision (Munhall et al., 1996). According to Vroomen and Stekelenburg, this (time-shifted) correlation would induce a form of “temporal ventriloquist” effect (Morein-Zamir et al., 2003; Scheier et al., 1999; Vroomen & de Gelder, 2004), by which the perceived timing of the auditory speech stream would be actively shifted (i.e., “ventriloquized”) toward the corresponding lip gestures, reducing differences in transmission and (p. 538) processing times of the dif ferent senses (therefore leading to a wider temporal integration windows for these type of stimuli).

Spatial Constraints in Audiovisual Speech Perception In addition to tolerance of temporal asynchronies, several studies have shown that exten sive spatial discrepancy between auditory and visual stimuli does not influence the strength of the McGurk effect (Bertelson et al., 1994; Colin et al., 2001; Fisher & Pylyshyn; 1994; Jones & Jarrick, 2006; Jones & Munhall, 1997), unless auditory spatial at tention is manipulated (Tiippana et al., 2011). Jones and Munhall (1997) measured the Page 21 of 56

Multimodal Speech Perception magnitude of the illusion for up to 90-degree sound angles and found that the proportion of auditory-based responses was independent of loudspeaker location. Similar findings were reported by Jones and Jarrick (2006). They measured the illusion both under tempo ral (from–360 to 360 ms) and spatial (five different locations) discrepancies. They found no indication of an additive relationship between the effects of spatial and temporal sepa rations. In other words, the McGurk illusion was not reduced by the combination of spa tial and temporal disparities. Besides measuring the magnitude of the McGurk illusion, a few studies also instructed participants to make perceptual judgments of the relative location of auditory stimulus (i.e., point to the apparent origin of the speech sounds; Bertelson et al., 1994; Driver, 1996; Jack & Thurlow, 1973). These studies showed that observers’ judgements of speech sound locations are biased toward the visual source; the so-called ventriloquist illusion (Connor, 2000; see Bertelson & de Gelder, 2004, for studies examining the effect in nonassociative, simple stimuli). This illusion, a classic illustration of multisensory bias in the spatial domain, underlies our perception of a voice emanating from actors appearing on the screen when, in reality, the soundtrack is physically located elsewhere. Just as with the temporal ventriloquist described earlier, the spatial ventriloquist effect is thought to occur as a result of our perceptual system assuming that the co-occurring auditory infor mation and visual information have a single spatial origin. An interesting characteristic of the ventriloquist illusion is that vision tends to dominate audition in the computation of the location of the emergent percept. This makes functional sense, considering that, in our daily life, we usually assign the origin of sounds, which may be difficult to localize, es pecially in noisy or reverberant conditions, to events perceived visually, which can be lo calized more accurately (Kubovy, 1988; Kubovy & Van Valkenburg, 1995). Bertelson et al. (1994) measured both the ventriloquist and the McGurk illusions for the very same audiovisual materials in one experiment. On each trial, participants were pre sented with an ambiguous fragment of auditory speech, delivered from one of seven hid den loudspeakers, together with an upright or inverted face shown on centrally located screen. They were instructed to do a localization task (i.e., point to the apparent origin of the speech sounds) and an identification task (i.e., report what had been said). Whereas spatial separations did not reduce the effectiveness of the audiovisual stimuli in produc ing the McGurk effect, the ventriloquist illusion decreased as the loudspeaker location moved away from the face (see Colin et al., 2001; Fisher & Pylyshyn, 1994; Jones & Munhall, 1997, for similar results). The inverted presentation of the face, in contrast, had no effect on the overall magnitude of the ventriloquism illusion, but it did significantly re duce the integration of auditory and visual speech (see Jordan & Bevan, 1997). This re versed pattern suggests that the two phenomena can be dissociated and, perhaps, involve different components of the cognitive architecture. Note, moreover, that such dissociation is also in disagreement with the unity assumption hypothesis described above because it suggests that in some trials, both the unified (i.e., McGurk percepts) and low-level senso ry discrepancies (i.e., the spatial origin) can be perceived by participants.

Page 22 of 56

Multimodal Speech Perception The findings that the McGurk illusion is impervious to spatial discrepancies suggest that the spatial rule for multisensory integration (i.e., enhanced integration for closely located sensory stimuli) does not apply in the specific case of audiovisual speech perception un less the task involves judgments regarding the spatial attributes of the event. Time-vary ing similarities in the patterning of information might prove, in this case, a more salient feature for binding (Calvert et al., 1998; Jones & Jarrick, 2006; Jones & Munhall, 1997).

Extraction of Visual Cues in Audiovisual Speech The study of the visual aspects of speech perception has generally been addressed mainly from two perspectives. Whereas some researchers have explored the essential visual in put required for a successful processing of linguistic information (Munhall et al., 2004; Preminger et al., 1998; Rosenblum & Saldaña, (p. 539) 1996; Thomas & Jordan, 2004),9 others have investigated how the observer actively selects this information by examining the pattern of eye movements during speech (e.g., Lansing & McConkie, 1994; VatikiotisBateson et al., 1998). Numerous studies attempting to isolate the critical aspects of the visual information have been carried out by occluding (or freezing) different parts of facial regions and measur ing the impact of the nonmasked areas to speech reading. These studies have generally shown that the oral region (i.e., talker’s mouth) offers a fairly direct source of information about the segmental properties of speech (Summerfield, 1979). For instance, Thomas and Jordan (2004) showed that the intelligibility of an oral-movements display was similar to that of a whole-face movements display (though see IJsseldijk, 1992). However, linguisti cally relevant information can also be extracted from extraoral facial regions, when the oral aperture is occluded (Preminger et al., 1998; Thomas & Jordan, 2004), probably ow ing to the strong correlation between oral and extraoral movements described above (Munhall & Vatikiotis-Bateson, 1998). Furthermore, visual speech influences on the audi tory component have also been shown to remain substantially unchanged across horizon tal viewing angles (full face, three-quarter, profile; Jordan & Thomas, 2001), rotations in the picture plane (Jordan & Bevan, 1997), or when removing the color of the talking face (Jordan et al., 2000). Even when the facial surface kinematics are reduced to the motion of a collection of light points, observers still show perceptual benefit in an acoustically noisy environment and can perceive the McGurk effect (Rosenblum & Saldaña, 1996). This result suggests that pictorial information such as skin texture does not portray criti cal information for audiovisual improvements (although note that performance never reached the levels found in natural displays), and they highlight the importance of motion cues in speech perception. However, is time-varying information of the seen articulators sufficient for audiovisual benefits to be observed? In Rosenblum’s study, the patch-light stimuli covered extensive regions of the face and the inside of the mouth (tongue, teeth) and the angular deformations of the point-lights could have been used to reveal the local surface configuration. It is possible, therefore, that such point-lights did not completely Page 23 of 56

Multimodal Speech Perception preclude configural information of the face forms (i.e., spatial relations between fea tures), and that these cues were used together with the motion cues to support speech identification. Indeed, other studies have found that configural information of the face is critical for audiovisual benefits and integration to be observed. For instance, Campbell (1996) demonstrated that the McGurk effect can be disrupted by inverting the brightness of the face (i.e., photonegative images), a manipulation that severely degrades visual forms of the face while preserving time-varying information. In the same line, in a speechin-noise task, Summerfield (1979) presented auditory speech stimuli together with four point-lights tracking the motion of the lips (center of top and bottom lips and the corners of the mouth) or with a Lissajou curve whose diameter was correlated with the amplitude of the audio signal, and found no enhancement whatsoever. Similarly, auditory speech de tection in noise is not facilitated by presenting a fine-grained correlated visual object (e.g., dynamic rectangle whose horizontal extent was correlated with the speech enve lope; also see Bernstein et al., 2004; Ghazanfar et al., 2005). This suggests that both the analysis of visual form and analysis of the dynamic characteristics of the seen articulators are important factors for audiovisual integration. Further studies are required to deter mine the contribution of each of these sources to speech identification. The image quality of the face has also been manipulated by using various techniques that eliminate part of the spatial frequency spectrum (Figure 26.8). Studies using this type of procedure suggest that fine facial detail is not critical for visual and audiovisual speech recognition. For instance, Munhall et al. (2004) degraded images by applying different bandpass and low-pass filters and revealed that the filtered visual information was suffi cient for attaining a higher speech intelligibility score than that of auditory-only signal presentation. Results showed that subjects had highest levels of speech intelligibility in the midrange filter band with a center spectral frequency of 11 cycles per face, but that the band with 5.5 cycles per face also significantly enhanced intelligibility. This suggests that high spatial frequency information is not needed for speech perception. In this line, other studies have shown that speech perception is reduced—but remains effective when facial images are spatially degraded by quantization (e.g., Campbell & Massaro, 1997; MacDonald et al., 2000), visual blur (Thomas & Jordan, 2002; Thorn & Thorn, 1989), or increased stimulus distance (Jordan & Sergeant, 2000).

Page 24 of 56

Multimodal Speech Perception

Figure 26.8 Three versions of the same image of a talker are shown with different amounts of spatial frequency filtering. The image on the left contains only very-low-frequency content and would provide minimal visual speech information. The middle image has a higher spatial frequency filter cutoff but is still quite blurry. Video images with this degree of filter ing have been found to produce an equal enhance ment of the perception of auditory speech in noise as the unfiltered video like the image on the right.

The results of these studies, moreover, are consistent with the observation that visual speech can be successfully encoded when presented in peripheral vision, (p. 540) several degrees away from fixation (Smeele et al., 1998). Indeed, Paré et al. (2003) showed that manipulations of observers’ gaze did not influence audiovisual speech integration sub stantially until their gaze was directed at least 60 degrees eccentrically. Thus, the conclu sion from this finding is that high-acuity, foveal vision of the oral area is not necessary to extract linguistically relevant visual information from the face. Altogether, results demonstrating that visual speech perception can subsist when experi mental stimuli are restricted to low spatial frequency components of the images suggest that visual speech information is processed at a coarse spatial level. In fact, many studies examining eye movements of perceivers naturally looking at talking faces (i.e., with no specific instructions regarding what cues to attend to), consistently look at the eye region more than the mouth (Klin et al., 2005). Nevertheless, if the task requires extracting fine linguistic information (e.g., phonetic details in high background noise, word identification or segmental cues), observers make significantly more fixations on the mouth region than the eye region (Buchan et al., 2007, 2008; Lansing & McConkie, 2003; Vatikiotis-Bateson et al., 1998).

Neural Correlates of Audiovisual Speech Inte gration The brain contains many structures that receive projections from more than one sensory system. In some, these projections remain functionally and anatomically segregated (such as in the thalamus). However, in others there is a convergence of multisensory informa tion onto the same neurons. These regions, therefore, are possibly involved in audiovisual integration operations. This latter type of area traditionally includes several structures in the high-level associative or heteromodal cortices, such as the superior temporal sulcus (STS), intraparietal sulcus (IPS), inferior frontal gyrus (IFG), insula, claustrum, and subcortical structures like the superior colliculus (SC). Among these brain areas, the majori ty of functional imaging studies emphasize the caudal part of the STS as a key region in Page 25 of 56

Multimodal Speech Perception volved in audio-visual integration of speech because it exhibits increased activity for audi tory speech stimulation (Scott & Johnsrude, 2003), visual speech articulation (Bernstein et al., 2002; Calvert et al., 1997; Campbell et al., 2001; MacSweeney et al., 2001), and congruent audiovisual speech (Calvert et al., 2000). Furthermore, several studies have demonstrated that this region shows enhancement to concordant audiovisual stimuli and depression to mismatching speech (Calvert et al., 2000). Finally, the STS and STG are in volved with visual enhancement of speech intelligibility in the presence of an acoustic masking noise, in accordance with the principle of inverse effectiveness (i.e., multisenso ry enhancement is greatest when uni-modal stimuli are least effective; Callan et al., 2003; Stevenson & James, 2009). Whereas these brain regions have been repeatedly shown to be involved in audiovisual speech processing, the relatively large diversity of experimental settings and analysis strategies across studies makes it difficult to determine the specific role of each location in the audiovisual integration processes. Nevertheless, it is now becoming apparent that different distributed networks of neural structures may serve different functions in audio visual integration (i.e., time, space, content; Calvert et al., 2000). In a recent study, for ex ample, Miller and d’Esposito (2005; see also Stevenson et al., 2011) observed that differ ent brain areas respond preferentially to the detection of sensory correspondence and to the perceptual fusion of speech events. That is, they found that while middle STS, middle IPS, and IFG (p. 541) are associated with the perception of fused bimodal speech stimuli, another network of brain areas (the SC, anterior insula, and anterior IPS) is differentially involved in the detection of commonalities of seen and heard speech in terms of its tem poral signature (see also Jones & Callan, 2003; Kaiser et al., 2004; Macaluso et al., 2004). It remains unknown, however, how these functionally distinct networks of neural groups work in concert to match and integrate multimodal input during speech perception. Ac cording to Doesburg et al. (2008), functional coupling between the networks is achieved through long-range phase synchronization (i.e., neuronal groups that enter into precise phase-locking over a limited period of time), particularly the oscillations in the gamma band (30 to 80 Hz; Engel & Singer, 2001; Fingelkurts et al., 2003; Senkowski et al., 2005; Varela et al., 2001). Furthermore, other remarkable findings demonstrate that viewing speech can modulate the activity in the secondary (Wilson et al., 2004) and primary motor cortices, specifically the mouth area in the left hemisphere (Ferrari et al., 2003; Watkins et al., 2003). That is, the distributed network of brain regions associated with the perception of bimodal speech information also includes areas that are typically involved in speech production, such as Broca’s area, premotor cortex (Meister et al., 2007), and even primary motor cortex (Calvert & Campbell, 2003; Campbell et al., 2001; Sato et al., 2009; Skipper et al., 2007). These results appear in keeping with the long-standing proposal that audiovisual integra tion of speech is mediated by the speech motor system (motor theory of speech percep tion; Liberman et al., 1967). However, the role that these areas play in audiovisual speech integration remains to be elucidated (see Galantucci et al., 2006; Hickok, 2009, for dis cussions). Page 26 of 56

Multimodal Speech Perception Moreover, there have been a number of studies in humans and animals that have re vealed early multi-sensory interactions within areas traditionally considered purely visual or purely auditory, such as MT/ V5 and Heschl’s gyrus, respectively (Beauchamp et al., 2004; Callan et al., 2003, 2004; Calvert et al., 2000, 2001; Möttönen et al., 2004; Olson et al., 2002), and even at earlier stages of processing (i.e., brainstem structures; Musacchia et al., 2006). Using functional magnetic resonance imaging (fMRI), Calvert et al. (1997), for example, found that linguistic visual cues are sufficient to activate primary auditory cortex in nor mal-hearing individuals in the absence of auditory speech sounds (see also Pekkola et al., 2005; though see Bernstein et al., 2002, for a nonreplication). The first evidence that visual speech modulates activity in the unisensory auditory cortex, however, was provided by Sams et al. (1991). Using magnetoencephalography (MEG) recordings in an oddball paradigm, Sams et al. found that infrequent McGurk stimuli (in congruent audiovisually presented syllables) presented among frequent matching audiovi sual standards gave rise to magnetic mismatch fields (i.e., mismatch negativity [MMN]10; Näätänen, 1982) at the level of the supratemporal region about 180 ms after stimulus on set. Other studies using mismatch paradigms have corroborated this result (Colin et al., 2002; Kislyuk et al., 2008; Möttönen et al., 2002). Using high-density electrical mapping, Saint-Amour et al. (2007) revealed a dominance of left hemispheric cortical generators during the early and late phases of the McGurk MMN, consistent with the well-known left hemispheric dominance for speech reading (Calvert & Lewis 2004; Capek et al. 2004). Subsequent MEG and electroencephalography (EEG) studies using the additive model (in which the event-related potential (ERP) responses to audiovisual stimulation are com pared with the sum of auditory and visual evoked responses) have estimated that the ear liest interactions between the auditory and visual properties can occur approximately 100 ms after the visual stimulus onset (Besle et al., 2004; Giard & Peronnet, 1999; Klucharev et al., 2003; Van Wassenhove et al., 2005) or even before (50 ms; Lebib et al., 2003), sug gesting that audiovisual integration of speech occurs early in the cortical auditory pro cessing hierarchy. For instance, van Wassenhove et al. (2005) examined the timing of AV integration for both congruent (/ka/, /pa/, and /ta/) and incongruent (McGurk effect) speech syllables and found that the latency of the classic components of the auditory evoked potential (i.e., N1/P2) speeded up when congruent visual information was present. According to the authors, because the visual cues for articulation often precede the acoustic cues in natural audiovisual speech (sometimes by more than 100 ms; Chan drasekaran et al., 2009), an early representation of the speech event can be extracted from visual cues related to articulator preparation and used to constrain the processing of the forthcoming auditory input (analysis by synthesis model; van Wassenhove et al., 2005). Thus, importantly, this model proposes not only that cross-modal effects occur ear ly in time, and in areas that are generally regarded as lower (early) in the sensory hierar chy, but also that the earlier-arriving visual information constrains the activation of pho netic units in the auditory cortex. Note that such a (p. 542) mechanism would only be

Page 27 of 56

Multimodal Speech Perception functional if the auditory processing system allowed for some sort of feedback signal from higher order areas. The idea of feedback modulating the processing in lower sensory areas by using predic tions generated by higher areas is not novel. Evidence from fMRI experiments has shown modulation of activity in sensory-specific cortices during audiovisual binding (Calvert & Campbell, 2003; Calvert et al., 1997, 2000; Campbell et al., 2001; Sekiyama et al., 2003; Wright et al., 2003) and have proposed that unisensory signals of multisensory objects are initially integrated in the STS and that interactions in the auditory cortex reflect feed back inputs from the STS (Calvert et al., 1999). However, although this explanation is ap pealing, it is unlikely that this is the only way in which audiovisual binding occurs. The relative preservation of multisensory function after lesions to higher order multisensory regions (see Ettlinger & Wilson, 1990, for a review) and studies demonstrating that inter actions in auditory cortex preceded activation in the STS region (Besle et al., 2008; Bushara, 2003; Möttönen et al., 2004; Musacchia et al., 2006) challenge a pure feedback interpretation. That is, current views on multisensory integration (Driver & Noesselt, 2008; Schroeder et al., 2008) suggest that there might be an additional direct corticalcortical input to auditory cortex from the visual cortex (Cappe & Barone, 2005; Falchier et al., 2002; Rockland & Ojima, 2003) that might convey direct phase resetting by the mo tion-sensitive cortex, resulting in tuning of the auditory cortex to the upcoming sound (Ar nal et al., 2009; Schroeder et al., 2008). Thus, in the light of recent evidence, a more elab orate network involving both unimodal processing streams and multisensory areas needs to be considered.

Cognitive Influences on Audiovisual Speech In tegration Currently, one of the most controversial issues in the study of audiovisual integration con cerns the question of whether audiovisual integration is an automatic process that occurs early and independently of top-down factors such as attention, expectations, and task sets, or can be modulated by cognitive processes (Colin et al., 2002; Massaro, 1983; McGurk & MacDonald, 1976; Soto-Faraco et al., 2004). The considerable robustness of the McGurk illusion under conditions in which the observ er realizes that the visual and auditory streams do not emanate from the same source makes it seem likely that higher cognitive functions have little access to the integration process and that the integration cannot be avoided or interrupted at will (Colin et al., 2002; Massaro, 1998; McGurk & MacDonald, 1976; Rosenblum & Saldaña, 1996; SotoFaraco et al., 2004). Nevertheless, the most straightforward demonstration for the auto matic nature of speech integration is provided by studies using indirect methodologies and showing that audiovisual integration emerges even when its resulting percept (e.g., McGurk illusion) is detrimental to the task at hand (Driver, 1996; Soto-Faraco et al., 2004). In Soto-Faraco’s study, participants were asked to make speeded classification judgments about the first syllable of audiovisually presented bisyllabic pseudowords, Page 28 of 56

Multimodal Speech Perception while attempting to ignore the second syllable. The paradigm, inspired by the classic Gar ner speeded classification task (Garner, 1974; see Pallier, 1994, for the syllabic version), is based on the finding that reaction times required to classify the first syllable (target) are slowed down when the second (irrelevant) syllable varies from trial to trial, in com parison to when it remains constant. Soto-Faraco et al. used audiovisual stimuli for which the identity of the second syllable was sometimes manipulated by introducing McGurk percepts. Critically, the authors found that the occurrence of the syllabic interference ef fect was determined by the illusory percept, rather than the actually presented acoustic stimuli. Overall, these results suggest that the integration of auditory and visual speech cues occurs before attentive selection of syllables because participants were unable to fo cus their attention on the auditory component alone and ignore the visual influence in the irrelevant syllable. In another study, Driver (1996) showed that selective listening to one of two spatially sep arated speech streams worsened when the lip movements corresponding to the distractor speech stream were displayed near the target speech stream. The interpretation for this interference is that the matching (distractor) auditory stream is ventriloquized toward the location of the seen talker face, causing an illusory perception that the two auditory streams emanate from the same source, and thus worsening the selection of the target stream by spatial attention. The key finding is that participants could not avoid integrat ing the auditory and visual components of the speech stimulus, suggesting that multisen sory binding of speech cannot be suppressed and arises before spatial selective attention is completed. In line with the behavioral results discussed above, neurophysiological (ERP) studies us ing (p. 543) oddball paradigms have shown that an MMN can be evoked (Colin et al., 2002) or eliminated (Kislyuk et al., 2008) by McGurk percepts. Because the MMN is con sidered to reflect a preattentive comparison process between the neural representation of the incoming deviant trace and the neural trace of the standard, these results suggest that conflicting signals from different modalities can be combined into a unified neural representation during early sensory processing without any attentional modulation. Fur thermore, the recent discovery that audiovisual interactions can arise early in functional terms (Callan et al., 2003; Calvert et al., 1997, 1999; Möttönen et al., 2004) seems to be consistent with the argument that binding mechanisms need to be rapid and mandatory. However, top-down modulation effects have been observed at very early stages in the neural processing pathway both for the auditory (e.g., Schadow et al., 2009) and the visu al sensory modalities (Shulman et al., 1997). It cannot be ruled out, therefore, that topdown cognitive processes modulate the audiovisual integration mechanisms in an analo gous fashion. In fact, attentional modulations of audiovisual interactions at hierarchically early stages of processing have been described before (Calvert et al., 1997; Pekkola et al., 2005). Altogether, the studies reviewed above suggest that audiovisual speech integration is cer tainly resilient to cognitive factors. However, these findings should be not construed as conclusive evidence for automatic integration of the auditory and visual speech compo Page 29 of 56

Multimodal Speech Perception nents in all situations. Recent studies have shown that increasing attentional load to a de manding, unrelated visual, auditory (Alsius et al., 2005; see also Tiippana et al., 2004), or tactile task (Alsius et al., 2007) decreases the percentage of illusory McGurk responses. This result conforms well to the perceptual load theory of attention11 (Lavie, 2005) and in dicates that spare attentional resources are in fact needed for successful integration. Note that if audiovisual speech integration were completely independent of attention, it would remain unaltered when attentional resources were fully consumed. Furthermore, other studies have shown that selective attention to the visual stimuli is re quired to perceptually bind audiovisually correlated speech (Alsius & Soto-Faraco, 2011; Fairhall & Macaluso, 2009) and McGurk-like combinations (Andersen et al., 2008). Fairhall et al. measured blood-oxygen-level-dependent (BOLD) responses using fMRI while participants were covertly attending to one of two visual speech streams (i.e., talk ing faces on the left and right) that could be either congruent or incongruent with a cen tral auditory stream. The authors found that spatial attention to the corresponding (matching) visual component of these audiovisual speech pairs was critical for the activa tion of cortical and subcortical brain regions thought to reflect neural correlates of audio visual integration. Using a similar display configuration (i.e., two laterally displaced faces and a single central auditory stream) with McGurk-like stimuli, Andersen et al. (2008) found that directing visual spatial attention toward one of the faces increased the influ ence of that particular face on auditory perception (i.e., the illusory percepts related to that face). Finally, Alsius and Soto-Faraco (2011) used visual and auditory search para digms to explore how the human perceptual system processes spatially distributed speech signals (i.e., speaking faces) in order to detect matching audio-visual events. They found that search efficiency among faces for the match with a voice declined with the number of faces being monitored concurrently. This suggests that visual selective atten tion is required to perceptually bind audiovisually correlated objects. In keeping with Driver’s results described above, however, they found that search among auditory speech streams for the match with a face was independent of the number of streams being moni tored concurrently (though see Tiippana, 2011). It seems, therefore, that whereas the perceptual system has to have full access to the spatial location of the visual event for the audiovisual correspondences to be detected, similar constraints do not apply to the audi tory event. That is, multisensory matching seems to occur before the deployment of audi tory spatial attention (though see Tiippana et al., 2011). In addition to the above-mentioned results, other studies have shown that audiovisual in tegration can sometimes be influenced by higher cognitive properties such as observers’ expectations (Windman et al., 2003), talker familiarity (Walker et al., 1995), lexical status (Barutchu et al., 2008), or instructions (Colin et al, 2005; Massaro, 1998; Summerfield & McGrath, 1984). For instance, in one study by Summerfield and McGrath (1984; see also Colin et al., 2005), half of the participants were informed about the artificial nature of the McGurk stimuli and were instructed to report syllables according to the auditory input. The rest of participants were not aware of the dubbing procedure and were simply in structed to repeat what the (p. 544) speaker had said. The visual influence was weaker in the first group of participants, suggesting that knowledge of how the illusion is created Page 30 of 56

Multimodal Speech Perception can facilitate the segregation of the auditory signal from the visual information. Taken to gether, therefore, these results question an extreme view of automaticity and suggest in stead some degree of penetrability in the binding process. Along the same lines, other studies have shown that identical McGurk-like stimuli can be responded to in qualitative ly different ways, depending on whether the auditory stimuli is perceived by the listener as speech or as nonspeech (Tuomainen et al., 2005), on whether the visual speech ges tures are consciously processed or not (Munhall et al., 2009), or on the set of features of the audiovisual stimuli participants have to respond to (Soto-Faraco & Alsius, 2007; 2009). Thus, the broad picture that emerges from all these studies is that, although au diovisual speech integration seems certainly robust to cognitive intervention, it can some times adapt to the specific demands imposed by the task at hand in a dynamic and mal leable manner. Therefore, it seems that rather than absolute susceptibility or absolute immunity to topdown modulation, the experimental conditions at hand will set the degree of automaticity of the binding process. In situations of low perceptual and cognitive processing load, where the visual speech stimuli may be particularly hard to ignore (see Lavie, 2005), au dio-visual integration appears to operate quite early and in a largely automatic manner (i.e., without voluntary control). This is line with a recent proposal by Talsma et al. (2010) stating that audiovisual integration will occur automatically (i.e., in a bottom-up manner) in low-perceptual-load settings, whereas in perceptually demanding conditions, top-down attentional control will be required. In keeping with this idea, previous studies outside the speech domain have shown that only when attention is directed to both modalities si multaneously are auditory and visual stimuli integrated very early in the sensory flow pro cessing (about 50 ms; Senkowski et al., 2005; Talsma & Woldorff, 2005; Talsma et al., 2007) and do interactions within higher order heteromodal areas arise (Degerman et al., 2007; Fort et al., 2002; Pekkola et al., 2005; van Atteveldt et al., 2007). However, topdown modulations may impose a limit to this automaticity under certain conditions. That is, having full access (at sensory and cognitive levels) to the visual stimuli (Alsius & SotoFaraco, 2011) and processing both the auditory and visual signals as speech stimuli seem to be critical requisites for binding (Munhall et al., 2009; Summerfield, 1979; Tuomainen et al., 2005). In summary, there seems to be a complex and dynamic interplay between audiovisual in tegration and cognitive processes. As mentioned earlier (see section, Neural Correlates of Audiovisual Speech Integration), the auditory and visual sensory systems interact at mul tiple levels of processing during speech perception (Hertrich et al., 2009; Klucharev et al., 2003). It is possible, therefore, that top-down modulatory signals exert an influence on some of these levels (see Calvert & Thesen, 2004, for a related argument), possibly de pending on perceptual information available (Fairhall & Macaluso, 2009), resources re quired (Alsius et al., 2005; 2007), spatial attention (Alsius & Soto-Faraco, 2011; Tiippana et al., 2011), task parameters (Hugenschmidt et al., 2010), and the type of attribute en coded (Munhall et al., 2009; Soto-Faraco & Alsius, 2009; Tuomainen et al., 2005).

Page 31 of 56

Multimodal Speech Perception

Summary Speech is a naturally occurring multisensory event and we, as perceivers, are capable of using various components of this rich signal to communicate. The act of talking involves the physical generation of sound by the face and vocal tract and the acoustics and move ments are inextricably linked. These physical signals can be accessed in part by the visu al, auditory, and haptic perceptual systems. Each of these perceptual systems preferen tially extracts different information about the total speech event. Thus, the combination of multiple sensory channels provides a richer and more robust perceptual experience. The neural networks supporting this processing are diverse and extended. In part, the neural substrates involved depend on the task subjects are engaged in. Much remains to be learned about how low-level automatic processes contribute to multisensory process ing and how and when higher cognitive strategies can modify this processing. Under standing behavioral data about task and attention are prerequisites for a principled neu roscientific explanation of this phenomenon.

References Alsius, A., Navarra, J., Campbell, R., & Soto-Faraco, S. (2005). Audiovisual integration of speech falters under high attention demands. Current Biology, 15, 839–843. Alsius, A., Navarra, J., Soto-Faraco, S. (2007). Attention to touch reduces audiovisual speech integration. Experimental Brain Research, 183, 399–404. Alsius, A., Soto-Faraco, S. (2011). Searching for audiovisual correspondence in multiple speaker scenarios. Experimental Brain Research, 213 (2-3), 175–183. Andersen, T. S., Tiippana, K., Laarni, J., Kojo, I., & Sams, M. (2008). The role of visual spa tial attention in audiovisual speech perception. Speech Communication, 51, 184–193. Andersson, U., & Lidestam, B. (2005). Bottom-up driving speechreading in a speechread ing expert: The case of AA (JK023). Ear and Hearing, 26, 214–224. Arnal, L. H., Morillon, B., Kell, C. A., & Giraud, A. L. (2009). Dual neural routing of visual facilitation in speech processing. Journal of Neuroscience, 29, 13445–13453. Arnold, P., & Hill, F. (2001). Bisensory augmentation: A speechreading advantage when speech is clearly audible and intact. British Journal of Psychology, 92 (2), 339–355. Auer, E. T., Jr. (2002). The influence of the lexicon on speech read word recognition: Con trasting segmental and lexical distinctiveness. Psychonomic Bulletin and Review, 9, 341– 347. Auer, E. T., Jr. (2009). Spoken word recognition by eye. Scandinavian Journal of Psycholo gy, 50, 419–425.

Page 32 of 56

Multimodal Speech Perception Auer, E. T., Jr., & Bernstein, L. E. (1997). Speechreading and the structure of the lexicon: Computationally modeling the effects of reduced phonetic distinctiveness on lexical uniqueness. Journal of the Acoustical Society of America, 102 (6), 3704–3710. Auer, E. T., Jr., Bernstein, L. E., & Coulter, D. C. (1998). Temporal and spatio-temporal vi brotactile displays for voice fundamental frequency: An initial evaluation of a new vibro tactile speech perception aid with normal-hearing and hearing-impaired individuals. Jour nal of the Acoustical Society of America, 104, 2477–2489. Barutchu, A., Crewther, S., Kiely, P., & Murphy, M. (2008). When /b/ill with /g/ill becomes / d/ill: Evidence for a lexical effect in audiovisual speech perception. European Journal of Cognitive Psychology, 20 (1), 1–11. Beauchamp, M. S., Argall, B. D., Bodurka, J., Duyn, J. H., & Martin, A. (2004). Unraveling multisensory integration: patchy organization within human STS multisensory cortex. Na ture Neuroscience, 7, 1190–1192. Berger, K. W. (1972). Visemes and homophenous words. Teacher of the Deaf, 70, 396–399. Bernstein, L. E., Auer, E. T., Moore, J. K., Ponton, C., Don, M., & Singh, M. (2002). Visual speech perception without primary auditory cortex activation. NeuroReport, 13, 311–315. Bernstein, L. E., Auer, E. T., & Takayanagi, S. (2004). Auditory speech detection in noise enhanced by lipreading. Speech Communication, 44, 5–18. (p. 546)

Bernstein, L. E., Demorest, M. E., & Tucker, P. E. (1998). What makes a good speechread er? First you have to find one. In R. Campbell, B. Dodd, & D. Burnham (Eds.), Hearing by eye. II: The Psychology of speechreading and auditory–visual speech (pp. 211–228). East Sussex, UK: Psychology Press. Bernstein, L. E., Eberhart, S. P., & Demorest, M. E. (1989). Single-channel vibrotactile supplements to visual perception of intonation and stress. Journal of the Acoustical Soci ety of America, 85, 397–405. Bernstein, L. E., Iverson, P., & Auer, E. T., Jr. (1997). Elucidating the complex relation ships between phonetic perception and word recognition in audiovisual speech percep tion. In C. Benoît & R. Campbell (Eds.), Proceedings of the ESCA/ ESCOP workshop on audio-visual speech processing (pp. 21–24). Rhodes, Greece. Bertelson, P., & de Gelder, B. (2004). The psychology of multimodal perception In: C. Spence & J. Driver (Eds.), Crossmodal space and crossmodal attention (pp. 141–177). Ox ford, UK: Oxford University Press. Bertelson, P., Vroomen, J., Wiegeraad, G., & de Gelder, B. (1994). Exploring the relation between McGurk interference and ventriloquism. In Proceedings of ICSLP 94, Acoustical Society of Japan (Vol. 2, pp. 559–562). Yokohama, Japan.

Page 33 of 56

Multimodal Speech Perception Besle, J., Fischer, C., Bidet-Caulet, A., Lecaignard, F., Bertrand, O., & Giard, M.H. (2008). Visual activation and audiovisual interactions in the auditory cortex during speech per ception: intracranial recordings in humans. Journal of Neuroscience, 28, 14301–14310. Besle, J., Fort, A., Delpuech, C., & Giard, M.-H. (2004). Bimodal speech: Early suppressive visual effects in the human auditory cortex. European Journal of Neuroscience, 20, 2225– 2234. Boothroyd, A. (1970). Concept and control of fundamental voice frequency in the deaf: An experiment using a visible pitch display. Paper presented at the International Congress of Education of the Deaf, Stockholm, Sweden. Brancazio, L. (2004). Lexical influences in audiovisual speech perception. Journal of Ex perimental Psychology: Human Perception and Performance, 30, 445–463. Brancazio, L., & Miller, J. L. (2005). Use of visual information in speech perception: Evi dence for a visual rate effect both with and without a McGurk effect. Perception and Psy chophysics, 67, 759–769. Brancazio, L., Miller, J. L., & Paré, M. A. (1999). Perceptual effects of place of articulation on voicing for audiovisually discrepant stimuli. Journal of the Acoustical Society of Ameri ca, 106, 2270. Brooks, P. L., & Frost, B. J. (1983). Evaluation of a tactile vocoder for word recognition. Journal of the Acoustical Society of America, 74, 34–39. Brooks, P. L., Frost, B. J., Mason, J. L., & Gibson, D. M. (1986). Continuing evaluation of Queen’s University tactile vocoder I: Identification of open set words. Journal of Rehabili tation Research and Development, 23, 119–128. Buchan, J. N., Paré, M., & Munhall, K. G. (2004). The influence of task on gaze during au diovisual speech perception. Journal of the Acoustical Society of America, 115, 2607. Buchan, J. N., Paré, M., & Munhall, K. G. (2007). Spatial statistics of gaze fixations during dynamic face processing. Social Neuroscience, 2 (1), 1–13. Buchan, J. N., Paré, M., & Munhall, K. G. (2008). The effect of varying talker identity and listening conditions on gaze behavior during audiovisual speech perception. Brain Re search, 1242, 162–171. Burnham, D., Ciocca, V., Lauw, C., Lau, S., & Stokes, S. (2000). Perception of visual infor mation for Cantonese tones. In M. Barlow & P. Rose (Eds.), Proceedings of the Eighth Australian International Conference on Speech Science and Technology (pp. 86–91). Aus tralian Speech Science and Technology Association, Canberra. Bushara, K. O., Hanakawa, T., Immisch, I., Toma, K., Kansaku, K., & Hallet, M. (2003). Neural correlates of cross-modal binding. Nature Neuroscience, 6 (2), 190–195.

Page 34 of 56

Multimodal Speech Perception Callan, D., Jones, J. A., Munhall, K. G., Kroos, C., Callan, A., & Vatikiotis Bateson, E. (2003). Neural processes underlying perceptual enhancement by visual speech gestures. NeuroReport, 14, 2213–2218. Callan, D., Jones, J. A., Munhall, K. G., Kroos, C., Callan, A., & Vatikiotis-Bateson, E. (2004). Multisensory integration sites identified by perception of spatial wavelet filtered visual speech gesture information. Journal of Cognitive Neuroscience, 16, 805–816. Calvert, G. A., Brammer, M., Bullmore, E., Campbell, R., Iversen, S. D., & David, A. (1999). Response amplification in sensory-specific cortices during crossmodal binding. Neuroreport, 10, 2619–2623. Calvert, G. A., Brammer, M. J., & Iversen, S. D. (1998). Crossmodal identification. Trends in Cognitive Sciences, 2 (7), 247–253. Calvert, G. A., Bullmore, E. T., Brammer, M. J., Campbell, R., Williams, S. C., McGuire, P. K., Woodruff, P. W., Iversen, S. D., & David, A. S. (1997). Activation of auditory cortex dur ing silent lipreading. Science, 276, 593–596. Calvert, G. A., & Campbell, R. (2003). Reading speech from still and moving faces: The neural substrates of visible speech. Journal of Cognitive Neuroscience, 15, 57–70. Calvert, G.A., Campbell, R., & Brammer, M. J. (2000). Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex. Current Biol ogy, 10, 649–657. Calvert, G. A., Hansen, P. C., Iversen, S. D., & Brammer, M. J. (2001). Detection of audiovi sual integration sites in humans by application of electro-physiological criteria to the BOLD effect. NeuroImage, 14, 427–438. Calvert, G. A., & Lewis, J. W. (2004). Hemodynamic studies of audiovisual interactions. In G. Calvert, C. Spence, & B. Stein (Eds.), Handbook of multisensory processes (pp. 483– 502). Cambridge, MA: MIT Press. Calvert, G. A., & Thesen, T. (2004). Multisensory integration: Methodological approaches and emerging principles in the human brain. Journal of Physiology, 98, 191–205. Campbell, C. S., & Massaro, D. W. (1997). Visible speech perception: Influence of spatial quantization. Perception, 26, 627–644. Campbell, R. (1992). The neuropsychology of lipreading. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 335, 39–45. Campbell, R. (1996). Seeing brains reading speech: A review and speculations. In D. G. Stork, & M. E. Hennecke (Eds.), Speechreading by humans and machines: Models. In: Systems and applications (pp. pp. 115–134). Berlin: Springer.

Page 35 of 56

Multimodal Speech Perception Campbell, R., MacSweeney, M., Surguladze, S., Calvert, G. A., Brammer, M. J., David, A. S., & Williams, S. C. R. (2001). Cortical substrates for the perception of face ac tions: An fMRI study of the specificity of activation for seen speech and for meaningless lower-face acts (gurning). Cognitive Brain Research, 12, 233–243. (p. 547)

Campbell, R., Zihl, J., Massaro, D., Munhall, K., & Cohen, M. (1997). Speechreading in the akinetopsic patient, LM. Brain, 120, 1793–1803. Capek, C. M., Bavelier, D., Corina, D., Newman, A. J., Jezzard, P., & Neville, H. J. (2004) The cortical organization of audio-visual sentence comprehension: An fMRI study at 4 Tesla. Cognitive Brain Research, 20, 111–119. Cappe, C., & Barone, P. (2005). Heteromodal connections supporting multisensory inte gration at low levels of cortical processing in the monkey. European Journal of Neuro science, 22 (11), 2886–2902. Carney, A. E. (1988). Vibrotactile perception of segmental features of speech: A compari son of single-channel and multichannel instruments. Journal of Speech and Hearing Re search, 31, 438–448. Carney, A. E., & Beachler, C. R. (1986). Vibrotactile perception of suprasegmental fea tures of speech: A comparison of single-channel and multi-channel instruments. Journal of the Acoustical Society of America, 79, 131–140. Carney, A. E., Clement, B. R., & Cienkowski, K. M. (1999). Talker variability effects in au ditory-visual speech perception. Journal of the Acoustical Society America, 106, 2270 (A). Chandrasekaran, C., & Ghazanfar, A. A. (2009). Different neural frequency bands inte grate faces and voices differently in the superior temporal sulcus. Journal of Neurophysi ology, 101, 773–788. Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A., & Ghazanfar, A. A. (2009). The natural statistics of audiovisual speech. PLoS Computational Biology, 5, e1000436. Chen, T. H., & Massaro, D. W. (2008). Seeing pitch: Visual information for lexical tones of Mandarin Chinese. Journal of Acoustical Society of America, 123 (4), 2356–2366. Colin, C., Radeau, M., & Deltenre, P. (2005). Top-down and bottom-up modulation of au diovisual integration in speech. European Journal of Cognitive Psychology, 17 (4), 541– 560. Colin, C., Radeau, M., Deltenre, P., & Morais, J. (2001). Rules of intersensory integration in spatial scene analysis and speech reading. Psychologica Belgica, 41, 131–144. Colin, C., Radeau, M., Soquet, A., Demolin, D., Colin, F., & Deltenre, P. (2002). Mismatch negativity evoked by the McGurk–MacDonald effect: A phonetic representation within short-term memory. Clinical Neurophysiology, 113, 495–506.

Page 36 of 56

Multimodal Speech Perception Connor, S. (2000). Dumbstruck: A cultural history of ventriloquism. Oxford, UK: Oxford University Press. Conrey, B., & Pisoni, D. B. (2006). Auditory-visual speech perception and synchrony de tection for speech and nonspeech signals. Journal of the Acoustical Society of America, 119, 4065–4073. Cotton, J. C. (1935). Normal “visual hearing.” Science, 82, 592–593. Cutler, A., & Butterfield, S. (1992). Rhythmic cues to speech segmentation: Evidence from juncture misperception. Journal of Memory and Language, 31, 218–236. Cvejic, E., Kim, J., & Davis, C. (2010). Prosody off the top of the head: Prosodic contrasts can be discriminated by head motion. Speech Communication, 52 (6), 555–564. Davis, C., & Kim, J. (2006). Audio-visual speech perception off the top of the head. Cogni tion, v100, B21–B31. Degerman, A., Rinne, T., Pekkola, J., Autti, T., Jääskeläinen, I. P., Sams, M., & Alho, K. (2007). Human brain activity associated with audiovisual perception and attention. Neu roImage, 34, 1683–1691. Dixon, N. F., & Spitz, L. (1980). The detection of auditory visual desynchrony. Perception, 9, 719–721. Doesburg, S. M., Emberson, L. L., Rahi1, A., Cameron D., & Ward, L. M. (2008). Asyn chrony from synchrony: Longrange gamma-band neural synchrony accompanies percep tion of audiovisual speech asynchrony. Experimental Brain Research, 185, 1. Driver, J. (1996). Enhancement of selective listening by illusory mislocation of speech sounds due to lip-reading. Nature, 381, 66–68. Driver, J., & Noesselt, T. (2008). Multisensory interplay reveals crossmodal influences on “sensory-specific” brain regions, neural responses, and judgments. Neuron, 57 (1), 11–23. Dupoux, E. (1993). The time course of prelexical processing: The Syllabic Hypothesis re visited. In: G. T. M. Altmann & R. Shillcock (Eds.), Cognitive models of speech processing: The second Sperlonga meeting (pp. 81–114). Hillsdale, NJ: Erlbaum. Engel, A. K., & Singer, W. (2001). Temporal binding and the neural correlates of sensory awareness. Trends in Cognitive Sciences, 5, 16–25. Erber, N. P., & Cramer, K. D. (1974). Vibrotactile recognition of sentences. American An nals of the Deaf, 119, 716–720. Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415, 429–433.

Page 37 of 56

Multimodal Speech Perception Eskelund, K., Tuomainen, J., & Andersen, T. S. (2011). Multistage audiovisual integration of speech: Dissociating identification and detection. Experimental Brain Research, 208, 447–457. Ettlinger, G., & Wilson, W. A. (1990). Cross-modal performance: Behavioural processes, phylogenetic considerations and neural mechanisms. Behavioural Brain Research, 40, 169–192. Fairhall, S., & Macaluso, E. (2009). Spatial attention can modulate audiovisual integration at multiple cortical and subcortical sites. European Journal of Neuroscience, 29, 1247– 1257. Falchier, A., Clavagnier, S., Barone, P., & Kennedy, H. (2002). Anatomical evidence of mul timodal integration in primate striate cortex. Journal of Neuroscience, 22, 5749–5759. Ferrari, P. F., Gallese, V., Rizzolatti, G., & Fogassi, L. (2003). Mirror neurons responding to the observation of ingestive and communicative mouth actions in the monkey ventral pre motor cortex. European Journal of Neuroscience, 17, 1703–1714. Fingelkurts, A. A., Fingelkurts, A. A., Krause, C. M., Möttönen, R., & Sams, M. (2003). Cortical operational synchrony during audio–visual speech integration. Brain Language, 85, 297–312. Fisher, B. D., & Pylyshyn, Z. W. (1994). The cognitive architecture of bimodal event per ception: A commentary and addendum to Radeau. Current Psychology of Cognition, 13 (1), 92–96. Fisher, C. G. (1968). Confusions among visually perceived consonants. Journal of Speech and Hearing Research, 11 (4), 796–804. Fisher, C. G. (1969). The visibility of terminal pitch contour. Journal of Speech and Hear ing Research, 12, 379–382. Fort, A., Delpuech, C., Pernier, J., & Giard, M. H. (2002). Early auditory-visual in teractions in human cortex during nonredundant target identification. Cognitive Brain Re search, 14, 20–30. (p. 548)

Fowler, C. A. (1986). An event approach to the study of speech perception direct-realist perspective. Journal of Phonology, 14, 3–28. Fowler, C. A. (2004). Speech as a supramodal or amodal phenomenon. In G. Calvert, C. Spence, B. E. Stein (Eds.), The handbook of multisensory processes. Cambridge, MA: MIT Press. Fowler, C., & Deckle, D. J. (1991). Listening with eye and hand: Cross-modal contributions to speech perception. Journal of Experimental Psychology, Human Perception and Perfor mance, 17, 816–828.

Page 38 of 56

Multimodal Speech Perception Foxton, J. M., Weisz, N., Bauchet-Lecaignard, F., Delpuech, C., & Bertrand, O. (2009). The neural bases underlying pitch processing difficulties. NeuroImage, 45, 1305–1313. Fujisaki, W., Shimojo, S., Kashino, M., & Nishida, S. Y. (2004). Recalibration of audiovisual simultaneity. Nature Neuroscience, 7 (7), 773–778. Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). The motor theory of speech percep tion reviewed. Psychonomic Bulletin and Review, 13, 361–377. Gagné, J. P., Tugby, K. G., & Michaud, J. (1991). Development of a Speechreading Test on the Utilization of Contextual Cues (STUCC): Preliminary findings with normal-hearing subjects. Journal of the Academy of Rehabilitative Audiology, 24, 157–170. Garner, W. R. (1974). The processing of information and structure. Hillsdale, NJ: Erlbaum. Gault, R. H. (1924) Progress in experiments on tactual interpretation of oral speech. Jour nal of Abnormal Psychology and Social Psychology, 14, 155–159. Gault, R. H., & Crane, G.W. (1928). Tactual patterns from certain vowel qualities instru mentally communicated from a speaker to a subject’s fingers. Journal of General Psychol ogy, 1, 353–359. Gentilucci, M., & Cattaneo, L. (2005). Automatic audiovisual integration in speech per ception. Experimental Brain Research, 167, 66–75. Ghazanfar, A. A., Maier, J. X., Hoffman, K. L., & Logothetis, N. K. (2005). Multisensory in tegration of dynamic faces and voices in rhesus monkey auditory cortex. Journal of Neuro science, 25, 5004–5012. Giard, M. H., & Peronnet, F. (1999). Auditory-visual integration during multimodal object recognition in humans: a behavioral and electrophysiological study. Journal of Cognitive Neuroscience, 11, 473–490. Gick, B., & Derrick, D. (2009). Aero-tactile integration in speech perception. Nature, 462, 502–504. Gick, B., Jóhannsdóttir, K., Gibraiel, D., & Muehlbauer, J. (2008). Tactile enhancement of auditory and visual speech perception in untrained perceivers. Journal of the Acoustical Society America, 123, 72–76. Granström, B., House, D., & Lundeberg, M. (1999). Prosodic cues in multimodal speech perception. In Proceedings of the International Congress of Phonetic Sciences (ICPhS99) (pp. 655–658). San Francisco. Grant, K. W. (2001). The effect of speechreading on masked detection thresholds for fil tered speech. Journal of the Acoustical Society America, 109, 2272–2275. Grant, K. W., Ardell, L. A., Kuhl, P. K., & Sparks, D. W. (1986). The transmission of prosod ic information via an electrotactile speech reading aid. Ear and Hearing, 7, 328–335. Page 39 of 56

Multimodal Speech Perception Grant, K. W., & Greenberg, S. (2001). Speech intelligibility derived from asynchronous processing of auditory-visual information. International Conference of Auditory-Visual Speech Processing (pp. 132–137). Santa Cruz, CA. Grant, K. W., & Seitz, P.-F. (2000). The use of visible speech cues for improving auditory detection of spoken sentences. Journal of the Acoustical Society America, 108, 1197– 1208. Grant, K. W., VanWassenhove, V., & Poeppel, D. (2004). Detection of auditory (cross-spec tral) and auditory-visual (cross-modal) synchrony. Speech peech Communication, 44, 43– 53. Green, K. P., & Gerdeman, A. (1995). Cross-modal discrepancies in coarticulation and the integration of speech information: The McGurk effect with mismatched vowels. Journal of Experiment Psychology: Human Perception and Performance, 21 (6), 1409–1426. Green, K. P., Kuhl, P. K., Meltzoff, A. N., & Stevens, E. B. (1991). Integrating speech infor mation across talkers, gender, and sensory modality: Female faces and male voices in the McGurk effect. Perception and Psychophysics, 50 (6), 524–536. Green, K. P., & Miller, J. L. (1985). On the role of visual rate information in phonetic per ception. Perception & Psychophysics, 38, 269–276. Hamilton, R. H., Shenton, J. T., & Coslett, H. B. (2006). An acquired deficit of audiovisual speech processing. Brain and Lang uage, 98, 66–73. Hanin, L., Boothroyd, A., & Hnath-Chisolm, T. (1988). Tactile presentation of voice fre quency as an aid to the speechreading of sentences. Ear and Hearing, 9, 335–341. Hertrich, I., Mathiak, K., Lutzenberger, W., & Ackermann, H. (2009). Time course of early audiovisual interactions during speech and nonspeech central auditory processing: A magnetoencephalography study. Journal of Cognitive Neuroscience, 21, 259–274. Hickok, G. (2009). Eight problems for the mirror neuron theory of action understanding in monkeys and humans. Journal of Cognitive Neuroscience, 21, 1229–1243. Hirsh, I. J., & Sherrick, C. E., Jr. (1961). Perceived order in different sense modalities. Journal of Experimental Psychology, 62, 423–432. Hugenschmidt, C. E., Peiffer, A. M., McCoy, T. P., Hayasaka, S., & Laurienti, P. J. (2010). Preservation of crossmodal selective attention in healthy aging. Experimental Brain Re search, 198, 273–285. IJsseldijk, F. J. (1992). Speechreading performance under different conditions of video im age, repetition, and speech rate. Journal of Speech and Hearing Research, 35, 466–471. Jack, C. E., & Thurlow, W. R. (1973). Effects of degree of visual association and angle of displacement on the “ventriloquism” effect. Perceptual and Motor Skills, 37, 967–979. Page 40 of 56

Multimodal Speech Perception Jackson, C. V. (1953). Visual factors in auditory localization. Quarterly Journal of Experi mental Psychology, 5, 52–65. Jackson, P. L. (1988). The theoretical minimal unit for visual speech perception: Visemes and coarticulation. Volta Review, 90 (5), 99–114. Jones, J., & Callan, D. (2003). Brain activity during audiovisual speech perception: An fM RI study of the McGurk effect. NeuroReport, 14, 1129–1133. Jones, J. A., & Jarick, M. (2006). Multisensory integration of speech signals: The relation ship between space and time. Experimental Brain Research, 174, 588–594. Jones, J. A., & Munhall, K. G. (1997). The effects of separating auditory and visual sources on audiovisual integration of speech. Canadian Acoustics, 25, 13–19. Joos, M. (1948). Acoustic phonetics. Baltimore: Linguistic Society of America. Jordan, T. R., & Bevan, K. M. (1997). Seeing and hearing rotated faces: Influences of facial orientation on visual and audiovisual speech recognition. Journal of Experimental (p. 549)

Psychology: Human Perception and Performance, 23, 388–403. Jordan, T. R., & Sergeant, P. (2000). Effects of distance on visual and audiovisual speech recognition. Language and Speech, 43, 107–124. Jordan, T. R., McCotter, M. V., & Thomas, S. M. (2000). Visual and audiovisual speech per ception with color and gray scale facial images. Perception and Psychophysics, 62, 1394– 1404. Jordan, T. R., & Thomas, S. (2001). Effects of horizontal viewing angle on visual and au diovisual speech recognition, Journal of Experimental Psychology: Human Perception and Performance, 27, 1386–1403. Kaiser, J., Hertrich, L., Ackermann, H., Mathiak, K., & Lutzenberger, W. (2004). Hearing lips: Gamma-band activity during audiovisual speech perception. Cerebral Cortex, 15, 646–653. Kim, J., & Davis, C. (2004). Investigating the audio-visual speech detection advantage. Speech Communication, 44, 19–30. Kim, J., Kroos, C., & Davis, C. (2010). Hearing a point-light talker: An auditory influence on a visual motion detection task. Perception, 39 (3), 407–416. King, A. J. (2005). Multisensory integration: Strategies for synchronization. Current Biolo gy, 15 (9), 339–341. Kislyuk, D. S., Möttönen, R., & Sams, M. (2008). visual processing affects the neural basis of auditory discrimination. Journal of Cognitive Neuroscience, 20 (12), 2175–2184.

Page 41 of 56

Multimodal Speech Perception Klatt, D. H. (1986). The problem of variability in speech recognition and in models of speech perception. In J. Perkell & D. Klatt (Eds.), Invarience and variability in speech processes (pp. 300–319). Hillsdale, NJ: Erlbaum. Klatt, D. (1989). Review of selected models of speech perception. In W. D. Marslen-Wilson (Ed.), Lexical representation and process (pp. 169–226). Cambridge, MA: MIT Press. Klin, A., Jones, W., Schultz, R., Volkmar, F., & Cohen, D. (2005). Visual fixation patterns during viewing of naturalistic social situations as predictors of social competence in indi viduals with autism. Archives of General Psychiatry, 59, 809–816. Klucharev, V., Möttönen, R., & Sams, M. (2003). Electrophysiological indicators of phonet ic and non-phonetic multisensory interactions during audiovisual speech perception. Cog nitive Brain Research, 18 (1), 65–75. Kubovy, M. (1988). Should we resist to the seductiveness of the space:time::vision:audition analogy? Journal of Experimental Psychology: Human Percep tion and performance, 14, 318–320. Kubovy, M., & Van Valkenburg, J. (1995). Auditory and visual objects. Cognition, 80, 97– 126. Lansing, C. R., & McConkie, G. W. (1994). A new method for speechreading research: Tracking observer’s eye movements. Journal of the Academy of Rehabilitative Audiology, 27, 25–43. Lansing, C. R., & McConkie, G. W. (1999). Attention to facial regions in segmental and prosodic visual speech perception tasks. Journal of Speech, Language, and Hearing Re search, 42, 526–538. Lansing, C. R., & McConkie, G. W. (2003). Word identification and eye fixation locations in visual and visual-plusauditory presentations of spoken sentences. Perception and Psy chophysics, 65 (4), 536–552. Laurienti, P. J., Kraft, R. A., Maldjian, J. A., Burdette J. H., & Wallace, M. T. (2004). Seman tic congruence is a critical factor in multisensory behavioral performance. Experimental Brain Research, 158, 405–414. Lavie, N. (2005). Distracted and confused? Selective attention under load. Trends in Cog nitive Sciences, 9 (2), 75–82. Lebib, R., Papo, D., de Bode, S., & Baudonniere, P. M. (2003). Evidence of a visual-audito ry cross-modal sensory gating phenomenon as reflected by the human P50 event-related potential modulation. Neuroscience Letters, 341, 185–188. Levitt, H. (1988). Recurrent issues underlying the development of tactile sensory aids. Ear and Hearing, 9 (6), 301–305.

Page 42 of 56

Multimodal Speech Perception Levitt, H. (1995). Processing of speech signals for physical and sensory disabilities. Pro ceedings of the National Academy Science U S A, 92 (22), 9999–10006. Lewkowicz, D. J. (2000). The development of intersensory temporal perception: An epige netic systems/limitations view. Psychological Bulletin, 126 (2), 281–308. Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Percep tion of the speech code. Psychological Review, 74 (6), 431–461. Liberman, A. M., Delattre, P., Cooper, F. S., & Gerstman, L. (1954). The role of consonant– vowel transitions in the perception of the stop and nasal consonants. Psychological Mono graphs: General and Applied, 68, 1–13. Liberman, A. M., & Mattingly, I. G. (1985). The motor theory of speech perception re vised. Cognition, 21, 1–36. Lidestam, B., & Beskow, J. (2006). Visual phonemic ambiguity and speechreading. Journal of Speech, Language, and Hearing Research, 49 (4), 835–847. Lisker, L. (1986). “Voicing” in English: A catalog of acoustic features signaling /b/ versus / p/ in trochees. Language and Speech, 29, 3–11. Macaluso, E., George, N., Dolan, R., Spence, C., & Driver, J. (2004). Spatial and temporal factors during processing of audiovisual speech: A PET study. NeuroImage, 21, 725–732. MacDonald, J., Andersen, S., & Bachmann, T. (2000). Hearing by eye: How much spatial degradation can be tolerated? Perception, 29, 1155–1168. MacDonald, J., & McGurk, H. (1978). Visual influences on speech perception processes. Perception & Psychophysics, 24, 253–257. MacLeod, A., & Summerfield, Q. (1987). Quantifying the contribution of vision to speech perception in noise. British Journal of Audiology, 21, 131–141. MacSweeney, M., Campbell, R., Calvert, G. A., McGuire, P. K., David, A. S., & Suckling, J. (2001). Dispersed activation in the left temporal cortex for speechreading in congenitally deaf speechreaders. Proceedings of the Royal Society of London B, 268, 451–457. Massaro, D. W. (1998). Perceiving talking faces: From speech perception to a behavioral principle. Cambridge, MA: MIT Press. Massaro, D. (2009). Caveat emptor: The meaning of perception and integration in speech perception. Available from Nature Precedings, http://hdl.handle.net/10101/npre. 2009.4016.1. Massaro, D. W., & Beskow, J. (2002). Multimodal speech perception: A paradigm for speech science. In B. Granström, D. House, & I. Karlsson (Eds.), Multimodality in lan guage and speech systems (pp. 45–71). Dordrecht: Kluwer Academic Publishers. Page 43 of 56

Multimodal Speech Perception Massaro, D. W., & Cohen, M. M. (1983). Phonological context in speech perception. Per ception and Psychophysics, 34, 338–348. Massaro, D. W., & Cohen, M. M. (1993). Perceiving asynchronous speech in consonantvowel and vowel syllables. Speech Communication, 13, 127–134. Massaro, D. W., & Cohen, M. M. (1996). Perceiving speech from inverted faces. Perception and Psychophysics, 58 (7), 1047–1065. (p. 550)

Massaro, D. W., Cohen, M. M., & Smeele, P. M. T. (1996). Perception of asynchronous and conflicting visible and auditory speech. Journal of the Acoustical Society of America, 100, 1777–1786. Massaro, D. W., & Light, J. (2004). Using visible speech for training perception and pro duction of speech for hard of hearing individuals. Journal of Speech, Language, and Hear ing Research, 47 (2), 304–320. Mattys, S. L., Bernstein, L. E., & Auer, E. T., Jr. (2002). Stimulus based lexical distinctive ness as a general word-recognition mechanism. Perception and Psychophysics, 64, 667– 679. Mayer, C., Abel, J., Barbosa, A., Black, A., & Vatikiotis-Bateson, E. (2011). The labial viseme reconsidered: Evidence from production and perception. Journal of the Acoustic Society of America, 129, 2456–2456. McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech perception. Cogni tive Psychology, 18, 1–86. McGrath, M., & Summerfield, Q. (1985). Intermodal timing relations and audio-visual speech recognition by normal-hearing adults. Journal of the Acoustic Society of America, 77, 678–685. McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 265, 746– 748. Meister, I. G., Wilson, S. M., Deblieck, C., Wu, A. D., & Iacoboni, M. (2007). The essential role of premotor cortex in speech perception. Current Biology, 17 (19), 1692–1696. Miller, L. M., & D’Esposito, M. (2005). Perceptual fusion and stimulus coincidence in the cross-modal integration of speech. Journal of Neuroscience, 25, 5884–5893. Miller, J. L., & Eimas, P. (1995). Speech perception: From signal to word. Annual Review of Psychology, 46, 467–492. Miller, G. A., & Nicely, P. E. (1955). An analysis of perceptual confusions among some English consonants. Journal of the Acoustical Society America, 27, 338–352.

Page 44 of 56

Multimodal Speech Perception Miyamoto, R. T., Myres, W. A., Wagner, M., & Punch, J. I. (1987). Vibrotactile devices as sensory aids for the deaf. Journal of American Academy of Otolaryngology-Head and Neck Surgery, 97, 57–63. Morein-Zamir, S., Soto-Faraco, S., & Kingstone, A. (2003). The capture of vision by audi tion: Deconstructing temporal ventriloquism. Cognitive Brain Research, 17, 154–163. Möttönen, R., Krause, C. M., Tiippana, K., & Sams, M. (2002). Processing of changes in vi sual speech in the human auditory cortex. Cognitive Brain Research, 13, 417–425. Möttönen, R., Schurmann, M., & Sams, M. (2004). Time course of multisensory interac tions during audiovisual speech perception in humans: A magnetoencephalographic study. Neuroscience Letters, 363, 112–115. Munhall, K. G., Gribble, P., Sacco, L., & Ward, M. (1996). Temporal constraints on the McGurk effect. Perception and Psychophysics, 58, 351–362. Munhall, K. G., Jones, J. A., Callan, D. Kuratate, T., & Vatikiotis-Bateson, E. (2003). Visual prosody and speech intelligibility: Head movement improves auditory speech perception. Psychological Science, 15 (2), 133–137. Munhall, K. G., Kroos, C., Jozan, G., & Vatikiotis-Bateson, E. (2004). Spatial frequency re quirements for audiovisual speech perception. Perception and Psychophysics, 66, 574– 583. Munhall, K. G., Servos, P., Santi, A., & Goodale, M. (2002). Dynamic visual speech percep tion in a patient with visual form agnosia. NeuroReport, 13 (14), 1793–1796. Munhall, K.G., ten Hove, M., Brammer, M., & Paré, M. (2009). Audiovisual integration of speech in a bistable illusion. Current Biology, 19 (9), 1–5. Munhall, K. G., & Vatikiotis-Bateson, E. (1998). The moving face during speech communi cation. In R. Campbell, B. Dodd, & D. Burnham (Eds.), Hearing by eye: Pt. 2. The psychol ogy of speechreading and audiovisual speech (pp. 123–139). London: Taylor & Francis, Psychology Press. Munhall, K. G., & Vatikiotis-Bateson, E. (2004). Spatial and temporal constraints on audio visual speech perception. In G. A. Calvert, C. Spence, B. E. Stein (Eds.), The handbook of multisensory processing (pp. 177–188). Cambridge, MA: MIT Press. Musacchia, G., Sams, M., Nicol, T., & Kraus, N. (2006). Seeing speech affects acoustic in formation processing in the human brainstem. Experimental Brain Research, 168 (1-2), 1– 10. Näätänen, R. (1982). Processing negativity: An evoked-potential reflection of selective at tention. Psychological Bulletin, 92, 605–640.

Page 45 of 56

Multimodal Speech Perception Navarra, J., & Soto-Faraco, S. (2007). Hearing lips in a second language: Visual articulato ry information enables the perception of L2 sounds. Psychological Research, 71 (1), 4–12. Nitchie, E. B. (1916). The use of homophenous words. Volta Rev iew, 18, 85–83. Norris, D., McQueen, J. M., & Cutler, A. (1995). Competition and segmentation in spokenword recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1209–1228. Norris, D., McQueen, J. M., & Cutler, A. (2000). Merging information in speech recogni tion: Feedback is never necessary. Behavioral and Brain Sciences, 23 (3), 299–324. O’Hare, J. J. (1991). Perceptual integration. Journal of the Washington Academy of Sciences, 81, 44–59. Ohala, J. (1975). Temporal regulation of speech. In G. Fant & M. A. A. Tatham (Eds.), Audi tory analysis and perception of speech (pp. 431–453). London: Academic Press. Olson, I. R., Gatenby, J. C., & Gore, J. C. (2002). A comparison of bound and unbound au dio-visual information processing in the human cerebral cortex. Cognitive Brain Research, 14, 129–138. Ouni, S., Cohen, M. M., Ishak, H., & Massaro, D. W (2007). Visual contribution to speech perception: Measuring the intelligibility of animated talking heads EURASIP. Journal on Audio, Speech, and Music Processing, 2007 (Article ID 47891), 1–12. Owens, O., & Blazek, B. (1985) Visemes observed by hearingimpaired and normal-hearing adult viewers. Speech and Hearing Research, 28, 381–393. Pallier, C. (1994). Role de la syllabe dans la perception de la parole: Etudes attentionelles. Doctoral dissertation, Ecole des Hautes Etudes en Sciences Sociales, Paris. Pandey, P. C., Kunov, H., & Abel, S. M. (1986). Disruptive effects of auditory signal delay on speech perception with lipreading. Journal of Auditory Research, 26, 27–41. Paré, M., Richler, R., ten Hove, M., & Munhall, K. G. (2003). Gaze behavior in audiovisual speech perception: The influence of ocular fixations on the McGurk effect. Perception and Psychophysics, 65, 553–567. Payton, K. L., Uchanski, R. M., & Braida, L. D. (1994). Intelligibility of conversational and clear speech in noise and reverberation for listeners with normal and impaired hearing. Journal of the Acoustical Society of America, 95, 1581–1592. Pekkola, J., Ojanen, V., Autti, T., Jaaskelainen, I. P., Mottonen, R., Tarkiainen, A., & Sams, M. (2005). Primary auditory cortex activation by visual speech: An fMRI study at 3 (p. 551)

T. NeuroReport, 16, 125–128.

Page 46 of 56

Multimodal Speech Perception Picheny, M. A., Durlach, N. I., & Braida, L. D. (1985). Speaking clearly for the hard of hearing I: Intelligibility differences between clear and conversational speech. Journal of Speech and Hearing Research, 28, 96–103. Pisoni, D. B., & Luce, P. A. (1987). Acoustic-phonetic representations in word recognition. Cognition, 25, 21–52. Plant, G. (1989). A comparison of five commercially available tactile aids. Australian Jour nal of Audiology, 11, 11–19. Pöppel, E., Schill, K., & von Steinbüchel, N. (1990). Sensory integration within temporally neutral systems states: a hypothesis. Naturwissenschaften, 77, 89–91. Potter, R. K., Kopp, G. A., & Green, H. C. (1947). Visible speech. New York: Van Nostrand. (Dover Publications reprint 1966). Preminger, J. E., Lin, H., Payen, M., & Levitt, H. (1998). Selective visual masking in speech reading. Journal of Speech Language and Hearing Research, 41 (3), 564–575. Radeau, M. (1994). Auditory-visual spatial interaction and modularity. Current Psychology of Cognition, 13 (1), 3–51. Radicke, J. L. (2007). Audiovisual phonological fusion. Unpublished master’s thesis, Indi ana University, Bloomington, IN. Raphael, L. J. (2005). Acoustic cues to the perception of segmental phonemes. In D. B. Pisoni & R. E. Remez (Eds.), The handbook of speech perception (pp. 182–206). Oxford, UK: Blackwell. Reed, C. M., Durlach, N. I., & Braida, L. D. (1982). Research on tactile communication of speech: A review. American Speech, Language, and Hearing Association, 20. Reed, C. M., Rabinowitz, W. M., Durlach, N. I., Braida, L. D., Conway-Fithian, S., & Schultz, M. C. (1985). Research on the Tadoma method of speech communication. Journal of the Acoustical Society America, 77, 247–257. Reisberg, D., McLean, J., & Goldfield, A. (1987). Easy to hear but hard to understand: A lip-reading advantage with intact auditory stimuli. In B. Dodd, R. Campbell (Eds.), Hear ing by eye: The psychology of lip-reading. Hillsdale, NJ: Erlbaum. Rockland, K. S., & Ojima, H. (2003). Multisensory convergence in calcarine visual areas in macaque monkey. International Journal of Psychophysiology, 50, 19–26. Rosenblum, L. D. (2008). Speech perception as a multimodal phenomenon. Current Direc tions in Psychological Science, 17 (6), 405–409. Rosenblum, L. D., Johnson, J. A., & Saldaña, H. M. (1996). Visual kinematic information for embellishing speech in noise. Journal of Speech and Hearing Research, 39 (6), 1159– 1170. Page 47 of 56

Multimodal Speech Perception Rosenblum, L. D., & Saldaña, H. M. (1996). An audiovisual test of kinematic primitives for visual speech perception. Journal of Experimental Psychology: Human Perception and Performance, 22 (2), 318–331. Rosenblum, L. D., & Saldaña, H. M. (1998). Time-varying information for visual speech perception. In R. Campbell, B. Dodd, & D. Burnham (Eds.), Hearing by eye: Pt. 2. The psy chology of speechreading and audiovisual speech (pp. 61–81). Hillsdale, NJ: Erlbaum. Rönnberg, J., Samuelsson, S., & Lyxell, B. (1998). Conceptual constraints in sentencebased lipreading in the hearing-impaired. In R. Campbell, B. Dodd, & D. Burnham (Eds.), Hearing by eye, II: Advances in the psychology of speechreading and auditory-visual speech (pp. 143–153). East Sussex, UK: Psychology Press. Ross, L., Saint-Amour, D., Leavitt, V., Jeavitt, D. C., & Foxe J. J. (2006). Do you see what I’m saying? Optimal visual enhancement of speech comprehension in noisy environments. Cerebral Cortex, 17 (5), 1147–1153. Rouger, J., Fraysse, B., Deguine, O., & Barone, P. (2008). McGurk effects in cochlear-im planted deaf subjects. Brain Research, 1188, 87–99. Rouger, J., Lagleyre, S., Fraysse, B., Deneve, S., Deguine, O., & Barone, P. (2007). Evi dence that cochlear-implanted deaf patients are better multisensory integrators. Proceed ings of the National Academy of Sciences U S A, 104 (17), 7295–7300. Saint-Amour, D., De Sanctis, S. P., Molholm, S., Ritter, W., & Foxe, J. J. (2007). Seeing voic es: High-density electrical mapping and source-analysis of the multisensory mismatch negativity evoked during the McGurk illusion. Neuropsychologia, 45, 587–597. Sams, M., Aulanko, R., Hamalainen, M., Hari, R., Lounasmaa, O. V., Lu, S. T., & Simola, J. (1991). Seeing speech: Visual information from lip movements modifies activity in the hu man auditory cortex. Neuroscience Letters, 127, 141–145. Samuelsson, S., & Rönnberg, J. (1993). Implicit and explicit use of scripted constraints in lip-reading. European Journal of Cognitive Psychology, 5, 201–233. Sato, M., Cavé, C., Ménard, L., & Brasseur, A. (2010). Auditorytactile speech perception in congenitally blind and sighted adults. Neuropsychologia, 48 (12), 3683–3686. Sato, M., Tremblay, P., & Gracco, V. L. (2009). A mediating role of the premotor cortex in phoneme segmentation. Brain and Language, 111 (1), 1–7. Schadow, J., Lenz, D., Dettler, N., Fründ, I., & Herrmann, C. S. (2009). Early gamma-band responses reflect anticipatory top-down modulation in the auditory cortex. NeuroImage, 47 (2), 651–658. Scheier, C. R., Nijhawan, R., & Shimojo, S. (1999). Sound alters visual temporal resolu tion. Investigative Ophthalmology and Visual Science, 40, 4169.

Page 48 of 56

Multimodal Speech Perception Schroeder, C. E., Lakatos, P., Kajikawa, Y., Partan, S., & Puce, A. (2008). Neuronal oscilla tions and visual amplification of speech. Trends in Cognitive Sciences, 12, 106–113. Schulte, K. (1972). Fonator system: Speech stimulator and speech feedback by technical ly amplified one-channel vibration. In G. Fant (Ed.), International Symposium on Speech Communication Ability and Profound Deafness (pp. 351–353). Washington, DC: A.G. Bell Association for the Deaf. Schwartz, J. L., Berthommier, F., & Savariaux, C. (2002). Audiovisual scene analysis: Evi dence for a “very-early” integration process in audio-visual speech perception. Proceed ings of ICSLP, 1937–1940. Scott, S. K., & Johnsrude, I. S. (2003). The neuroanatomical and functional organization of speech perception. Trends in Neurosciences, 26 (2), 100–107. Sekiyama, K., Kanno, I., Miura, S., & Sugita, Y. (2003). Auditory-visual speech perception examined by fMRI and PET. Neuroscience Research, 47 (3), 277–287. Senkowski, D., Talsma, D., Herrmann, C., & Woldorff, M. G. (2005). Multisensory process ing and oscillatory gamma responses: Effects of spatial selective attention. Experimental Brain Research, 166, 411–426. Shulman, G. L., Fiez, J. A., Corbetta, M., Buckner, R. L., Miezin, F. M., Raichle, M. E., & Petersen, S. E. (1997). Common blood flow changes across visual tasks: II. Decreases in cerebral cortex. Journal Cognitive Neuroscience, 9, 648–663. Skinner, M. W., Rinzer, S. M., Fredricksorr, J. M., Smith, P. G., Holden, T. A., Hold en, L. K., Juelich, M. E., & Turner, B. A. (1989). Comparison of benefit from vibrotactile aid and cochlear implant for post-linguistically deaf adults. Laryngoscope, 98, 1092–1099. (p. 552)

Skipper, J. I., van Wassenhove, V., Nusbaum, H. C., & Small, S. L. (2007). Hearing lips and seeing voices: How cortical areas supporting speech production mediate audiovisual speech perception. Cerebral Cortex, 17 (10), 2387–2399. Small, D. M., Voss, J., Mak, Y. E., Simmons, K. B., Parrish, T., & Gitelman, D. (2004). Expe rience-dependent neural integration of taste and smell in the human brain. Journal of Neurophysiology, 92, 1892–1903. Smeele, P., Massaro, D., Cohen, M., & Sittig, A. (1998). Laterality in visual speech percep tion. Journal of Experimental Psychology: Human Perception and Psychophysics, 24, 1232–1242. Soto-Faraco, S., & Alsius, A. (2007). Conscious access to the unisensory components of a crossmodal illusion. NeuroReport, 18, 347–350. Soto-Faraco, S., & Alsius, A. (2009). Deconstructing the McGurk-MacDonald illusion. Jour nal of Experimental Psychology: Human Perception and Performance, 35 (2), 580–587.

Page 49 of 56

Multimodal Speech Perception Soto-Faraco, S., Navarra, J., & Alsius, A. (2004). Assessing automaticity in audiovisual speech integration: evidence from the speeded classification task. Cognition, 92 (3), B13– B23. Soto-Faraco, S., Navarra, J., Weikum, W. M., Vouloumanos, A., Sebastián-Gallés, N., & Werker, J. F. (2007) Discriminating languages by speech-reading. Perception and Psy chophysics, 69, 218–231. Soto-Faraco, S., Sebastián-Gallés, N., & Cutler, A. (2001). Segmental and suprasegmental mismatch in lexical access. Journal of Memory and Language, 45, 412–432. Spence, C., & Squire, S. B. (2003). Multisensory integration: Maintaining the perception of synchrony. Current Biology, 13, R519–R521. Stein, B. E., & Meredith, M. A. (1993). The merging of the senses. Cambridge, MA: MIT Press. Stekelenburg, J. J., & Vroomen, J. (2007). Neural correlates of multisensory integration of ecologically valid audiovisual events. Journal of Cognitive Neuroscience, 19 (12), 1964– 1973. Stevenson, C. M., Brookes, M. J., & Morris, P. G. (2011). b-Band correlates of the fMRI BOLD response. Human Brain Mapping, 32, 182–197. Stevenson, R. A., & James, T. W. (2009). Audiovisual integration in human superior tempo ral sulcus: Inverse effectiveness and the neural processing of speech and object recogni tion. NeuroImage, 44 (3), 1210–1223. Studdert-Kennedy, M. (1989). Feature fitting: A comment on K. N. Stevens’ “On the quan tal nature of speech.” Journal of Phonetics, 17, 135–144. Sumby, W., & Pollack, I. (1954). Visual contribution to speech intelligibility in noise. Jour nal of Acoustical Society of America, 26, 212–215. Summerfield, Q. A. (1979). Use of visual information for phonetic perception. Phonetica, 36, 314–331. Summerfield, Q. (1987). Some preliminaries to a comprehensive account of audio-visual speech perception. In B. Dodd & R. Campbell (Eds.), Hearing by eye: The psychology of lipreading (pp. pp. 3–51). Hillsdale, NJ: Erlbaum. Summerfield, Q. (1992). Lipreading and audio-visual speech perception. Philosophical Transactions of the Royal Society, Series B, Biological Sciences, 335, 71–78. Summerfield, Q., & McGrath, M. (1984). Detection and resolution of audio-visual incom patibility in the perception of vowels. Quarterly Journal of Experimental Psychology, 36A, 51–74.

Page 50 of 56

Multimodal Speech Perception Summers, I. R., Cooper, P. G., Wright, P., Gratton, D. A., Milnes, P., & Brown, B. H. (1997). Information from timevarying vibrotactile stimuli. Journal of the Acoustical Society of America, 102, 3686–3696. Talsma, D., Doty, T. J., & Woldorff, M. G. (2007). Selective attention and audiovisual inte gration: is attending to both modalities a prerequisite for early integration? Cerebral Cor tex, 17 (3), 679–690. Talsma, D., Senkowski, D., Soto-Faraco, S., & Woldorff, M. G. (2010). The multifaceted in terplay between attention and multisensory integration. Trends in Cognitive Sciences, 14, 400–410. Talsma, D., & Woldorff, M. G. (2005). Selective attention and multisensory integration: Multiple phases of effects on the evoked brain activity. Journal of Cognitive Neuroscience, 17 (7), 1098–1114. Thomas, S. M., & Jordan, T. R. (2002). Determining the influence of Gaussian blurring on inversion effects with talking faces. Perception and Psychophysics, 64, 932–944. Thomas, S. M., & Jordan, T. R. (2004). Contributions of oral and extra-oral facial motion to visual and audiovisual speech perception. Journal of Experimental Psychology: Human Perception and Performance, 30, 873–888. Thompson, D. M. (1934). On the detection of emphasis in spoken sentences by means of visual, tactual, and visual-tactual cues. Journal of General Psychology, 11, 160–172. Thorn, F., & Thorn, S. (1989). Speechreading with reduced vision: A problem of aging. Journal of the Optical Society of America, 6, 491–499. Tiippana, K., Andersen, T. S., & Sams, M. (2004). Visual attention modulates audiovisual speech perception. European Journal of Cognitive Psychology, 16, 457–472. Tiippana, K., Puharinen, H., Möttönen, R., & Sams, M. (2011). Sound location can influ ence audiovisual speech perception when spatial attention is manipulated. Seeing and Perceiving, 24, 67–90. Tremblay, C., Champoux, F., Voss, P., Bacon, B. A., & Lepore, F. (2007). Speech and nonspeech audio-visual illusions: A developmental study. PLoS ONE, 2 (8), e742. Troyer, M., Loebach, J. L., & Pisoni, D. B. (2010). Perception of temporal asynchrony in audiovisual phonological fusion. Research on Spoken Language Processing, Progress Re port, 29, 156–182. Tuomainen, J., Andersen, T., Tiippana, K., & Sams, M. (2005). Audio-visual speech percep tion is special. Cognition, 96 (1), B13–B22.

Page 51 of 56

Multimodal Speech Perception van Atteveldt, N. M., Formisanoa, E., Goebela, R., & Blomert, L. (2007). Top-down task ef fects overrule automatic multisensory responses to letter–sound pairs in auditory associa tion cortex. NeuroImage, 36 (4), 1345–1360. van Wassenhove, V., Grant, K. W., & Poeppel, D. (2005). Visual speech speeds up the neur al processing of auditory speech. Proceedings of the National Academy of Science U S A, 102, 1181–1186. van Wassenhove, V., Grant, K. W., & Poeppel, D. (2007). Temporal window of integration in bimodal speech. Neuropsychologia, 45 (3), 598–607. Varela, F., Lachaux, J. P., Rodríguez, E., & Martinerie, J. (2001). The brainweb: Phase syn chronization and large-scale integration. Nature Reviews Neuroscience, 2, 229–239. Vatakis, A., Ghazanfar, A. A., & Spence, C. (2008). Facilitation of multisensory integration by the “unity effect” reveals that speech is special. Journal of Vision, 9 (14), 1–11. Vatakis, A., Navarra, J., Soto-Faraco, S., & Spence, C. (2008). Audiovisual tempo ral adaptation of speech: Temporal order versus simultaneity judgments. Experimental (p. 553)

Brain Research, 185, 521–529. Vatakis, A., & Spence, C. (2006). Audiovisual synchrony perception for speech and music assessed using a temporal order judgment task. Neuroscience Letters, 393, 40–44. Vatakis, A., & Spence, C. (2007). Crossmodal binding: Evaluating the “unity assumption” using audiovisual speech and nonspeech stimuli. P erception and Psychophysics, 69, 744– 756. Vatakis, A., & Spence, C. (2008). Evaluating the influence of the “unity assumption” on the temporal perception of realistic audiovisual stimuli. Acta Psychologica, 127 (1), 12–23. Vatakis, A., & Spence, C. (2010). Audiovisual temporal integration for complex speech, object-action, animal call, and musical stimuli. In M. J. Naumer & J. Kaiser (Eds.), Multi sensory object perception in the primate brain (pp. 95–121). New York: Springer. Engi neering in Medicine and Biology Society of the Institute of Electrical and Electronics En gineers. Vatikiotis-Bateson, E., Eigsti, I.-M., Yano, S., & Munhall, K. G. (1998). Eye movement of perceivers during audiovisual speech perception. Perception and Psychophysics, 60, 926– 940. Vatikiotis-Bateson, E., Munhall, K. G., Kasahara, Y., Garcia, F., & Yehia, H. (1996). Charac terizing audiovisual information during speech. In Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP 96) (Vol. 3, pp. 1485–1488). New York: IEEE Press.

Page 52 of 56

Multimodal Speech Perception Vitkovitch, M., & Barber, P. (1994). Effect of video frame rate on subjects’ ability to shad ow one of two competing verbal passages. Journal of Speech and Hearing Research, 37 (5), 1204–1211. Vroomen, J., & de Gelder, B. (2004). Temporal ventriloquism: Sound modulates the flashlag effect. Journal of Experimental Psychology: Human Perception and Performance, 30, 513–518. Vroomen, J., & Keetels, M. (2010). Perception of intersensory synchrony: A tutorial re view. Attention, Perception, & Psychophysics, 72, 871–884. Vroomen, J., Keetels, M., de Gelder, B., & Bertelson, P. (2004). Recalibration of temporal order perception by exposure to audiovisual asynchrony. Cognitive Brain Research, 22 (1), 32–35. Vroomen, J., & Stekelenburg, J. J. (2011). Perception of intersensory synchrony in audiovi sual speech: Not that special. Cognition, 118, 78–86. Walker, S., Bruce, V., & O’Malley, C. (1995). Facial identity and facial speech processing: Familiar faces and voices in the McGurk effect. P erception and Psychophysics, 57, 1124– 1133. Watkins, K. E., Strafella, A. P., & Paus, T. (2003). Seeing and hearing speech excites the motor system involved in speech production. Neuropsychologia, 41 (8), 989–994. Weikum, W. M., Vouloumanos, A., Navarra, J., Soto-Faraco, S., Sebastián-Gallés, N., & Werker, J. F. (2007). Visual language discrimination in infancy. Science, 316, 1159. Weisenberger, J. M., Broadstone, S. P., & Kozma-Spytek, L. (1991). Relative performance of single-channel and multichannel tactile aids for speech perception. Journal of Rehabili tation Research and Development, 28, 45–56. Weisenberger, J. M., Broadstone, S. M., & Saunders, F. A. (1989). Evaluation of two multi channel tactile aids for the hearing impaired. Journal Acoustic Society America, 865, 1764–1775. Weisenberger, J. M., & Kozma-Spytek, L. (1991). Evaluating tactile aids for speech per ception and production by hearingimpaired adults and children. American Journal of Otol ogy, 12 (Suppl), 188–200. Weisenberger, J. M., & Percy, M. (1995). The transmission of phoneme-level information by multichannel tactile speech perception aids. Ear and Hearing, 16, 392–406. Weisenberger, J. M., & Russel, A. F. (1989). Comparison of two single-channel vibrotactile aids for the hearing-impaired. Journal of Speech Hearing Research, 32, 83–92. Welch, R. B. (1999). Meaning, attention and the “unity assumption” in the intersensory bias of spatial and temporal perceptions. In G. Aschersleben, T. Bachmann, & J. Müsseler Page 53 of 56

Multimodal Speech Perception (Eds.), Cognitive contributions to the perception of spatial and temporal events (pp. 371– 388). Amsterdam: Elsevier. Welch, R. B., & Warren, D. H. (1980). Immediate perceptual response to intersensory dis crepancy. Psychological Bulletin, 88 (3), 638–667. Welch, R. B., & Warren D. H. (1986). Intersensory interactions. In K. R. Kaufman & J. P. Thomas (Eds.), Handbook of perception and human performance. Vol. 1: Sensory process es and perception (pp. 1–36). New York: Wiley. Wickelgren, W. A. (1969). Context-sensitive coding, associative memory, and serial order in (speech) behavior. Psychological Review, 76, 1–15. Wilson, S. M., Pinar Saygin, A., Sereno, M. I., & Iacoboni, M. (2004). Listening to speech activates motor areas involved in speech production. Nature, 7 (7), 701–702. Windmann, S. (2003). Effects of sentence context and expectation on the McGurk illusion. Journal of Memory and Language, 50 (2), 212–230. Wright, T. M., Pelphrey, K. A., Allison, T., McKeown, M. J., & McCarthy, G. (2003). Polysen sory interactions along lateral temporal regions evoked by audiovisual speech. Cerebral Cortex, 13, 1034–1043. Yehia, H. C., Kuratate, T., & Vatikiotis-Bateson, E. (2002). Linking facial animation, head motion and speech acoustics. Journal of Phonetics, 30, 555–568. Yehia, H., Rubin, P., & Vatikiotis-Bateson, E. (1998). Quantitative association of vocal-tract and facial behaviour. Speech Communication, 26, 23–43. Yuan, H., Reed, C. M., & Durlach, N. I. (2005). Tactual display of consonant voicing as a supplement to lipreading. Journal of the Acoustical Society of America, 118 (2), 1003– 1015. Zampini, M., Guest, S., Shore, D. I., & Spence, C. (2003). A udio-visual simultaneity judg ments. Perception and Psychophysics, 67 (3), 531–544.

Notes: (1) . A spectrograph is an instrument that separates an incoming wave into a frequency spectrum. (2) . Two main theories (motor theory, Liberman et al., 1967, Liberman & Mattingly, 1985; and direct realism, Fowler, 1986) propose that the process of speech perception involves the perceptual recovery and classification of articulatory gestures produced by the talker rather than the acoustic correlates of those articulations. Thus, according to these theo ries, articulatory movements are the invariant components in speech perception.

Page 54 of 56

Multimodal Speech Perception (3) . Whereas some researchers have used these two terms inter-changeably, others have distinguished them theoretically. Summerfield (1992; Footnote 1), for instance, defines lip reading as the perception of speech purely by observing the talker’s articulatory ges tures, whereas speech reading would also include other linguistic aspects of nonverbal communication (e.g., the talker’s facial and manual gestures). (4) . According to this model, the processing of corresponding auditory and visual infor mation is never functionally separated (at early stages of processing). Other models claim that audiovisual binding occurs after unisensory signals have been thoroughly processed, thus at a late stage of processing (e.g., FLMP; Massaro, 1998). (5) . Sensory substitution systems gather environmental energy that would normally be processed by one sensory system (e.g., acoustic energy in the present case) and translate this information into stimuli for another sensory system (e.g., electrotactile or vibrotactile energy). (6) . In air, the speed of light is much faster than sound (approximately 300,000,000 m/ second vs. 330 m/second, respectively). The neural transduction latencies are faster for auditory stimuli than for visual (approximately 10 ms vs. 50 ms, respectively; Pöppel et al., 1990). (7) . The temporal resolution has often been determined by means of temporal order judg ment (TOJ) or simultaneity judgment (SJ) tasks. In a TOJ task, stimuli are presented with multi-sensory information at various stimulus onset asynchronies (SOAs; Dixon & Spitz, 1980; Hirsh & Sherrick, 1961), and observers may judge which stimulus came first or which came second. In an SJ task, observers must judge whether the stimuli were pre sented simultaneously or successively. (8) . The special nature of audiovisual speech perception, compared with nonspeech au diovisual binding, has been supported in other studies showing that audiovisual interac tion is stronger if the very same audiovisual stimuli are treated as speech rather than nonspeech (Tuomainen et al., 2005) and that independent maturational processes under lie speech and nonspeech audiovisual illusory effects (Tremblay et al., 2007). At the physi ological level, audiovisual speech and nonspeech interactions also appear to rely, at least in part, on distinct mechanisms (Klucharev et al., 2003; van Wassenhove et al., 2005). However, higher familiarity, extensive exposure to these stimuli in daily life, and the fact that audiovisual speech events may be somehow more attention grabbing (i.e., compared with nonspeech events) are potential confounds that may explain some of the differences reported in previous literature (see Stekelenburg & Vroomen, 2007; Vatakis et al., 2008). (9) . Among other things, determining which components of the face are important for vis ible speech perception will allow the development of applications with virtual three-di mensional animated talking heads (Ouni et al., 2007). These animated agents have the po tential to improve communication in a range of situations, by supporting auditory infor

Page 55 of 56

Multimodal Speech Perception mation (e.g., deaf population, telephone conversation, second language learning; Massaro & Light, 2004). (10) . The MMN is typically evoked by an occasional auditory change (deviant) in a ho mogenous sequence of auditory stimuli (standards), even when it occurs outside the sub jects’ focus of attention. That is, it constitutes an electrophysiological signature of audito ry discrimination abilities. (11) . According to the perceptual load theory of attention, the level of attentional de mands required to perform a particular process will determine, among other things, the amount of resources available to engage in the processing of task irrelevant information. When a relevant task exhausts the available processing resources (i.e., under conditions of high perceptual load), other incoming stimuli/tasks will receive little, if any, attention. However, the theory also implies that, if the target-processing load is low, attention will inevitably spill over to the processing of distractors, even if they are task irrelevant.

Agnès Alsius

Agnès Alsius, Department of Psychology, Queen’s University, Ontario, Canada Ewen MacDonald

Ewen MacDonald, Department of Psychology, Queen’s University, Ontario, Canada; Centre for Applied Hearing Research, Department of Electrical Engineering, Techni cal University of Denmark, Lyngby, Denmark Kevin Munhall

Kevin Munhall is Professor and Coordinator of Graduate Studies, Queen's University.

Page 56 of 56

Organization of Conceptual Knowledge of Objects in the Human Brain

Organization of Conceptual Knowledge of Objects in the Human Brain Bradford Z. Mahon and Alfonso Caramazza The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0027

Abstract and Keywords One of the most provocative and exciting issues in cognitive science is how neural speci ficity for semantic categories of common objects arises in the functional architecture of the brain. Three decades of research on the neuropsychological phenomenon of categoryspecific semantic deficits has generated detailed claims about the organization and repre sentation of conceptual knowledge. More recently, researchers have sought to test hy potheses developed on the basis of neuropsychological evidence with functional imaging. From those two fields, the empirical generalization emerges that object domain and sen sory modality jointly constrain the organization of knowledge in the brain. At the same time, research within the embodied cognition framework has highlighted the need to ar ticulate how information is communicated between the sensory and motor systems, and processes that represent and generalize abstract information. Those developments point toward a new approach for understanding category-specificity in terms of the coordinat ed influences of diverse regions and cognitive systems. Keywords: objects, category-specific semantic deficits, conceptual knowledge, organization, object domain, senso ry modality

Introduction The scientific study of how concepts are represented in the mind/brain extends to all dis ciplines within cognitive science. Within the psychological and brain sciences, research has focused on studying how the perceptual, motor, and conceptual attributes of common objects are represented and organized in the brain. Theories of conceptual representa tion must therefore explain not only how conceptual content itself is represented and or ganized but also the role played by conceptual content in orchestrating perceptual and motor processes.

Page 1 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Cognitive neuropsychological studies of brain-damaged patients provide strong evidence about the representation of conceptual knowledge and the relationship between concep tual knowledge and perceptual and motor processes. The cognitive neuropsychological approach ultimately seeks to evaluate models of cognitive processing through the proxi mate goal of explaining the profile of behavioral performance observed in brain-damaged patients. In the measure to which it is possible to establish the functional locus of impair ment in a patient within a given model of cognitive functioning, it is possible to test other assumptions of that model through further experiments with that patient. Dissociations of abilities in patients (and of processes in models) are central to the neuropsychological ap proach. This is because, if a given behavior/process X can be impaired while another be havior/process Y is preserved, then one may conclude that the former process is not causally involved in the latter process. Another important source of evidence from neu ropsychology is aspects of cognitive functioning that are observed to be systematically impaired or spared together (for discussion of methodological issues in (p. 555) cognitive neuropsychology, see Caramazza, 1986, 1992; Shallice, 1988).

Scope of the Review The modern study of the representation of concepts in the brain was initiated by a series of papers by Elizabeth Warrington, Tim Shallice, and Rosaleen McCarthy. Those authors described patients with disproportionate semantic impairments for one, or several, cate gories of objects compared with other categories (see Hécaen & De Ajuriaguerra, 1956, for earlier work). Since those initial investigations, a great deal has been learned about the causes of category-specific semantic deficits and, by extension, about the organiza tion of object knowledge in the brain.

Page 2 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain

Figure 27.1 Representative picture naming perfor mance of patients with category-specific semantic deficits. a. (Upper Left) Category-specific semantic deficits for living animate things. b. (Upper Right) Category-specific semantic deficits for fruit/vegeta bles. c. (Lower Left) Category-specific semantic deficits for conspecifics. d. (Lower Right). Categoryspecific semantic deficits for nonliving.

The focus of this review is on neuropsychological research and, in particular, on the phe nomenon of category-specific semantic deficits. Evidence from other fields within cogni tive science and neuroscience and functional neuroimaging is reviewed as it bears on the theoretical positions that emerge from the study of category-specific semantic deficits. In particular, we highlight findings in functional neuroimaging related to the representation of different semantic categories in the brain. We also discuss the degree to which concep tual representations are grounded in sensory and motor processes and the critical role that neuropsychological studies of patients with impairments to sensory and motor knowl edge can play in constraining theories of semantic representation. However, the stated fo cus of this chapter also excludes important theoretical positions in the field of semantic memory (e.g., Patterson, Nestor, & Rogers, 2007).

Category-Specific Semantic Deficits: Introduc tion to the Phenomenon Patients with category-specific semantic deficits present with disproportionate or even se lective impairments for one semantic category compared with other semantic categories. Figure 27.1 illustrates cases of disproportionate impairment for animals (see Figure 27.1A; Blundo et al., 2006; Caramazza & Shelton, 1998), (p. 556) fruit/vegetables (see Fig ure 27.1B; Hart et al., 1985; Samson & Pillon, 2003), conspecifics (see Figure 27.1C; Miceli et al., 2000; Ellis et al., 1989), and nonliving things (see Figure 27.1D; Hillis & Caramazza, 1991; Laiacona & Capitani, 2001). There have been over 100 reported cases Page 3 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain of category-specific semantic impairment (for review and discussion, see Capitani et al., 2003; Hart et al., 2007; Humphreys & Forde, 2001; Tyler & Moss, 2001). The majority of reported patients have disproportionate impairments for living things compared with non living things (Capitani et al., 2003). One important aspect of the performance profile of patients with category-specific seman tic impairment is that the impairment is to conceptual knowledge and not (only) to modal ity-specific input or output representations. The evidence for locating the deficit at a con ceptual level is that the category-specific deficit does not depend on stimuli being pre sented, or on patients responding, in only one modality of input or output. For instance, patients KC and EW (see Figure 27.1A) were impaired for naming living animate things compared with nonliving things and fruit/vegetables. Both patients were also impaired for answering questions about living animate things, such as “does a whale have legs” or “are dogs domestic animals,” but were unimpaired for the same types of questions about nonanimals (Figure 27.2A). Patients with category-specific semantic deficits may also have additional, and also cate gory-specific, deficits at presemantic levels of processing. For instance, patient EW was impaired for judging whether pictures depicted real or unreal animals, but was unim paired for the same task over nonanimal stimuli. The ability to make such decisions is as sumed to index the integrity of the visual structural description system, a presemantic stage of object recognition (Humphreys et al., 1988). In contrast, patient KC was relative ly unimpaired on an object decision task, even for the category of items (living animate) that the patient was unable to name. A similar pattern to that observed in patient KC was present in patient APA (Miceli et al., 2000). Patient APA was selectively impaired for con ceptual knowledge of people (see Figure 27.1C). Despite a severe impairment for naming famous people, APA did not have a deficit at the level of face recognition (prosopagnosia). Another important aspect of patients with category-specific semantic impairments is that they have difficulty distinguishing among basic-level items within the impaired category, but do not necessarily have problems assigning items they cannot identify to the correct superordinate-level category (e.g., they may know that a picture of a dog is an animal, but do not know which animal; see Humphreys & Forde, 2005, for a patient with greater diffi culty at a superordinate than a basic level across all semantic categories). A number of studies have now documented that variables such as lexical frequency, con cept familiarity, and visual complexity may be unbalanced if items are sampled “random ly” from different semantic categories (Cree & McRae, 2003; Funnell & Sheridan, 1992; Stewart et al., 1992). In addition, Laiacona and colleagues (Barbarotto et al., 2002; Laia cona et al., 1998) have highlighted the need to control for gender-specific effects on vari ables such as concept familiarity (for discussion of differences between males and fe males in the incidence of category-specific semantic deficits for different categories, see Laiacona et al., 2006). However, the existence of category-specific semantic deficits is not an artifact of such stimulus-specific attributes. Clear cases have been reported while carefully controlling for those factors, and double dissociations have been reported using Page 4 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain the same materials (e.g., Hillis & Caramazza, 1991; see also the separate case reports in Laiacona & Capitani, 2001, and Barbarotto et al., 1995).

Overview of Theoretical Explanations of the Causes of Category-Specific Semantic Deficits Theories developed to explain category-specific semantic deficits fall into two broad groups (Caramazza, 1998). Theories within the first group, based on the neural structure principle, assume dissociable neural substrates are differentially (or exclusively) involved in representing different semantic categories. Theories within the second group, based on the correlated structure principle, assume that conceptual knowledge of items from dif ferent semantic categories is not represented in functionally dissociable regions of the brain. According to theories based on the neural structure principle, category-specific semantic deficits are due to differential or selective damage to the neural substrate upon which the impaired category of items depends. Two broad classes of theories based on the neural structure principle are the sensory/functional theory (Warrington & McCarthy, 1983, 1987; Warrington & Shallice, 1984) and the domain-specific hypothesis (Caramazza & Shelton, 1998).

Page 5 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain

Figure 27.2 Relation between impairments for a type or modality of knowledge and category-specific se mantic deficits. These data show that: a. (Upper Graph) category-specific semantic impairments are associated with impairments for all types of knowl edge about the impaired category; b. (Middle Graph) differential impairments for visual/perceptual knowl edge can be associated with (if anything) a dispro portionate impairment for nonliving things compared to living things; and c. (Lower Graph) selective im pairment for knowledge of object color is not associ ated with a corresponding disproportionate deficit for fruit/vegetables. References for patient initials from a. (Upper Graph): EW–Caramazza and Shelton 1998; GR and FM–Laia cona et al 1993; DB–Lambon Ralph et al 1998; RC– Moss et al 1998. (p. 557)

The sensory/functional theory is composed of two assumptions. The first—the multiple se mantics assumption—is that conceptual knowledge is organized into subsystems that par allel the sensory and motor modalities of input and output. The second assumption is that the critical semantic attributes of items from different categories of objects are repre sented in different modality-specific semantic subsystems. The domain-specific hypothesis assumes that the first-order constraint on the organiza tion of conceptual knowledge is object domain, with the possible domains restricted to those that could have had an evolutionarily relevant history—living animate, living inani mate, conspecifics, and “tools.” Theories based on the correlated structure principle model semantic memory as a system that (p. 558) represents statistical regularities in the co-occurrence of object properties in the world (Caramazza et al., 1990; Devlin et al., 1998; McClelland & Rogers, 2003; Tyler Page 6 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain & Moss, 2001). This class of models has been instrumental in motivating large-scale em pirical investigations of how different types of features are distributed and correlated for different semantic categories. Several theories based on the correlated structure princi ple have been developed to explain the causes of category-specific semantic deficits (Caramazza et al., 1990; Devlin et al., 1998; Tyler & Moss, 2001). This review is organized to reflect the role that different theoretical assumptions have played in motivating empirical research. Initial hypotheses that were developed to ex plain category-specific semantic deficits appealed to a single principle of organization (modality specificity, domain specificity, or correlated structure). The current state of the field of category-specific semantic deficits is characterized by complex models that inte grate assumptions from multiple theoretical frameworks. This trajectory of theoretical po sitions reflects the fact that, although theories that have been developed based on the neural and correlated structure principles are mutually contrary as explanations about the causes of category-specific semantic deficits, the individual assumptions that consti tute those theories are not necessarily incompatible as hypotheses about the structure of knowledge in the brain (for discussion, see Caramazza & Mahon, 2003).

Neural Structure Principle Multiple Semantics Assumption The proposal that the organization of the semantic system follows the organization of the various input and output modalities to and from the semantic system was initially pro posed by Beauvois (Beauvois, 1982; Beauvois et al., 1978). The original motivation for the assumption of multiple semantics was the phenomenon of optic aphasia (e.g., Llermitte & Beavuois, 1973; for review, see Plaut, 2002). Patients with optic aphasia present with im paired naming of visually presented objects, but relatively (or completely) spared naming of the same objects when presented through the tactile modality (e.g., Hillis & Caramaz za, 1995). The fact that optic aphasic patients can name objects presented through the tactile modality indicates that the naming impairment to visual presentation is not due to a deficit at the level of retrieving the correct names. In contrast to patients with visual ag nosia (e.g., Milner et al., 1991), patients with optic aphasia can recognize, at a visual lev el of processing, the stimuli they cannot name. Evidence for this is provided by the fact that some optic aphasic patients can demonstrate the correct use of objects that they can not name (e.g., Coslett & Saffran, 1992; Llermitte & Beauvois, 1973; see Plaut, 2002). Beauvois (1982) explained the performance of optic aphasic patients by assuming that the conceptual system is functionally organized into visual and verbal semantics and that op tic aphasia is due to a disconnection between the two semantic systems. Along with reporting the first cases of category-specific semantic deficit, Warrington and her collaborators (Warrington & McCarthy, 1983; Warrington & Shallice, 1984) developed an influential explanation of the phenomenon that built on the proposal of Beauvois (1982). Warrington and colleagues argued that category-specific semantic deficits are Page 7 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain due to differential damage to a modality-specific semantic subsystem that is not itself or ganized by semantic category. Specifically, those authors noted that the patients they had reported with impairments for living things also had impairments for foods, plants, and precious stones (Warrington & Shallice 1984); in contrast, a patient with an impairment for nonliving things (Warrington & McCarthy, 1983) was spared for living things, food, and plant life. Warrington and her collaborators reasoned that the association of impaired and spared categories was meaningfully related to the degree to which identification of items from those categories depends on sensory or functional knowledge. Specifically, they argued that the ability to identify living things differentially depends on sensory knowledge, whereas the ability to identify nonliving things differentially depends on func tional knowledge. Farah and McClelland (1991) implemented the theory of Warrington and colleagues in a connectionist framework. Three predictions follow from the computational model of Farah and McClelland (1991; for discussion, see Caramazza & Shelton, 1998). All three of those predictions have now been tested. The first prediction is that the grain of category-specif ic semantic deficits should not be finer than living versus nonliving. This prediction fol lows from the assumption that all living things differentially depend on visual knowledge. However, as represented in Figure 27.1, patients have been reported with selective se mantic impairments for fruit/vegetables (e.g., Hart et al., 1985; Laiacona et al., 2005; Samson & Pillon, 2003) and animals (e.g., Blundo et al., 2006; Caramazza & Shelton, 1998). The second prediction is that an impairment for a given category of knowledge will be associated with a disproportionate impairment for the modality of (p. 559) knowledge that is critical for that category. At variance with this prediction, it is now known that cat egory-specific semantic deficits are associated with impairments for all types of knowl edge (sensory and functional) about items from the impaired category (see Figure 27.2A; e.g., Blundo et al., 2006; Caramazza & Shelton, 1998; Laiacona & Capitani, 2001; Laia cona et al., 1993; Lambon Ralph et al., 1998; Moss et al., 1998). The third prediction is that impairments for a type of knowledge will necessarily be associated with differential impairments for the category that depends on that knowledge type. Patients exhibiting patterns of impairment contrary to this prediction have been reported. For instance, Fig ure 27.2B shows the profile of a patient who was (1) more impaired for visual compared with functional knowledge, and (2) if anything, more impaired for nonliving things than living things (Lambon Ralph et al., 1998; see also Figure 27.2C, Figure 27.3, and discus sion below).

Page 8 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain

Second-Generation Sensory/Functional Theories

Figure 27.3 Category-specific patterns of BOLD re sponse in the healthy brain (data from Chao et al 2002; graphics provided by Alex Martin). This figure shows in red, a network of regions that are differen tially activated for living animate things, and in blue, a network of regions that are differentially activated for nonliving things.

The original formulation of the sensory/functional theory was based on a simple division between visual-perceptual knowledge and functional-associative knowledge. Warrington and McCarthy (1987; see also Crutch & Warrington, 2003) suggested, however, that knowledge of object color is differentially important for fruit/vegetables compared with animals. Since Warrington and McCarthy, further sensory- and motor-based dimensions that may be important for distinguishing between semantic categories have been articu lated (e.g., Cree & McRae, 2003; Vinson et al., 2003). Cree and McRae (2003) used a feature-listing task to study the types of information that normal subjects spontaneously associate with different semantic categories. The seman tic features were then classified into nine knowledge types: color, visual parts and surface properties, visual motion, smell, sound, tactile, taste, function, and encyclopedic (see Vin son et al., 2003, for a slightly different classification). Hierarchical cluster analyses were used to determine which semantic categories differentially loaded on which feature types. The results of those analyses indicated that (1) visual motion and function information were the two most important knowledge types for distinguishing living animate things (high on visual motion information) from (p. 560) nonliving things (high on function infor mation); (2) living animate things were weighted lower on color information than fruit/ vegetables, but higher on this knowledge type than nonliving things; and (3) fruit/vegeta bles were distinguished from living animate and nonliving things by being weighted the highest on both color and taste information. Cree and McRae’s analyses support the claim that the taxonomy of nine knowledge types is effective in distinguishing among the domains living animate, fruit/vegetables, and non Page 9 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain living. Those analyses do not demonstrate that the nine knowledge types are critical for distinguishing among items within the respective categories. However, and as noted above, patients with category-specific semantic impairments do not necessarily have diffi culty distinguishing between different domains (i.e., they might know it is an “animal,” but cannot say which one). It is therefore not obvious that Cree and McRae’s analyses support the claim that category-specific semantic deficits may be explained by assuming damage to one (or more) of the nine knowledge types. At a more general level, the open empirical question is whether the additional knowledge types and the corresponding further functional divisions that are introduced into the se mantic system can account for the neuropsychological evidence. Clearly, if fruit/vegeta bles and animals are assumed to differentially depend on different types of information (and by inference, different semantic subsystems), it is in principle possible to account for the tripartite distinction between animals, fruit/vegetables, and nonliving. As for the origi nal formulation of the sensory/functional theory, the question is whether fine-grained cat egory-specific semantic impairments are associated with impairments for the type of knowledge upon which items from the impaired category putatively depend. However, pa tients have been reported with category-specific semantic impairments for fruit/vegeta bles, without disproportionate impairments for color knowledge (e.g., Samson & Pillon, 2003). Patients have also been reported with impairment for knowledge of object color without a disproportionate impairment for fruit/vegetables compared with other cate gories of objects (see Figure 27.2C; Luzzatti & Davidoff, 1994; Miceli et al., 2001). Another way in which investigators have sought to provide support for the sensory/func tional theory is to study the semantic categories that are systematically impaired togeth er. As noted above, one profile of the first reported cases that motivated the development of the sensory/functional theory (Warrington & Shallice, 1984) was that the categories of animals, plants, and foods tended to be impaired or spared together. Those associations of impairing and sparing of categories made sense if all of those categories depended on the same modality-specific system for their identification. Following the same logic, it was argued that musical instruments patterned with living things (because of the importance of sensory attributes; see Dixon et al., 2000, for relevant data), whereas body parts pat terned with nonliving things (because of the importance of functional attributes associat ed with object use (e.g., Warrington & McCarthy, 1987). However, as was the case for the dissociation between living animate (animals) and living inanimate (e.g., plants) things, it is now known that musical instruments dissociate from living things, whereas body parts dissociate from nonliving things (Caramazza & Shelton, 1998; Laiacona & Capitani, 2001; Shelton et al., 1998; Silveri et al., 1997; Turnbull & Laws, 2000; for review and discus sion, see Capitani et al., 2003). More recently, Borgo and Shallice (2001, 2003) have argued that sensory-quality cate gories, such as materials, edible substances, and drinks, are similar to animals in that they depend on sensory information for their identification. Those authors reported that impairment for living things was associated with impairments for sensory-quality cate gories. However, Laiacona and colleagues (2003) reported a patient who was impaired for Page 10 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain living things but spared for sensory-quality categories (for further discussion, see Carroll & Garrard 2005). Another dimension that has been argued to be instrumental in accounting for categoryspecific semantic deficits is differential similarity in the visual structure of items from dif ferent categories. Humphreys and Forde (2001; see also Tranel et al., 1997) argued that living things tend to be more structurally similar than nonliving things. If that were the case, then it could be argued that damage to a system not organized by object category would result in disproportionate disruption of items that are more “confusable” (see also Lambon Ralph et al., 2007; Rogers et al., 2004). Within Humphreys and Forde’s frame work, it is also assumed that activation dynamically cascades from visual object recogni tion processes through to lexical access. Thus, perturbation of visual recognition process es could trickle through the system to disrupt the normal functioning of subsequent processes, resulting in a naming deficit (see Humphreys et al., 1988). (p. 561) Laws and colleagues (Laws & Gale, 2002; Laws & Neve, 1999) also argued for the critical role of similarity in visual structure for explaining category-specific semantic deficits. However, in contrast to Humphreys and Forde (see also Tranel et al., 1997), Laws and colleagues argued that nonliving things tend to be more similar than living things. Clearly, there remains much work to be done to understand the role that visual similarity and the consequent “crowding” (Humphreys & Forde 2001) of visual representations have in explaining category-specific semantic deficits. On the one hand, there is no con sensus regarding the relevant object properties over which similarity should be calculat ed, or regarding how such a similarity metric should be calculated. On the other hand, as suming an “agreed on” means for determining similarity in visual shape, the question re mains open as to the role that such a factor might play in explaining the facts of categoryspecific semantic deficits.

Domain-Specific Hypothesis The domain-specific hypothesis of the organization of conceptual knowledge in the brain (Caramazza & Shelton, 1998) assumes that the first-order constraint on the organization of information within the conceptual system is object domain. The semantic categories that may be organized by domain-specific constraints are limited to those that could have had an evolutionarily relevant history: living animate, living inanimate, conspecifics, and tools. On this proposal the phenomenon of category-specific semantic deficit reflects dif ferential or selective damage to the neural substrates that support one or another domain of knowledge. Research from developmental psychology converges with the assumption that conceptual knowledge is organized, in part, by innately specified constraints on ob ject knowledge (e.g., Baillargeon 1998; Carey & Spelke, 1994; Gallistel, 1990; R. Gelman, 1990; Keil, 1981; Spelke et al., 1992; Wellman & S. Gelman, 1992; see Santos & Caramaz za, 2002, for review; see, e.g., Kiani et al., 2007, for convergent findings using neurophys iological methods with nonhuman primates). Research in developmental psychology has also highlighted other domains of knowledge beyond those motivated by neuropsychologi cal research on patients with category-specific deficits, such as number and geometric/ Page 11 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain spatial reasoning (e.g., Cantlon et al., 2009; Feigenson et al., 2004; Hermer & Spelke, 1994). Unique predictions are generated by the original formulation of the domain-specific hy pothesis as it was articulated in the context of category-specific semantic deficits. One prediction is that the grain of category-specific semantic deficits will reflect the grain of those categories that could plausibly have had an evolutionarily relevant history (see Fig ure 27.1). Another prediction is that category-specific semantic impairments will be asso ciated with impairments for all types of knowledge about the impaired object type (see Figure 27.2A). A third prediction made by the domain-specific hypothesis is that it should be possible to observe category-specific impairments that result from early damage to the brain. Evidence in line with this expectation is provided by the case of Adam (Farah & Ra binowitz 2003). Patient Adam, who was 16 years old at the time of testing, suffered a stroke at 1 day of age. Adam failed to acquire knowledge of living things, despite normal levels of knowledge about nonliving things. As would be expected within the framework of the domain-specific hypothesis, Adam was impaired for both visual and nonvisual knowledge of living things (Farah & Rabinowitz, 2003).

Correlated Structure Principle Theories based on the correlated structure principle assume that the conceptual system has no structure that is specifically reflected in functional neuroanatomy. For instance, the organized unitary content hypothesis (OUCH; Caramazza et al., 1990) was initially formulated as an explanation of optic aphasia that did not invoke the assumption of multi ple semantics. Caramazza and colleagues (1990; see also Riddoch et al., 1988) argued that there are privileged relationships between certain types of input representations (e.g., visual form) and certain types of output representations (e.g., knowledge of object manipulation), thus explaining how optic aphasic patients might be spared for gesturing to objects while impaired for naming them. Other researchers subsequently developed highly specified proposals based on the corre lated structure principle, all of which build on the idea that different types of features are differentially correlated across different semantic categories (Devlin et al., 1998; Rogers et al., 2004; Tyler & Moss 2001). Those models of semantic memory have been imple mented computationally, with simulated damage, to provide existence proofs that a sys tem with no explicit functional organization may be damaged so as to produce categoryspecific semantic deficits. Because theories based on the correlated structure principle do not assume that the conceptual system has structure at the level of functional neu roanatomy, (p. 562) they are best suited to modeling the patterns of progressive loss of conceptual knowledge observed in neurodegenerative diseases, such as dementia of the Alzheimer type and semantic dementia. The type of damage in such patients is diffuse and widespread and can be modeled in connectionist architectures by removing, to vary ing degrees, randomly selected components of the network.

Page 12 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain One important proposal is the conceptual-structure account of Tyler, Moss, and col leagues (Bright et al., 2005; Tyler & Moss, 2001). That proposal assumes that living things have more shared features, whereas nonliving things have more distinctive fea tures. The model further assumes that the shared features of living things are highly cor related (has eyes/can see), whereas for nonliving things distinctive features are highly correlated (used for spearing/has tines). If distinctive features are critical for identifica tion, and if greater correlation confers resilience to damage, then an interaction between the severity of overall impairment and the direction of category-specific semantic deficit is predicted. Mild levels of impairments should produce disproportionate impairments for living things compared with nonliving things. At more severe levels of impairments, the distinctive features of nonliving things will be lost, and a disproportionate impairment for this category will be observed. The opposite prediction regarding the severity of overall impairment and the direction of category-specific impairment is predicted by the account of Devlin and colleagues (1998) because it is assumed that as damage becomes severe, whole sets of inter-correlated features will be lost, resulting in a disproportionate impair ment for living things. However, it is now known that neither prediction finds clear empir ical support (Garrard et al., 1998; Zannino et al., 2002; see also Laiacona and Capitani, 2001, for discussion within the context of focal lesions; for further discussion and theoret ical developments, see Cree and McRae, 2003; Vinson et al., 2003). One issue that is not resolved is whether correlations between different features should be calculated in a concept-dependent or concept-independent manner (Zannino et al., 2006). For instance, although the (“distinctive”) information “has tines” is highly correlat ed with the function “used for spearing” in the concept of “fork” (correlated as concept dependent), the co-occurrence of those properties in the world is relatively low (concept independent). Sartori, Lombardi, and colleagues (Sartori & Lombardi, 2004; Sartori et al., 2005) have addressed a similar issue by developing the construct of “semantic rele vance,” which is computed through a nonlinear combination of the frequency with which particular features are produced for an item and the distinctiveness of those features for all concepts in the database. Those authors have shown that living things tend to be low er, on average, than nonliving things in terms of their relevance, thus making living things on average “harder” than nonliving things. As is the case for other accounts of cat egory-specific semantic deficits that are based on differences across categories along a single dimension, the existence of disproportionate deficits for the relatively “easy” cate gory (nonliving things) are difficult to accommodate (see, e.g., Hillis & Caramazza, 1991; Laiacona & Capitani, 2001; see Figure 27.1D). Nevertheless, the theoretical proposal of Sartori and colleagues highlights the critical and unresolved issue of how to determine the “psychologically relevant” metric for representing feature correlations. Another unresolved issue is whether high correlations between features will provide “re silience” to damage for those features, or rather will make damage “contagious” among them. It is often assumed that high correlation confers resilience to, or insulation from, damage; however, our understanding of how damage to one part of the brain affects oth er regions of the brain remains poorly developed. It is also not obvious that understand ing the behavior of connectionist architectures constitutes the needed motivation for de Page 13 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain ciding, one way or the other, whether greater correlation confers greater resilience to damage. In fact, theoretical differences about the role of correlations in conferring re silience to damage are in part responsible for the contrasting predictions that follow from the models of Tyler and colleagues (Tyler & Moss, 2001) and Devlin and colleagues (1998) (see Zannino et al., 2006, for discussion). Another example that illustrates our current lack of understanding of the role of correla tion in determining the patterns of impairment is provided by dissociations between sen sory, motor, and conceptual knowledge. For instance, the visual structure of objects is highly correlated with more abstract knowledge of the conceptual features of objects. Even so, patients with impairments to abstract conceptual features of objects do not nec essarily have corresponding impairments to object recognition processes (see above, and Capitani et al., 2003, for review). Similarly, although manipulation knowledge (“how to” knowledge) is correlated with functional knowledge (“what for” knowledge), damage to the former does not imply damage to the latter (see Buxbaum et al., 2000; see Figure 27.3D and discussion below). (p. 563)

Theories based on the correlated structure principle are presented as alternatives

to proposals that assume neural structure within the conceptual system. The implicit as sumption in that argument is that the theoretical construct of a semantic feature offers a means for reducing different categories to a common set of elements (see Rogers et al., 2004, for an alternative proposal). There are, however, no semantic features that have been described that are shared across semantic categories, aside from very abstract fea tures such as “has mass” (Strnad, Anzellotti, & Caramazza, 2011). In other words, in the measure to which semantic features are the “substance” of conceptual representations, different semantic categories would be represented by non-overlapping sets of features. Thus, and as has been proposed on the basis of functional neuroimaging data (see, e.g., Haxby et al., 2001, and discussion below), it may be the case that regions of high feature correlation (e.g., within semantic category correlations in visual structure) are reflected in the functional neuroanatomy of the brain (see also Devlin et al., 1998, for a hybrid model in which both focal and diffuse lesions can produce category-specific effects, and Caramazza et al., 1990, for an earlier proposal along those lines).

Anatomy of Category Specificity An important development in cognitive neuroscience that has paralleled the articulation of theories of semantic organization is the discovery of multiple channels of visual pro cessing (Goodale & Milner, 1992; Ungerleider & Miskin, 1982). It is now known that visu al processing bifurcates into two independent but interconnected streams (for discussion of how best to characterize the two streams, see Pisella et al., 2006). The ventral visual object processing stream projects from V1 through ventral occipital and temporal cor tices, terminating in anterior regions of the temporal lobe, and subserves visual object identification. The dorsal object processing stream projects from V1 through dorsal occip ital cortex to posterior parietal cortex and subserves object-directed action and spatial Page 14 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain analysis for the purposes of object-directed grasping. The two-visual-systems hypothesis has played a central role in understanding the neuroanatomy of category specificity.

Lesion Analyses A natural issue to arise in neuropsychological research concerns which brain regions tend to be lesioned in association with category-specific deficits. The first study to systemati cally address this issue was by H. Damasio and colleagues (1996). Those authors found that name retrieval deficits for pictures of famous people were associated with left tempo ral pole lesions, a result confirmed by other investigators (see Lyons, et al., 2006, for an overview). Damasio and colleagues also found that deficits for naming animals were asso ciated with (more posterior) lesions of anterior left ventral temporal cortex. Subsequent research has confirmed that deficits for naming animals are associated with lesions to an terior regions of temporal cortex (e.g., Brambati et al., 2006). Damasio and collaborators also found that deficits for naming tools were associated with lesions to posterior and lat eral temporal areas, overlapping the left posterior middle gyrus. The critical role of the left posterior middle temporal gyrus for knowing about tools has also since been con firmed by other lesion studies (e.g., Brambati et al., 2006). A subsequent report by H. Damasio and colleagues (2004) demonstrated that the same re gions were also reliably damaged in patients with impairments for recognizing stimuli from those three categories. In addition, Damasio and colleagues (2004) found that deficits for naming tools, as well as fruit/vegetables, were associated with lesions to the inferior precentral and postcentral gyri and the insula. Consensus about the association of lesions to the regions discussed above with category-specific deficits is provided by Gainotti’s (e.g., 2000) analyses of published reports of patients with category-specific se mantic deficits. A number of investigators have interpreted the differential role of anterior mesial aspects of ventral temporal cortex in the processing of living things to reflect the fact that living things have more shared properties than nonliving things, such that more fine-grained discriminations are required to name them (Bright et al., 2005; Damasio et al., 2004; Sim mons & Barsalou, 2003; see also Humphreys et al., 2001). Within this framework, the as sociation of deficits to unique person knowledge and lesions to the most anterior aspects of the temporal lobe is assumed to reflect the greater discrimination that is required for distinguishing among conspecifics, compared with animals (less) and nonliving things (even less).

Page 15 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain

Functional Imaging

Figure 27.4 Congenitally blind and sighted partici pants were presented with auditorily spoken words of living things (animals) and nonliving things (tools, nonmanipulable objects) and were asked to make size judgments about the referents of the words. The sighted participants were also shown pictures corre sponding to the same stimuli in a separate scan. For sighted participants viewing pictures, the known finding was replicated that nonliving things such as tools and large nonmanipulable objects lead to differ ential neural responses in medial aspects of ventral temporal-occipital cortex. This pattern of differential BOLD responses for nonliving things in medial as pects of ventral temporal-occipital cortex was also observed in congenitally blind participants and sight ed participants performing the size judgment task over auditory stimuli. These data indicate that the medial-to-lateral bias in the distribution of categoryspecific responses does not depend on visual experi ence. For details of the study, see Mahon and Col leagues (2009).

Data from functional imaging, and in particular functional magnetic resonance imaging (fMRI), have added in important ways to our understanding of how different semantic cat egories are processed in the healthy brain. In particular, although (p. 564) the lesion over lap approach is powerful in detecting brain regions that are critical for performing a giv en task, functional imaging has the advantage of detecting both regions that are critical and regions that are automatically engaged by the mere presentation of a certain type of stimulus. Thus, in line with the lesion evidence described above, nonliving things, and in particular tools, differentially activate the left middle temporal gyrus (Figure 27.4A; e.g., Martin et al., 1996; Thompson-Schill et al., 1999; see Devlin et al., 2002, for review). Oth er imaging data indicate that this region plays an important role in processing the seman tics of actions (e.g., Martin et al., 1995; Kable et al., 2002; Kemmerer et al., 2008), as well Page 16 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain as mechanical (i.e., unarticulated) motion (Beauchamp et al., 2002, 2003; Martin & Weis berg, 2003). In contrast, and not as apparent in lesion studies, tools differentially activate dorsal stream regions that mediate object-directed action. The activation of some of those re gions is independent of whether (p. 565) action information is necessary to perform the task in which participants are engaged (e.g., picture naming). For instance, regions with in dorsal occipital cortex, posterior parietal cortex, through to the anterior intraparietal sulcus, are automatically activated when participants observe manipulable objects (e.g., Chao & Martin, 2000; Culham et al., 2003; Fang & He, 2005; Frey et al., 2005). Those re gions are important for determining volumetric and spatial information about objects as well as shaping and transporting the hand for object grasping. However, those dorsal oc cipital and posterior parietal regions are not thought to be critical for object identifica tion or naming (e.g., Goodale & Milner, 1992). Naming tools also differentially activates the left inferior parietal lobule (e.g., Mahon et al., 2007; Rumiati et al., 2003), a structure that is important for representing complex object-associated manipulations (e.g., for re view, see Johnson-Frey, 2004; Lewis, 2006). One clear way in which functional imaging data have contributed beyond lesion evidence to our understanding of category specificity in the brain is the description of highly con sistent topographic biases by semantic categories in the ventral object processing stream (see Figure 27.4B and C; for reviews, see Bookheimer, 2002; Gerlach, 2002; Grill-Spector & Malach, 2004; Op de Beeck et al., 2008; Thompson-Schill, 2003). As opposed to the an terior-posterior mapping of semantic categories within the ventral stream described by the lesion evidence (e.g., Damasio et al., 1996), there is also a lateral-to-medial organiza tion. The fusiform gyrus on the ventral surface of the temporal-occipital cortex is critical for representing object color and form (e.g., Martin, 2007; Miceli et al., 2001). Living ani mate things such as faces and animals elicit differential neural responses in the lateral fusiform gyrus, whereas nonliving things (tools, vehicles) elicit differential neural re sponses in the medial fusiform gyrus (e.g., Chao et al., 1999; Mahon et al., 2007; Nop peney et al., 2006). Stimuli that are highly definable in terms of their spatial context, such as houses and scenes, differentially activate regions anterior to these fusiform regions, in the vicinity of parahippocampal cortex (e.g., Bar & Aminoff, 2003; Epstein & Kanwisher, 1998). Other visual stimuli also elicit consistent topographical biases in the ventral stream, for instance, written words (see Dehaene et al., 2005, for discussion) and images of body parts (e.g., Downing et al., 2001).

Distributed Domain-Specific Hypothesis The basic phenomenon of consistent topographic biases by semantic category in the ven tral stream sits more naturally with the distributed domain-specific hypothesis than the sensory/functional theory. To explain those data within the context of the sensory/func tional theory, further assumptions are necessary about why there would be an organiza tion by semantic category within the (putative) visual modality. In short, a hybrid model is Page 17 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain required that combines the assumption of multiple semantics with some claim about how information would come to be topographically segregated by semantic category. A num ber of such proposals have been advanced, although not always in the context of the sen sory/functional theory, or more generally within the context of theories that emerge from category-specific semantic deficits (see, e.g., Gauthier et al., 2000; Haxby et al., 2001; Ishai et al., 1999; Levy et al., 2001; Mechelli et al., 2006; Rogers et al., 2003). To date, the emphasis of research on the organization of the ventral stream has been on the stimulus properties that drive responses in a particular brain region, studied in rela tive isolation from other regions. This approach was inherited from well-established tradi tions in neurophysiology and psychophysics, where it has been enormously productive for mapping psychophysical continua in primary sensory systems. It does not follow that the same approach will yield equally useful insights for understanding the principles of the neural organization of conceptual knowledge. The reason is that unlike the peripheral sensory systems, the pattern of neural responses in higher order areas is only partially driven by the physical input—it is also driven by how the stimulus is interpreted, and that interpretation does not occur in a single, isolated region. The ventral object processing stream is the central pathway for the extraction of object identity from visual information in the primate brain—but what the brain does with that information about object identity depends on how the ventral stream is connected to the rest of the brain. We have extended the domain-specific hypothesis, as developed in the context of catego ry-specific semantic deficits, to explain the causes of category specificity in the ventral stream (Caramazza & Mahon, 2003; Mahon & Caramazza, 2009). This reformulation of the theory is referred to as the distributed domain-specific hypothesis. On that view, what is given innately is the connectivity; specialization by semantic category in the ventral stream is driven by that connectivity (Mahon & Caramazza, 2011). The implication of this proposal is that the organization of the ventral stream by category is (p. 566) relatively in variant to visually based, bottom-up constraints. This approach corrects an imbalance in explanations of the causes of the consistent topography by semantic category in the ven tral object processing stream by giving greater prominence to endogenously determined constraints on brain organization. An important characteristic of domain-specific systems is that the computations that must be performed over items from the domain are sufficiently “eccentric” (Fodor, 1983 to mer it a specialized process. In other words, the coupling across different brain regions that is necessary for successful processing of a given domain is different in kind from the types of coupling that are needed for other domains of knowledge. For instance, the need to in tegrate motor-relevant information with visual information is present for tools and other graspable objects and less so for animals or faces. In contrast, the need to integrate affec tive information, biological motion processing, and visual form information is strong for conspecifics and animals, and less so for tools or places. Thus, our proposal is that do main-specific constraints are expressed as patterns of connectivity among regions of the ventral stream and other areas of the brain that process nonvisual information about the same classes of items. For instance, specialization for faces in the lateral fusiform gyrus Page 18 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain (fusiform face area; Martin & Weisberg, 2003; Pasley et al., 2004; Vuilleumier et al., 2004) arises because that region of the brain has connectivity with the amygdala and the supe rior temporal sulcus (among other regions), which are important for the extraction of so cially relevant information and biological motion. Specificity for tools and manipulable ob jects in the medial fusiform gyrus is driven, in part, by connectivity between that region and regions of parietal cortex that subserve object manipulation (Mahon et al., 2007; Noppeney et al., 2006; Rushworth et al., 2006; Valyear & Culham, 2009). Connectivitybased constraints may also be responsible for other effects of category specificity in the ventral visual stream, such as connectivity between somatomotor areas and regions of the ventral stream that differentially respond to body parts (extrastriate body area; Astafiev et al., 2004; Orlov et al., 2010; Peelen & Caramazza, 2010), connectivity between left lat eralized frontal language processing regions and ventral stream areas specialized for printed words (visual word form area; Dehaene et al., 2005; Martin, 2006), and connectiv ity between regions involved in spatial analysis and ventral stream regions showing dif ferential responses to highly contextualized stimuli, such as houses, scenes, and large nonmanipulable objects (parahippocampal place area; Bar & Aminoff, 2003).

Role of Visual Experience According to the distributed domain-specific hypothesis, the organization by category in the ventral stream not only is a reflection of the visual structure of the world but also re flects the structure of how ventral visual cortex is connected to other regions of the brain (Mahon & Caramazza, 2009; Mahon et al., 2007; Riesenhuber, 2007). However, visual ex perience and dimensions of visual similarity are also critical in shaping the organization of the ventral stream (Felleman & Van Essen, 1991; Op de Beeck et al., 2006)—after all, the principal afferents to the ventral stream come from earlier stages in the visual hierar chy (Tanaka et al., 1991). Although recent discussion has noted the possibility that nonvisual dimensions may be relevant in shaping the organization of the ventral stream (Cant et al., 2009; Grill-Spector & Malach, 2004; Martin, 2007), those accounts have given far greater prominence to the role of visual experience in their explanation of the causes of category-specific organiza tion within the ventral stream. A number of hypotheses have been developed, and we merely touch on them here to illustrate a common assumption: that the organization of the ventral stream reflects the visual structure of the world, as interpreted by domaingeneral processing constraints. Thus, the general thrust of those accounts is that the vi sual structure of the world is correlated with semantic category distinctions in a way that is captured by how visual information is organized in the brain. One of the most explicit proposals is that there are weak eccentricity preferences in higher order visual areas that are inherited from earlier stages in the processing stream. Those eccentricity biases in teract with our experience of foveating some classes of items (e.g., faces) and viewing others in the relative periphery (e.g., houses; Levy et al., 2001). Another class of propos als is based on the supposition that items from the same category tend to look more simi lar than items from different categories, and similarity in visual shape is mapped onto ventral temporal occipital cortex (Haxby et al., 2001). It has also been proposed that a Page 19 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain given category may require differential processing relative to other categories, for in stance, in terms of expertise (Gauthier et al., 1999), visual crowding (Rogers et al., 2005), or the relevance of visual information for categorization (Mechelli et al., 2006). Still other accounts (p. 567) appeal to “feature” similarity and distributed feature maps (Tyler et al., 2003). Finally, it has been suggested that multiple, visually based dimensions of organiza tion combine super-additively to generate the boundaries among category-preferring re gions (Op de Beeck et al., 2008). Common to all of these accounts is the assumption that visual experience provides the necessary structure, and that a visual dimension of organi zation happens to be highly correlated with semantic category. Although visual information is important in shaping how the ventral stream is organized, recent findings indicate that visual experience is not necessary for the same, or similar, patterns of category specificity to be present in the ventral stream. In an early positron emission tomography (PET) study, Büchel and colleague (1998) showed that congenitally blind subjects have activation for words (presented in Braille) in the same region of the ventral stream as sighted individuals (presented visually). Pietrini and colleagues (2004) used multivoxel pattern analyses to show that the pattern of activation over voxels in the ventral stream was more consistent across different exemplars within a category, than ex emplars across categories. More recently, we (Mahon et al., 2009) have shown that the same medial-to-lateral bias in category preferences on the ventral surface of the occipital temporal cortex that is present in sighted individuals is present in congenitally blind sub jects. Specifically, nonliving things, compared with animals, elicit stronger activation in medial regions of the ventral stream (see Figure 27.3). Although these studies on category specificity in blind individuals represent only a firstpass analysis of the role of visual experience in driving category specificity in the ventral stream, they indicate that visual experience is not necessary for category specificity to emerge in the ventral stream. This fact raises an important question—if visual experience is not needed for the same topographical biases in category specificity to be present in the ventral stream, then, what drives such organization? One possibility, as we have sug gested, is innate connectivity between regions of the ventral stream and other regions of the brain that process affective, motor, and conceptual information.

Connectivity as an Innate Domain-Specific Constraint A critical component of the distributed domain-specific hypothesis is the notion of connec tivity. The most obvious candidate to mediate such networks is white matter connectivity. However, it is important to underline that functional networks need not be restricted by the grain of white matter connectivity, and perhaps more important, task- and state-de pendent changes may bias processing toward different components of a broader anatomi cal brain network. For instance, connectivity between lateral and orbital prefrontal re gions and ventral temporal-occipital cortex (Kveraga et al., 2007; Miller et al., 2003) is critical for categorization of visual input. It remains an open question whether multiple functional networks are subserved by this circuit, each determined by the type of visual stimulus being categorized. For instance, when categorizing manipulable objects, connec Page 20 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain tivity between parietal-frontal somatomotor areas and prefrontal cortex may dominate, whereas when categorizing faces other regions may express stronger functional coupling to those same prefrontal regions. Such a suggestion would generate the expectation that although damage to prefrontal-to-ventral stream connections may result in difficulties categorizing all types of visual stimuli, disruption of the afferents to prefrontal cortex from a specific category-preferring area could lead to categorization problems selective to that domain.

Object-Associated Actions The activation by tool stimuli of regions of the brain that mediate object-directed action has been argued to follow naturally from the sensory/functional theory. On that theory, the activation of dorsal structures by tool stimuli indexes the critical role of function knowledge in the recognition of nonliving things (e.g., Boronat et al., 2004; Kellenbach et al., 2003; Martin, 2000; Noppeney et al., 2006; Simmons & Barsalou, 2003). That argu ment is weakened, however, in the measure to which it is demonstrated that the integrity of action knowledge is not necessary in order to have other types of knowledge about tools, such as their function. The neuropsychological phenomenon of apraxia offers a way of testing whether action knowledge is critical for supporting conceptual processing of tools. Apraxia refers to an impairment for using objects that cannot be explained by a deficit in visual object recog nition or an impairment to low-level motor processes themselves. Figure 27.5A summarizes the performance profile of the patient reported by Ochipa and colleagues (1989) who was impaired for using objects but relatively preserved for naming the same objects (see also Figure 27.5B for similar dissociations in a series of single-case analyses; Negri et al., 2007; see also Rosci et al., 2003; for clear (p. 568) cases studies, see Moreaud et al., 1998; Rapcsak et al., 2001; Rumiati et al., 2001; see Rothi et al., 1991, for an influ ential cognitive model). Apraxic deficits for using objects are often observed subsequent to lesions in the regions of the dorsal stream, reviewed above, that are automatically acti vated when participants name tools (in particular, the left inferior parietal lobule). The fact that patients are able to name objects that they cannot use indicates that the activa tion of those regions during naming tasks is not, in and of itself, necessary for successful completion of the task. At the same time, lesions to parietal cortex, in the context of le sions to the middle temporal gyrus or frontal motor areas, do modulate performance in object identification. In a recent analysis (Mahon et al., 2007), a group of unilateral stroke patients were separated into two groups according to the anatomical criterion of having lesions involving (see Figure 27.5C, left) or not involving parietal cortex (see Fig ure 27.5C, right). There was a relationship between performance in object identification and object use at the group level only in patients with lesions involving parietal cortex, suggesting that action knowledge associated with objects is not irrelevant for successful identification.

Page 21 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Other neuropsychological data indicate that the integrity of action knowledge is not nec essary for patients to have accurate knowledge of object function. Figure 27.5 depicts the performance of patient WC (Buxbaum et al., 2000) on two picture-matching tasks. In a picture-matching task that required knowledge of object manipulation, performance was impaired; however, in a picture-matching task that required knowledge of object function, performance was spared. Functional imaging studies (Boronat et al., 2004; Canessa et al., 2008; Kellenbach et al., 2003) converge with those neuropsychological data in showing that manipulation, but not function, knowledge modulates neural responses in the inferi or parietal lobule. There is also evidence, from both functional neuroimaging (e.g., Canes sa et al., 2008) and neuropsychology (e.g., Sirigu et al., 1991), that temporal, and not parietal, cortex may be involved in the representation of function knowledge of objects. The convergence between the neuropsychological evidence from apraxia and the func tional imaging evidence indicates that although there is a dedicated system for knowl edge of object manipulation, that system is not critically involved in representing knowl edge of object function. This suggests that the automatic engagement of action process ing by manipulable objects, as observed in neuroimaging, may have consequences for a theory of pragmatics or action, but not necessarily for a theory of semantics (Goodale & Milner, 1992; Jeannerod & Jacob, 2005). This in turn weakens the claim that automatic activation of dorsal stream structures by manipulable objects is evidence for the sensory/ functional theory.

Relation Between Sensory, Motor, and Concep tual Knowledge Early formulations of the sensory/functional theory assumed that conceptual content, al though tied in important ways to the sensory and motor systems, was more abstract than the token-based information contained within the sensory and motor systems (Warrington & McCarthy, 1983, 1987; Warrington & Shallice, 1984; see also Crutch & Warrington, 2003). More recent formulations of the multiple semantics approach have argued, within the embodied cognition framework, that conceptual content can be reductively grounded in sensory and motor processes (e.g., Barsalou, 1999, 2008; H. Damasio et al., 2004; Gallese & Lakoff, 2005; Patterson et al., 2007; Pulvermüller, 2005; Prinz, 2002; Zwaan, 2004). The first detailed articulation of the embodied cognition framework was by Allen Allport (1985). Allport (1985) proposed that conceptual knowledge is organized according to sen sory and motor modalities and that the information represented within different modali ties was format specific: The essential idea is that the same neural elements that are involved in the coding the sensory attributes of a (possibly unknown) object presented to eye or hand or ear also make up the elements of the auto-associated activity-patterns that repre sent familiar object-concepts in “semantic memory.” This model is, of course, in Page 22 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain radical opposition to the view, apparently held by many psychologists, that “se mantic memory” is represented in some abstract, modality-independent, “concep tual” domain remote from the mechanisms of perception and motor organization. (Allport, 1985, p. 53, emphasis original)

Figure 27.5 Relation between knowledge of how to manipulate tools and other knowledge of tools. a. (Upper Left). Ochipa and colleagues (1989) reported a patient with a severe impairment for manipulating objects but relatively preserved naming of the same objects. b. (Upper Right). A multiple single case study of unselected unilateral stroke patients asked patients to use and identify the same set of objects (N egri et al 2007). Performance of the patients is plot ted as t values (Crawford and Garthwaite 2006) com pared to control (n = 25) performance. c. Lesions to parietal cortex, in the context of lesions to lateral temporal and frontal regions, can be instrumental in modulating the relationship between performance in object identification and object use, at the group lev el (see Mahon et al 2007, Figure 7, for details and le sion overlap analyses). Each circle in the plots repre sents the performance of a single patient in object identification and object use. The 95% confidence in tervals around the regression lines are shown. Re produced with permission from Mahon and col leagues (2007). d.(Lower Graph). Patient WC (Buxbaum et al 2000) was impaired for matching pic tures based on how objects are manipulated but was spared for matching pictures based on the function of the objects.

One type of evidence, discussed above, that has been argued to support an embodied rep resentation of object concepts is the observation that regions of the brain that directly mediate object-directed action are automatically activated when participants observe ma nipulable objects. However, the available neuropsychological evidence (see Figure 27.5) Page 23 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain reduces confidence in the claim that action knowledge plays a critical role in grounding the diverse types of knowledge that we have about tools. The strongest evi dence for the relevance of motor and perceptual processes to conceptual processing is provided by demonstrations that the sensory and motor systems are automatically en gaged by linguistic stimuli that imply action (e.g., Buccino et al., 2005; Boulenger et al., 2006; Glenberg & Kaschak, 2002; Oliveri et al., 2004). It has also been demonstrated that activation of the motor system automatically spreads to conceptual and perceptual levels of processing (e.g., Pulvermüller et al., 2005). (p. 569)

The embodied cognition hypothesis makes strong predictions about the integrity of con ceptual processes after damage to sensory and motor processes. It predicts, necessarily, and as Allport wrote, that “…the loss of particular attribute information in semantic mem ory should be accompanied by a corresponding perceptual (agnostic) deficit” (p. 55, (p. 570) emphasis original). Although there are long traditions within neuropsychology of studying patients with deficits for sensory and motor knowledge, only recently have those deficits been of such clear theoretical relevance to hypotheses about the nature of seman tic memory. Systematic and theoretically informed studies of such patients will play a piv otal role in evaluating the relation between sensory, motor, and conceptual knowledge. Central to that enterprise will be to specify how information is dynamically exchanged be tween systems, in the context of specific task requirements. This will be important for de termining the degree to which sensory and motor activation is in fact a critical compo nent of conceptual processing (see Machery, 2007; Mahon & Caramazza, 2008, for discus sion). It is theoretically possible (and in our view, likely) that although concepts are not exhausted by sensory and motor information, the organization of “abstract” concepts is nonetheless shaped in important ways by the structure of the sensory and motor systems. It is also likely, in our view, that processing of such “abstract” conceptual content is heav ily interlaced with activation of the sensory and motor systems. We have referred to this view as “grounding by interaction” (Mahon & Caramazza, 2008).

Grounding by Interaction: A Hypothesis about the Representation of Conceptual Content Consider the hypothetical apraxic patient with whom one might have a conversation about hammers. The patient might be able to recount the history of the hammer as an in vention, the materials of which the first hammer was made, or what hammers typically weigh. The patient may even look at a hammer and name it without apparent difficulty. But when presented with a hammer, the patient is profoundly impaired at demonstrating how the object is physically manipulated to accomplish its function. This impairment is not due to a peripheral motor deficit because the patient may be able to imitate meaning less gestures without difficulty. What is the functional locus of damage in the patient? Has the patient “lost” part of her concept of “hammer”?

Page 24 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain On one level, the patient clearly does retain the concept of hammer, and in this sense, the concept of hammer is “symbolic,” “abstract,” and “qualitatively different” from the motor “knowledge” that is compromised in the patient. On another level, when the patient in stantiates the abstract and symbolic concept of hammer, that instantiation occurs isolated from sensory-motor information that, in the normal system, would go along with the in stantiation of the concept. Thus, on the one hand, there is a level of representation of meaning that is sufficiently general and flexible that it may apply to inputs from diverse sensory modalities and be ex pressed in action through diverse output modalities. The abstract and symbolic represen tation of hammer could be accessed from touch, vision, or audition; similarly, that repre sentation could be “expressed” by pantomiming the use of a hammer, producing the sounds that make up the word hammer, writing the written word hammer, and so on. In short, there is a level of conceptual representation that is abstract and symbolic and that is not exhausted by information represented in the sensory and motor systems. On the other hand, conceptual information that is represented at an abstract and symbol ic level does not, in and of itself, exhaust what we know about the world. What we know about the world depends also on interactions between abstract conceptual content and the sensory and motor systems. There are two ways in which such interactions may come about. First, abstract and symbolic concepts can be activated by events in the world that are processed by the sensory systems, and realize changes in the world through the mo tor system (see Jeannerod & Jacob, 2005, for relevant discussion). Second, the instantia tion of a given abstract and symbolic concept always occurs in a particular situation; as such, the instantiation of that concept in that situation may involve highly specific senso ry and motor processes. Within the grounding by interaction framework, sensory and motor information colors conceptual processing, enriches it, and provides it with a relational context. The activa tion of the sensory and motor systems during conceptual processing serves to ground ab stract and symbolic representations in the rich sensory and motor content that mediates our physical interaction with the world. Of course, the specific sensory and motor infor mation that is activated may change depending on the situation in which the abstract and symbolic conceptual representation is instantiated. On the grounding by interaction view, the specific sensory and motor information that goes along with the instantiation of a concept is not constitutive of that concept. Of course, that does not mean that that specific sensory and motor information is not impor tant for the instantiation of a concept, in a particular way, at a given point in time. In deed, (p. 571) such sensory and motor information may constitute, in part, that instantia tion. A useful analogy in this regard is to linguistic processing. There is no upper limit (in principle) on the number of completely novel sentences that a speaker may utter. This fact formed one of the starting points for formal arguments against the behaviorist para digm (Chomsky, 1959). Consider the (indefinite) set of sentences that a person may utter in his life: Those sentences can have syntactic structures that are in no way specifically Page 25 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain tied to the particular words through which the expression of those syntactic structures were realized. The syntax of a sentence is not exhausted by an account of the words of which it is composed; this is the case even though it may be the first time that that syn tactic structure has ever been produced, and even though the expression of that particu lar syntactic structure clearly depended (de facto) on the presence of those particular words. To close the analogy: Concepts “wear” sensory and motor information in the way that the syntax of a sentence “wears” particular words.

Toward a Synthesis We have organized this review around theoretical explanations of category specificity in the human brain. One theme that emerges is the historical progression from theories based on a single principle of organization to theories that integrate multiple dimensions of organization. This progression is due to the broad recognition in the field that a single dimension will not be sufficient to explain all aspects of the organization of object knowl edge in the brain. However, every dimension or principle of organization is not of equal importance. This is because all dimensions do not have the same explanatory scope. A rel ative hierarchy of principles is therefore necessary to determine which of the many known facts are theoretically important and which are of only marginal significance (Caramazza & Mahon, 2003). Two broad findings emerge from cognitive neuropsychological research. First, patients have been reported with disproportionate impairments for a modality or type of knowl edge (e.g., visual-perceptual knowledge—see Figure 27.2B; manipulation knowledge—see Figure 27.5). Second, category-specific semantic deficits are associated with impairments for all types of knowledge about the impaired category (see Figure 27.2A). Analogues to those two facts are also found in functional neuroimaging. First, the attributes of some categories of objects (e.g., tools) are differentially represented in modality specific sys tems (i.e., motor systems). Second, within a given modality specific system (e.g., ventral visual pathway) there is functional organization by semantic category (e.g., living animate vs. nonliving; see Figure 27.4 for an overview). Thus, across both neuropsychological studies and functional imaging studies, the broad empirical generalization emerges that there are two, orthogonal, constraints on the organization of object knowledge: object do main and sensory-motor modality. This empirical generalization is neutral with respect to how one explains the causes of category-specific effects in both functional neuroimaging and neuropsychology. Many theoretical proposals of the causes of category specificity articulate dimensions along which semantic categories differ (e.g., Cree & McRae, 2003; Devlin et al., 1998; Gauthier et al., 2000; Haxby et al., 2001; Humphreys & Forde, 2001; Laws & Gale, 2002; Levy et al., 2001; Mechelli et al., 2006; Op de Beeck et al., 2008; Rogers et al., 2004; Sar tori & Lombardi, 2004; Simmons & Barsalou, 2003; Tranel et al., 1997; Tyler & Moss, 2001; Warrington & Shallice, 1984; Zannino et al., 2006). Understanding the role that such dimensions play in the genesis of category specificity in a particular part of the Page 26 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain brain, or a particular component of a cognitive model, will be central to characterizing the functioning of that component of the system. However, progress in understanding the causes of category specificity in one region of the brain, or one functional component of a cognitive model, will require an understanding of how category specificity is realized throughout the whole brain and throughout the whole cognitive model. All current theories of the organization of conceptual knowledge assume that a concept is composed of distinct types of information. This shared assumption permits an explanation of how thinking about a single concept (e.g., hammer) can engage different regions of the brain that processes distinct types of information (e.g., sensory vs. motor). It also allows for an account of how patients may present with impairments for a type or modality of knowledge (e.g., know what a hammer looks like, but not know how to use it). However, that assumption begs the question of how the different types of information that consti tute a given concept are functionally unified. A central theoretical issue to be addressed by the field is to understand the nature of the mechanisms that unify different types of knowledge about the same entity in the world, and that give rise to a functionally unitary concept of that entity. Our proposal, the distributed domain-specific hypothesis (Caramazza & Mahon, 2003; Mahon & Caramazza, 2009, 2011), is that the organization of conceptual knowl edge in the brain reflects the final product of a complex tradeoff of pressures, some of which are expressed locally within a given brain region, and some of which are expressed as connectivity between that region and the rest of the brain. Our suggestion is that con nectivity within a domain-specific neural circuit is the first, or broadest, principle accord ing to which conceptual knowledge is organized. For instance, visual motion properties of living animate things are represented in a different region or system than visual form properties of living animate things. In addition, affective properties of living animate things may be represented by other functionally and neuroanatomically distinct systems. However, all those types of information constitute the domain “living animate.” For that reason, it is critical to specify the nature of the functional connectivity that relates pro cessing across distinct subsystems specialized for different types of information. The ba sic expectation of the distributed domain-specific hypothesis is that the functional con nectivity that relates processing across distinct types of information (e.g., emotional val ue versus visual form) will be concentrated around those domains that have had evolu tionarily important histories. The strong prediction that follows from that view is that it is those neural circuits that are disrupted or disorganized after brain damage in patients (p. 572)

with category-specific semantic deficits. Independently of whether the distributed domain-specific hypothesis is empirically con firmed, it serves to highlight two key aspects of human conceptual processing. First, hu mans do not have systems that support rich conceptual knowledge of objects just to have them. We have those systems because they serve action, and ultimately have been in the service of survival (Goodale & Milner, 1992). An understanding of the architecture of the conceptual system must therefore be situated in the context of the real-world computa tional problems that the conceptual system is structured to support. Second, human be Page 27 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain havior arises as a result of the integration of multiple cognitive processes that individual ly operate over distinct types of knowledge. On the distributed domain-specific hypothe sis, the distinct (and potentially modular) processes within the sensory, motor, and affec tive systems are components of broader structures within the mind/brain. This framework thus emphasizes the need to understand how different types of cognitive processes, oper ating over different types of information, work in concert to orchestrate behavior. In the more than 25 years since Warrington and colleagues’ first detailed reports of pa tients with category-specific semantic deficits, new fields of study have emerged around the study of the organization and representation of conceptual knowledge. Despite that progress, the theoretical questions that currently occupy researchers are the same as those that were initially framed and debated two decades ago: What are the principles of neural organization that give rise to effects of category specificity? Are different types of information involved in processing different semantic categories, and if so, what distin guishes those different types of information? Future research will undoubtedly build on the available theories and redeploy their individual assumptions within new theoretical frameworks.

Author Note Bradford Z. Mahon was supported in part by NIH training grant 5 T32 19942-13 and R21NS076176-01A1; Alfonso Caramazza was supported by grant DC006842 from the Na tional Institute on Deafness and Other Communication Disorders. Sections of this article are drawn from three previous publications of the same authors: Mahon and Caramazza, 2008, 2009, and 2011. The authors are grateful to Erminio Capitani, Marcella Laiacona, Alex Martin, and Daniel Schacter for their comments on an earlier draft.

References Allport, D. A. (1985). Distributed memory, modular subsystems and dysphasia. In S. K. Newman & R. Epstein (Eds.), Current perspectives in dysphasia. New York: Churchill Liv ingstone. Astafiev, S. V., et al. (2004). Extrastriate body area in human occipital cortex responds to the performance of motor actions. Nature Neuroscience, 7, 542–548. Baillargeon, R. (1998). Infants’ understanding of the physical world. In M. Sabourin, F. Craik, & M. Robert (Eds.), Advances in psychological science: 2. Biological and cognitive aspects, (pp. 503–529). London: Psychology Press. Bar, M., & Aminoff, E. (2003). Cortical analysis of visual context. Neuron, 38, 347–358. Barbarotto, R., Capitani, E., & Laiacona, M. (1996). Naming deficit in herpes simplex en cephalitis. Acta Neurologica Scandinavaca, 93, 272–280.

Page 28 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Barbarotto, R., Capitani, E., Spinnler, H., & Trivelli, C. (1995). Slowly progressive seman tic impairment with category specificity. Neurocase, 1, 107–119. Barbarotto, R., Laiacona, M., Macchi, V., & Capitani, E. (2002). Picture reality decision, semantic categories, and gender: A new set of pictures, with norms and an experimental study. Neuropsychologia, 40, 1637–1653. Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain Science, 22, 637–660. (p. 573)

Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617–645. Beauchamp, M. S., Lee, K. E., Haxby, J. V., & Martin, A. (2002.) Parallel visual motion pro cessing streams for manipulable objects and human movements. Neuron, 24, 149–159. Beauchamp, M. S., Lee, K. E., Haxby, J. V., & Martin, A. (2003). FMRI responses to video and point-light displays of moving humans and manipulable objects. Journal of Cognitive Neuroscience, 15, 991–1001. Beauvois, M.-F. (1982). Optic aphasia: A process of interaction between vision and lan guage. Philosophical Transacation of the Royal Society of London B, 298, 35–47. Beauvois, M.-F., Saillant, B., Mhninger, V., & Llermitte, F. (1978). Bilateral tactile aphasia: A tacto-verbal dysfunction. Brain, 101, 381–401. Blundo, C., Ricci, M., & Miller, L. (2006). Category-specific knowledge deficit for animals in a patient with herpes simplex encephalitis. Cognitive Neuropsychology, 23, 1248–1268. Bookheimer, S. (2002). Functional MRI of language: New approaches to understanding the cortical organization of semantic processing. Annual Review of Neuroscience, 25, 151–188. Borgo, F., & Shallice, T. (2001). When living things and other “sensory-quality” categories behave in the same fashion: A novel category-specific effect. Neurocase, 7, 201–220. Borgo, F., & Shallice, T. (2003). Category specificity and feature knowledge: Evidence from new sensory-quality categories. Cognitive Neuropsychology, 20, 327–353. Boronat, C. B., Buxbaum, L. J., Coslett, H. B., Tang, K., Saffran, E. M., et al. 2004. Distinc tions between manipulation and function knowledge of objects: Evidence from functional magnetic resonance imaging. Cognitive Brain Research, 23, 361–373. Boulenger, V., Roy, A. C., Paulignan, Y., Deprez, V., Jeannerod, M., & Nazir, T. A. (2006). Cross-talk between language processes and overt motor behavior in the first 200 msec of processing. Journal of Cognitive Neuroscience, 18, 1607–1615. Brambati, S. M., Myers, D., Wilson, A., Rankin, K. P., Allison, S. C., Rosen, H. J., Miller, B. L., & Gorno-Tempini, M. L. (2006). The anatomy of category-specific object naming in neurodegenerative diseases. Journal of Cognitive Neuroscience, 18, 1644–1653. Page 29 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Bright, P., Moss, H. E., Stamatakis, E. A., & Tyler, L. K. (2005). The anatomy of object pro cessing: The role of anteromedial temporal cortex. Quarterly Journal of Experimental Psy chology B, 58, 361–377. Buccino, G., Riggio, L., Melli, G., Binkofski, F., Gallese, V., & Rizzolatti, G. (2005). Listen ing to action related sentences modulates the activity of the motor system: A combined TMS and behavioral study. Cognitive Brain Research, 24, 355–363. Büchel, C., Price, C. J., & Friston, K. (1998). A multimodal language region in the ventral visual pathway. Nature, 394, 274–277. Buxbaum, L. J., Veramonti, T., & Schwartz, M. F. (2000). Function and manipulation tool knowledge in apraxia: Knowing “what for” but not “how.” Neurocase, 6, 83–97. Canessa, N., Borgo, F., Cappa, S. F., Perani, D., Falini, A., Buccino, G., Tettamanti, M., & Shallice, T. (2008). The different neural correlates of action and functional knowledge in semantic memory: An fMRI study. Cerebral Cortex, 18, 740–751. Cant, J. S. et al. (2009) fMR-adaptation reveals separate processing regions for the per ception of form and texture in the human ventral stream. Experimental Brain Research, 192, 391–405. Cantlon, J. F., Platt, M., & Brannon, E. M. (2009). The number domain. Trends in Cogni tive Sciences, 13 (2), 83–91. Capitani, E., Laiacona, M., Mahon, B., & Caramazza, A. (2003). What are the facts of cate gory-specific deficits? A critical review of the clinical evidence. Cognitive Neuropsycholo gy, 20, 213–262. Caramazza, A. (1986). On drawing inferences about the structure of normal cognitive sys tems from the analysis of patterns of impaired performance: The case for single-patient studies. Brain and Cognition, 5, 41–66. Caramazza, A. (1992). Is cognitive neuropsychology possible? Journal of Cognitive Neuro science, 4, 80–95. Caramazza, A. (1998). The interpretation of semantic category-specific deficits: What do they reveal about the organization of conceptual knowledge in the brain? Neurocase, 4, 265–272. Caramazza, A., Hillis, A. E., Rapp, B. C., & Romani, C. (1990). The multiple semantics hy pothesis: Multiple confusions? Cognitive Neuropsychology, 7, 161–189. Caramazza, A., & Mahon, B. Z. (2003). The organization of conceptual knowledge: The ev idence from category-specific semantic deficits. Trends in Cognitive Sciences, 7, 354–361.

Page 30 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Caramazza, A., & Mahon, B. Z. (2006). The organisation of conceptual knowledge in the brain: The future’s past and some future directions. Cognitive Neuropsychology, 23, 13– 38. Caramazza, A., & Shelton, J. R. (1998). Domain specific knowledge systems in the brain: The animate-inanimate distinction. Journal of Cognitive Neuroscience, 10, 1–34. Carey, S., & Spelke, E. S. (1994). Domain specific knowledge and conceptual change. In L. Hirschfeld & S. Gelman (Eds.), Mapping the mind: Domain specificity in cognition and culture (pp. 169–200). Cambridge, UK: Cambridge University Press. Carroll, E., & Garrard, P. (2005). Knowledge of living, nonliving and “sensory quality” cat egories in semantic dementia. Neurocase, 11, 338–350. Chao, L. L., Haxby, J. V., & Martin, A. (1999). Attribute-based neural substrates in posteri or temporal cortex for perceiving and knowing about objects. Nature Neuroscience, 2, 913–919. Chao, L. L., & Martin, A. (2000). Representation of manipulable man-made objects in the dorsal stream. NeuroImage, 12, 478–484. Chao, L. L., Weisberg, J., & Martin, A. (2002). Experience-dependent modulation of cate gory related cortical activity. Cerebral Cortex, 12, 545–551. Coslett, H. B., & Saffran, E. M. (1992). Optic aphasia and the right hemisphere: A replica tion and extension. Brain and Language, 43, 148–161. Crawford, J. R., & Garthwaite, P. H. (2006). Methods of testing for a deficit in single case studies: Evaluation of statistical power by Monte Carlo simulation. Cognitive Neuropsy chology, 23, 877–904. Cree, G. S., & McRae, K. (2003). Analyzing the factors underlying the structure and com putation of the meaning of chipmunk, cherry, chisel, cheese, and cello and many other such concrete nouns Journal of Experimental Psychology: General, 132, 163–201. Crutch, S. J., & Warrington, E. K. (2003). The selective impairment of fruit and vegetable knowledge: A multiple processing channels account of fine-grain category specificity. Cognitive Neuropsychology, 20, 355–372. (p. 574)

Culham, J. C., Danckert, S. L., DeSourza, J. F. X., Gati, J. S., Menon, R. S., & Goodale, M. A. (2003). Visually guided grasping produces fMRI activation in dorsal but not ventral stream brain areas. Experimental Brain Research, 153, 180–189. Damasio, H., Grabowski, T. J., Tranel, D., & Hichwa, R. D. (1996). A neural basis for lexi cal retrieval. Nature, 380, 499–505. Damasio, H., Tranel, D., Grabowski, T., Adolphs, R., & Damasio, A. (2004). Neural systems behind word and concept retrieval. Cognition, 92, 179–229. Page 31 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Dehaene, S., Cohen, L., Sigman, M., & Vinckier, F. (2005). The neural code for written words: A proposal. Trends in Cognitive Sciences, 9, 335–341. Devlin, J., Gonnerman, L., Andersen, E., & Seidenberg, M. (1998). Category-specific se mantic deficits in focal and widespread brain damage: A computational account. Journal of Cognitive Neuroscience, 10, 77–94. Devlin, J. T., Moore, C. J., Mummery, C. J., Gorno-Tempini, M. L., Phillips, J. A., Noppeney, U., Frackowiak, R. S. J., Friston, K. J., & Price, C. J. (2002). Anatomic constraints on cogni tive theories of category-specificity. NeuroImage, 15, 675–685. Dixon, M. J., Piskopos, M., & Schweizer, T. A. (2000). Musical instrument naming impair ments: The crucial exception to the living/nonliving dichotomy in category-specific ag nosia. Brain and Cognition, 43, 158–164. Duchaine, B. C., & Yovel, G. (2008). Face recognition. The Senses: A Comprehensive Ref erence, 2, 329–357. Duchaine, B. C., Yovel, G., Butterworth, E. J., & Nakayama, K. (2006). Prosopagnosia as an impairment to face-specific mechanisms: Elimination of the alternative hypotheses in a developmental case. Cognitive Neuropsychology, 23, 714–747. Eggert, G. H. (1977). Wernicke’s works on aphasia: A sourcebook and review (Vol. 1). The Hague: Mouton. Ellis, A. W., Young, A. W., & Critchley, A. M. R. (1989). Loss of memory for people follow ing temporal lobe damage. Brain, 112, 1469–1483. Epstein, R., & Kanwisher, N. (1998). A cortical representation of the local visual environ ment. Nature, 392, 598–601. Fang, F., & He, S. (2005). Cortical responses to invisible objects in the human dorsal and ventral pathways. Nature Neuroscience, 8, 1380–1385. Farah, M., & McClelland, J. (1991). A computational model of semantic memory impair ment: modality specificity and emergent category specificity. Journal of Experimental Psy chology: General, 120, 339–357. Farah, M. J., & Rabinowitz, C. (2003). Genetic and environmental influences on the orga nization of semantic memory in the brain: Is “living things” an innate category? Cognitive Neuropsychology, 20, 401–408. Feigenson, L., Dehaene, S., & Spelke, E. S. (2004). Core systems of number. Trends in Cognitive Sciences, 8, 307–314. Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in primate visual cortex. Cerebral Cortex, 1, 1–47. Fodor, J. (1983). Modularity of mind. Cambridge, MA: MIT Press. Page 32 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Funnell, E., & Sheridan, J. (1992). Categories of knowledge: Unfamiliar aspects of living and nonliving things. Cognitive Neuropsychology, 9, 135–153. Gainotti, G. (2000). What the locus of brain lesion tells us about the nature of the cogni tive defect underlying category-specific disorders: A review. Cortex, 36, 539–559. Gallistel, C. R. (1990). The organization of learning. Cambridge, MA: Bradford/MIT Press. Gallese, V., & Lakoff, G. (2005). The brain’s concepts: The role of the sensory-motor sys tem in reason and language. Cognitive Neuropsychology, 22, 455–479. Garrard, P., Patterson, K., Watson, P. C., & Hodges, J. R. (1998). Category-specific seman tic loss in dementia of Alzheimer’s type: Functional-anatomical correlations from crosssectional analyses. Brain, 121, 633–646. Gauthier, I., Skudlarski, P., Gore, J. C., & Anderson, A. W. (2000). Expertise for cars and birds recruits brain areas involved in face recognition. Nature Neuroscience, 3, 191–197. Gauthier, I., et al. (1999) Activation of the middle fusiform “face area” increases with ex pertise in recognizing novel objects. Nature Neuroscience, 2, 568–573. Gelman, R. (1990). First principles organize attention to and learning about relevant da ta: Number and the animate-inanimate distinction as examples. Cognitive Science, 14, 79–106. Gerlach, C. (2007). A review of functional imaging studies on category specificity. Journal of Cognitive Neuroscience, 19, 296–314. Glenberg, A. M., & Kaschak, M. P. (2002). Grounding language in action. Psychonomic Bulletin and Review, 9, 558–565. Goodale, M. A., & Milner, A. D. (1992). Separate visual pathways for perception and ac tion. Trends in Neurosciences, 15, 20–25. Grill-Spector, K., & Malach, R. (2004). The human visual cortex. Annual Review Neuro science, 27, 649–677. Hart, J., Anand, R., Zoccoli, S., Maguire, M., Gamino, J., Tillman, G., King, R., & Kraut, M. A. (2007). Neural substrates of semantic memory. Journal of the International Neuropsy chology Society, 13, 865–880. Hart, J., Jr., Berndt, R. S., & Caramazza, A. (1985). Category-specific naming deficit fol lowing cerebral infarction. Nature, 316, 439–440. Haxby, J. V., Gobbini, M. I., Furey, M. L., Ishai, A., Schouten, J. L., & Pietrini, P. (2001). Distributed and overlapping representations of faces and objects in ventral temporal cor tex. Science, 293, 2425–2430.

Page 33 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Hécaen, H., & De Ajuriaguerra, J. (1956). Agnosie visuelle pour les objets inanimées par lésion unilatérale gauche. Révue Neurologique, 94, 222–233. Hermer, L., & Spelke, E. S. (1994). A geometric process for spatial reorientation in young children. Nature, 370, 57–59. Hillis, A. E., & Caramazza, A. (1991). Category-specific naming and comprehension im pairment: A double dissociation. Brain, 114, 2081–2094. Hillis, A. E., & Caramazza, A. (1995). Cognitive and neural mechanisms underlying visual and semantic processing: Implications from “optic aphasia.” Journal of Cognitive Neuro science, 7, 457–478. Humphreys, G. W., & Forde, E. M. E. (2001). Hierarchies, similarity, and interactivity in object recognition: “Category-specific” neuropsychological deficits. Behavioral and Brain Science, 24, 453–475. Humphreys, G. W., & Forde, E. M. E. (2005). Naming a giraffe but not an animal: Baselevel but not superordinate naming in a patient with impaired semantics. Cognitive Neu ropsychology, 22, 539–558. Humphreys, G. W., Riddoch, M. J., & Quinlan, P. T. (1988). Cascade processes in picture identification. Cognitive Neuropsychology, 5, 67–103. (p. 575)

Ishai, A., Ungerleider, L. G., Martin, A., Schourten, J. L., & Haxby, J. V. (1999). Distributed representation of objects in the human ventral visual pathway. Proceedings of the Nation al Academy of Sciences U S A, 96, 9379–9384. Jeannerod, M., & Jacob, P. (2005). Visual cognition: A new look at the two-visual systems model. Neuropsychologia, 43, 301–312. Johnson-Frey, S. H. (2004). The neural bases of complex tool use in humans. Trends in Cognitive Sciences, 8, 71–78. Kable, J. W., Lease-Spellmeyer, J., & Chatterjee, A. (2002). Neural substrates of action event knowledge. Journal of Cognitive Neuroscience, 14, 795–805. Keil, F. C. (1981). Constraints on knowledge and cognitive development. Psychological Re view, 88, 197–227. Kellenbach, M. L., Brett, M., & Patterson, K. (2003). Actions speak louder than functions: The importance of manipulability and action in tool representation. Journal of Cognitive Neuroscience, 15, 20–46. Kemmerer, D., Gonzalez Castillo, J., Talavage, T., Patterson, S., & Wiley, C. (2008). Neu roanatomical distribution of five semantic components of verbs: Evidence from fMRI. Brain and Language, 107, 16–43.

Page 34 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Kiani, R., Esteky, H., Mirpour, K., & Tanaka, K. (2007). Object category structure in re sponse patterns of neuronal population in monkey inferior temporal cortex. Journal of Neurophysiology, 97, 4296–4309. Kveraga, K., et al. (2007). Magnocellular projections as the trigger of top-down facilita tion in recognition. Journal of Neuroscience, 27, 13232–13240. Laiacona, M., Barbarotto, R., & Capitani, E. (1993). Perceptual and associative knowledge in category specific impairment of semantic memory: A study of two cases. Cortex, 29, 727–740. Laiacona, M., Barbarotto, R., & Capitani, E. (1998). Semantic category dissociation in naming: Is there a gender effect in Alzheimer disease? Neuropsychologia, 36, 407–419. Laiacona, M., Barbarotto, R., & Capitani, E. (2005). Animals recover but plant life knowl edge is still impaired 10 years after herpetic encephalitis: the long-term follow-up of a pa tient. Cognitive Neuropsychology, 22, 78–94. Laiacona, M., Barbarotto, R., & Capitani, E. (2006). Human evolution and the brain repre sentation of semantic knowledge: Is there a role for sex differences? Evolution and Hu man Behaviour, 27, 158–168. Laiacona, M., & Capitani, E. (2001). A case of prevailing deficit on nonliving categories or a case of prevailing sparing of living categories? Cognitive Neuropsychology, 18, 39–70. Laiacona, M., Capitani, E., & Caramazza, A. (2003). Category-specific semantic deficits do not reflect the sensory-functional organisation of the brain: A test of the “sensory-quality” hypothesis. Neurocase, 9, 3221–3231. Lambon Ralph, M. A., Howard, D., Nightingale, G., & Ellis, A. W. (1998). Are living and non-living category-specific deficits causally linked to impaired perceptual or associative knowledge? Evidence from a category-specific double dissociation. Neurocase, 4, 311– 338. Lambon Ralph, M. A., Lowe, C., & Rogers, T. T. (2007). Neural basis of category-specific semantic deficits for living things: Evidence from semantic dementia, HSVE and a neural network model. Brain, 130, 1127–1137. Laws, K. R., & Gale, T. M. (2002). Category-specific naming and the “visual” characteris tics of line drawn stimuli. Cortex, 38, 7–21. Laws, K. R., & Neve, C. (1999). A “normal” category-specific advantage for naming living things. Neuropsychologia, 37, 1263–1269. Levy, I., Hasson, U., Avidan, G., Hendler, T., & Malach, R. (2001). Center-periphery organi zation of human object areas. Nature Neuroscience, 4, 533–539.

Page 35 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Lewis, J. W. (2006). Cortical networks related to human use of tools. Neuroscientist, 12, 211–231. Lhermitte, F., & Beauvois, M.-F. (1973). A visual speech disconnection syndrome: Report of a case with optic aphasia, agnosic alexia and color agnosia. Brain, 96, 695–714. Luzzatti, C., & Davidoff, J. (1994). Impaired retrieval of object-color knowledge with pre served color naming. Neuropsychologia, 32, 1–18. Lyons, F., Kay, J., Hanley, J. R., & Haslam, C. (2006). Selective preservation of memory for people in the context of semantic memory disorder: Patterns of association and dissocia tion. Neuropsychologia, 44, 2887–2898. Mahon, B. Z., & Caramazza, A. (2003). Constraining questions about the organisation and representation of conceptual knowledge. Cognitive Neuropsychology, 20, 433–450. Mahon, B. Z., & Caramazza, A. (2005). The orchestration of the sensory-motor systems: Clues from neuropsychology. Cognitive Neuropsychology, 22, 480–494. Mahon, B. Z., & Caramazza, A. (2008). A critical look at the embodied cognition hypothe sis and a new proposal for grounding conceptual content. Journal of Physiology—Paris, 102, 50–70. Mahon, B. Z., & Caramazza, A. (2011). The distributed domain-specific hypothesis. Trends in Cognitive Sciences. In Production Mahon, B. Z., Milleville, S., Negri, G. A. L., Rumiati, R. I., Martin, A., & Caramazza, A. (2007). Action-related properties of objects shape object representations in the ventral stream. Neuron, 55, 507–520. Mahon, B. Z., et al. (2009) Category-specific organization in the human brain does not re quire visual experience. Neuron, 63, 397–405. Mahon, B. Z., & Caramazza, A. (2009). Concepts and categories: A cognitive neuropsy chological perspective. Annual Review of Psychology, 60, 1–15. Machery, E. (2007). Concept empiricism: A methodological critique. Cognition, 104, 19– 46. Martin, A. (2006). Shades of Déjerine—Forging a causal link between the visual word form area and reading. Neuron, 50, 173–175. Martin, A. (2007). The representation of object concepts in the brain. Annual Review Psy chology, 58, 25–45. Martin, A., Haxby, J. V., Lalonde, F. M., Wiggs, C. L., & Ungerleider, L. G. (1995). Discrete cortical regions associated with knowledge of color and knowledge of action. Science, 270, 102–105. Page 36 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Martin, A., & Weisberg, J. (2003). Neural foundations for understanding social and me chanical concepts. Cognitive Neuropsychology, 20, 575–587. McClelland, J. L., & Rogers, T. T. (2003). The parallel distributed processing approach to semantic cognition. Nature Reviews Neuroscience, 4, 310–322. Mechelli, A., Sartori, G., Orlandi, P., & Price, C. J. (2006). Semantic relevance explains category effects in medial fusiform gyri. NeuroImage, 3, 992–1002. Miceli, G., Capasso, R., Daniele, A., Esposito, T., Magarelli, M., & Tomaiuolo, F. (2000). Selective deficit for people’s names following left temporal damage: An impair ment of domain-specific conceptual knowledge. Cognitive Neuropsychology, 17, 489–516. (p. 576)

Miceli, G., Fouch, E., Capasso, R., Shelton, J. R., Tamaiuolo, F., & Caramazza, A. (2001). The dissociation of color from form and function knowledge. Nature Neuroscience, 4, 662–667. Miller, E. K., et al. (2003). Neural correlates of categories and concepts. Current Opinion in Neurobiology, 13, 198–203. Milner, A. D., Perrett, D. I., Johnson, R. S., Benson, O. J., Jordan, T. R., et al. (1991). Per ception and action “visual form agnosia.” Brain, 114, 405–428. Mitchell, J. P., Heatherton, T. F., & Macrae, C. N. (2002). Distinct neural systems subserve person and object knowledge. Proceedings of the National Academy of Sciences U S A, 99, 15238–15243. Moreaud, O., Charnallet, A., & Pellat, J. (1998). Identification without manipulation: A study of the relations between object use and semantic memory. Neuropsychologia, 36, 1295–1301. Morris, J. S., öhman, A., & Dolan, R. J. (1999). A subcortical pathway to the right amyg dala mediating “unseen” fear. Proceedings of the National Academy of Sciences U S A, 96, 1680–1685. Moscovitch, M., Winocur, G., & Behrmann, M. (1997). What is special about face recogni tion? Nineteen experiments on a person with visual object agnosia and dyslexia but with normal face recognition. Journal of Cognitive Neuroscience, 9, 555–604. Moss, H. E., Tyler, L. K., Durrant-Peatfield, M., & Bunn, E. M. (1998). “Two eyes of a seethrough”: Impaired and intact semantic knowledge in a case of selective deficit for living things. Neurocase, 4, 291–310. Negri, G. A. L., Rumiati, R. I., Zadini, A., Ukmar, M., Mahon, B. Z., & Caramazza, A. (2007). What is the role of motor simulation in action and object recognition? Evidence from apraxia. Cognitive Neuropsychology, 24, 795–816.

Page 37 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Noppeney, U., Price, C. J., Penny, W. D., & Friston, K. J. (2006). Two distinct neural mecha nisms for category-selective responses. Cerebral Cortex, 16, 437–445. Nunn, J. A., & Pearson, R. (2001). Developmental prosopagnosia: Should it be taken at face value? Neurocase, 7, 15–27. Ochipa, C., Rothi, L. J. G., & Heilman, K. M. (1989). Ideational apraxia: A deficit in tool se lection and use. Annals of Neurology, 25, 190–193. Oliveri, M., Finocchiaro, C., Shapiro, K., Gangitano, M., Caramazza, A., & Pascual-Leone, A. (2004). All talk and no action: A transcranial magnetic stimulation study of motor cor tex activation during action word production. Journal of Cognitive Neuroscience, 16, 374– 381. Op de Beeck, H. P., Haushofer, J., & Kanwisher, N. G. (2008). Interpreting fMRI data: Maps, modules, and dimensions. Nature Reviews Neuroscience, 9, 123–135. Op de Beeck, H. P., et al. (2006). Discrimination training alters object representations in human extrastriate cortex. Journal of Neuroscience, 26, 13025–13036. Orlov, T., et al. (2010). Topographic representation of the human body in the occipitotem poralcortex. Neuron, 68, 586–600. Pasley, B. N., Mayes, L. C., & Schultz, R. T. (2004). Subcortical discrimination of unper ceived objects during binocular rivalry. Neuron, 42, 163–172. Patterson, K., Nestor, P. J., & Rogers, T. (2007). What do you know what you know? The representation of semantic knowledge in the brain. Nature Neuroscience Reviews, 8, 976–988. Peelen, M. V., & Caramazza, A. (2010) What body parts reveal about the organization of the brain. Neuron, 68, 331–333. Pietrini, P., et al. (2004) Beyond sensory images: Object-based representation in the hu man ventral pathway. Proceedings of the National Academy of Sciences U S A, 101, 5658– 5663. Pisella, L., Binkofski, B. F., Lasek, K., Toni, I., & Rossetti Y. 2006. No double-dissociation between optic ataxia and visual agnosia: Multiple sub-streams for multiple visuo-manual integrations. Neuropsychologia, 44, 2734–2748. Plaut, D. C. (2002). Graded modality-specific specialization in semantics: a computational account of optic aphasia. Cognitive Neuropsychology, 19, 603–639. Polk, T. A., Park, J., Smith, M. R., & Park, D. C. (2007). Nature versus nurture in ventral vi sual cortex: A functional magnetic resonance imaging study of twins. Journal of Neuro science, 27, 13921–13925.

Page 38 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Prinz, J. J. (2002). Furnishing the mind: Concepts and their perceptual basis. Cambridge, MA: MIT Press. Pulvermüller, F. (2005). Brain mechanisms linking language and action. Nature Reviews Neuroscience, 6, 576–582. Pulvermüller, F., Hauk, O., Nikolin, V. V., & Ilmoniemi, R. J. (2005). Functional links be tween language and motor systems. European Journal of Neuroscience, 21, 793–797. Rapcsak, S. Z., Ochipa, C., Anderson, K. C., & Poizner, H. (1995). Progressive ideomotor apraxia: Evidence for a selective impairment in the action production system. Brain and Cognition, 27, 213–236. Riddoch, M. J., Humphreys, G. W., Coltheart, M., & Funnell, E. (1988). Semantic systems or system? Neuropsychological evidence re-examined. Cognitive Neuropsychology, 5, 3– 25. Riesenhuber, M. (2007). Appearance isn’t everything: News on object representation in cortex. Neuron, 55, 341–344. Rogers, T. T., Hocking, J., Mechelli, A., Patterson, K., & Price, C. J. (2003). Fusiform activa tion to animals is driven by the process, not the stimulus. Journal of Cognitive Neuro science, 17, 434–445. Rogers, T. T., Lambon Ralph, M. A., Garrard, P., Bozeat, S., McClelland, J. L., et al. (2004). Structure and deterioration of semantic memory: A neuropsychological and computation al investigation. Psychological Review, 111, 205–235. Rogers, T. T., et al. (2005). Fusiform activation to animals is driven by the process, not the stimulus. Journal of Cognitive Neuroscience, 17, 434–445. Rosci, C., Chiesa, V., Laiacona, M., & Capitani, E. (2003). Apraxia is not associated to a disproportionate naming impairment for manipulable objects. Brain and Cognition, 53, 412–415. Rothi, L. J., Ochipa, C., & Heilman, K. M. (1991). A cognitive neuropsychological model of limb praxis. Cognitive Neuropsychology, 8, 443–458. Rumiati, R. L., Zanini, S., & Vorano, L. (2001). A form of ideational apraxia as a selective deficit of contention scheduling. Cognitive Neuropsychology, 18, 617–642. Rushworth, M. F. S., et al. (2006). Connection patterns distinguish 3 regions of human parietal cortex. Cerebral Cortex, 16, 1418–1430. Sacchett, C., & Humphreys, G. W. (1992). Calling a squirrel a squirrel but a canoe a wig wam: A category-specific deficit for artifactual objects and body parts. Cognitive Neu ropsychology, 9, 73–86.

Page 39 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Samson, D., & Pillon, A. (2003). A case of impaired knowledge for fruit and veg etables. Cognitive Neuropsychology, 20, 373–400. (p. 577)

Santos, L. R., & Caramazza, A. (2002). The domain-specific hypothesis: A developmental and comparative perspective on category-specific deficits. In E. M. E. Forde & G. W. Humphreys (Eds.), Category-specificity in the brain and mind. New York: Psychology Press. Sartori, G., & Lombardi, L. (2004). Semantic relevance and semantic disorders. Journal of Cognitive Neuroscience, 16, 439–452. Sartori, G., Lombardi, L., & Mattiuzzi, L. (2005). Semantic relevance best predicts normal and abnormal name retrieval. Neuropsychologia, 43, 754–770. Shallice, T. (1988). From neuropsychology to mental structure. Cambridge, UK: Cam bridge University Press. Shallice, T. (1993). Multiple semantics: Whose confusions? Cognitive Neuropsychology, 10, 251–261. Shelton, J. R., Fouch, E., & Caramazza, A. (1998). The selective sparing of body part knowledge: A case study. Neurocase, 4, 339–351. Silveri, M. C., Gainotti, G., Perani, D., Cappelletti, J. Y., Carbone, G., & Fazio, F. (1997). Naming deficit for non-living items: Neuropsychological and PET study. Neuropsychologia, 35, 359–367. Simmons, W. K., & Barsalou, L. W. (2003). The Similarity-in-Topography Principle: Recon ciling theories of conceptual deficits. Cognitive Neuropsychology, 20, 451–486. Sirigu, A., Duhamel, J., & Poncet, M. (1991). The role of sensorimotor experience in object recognition. Brain, 114, 2555–2573. Spelke, E. S., Breinlinger, K., Macomber, J., & Jacobson, K. (1992). Origins of knowledge. Psychological Review, 99, 605–632. Stewart, F., Parkin, A. J., & Hunkin, N. M. (1992). Naming impairments following recovery from herpes simplex encephalitis. Quarterly Journal of Experimental Psychology A, 44, 261–284. Strnad, L., Anzellotti, S., & Caramazza, A. (2011). Formal models of categorization: In sights from cognitive neuroscience. In E. M. Pothos & A. J. Wills (Eds.), Formal approach es in categorization (pp. 313–324). Cambridge, UK: Cambridge University Press. Tanaka, K., et al. (1991). Coding visual images of objects in the inferotemporal cortex of the macaque monkey. Journal of Neurophysiology, 66, 170–189. Thompson-Schill, S. L. (2003). Neuroimaging studies of semantic memory: Inferring “how” from “where.” Neuropsychologia, 41, 280–292. Page 40 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Thompson-Schill, S. L., Aguirre, G. K., D’Esposito, M., & Farah, M. J. (1999). A neural ba sis for category and modality specificity of semantic knowledge. Neuropsychologia, 37, 671–676. Tranel, D., Logan, C. G., Frank, R. J., & Damasio, A. R. (1997). Explaining category-relat ed effects in the retrieval of conceptual and lexical knowledge of concrete entities: Opera tionalization and analysis of factor. Neuropsychologia, 35, 1329–1339. Turnbull, O. H., & Laws, K. R. (2000). Loss of stored knowledge of object structure: Impli cation for “category-specific” deficits. Cognitive Neuropsychology, 17, 365–389. Tyler, L. K., & Moss, H. E. (2001). Towards a distributed account of conceptual knowl edge. Trends in Cognitive Sciences, 5, 244–252. Tyler, L. K., et al. (2003). Do semantic categories activate distinct cortical regions? Evi dence for a distributed neural semantic system. Cognitive Neuropsychology, 20, 541–559. Valyear, K. F., & Culham, J. C. (2009). Observing learned object-specific functional grasps preferentially activates the ventral stream. Journal of Cognitive Neuroscience, 22, 970– 984. Vinson, D. P., Vigliocco, G., Cappa, S., & Siri, S. (2003). The breakdown of semantic knowledge: Insights from a statistical model of meaning representation. Brain and Lan guage, 86, 347–365. Vuilleumier, P., et al. (2004) Distant influences of amygdala lesion on visual cortical acti vation during emotional face processing. Nature Neuroscience, 7, 1271–1278. Warrington, E. K., & McCarthy, R. (1983). Category specific access dysphasia. Brain, 106, 859–878. Warrington, E. K., & McCarthy, R. A. (1987). Categories of knowledge: Further fractiona tions and an attempted integration. Brain, 110, 1273–1296. Warrington, E. K., & Shallice, T. (1984). Category specific semantic impairment. Brain, 107, 829–854. Wellman, H. M., & Gelman, S. A. (1992). Cognitive development: Foundational theories of core domains. Annual Review of Psychology, 43, 337–375. Zannino, G. D., Perri, R., Carlesimo, G. A., Pasqualetti, P., & Caltagirone, C. (2002). Cate gory-specific impairment in patients with Alzheimer’s disease as a function of disease severity: A cross-sectional investigation. Neuropsychologia, 40, 2268–2279. Zannino, G. D., Perri, R., Pasqualetti, P., Caltagirone, C., & Carlesimo, G. A. (2006). Analy sis of the semantic representations of living and nonliving concepts: A normative study. Cognitive Neuropsychology, 23, 515–540.

Page 41 of 42

Organization of Conceptual Knowledge of Objects in the Human Brain Zwaan, R. A. (2004). The immersed experiencer: Toward an embodied theory of language comprehension. In B. H. Ross (Ed.), The psychology of learning and motivation. New York: Academic Press.

Bradford Z. Mahon

Bradford Z. Mahon, Departments of Neurosurgery and Brain and Cognitive Sciences, University of Rochester, Rochester, NY Alfonso Caramazza

Alfonso Caramazza is Daniel and Amy Starch Professor of Psychology at Harvard Uni versity.

Page 42 of 42

A Parallel Architecture Model of Language Processing

A Parallel Architecture Model of Language Processing Ray Jackendoff The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0028

Abstract and Keywords The Parallel Architecture is a linguistic theory in which (1) the generative power of lan guage is divided among phonology, syntax, and semantics; and (2) words, idioms, morpho logical affixes, and phrase structure rules are stored in the lexicon in a common format. This formalization leads to a theory of language processing in which the “competence” grammar is put to work directly in “performance,” and which is compatible with psy cholinguistic evidence concerning lexical retrieval, incremental and parallel processing, syntactic priming, and the integration of visual context with semantic processing. Keywords: parallel architecture, language processing

Goals of a Theory of Language Processing—and Goals of Language Processing The Parallel Architecture (Jackendoff 1997, 2002, 2010; Culicover & Jackendoff, 2005) is a theory of language that preserves all the mentalistic and biological aspects of main stream generative grammar (e.g., Chomsky 1965, 1981, 1995, 2000), but which employs a theoretical technology better in tune with linguistic and psycholinguistic discoveries of the past 30 years. This chapter shows how the Parallel Architecture lends itself to a direct relation between linguistic structure (or “competence”) and language processing (or “performance”). A theory of language processing has to explain how language users convert sounds into meanings in language perception and how they convert meanings into sounds in lan guage production. In particular, the theory has to describe what language users store in long-term memory that enables them to do this, and how the material stored in memory is brought to bear in understanding and creating novel utterances in real time.

Page 1 of 31

A Parallel Architecture Model of Language Processing Linguistic theory is an account of the repertoire of utterances available to a speaker, ab stracting away from the real-time aspects of language processing and from the distinction between perception and production. I take it that one should seek a linguistic theory that embeds gracefully into an account of language processing, and that can be tested through experimental techniques as well as through grammaticality judgments. Unfortunately, many linguists assert that a theory of performance has no bearing on a theory of competence, and many psycholinguists “retaliate” by asserting that a theory of processing has no need for a theory of competence. But a linguistic theory that disre gards processing cuts itself off from valuable sources of evidence and from potential inte gration into cognitive science. From the other side, processing theories that claim to do without a theory of competence always implicitly embody such a theory anyway, usually a theory that severely underestimates the complexity and richness of the repertoire of ut terances. The goal here is to develop competence and performance theories that are ade quate on their own turf and that also interact meaningfully with each other. All linguistic theories consider utterances to be structured in several domains: at least phonological (p. 579) (sound) structure, syntactic (grammatical) structure, and semantic (meaning) structure. Therefore, a plausible working hypothesis is that the goal of lan guage processing is to produce a correlated set of phonological, syntactic, and semantic structures that together match sound to meaning. In perceiving an utterance, the starting point is an unstructured phonetic string being apprehended over time, possibly with some gaps or uncertainty; the end point is a meaning correlated with a structured string of sounds. In producing an utterance, the starting point is a meaning (or thought), possibly complete, possibly developing as the utterance is being produced; the end point is a fully structured meaning correlated with a structured string of motor instructions that pro duce sounds. Because the correlation of sound and meaning is mediated by syntactic structure, the processor must also develop enough syntactic structure in both perception and production to be able to make the relation of sound and meaning explicit.1 These observations already suffice to call into question connectionist models of language perception whose success is judged by their ability to predict the next word of a sen tence, given some finite preceding context (e.g., Elman, 1990; MacDonald & Christiansen, 2002; and, as far as I can determine, Tabor & Tanenhaus, 1999). The implicit theory of language behind such models is that well-formed language is characterized only by the statistical distribution of word sequencing. To be sure, statistics of word sequencing are sometimes symptomatic of meaning relations, but they do not constitute meaning rela tions. Consider the previous sentence: (1) How could a processor predict the full succes sion of words, and (2) what good would such predictions do in understanding the sen tence? Moreover, predicting the next word has no bearing whatsoever on an explanation of speech production, where the goal has to be to produce the next word in an effort to say something meaningful.

Page 2 of 31

A Parallel Architecture Model of Language Processing More generally, we have known since Chomsky (1957) and Miller and Chomsky (1963) that sequential dependencies among words in a sentence are not sufficient to determine understanding or even grammaticality. For instance, in (1), (1) Does the little boy in the yellow hat who Mary described as a genius like icecream? the fact that the italicized verb is like rather than likes is determined by the presence of does, fourteen words away; and we would have no difficulty making the distance longer. What is signif icant here is not the distance in words; it is the distance in noun phrases (NPs)—the fact that does is one NP away from like. This relation is not captured in Elman-style recurrent networks, which (as pointed out by many critics over the past twenty years) take account only of word se quence and have no representation of global structure.

Other issues with connectionist models of language processing will arise below. However, my main focus here is the Parallel Architecture, to which we now turn.

The Parallel Architecture The Parallel Architecture differs from mainstream generative grammar (MGG) in three important respects. • MGG is syntactocentric: The generative power of language is invested in the syntax, and phonology and semantics are “interpretive,” derived from syntactic structure. In the Parallel Architecture, phonology, syntax, and semantics are independent genera tive components, linked by interfaces. • MGG is derivation-based: The structure of a sentence is produced through a step-bystep algorithmic process, and is inherently directional. The Parallel Architecture is con straint-based and nondirectional. • MGG maintains a strict formal distinction between the lexicon and rules of grammar. In the Parallel Architecture, words are relatively idiosyncratic rules in a continuum of generality with more general grammatical structure. I take up these aspects of the Parallel Architecture in turn.

Phonology as an Independent Generative Component A major theoretical development in the 1970s (e.g., Goldsmith, 1979; Liberman & Prince, 1977) showed that phonology has its own units and principles of combination, incommen surate with syntactic units, though correlated with them. For instance, consider the sen tence in (2). (2) Syntax: [NP Sesame Street] [VP is [NP a production [of [NP the Children’s Television Workshop]]]] Phonology: Page 3 of 31

A Parallel Architecture Model of Language Processing [Sesame Street is a production of] [the Children’s Television Workshop] or [Sesame Street] [is a production] [of the Children’s Television Workshop] The syntactic structure of (2) consists of a noun phrase (NP), Sesame Street, fol lowed by a verb phrase (VP), the rest of the sentence. The VP in turn consists of the verb is plus another NP. This NP has embedded within it a further NP, the Children’s Television Workshop. However, the way the sentence is pronounced does not necessarily conform to this structure: It can be broken up into intonation contours (or breath groups) in various ways, two of which are illustrated in (2). Some of these units, for instance, Sesame Street is a production of in the first version, do not correspond to any syntactic constituent; moreover, this unit cannot be classified as an NP or a VP, because it cuts across the boundaries of both. Another such example is the familiar This is the cat that chased the rat that ate the cheese. This has relentlessly right-embedded syntax; but the intonation is a flat structure with three parallel parts, only the last of which corresponds to a syntactic (p. 580)

constituent. The proper way to characterize the pronunciation of these examples is in terms of Intona tion Phrases (IPs), phonological units over which intonation contours and the position of pauses are defined. The pattern of intonation phrases is to some degree independent of syntactic structure, as seen from the two possibilities in (2). Nevertheless it is not entire ly free. For instance, (3) is not a possible pronunciation of this sentence. (3) *[Sesame] [Street is a] [production of the Children’s] [Television Workshop] Thus there is some correlation between phonological and syntactic structure, which a theory of intonation needs to characterize. A first approximation to the proper account for English appears to be the following principles, stated very informally (Gee & Grosjean, 1983; Hirst, 1993; Jackendoff, 1987; Truckenbrodt, 1999)2: (4) a. An Utterance is composed of a string of IPs; IPs do not embed in each other. b. An IP must begin at the beginning of a syntactic constituent. It may end be fore the syntactic constituent does, but it may not go beyond the end of the largest syntactic constituent that it starts. Inspection will verify that (2) observes this matching. But (3) does not, because the sec ond IP begins with the noun Street, and there is no constituent starting with Street that also contains is a. These examples illustrate that phonological structure requires its own set of basic units and combinatorial principles such as (4a). It is generative in the same sense that syntax is, though not recursive. In addition, because units of phonological structure such as IPs cannot be derived from syntactic structure, the grammar needs principles such as (4b) Page 4 of 31

A Parallel Architecture Model of Language Processing that stipulate how phonological and syntactic structures can be correlated. In the Parallel Architecture, these are called interface rules.3

Semantics as an Independent Generative Component Several different and incompatible approaches to semantics developed during the 1970s and 1980s: formal semantics (Chierchia & McConnell-Ginet, 1990; Lappin, 1996; Partee, 1976), Cognitive Grammar (Lakoff, 1987; Langacker, 1987; Talmy, 1988), Conceptual Se mantics (Jackendoff, 1983, 1990; Pinker, 1989, 2007), and approaches growing out of cog nitive psychology (Collins & Quillian, 1969; Rosch & Mervis, 1975; Smith & Medin, 1981; Smith, Shoben, & Rips, 1974) and artificial intelligence (Schank, 1975). Whatever radical differences among them, they implicitly agreed on one thing: Meanings of sentences are not made up of syntactic units such as verbs, noun phrases, and prepositions. Rather, they are combinations of specifically semantic units such as (conceptualized) individuals, events, times, places, properties, and quantifiers, none of which correspond one to one with syntactic units; and these semantic units are combined according to principles that are specific to semantics and distinct from syntactic principles. This means that seman tics, like phonology, must be an independent generative system, not strictly derivable from syntactic structure, but only correlated with it. The correlation between syntax and semantics again takes the form of interface rules that state the connection between the two types of mental representation. The Parallel Architecture, unlike MGG and most other linguistic theories incorporated in to processing models, incorporates a rich and explicit theory of semantics: Conceptual Semantics (Jackendoff, 1983, 1990, 2002, chapters 9–12). This theory is what makes it possible to explore the ways in which syntax does and does not match up with meaning and the ways in which semantics interfaces with other sorts of cognitive capacities, both perception and “world knowledge.”

Figure 28.1 The Parallel Architecture.

Granting semantics its independence from syntax makes sense both psychologically and biologically. Sentence meanings are, after all, the combinatorial thoughts that spoken sentences convey. Thoughts (or concepts) have their own structure, evident even (p. 581) in nonhuman primates (Cheney & Seyfarth 1990, Hauser 2000), and language is at its ba sis a combinatorial system for expressing thoughts. (See Pinker & Jackendoff, 2005, and Jackendoff & Pinker, 2005, for discussion.) To sum up, we arrive at an architecture for language along the lines of Figure 28.1.

Page 5 of 31

A Parallel Architecture Model of Language Processing Here the interfaces are indicated by double arrows to signify that they characterize corre lations of structures with each other rather than derivation of one structure from the oth er.

Constraint-Based Principles of Grammar A second major feature of the Parallel Architecture is that it is constraint based and nondirectional. Instead of classical phrase structure rules such as (5), in which the sym bol S is expanded as or rewritten as NP plus VP and so on, the Parallel Architecture lists available pieces of structure or “treelets,” as in (6).

A tree is built by “clipping together” these tree-lets at nodes they share, working from the bottom up, or from the top down, or from anywhere in the middle, as long as the resulting tree ends up with S at the top and terminal symbols at the bottom. No order for building trees is logically prior to any other. Alternatively, one can take a given tree and check its well-formedness by making sure that every part of it conforms to one of the treelets; the structures in (6) then function as constraints on possible trees rather than as algorithmic generative engines for producing trees. Hence the constraint-based formalism does not presuppose any particular implementation; it is compatible with serial, parallel, top-down, or bottom-up computation. The constraint-based formalism is not confined to the Parallel Architecture. It is a major feature of several other non-mainstream versions of generative grammar, such as Lexical Functional Grammar (Bresnan, 2001), Head-driven Phrase Structure Grammar (Pollard & Sag, 1994), and Optimality Theory (Prince & Smolensky, 1993/2004). An important part of this formalism is that constraints can be violable and can compete with each other; it is beyond the scope of this article to describe the various theoretical approaches to resolv ing constraint conflict. This approach is advantageous in making contact with models of processing. For exam ple, suppose an utterance begins with the word the. This is listed in the lexicon as a de terminer, so we begin with the subtree (7).

(7)

Page 6 of 31

A Parallel Architecture Model of Language Processing Det is the initial node in treelet (6b), which can therefore be clipped onto (7) to produce (8).

(8) In turn, an initial NP fits into treelet (6a), which in turn can have (6c) clipped into its VP, giving (9)—

(9) —and we are on our way to anticipatory parsing, that is, setting up grammatical expectations on the basis of an initial word. Further words in (p. 582) the sentence may be attached on the basis of the top-down structure anticipated in (9). Alternatively, they may disconfirm it, as in the sen tence The more I read, the less I understand—in which case other treelets had better be avail able that can license the construction.4

In psycholinguistics, the term constraint-based seems generally to be used to denote a lexically driven connectionist architecture along the lines of MacDonald et al. (1994). Like the constraint-based linguistic theories, these feature multidimensional constraint satis faction and the necessity to resolve competition among conflicting constraints. However, as MacDonald and Christiansen (2002) observe, the constraint-based aspects of such pro cessing theories can be separated from the connectionist aspects. Indeed, one of the ear liest proposals for lexically driven constraint-based parsing, by Ford et al. (1982), is couched in traditional symbolic terms. The constraints that govern structure in the Parallel Architecture are not all word-based, as they are for MacDonald et al. The projection of the into a Determiner node in (7) is in deed word-based. But all the further steps leading to (9) are accomplished by the treelets in (6), which are phrasal constraints that make no reference to particular words. Similar ly, the use of the prosody–syntax interface constraint (4b) constrains syntactic structure without reference to particular words. In general, as will be seen later, the building of structure is constrained by a mixture of word-based, phrase-based, semantically based, and even pragmatically based conditions.

Page 7 of 31

A Parallel Architecture Model of Language Processing

No Strict Lexicon Versus Grammar Distinction Every mentalistic linguistic theory takes a word to be an association in long-term memory between pieces of phonological, syntactic, and semantic structure. The phonological and semantic structures of words are typically much richer than their syntactic structures. For example, the words dog, cat, chicken, kangaroo, worm, and elephant are differentiat ed in sound and meaning, but they are syntactically indistinguishable: they are all just singular count nouns. Similarly for all the color words and for all the verbs of locomotion such as walk, jog, swagger, slither, and so on. In the Parallel Architecture, a word is treated as a small interface rule that plays a role in the composition of sentence structure. It says that in building the structure for a sen tence, this particular piece of phonology can be matched with this piece of meaning and these syntactic features. So, for instance, the word cat has a lexical structure along the lines of (10a), and the has a structure like (10b). (10) a. kæt1—N1—CAT1 b. ðə2—Det2—DEF2 The first component of (10a) is a phonological structure; the second marks it as a noun; the third is a stand-in for whatever semantic features are necessary to distinguish cats from other things. (10b) is similar (where DEF is the feature “definiteness”). The co-sub scripting of the components is a way of notating that the three parts are linked in longterm memory (even if it happens that they are localized in different parts of the brain). When words are built into phrases, structures are built in all three components in paral lel, yielding a linked trio of structures like (11) for the cat.

(11) Here the subscript 1 binds together the components of cat, and the subscript 2 binds to gether the components of the. A word can stipulate contextual restrictions on parts of its environment; these include, among other things, traditional notions of subcategorization and selectional restrictions. For example, the transitive verb devour requires a direct object in syntactic structure. Its semantics requires two arguments: an action of devouring must involve a devourer (the agent) and something being devoured (the patient). Moreover, the patient has to be ex pressed as the direct object of the verb. Thus the lexical entry for this verb can be notat ed as (12). The material composing the verb itself is notated in roman type. The contextu al restrictions are notated in italics: NP, X, and Y are variables that must be satisfied in or Page 8 of 31

A Parallel Architecture Model of Language Processing der for a structure incorporating this word to be well formed. The fact that the patient must appear in object position is notated in terms of the subscript 4 shared by the syntac tic and semantic structure. (12) dəvawr3 –V3 NP4 – [[X; ANIMATE] DEVOUR3 [Y; EDIBLE]4] In the course of parsing, if the parser encounters devour, the syntactic and semantic structure of (12) will create an anticipation of a direct object that denotes some edible en tity. The word-based projection of structure illustrated in (12) is entirely parallel to that in lexically driven models of parsing such as that of MacDonald et al. (1994). Howev er, MacDonald et al. claim that all structure is built on the basis of word-based contextual constraints. This strategy is not feasible in light of the range of structures in which most open-class items can appear (and it is questioned experimentally by Traxler et al., 1998). For example, we do not want every English noun to stipulate that it can occur with a pos sessive, with quantifiers, with prenominal adjectives, with postnominal prepositional (p. 583)

phrase modifiers, and with relative clauses, and if a count noun, in the plural. These pos sibilities are a general property of noun phrases, captured in the phrasal rules, and they do not belong in every noun’s lexical entry. Similarly, we do not want every verb to stipu late that it can occur in every possible inflectional form, and that it can co-occur with a sentential adverbial, a manner adverbial (if semantically appropriate), time and place phrases, and so on. Nor, in German, do we want every verb to say that it occurs second in main clauses and last in subordinate clauses. These sorts of linguistic phenomena are what a general theory of syntax accounts for, and for which general phrasal rules like (6) are essential.5 Furthermore, the constraints between prosodic and syntactic constituency discussed earlier cannot be coded on individual words either. It is an empirical problem to sort out which constraints on linguistic form are word-based, which are phrase based, which involve syntactic structure, which involve semantic or prosodic structure, and which involve interface conditions. An important feature of the treatment of words illustrated in (12) is that it extends direct ly to linguistic units both smaller and larger than words. For instance, consider the Eng lish regular plural inflection, which can be formalized in a fashion entirely parallel to (12). (13) Wd6+z5 – N6+aff 5 – [PLUR 5 (X6)] The phonological component of (13) says that the phoneme z is appended to a phonologi cal word. The syntactic component says that an affix appears attached to a noun. The cosubscripting indicates that this affix is pronounced z and the noun corresponds to the phonological word that z is added to. The semantic component of (13) says that the con cept expressed by this phonological word is pluralized. Thus the regular plural is formally similar to a transitive verb; the differences lie in what syntactic category it belongs to and what categories it attaches to in syntax and phonology.

Page 9 of 31

A Parallel Architecture Model of Language Processing This conception of regular affixation is somewhat different from Pinker’s (1999). Pinker would state the regular plural as a procedural rule: “To form the plural of a noun, add z.” In the present account, the regular plural is at once a lexical item, an interface rule, and a rule for combining an affix and a noun, depending on one’s perspective. However, the present analysis does preserve Pinker’s dichotomy between regular affixation and irregu lar forms. As in his account, irregular forms must be listed individually, whereas regular forms can be constructed by combining (13) with a singular noun. In other words, this is a “dual-process” model of inflection. However, the “second” process, that of free combina tion, is exactly the same as is needed for combining transitive verbs with their objects. Notice that every theory of language needs a general process for free combination of verbs and their objects—the combinations cannot be memorized. So parsimony does not constitute a ground for rejecting this particular version of the dual-process model.6 Next consider lexical entries that are larger than a word. For example, the idiom kick the bucket is a lexical VP with internal phonological and syntactic structure:

(14) Here the three elements in phonology are linked to the three terminal elements of the VP (V, Det, and N). However, the meaning is linked not to the individual words but rather to the VP as a whole (subscript 10). Thus the words have no meaning on their own—only the entire VP has meaning. This is precisely what it means for a phrase to be an idiom: Its meaning cannot be predicted from the meanings of its parts, but instead must be learned and stored as a whole. Once we acknowledge that pieces of syntactic structure are stored in long-term memory associated with idiomatic meanings in items like (14), it is a short step to also admitting pieces of structure that lack inherent meanings, such as the “treelets” in (6). This leads to a radical conclusion from the mainstream point of view: words, regular affixes, idioms, and ordinary phrase structure rules like (6) can all be expressed in a common formalism, namely as pieces of linguistic structure stored in long-term memory. (p. 584) The lexicon is not a separate component of grammar from the rules that assemble sentences. Rather, what have traditionally been distinguished as “words” and “rules” are simply different sorts of stored structure. “Words” are idiosyncratic interface rules; “rules” may be gener al interface rules, or they may be simply stipulations of possible structure in one compo Page 10 of 31

A Parallel Architecture Model of Language Processing nent or another. Novel sentences are “generated” by “clipping together” pieces of stored structure, an operation called unification (Shieber, 1986).7 Under this interpretation of words and rules, the distinction between word-based parsing and rule-based parsing disappears. This yields an immediate benefit in the description of syntactic priming, in which the use of a particular syntactic structure such as a ditransi tive verb phrase primes subsequent appearances (Bock, 1995; Bock & Loebell, 1990,). As Bock (1995) observes, the existence of syntactic priming is problematic within main stream assumptions: There is no reason that rule application should behave anything like lexical access. However, in the Parallel Architecture, where syntactic constructions and words are both pieces of stored structure, syntactic priming is to be expected, altogether parallel to word priming. Summing up the last three sections, the Parallel Architecture acknowledges all the com plexity of linguistic detail addressed by mainstream theory, but it proposes to account for this detail in different terms. Phonology and semantics are combinatorially independent from syntax; ordered derivations are replaced by parallel constraint checking; words are regarded as interface rules that help mediate between the three components of language; and words and rules are both regarded as pieces of stored structure. Jackendoff (2002) and Culicover and Jackendoff (2005) demonstrate how this approach leads to far more natural descriptions of many phenomena such as idioms and offbeat “syntactic nuts” that have been either problematic or ignored in the mainstream tradition. Culicover and Jackendoff (2005) further show how this approach leads to a considerable reduction in the complexity of syntactic structure, an approach called Simpler Syntax. From the point of view of psycholinguistics, this should be a welcome result. The syntac tic structures posited by contemporary MGG are far more complex than have been or could be investigated experimentally, whereas the structures of Simpler Syntax are for the most part commensurate with those that have been assumed in the last three decades of psycholinguistic and neurolinguistic research.

Processing in the Parallel Architecture: Gener al Considerations The Parallel Architecture is motivated primarily on grounds of its ability to account for the phenomena addressed by traditional linguistic theory; that is, it is a “competence” model in the classical sense. However, we have begun to see that it also has implications for processing. We now turn more specifically to embedding the Parallel Architecture in a processing theory that helps clarify certain debates in psycholinguistics and that also al lows psycholinguistic evidence to bear directly on issues of linguistic theory. (See also Jackendoff, 2002, chapter 7, which discusses production as well as perception.) To begin, it is necessary to discuss two general considerations in processing. The first is serial versus parallel processing. When the parser encounters a local structural ambigui Page 11 of 31

A Parallel Architecture Model of Language Processing ty, does it only pursue one preferred analysis, backing up if it makes a mistake—or does it pursue multiple options in parallel? Through the past three decades, as these two alterna tive hypotheses have competed in the literature and have been refined to deal with new experimental evidence, they have become increasingly indistinguishable (Lewis, 2000). On the one hand, a parallel model has to rank alternatives for plausibility; on the other hand, a serial model has to be sensitive to detailed lexically conditioned alternatives that imply either some degree of parallelism or a phenomenally fast recovery from certain kinds of incorrect analyses. The Parallel Architecture cannot settle this dispute, but it does place a distinct bias on the choice. It has been clear since Swinney (1979) and Tanenhaus et al. (1979) that lexical access in language perception is “promiscuous”: An incoming phonological string acti vates all semantic structures associated with it, whatever their relevance to the current semantic context, and these remain activated in parallel for some time in working memo ry. As shown earlier, the Parallel Architecture treats syntactic treelets as the same formal type as words: Both are pieces of structure stored in long-term memory. A structural am biguity such as that in (15)— (15) My professor told the girl that Bill liked the story about Harry. —arises by activating different treelets and/or combining them in different ways, a treatment not so different in spirit from a lexical ambiguity. This suggests that on grounds of consistency, the Parallel Architecture recommends parallel processing. Thus in developing a model of process ing, I assume all (p. 585) the standard features of parallel processing models, in particular com petition among mutually inhibitory analyses.

A second ongoing dispute in the language processing literature concerns the character of working memory. One view (going back at least to Neisser, 1967) sees working memory as functionally separate from long-term memory, a “place” where incoming information can be structured. In this view, lexical retrieval involves in some sense copying or binding the long-term coding of a word into working memory. By contrast, semantic network and con nectionist architectures for language processing (e.g., Smith & Medin, 1981; Elman et al., 1996; MacDonald & Christiansen, 2002; MacDonald et al., 1994) make no distinction be tween long-term and working memory. For them, “working memory” is just the part of long-term memory that is currently activated (plus, in Elman’s recurrent network archi tecture, a copy of the immediately preceding input); lexical retrieval consists simply of ac tivating the word’s long-term encoding, in principle a simpler operation. Such a conception, though, does not allow for the building of structure. Even if the words of a sentence being perceived are activated, there is no way to connect them up; the dog chased a cat, the cat chased a dog, and dog cat a chased the activate exactly the same words. There is also no principled way to account for sentences in which the same word occurs twice, such as my cat likes your cat: the sentence refers to two distinct cats, even though there is (presumably) only one “cat node” in the long-term memory network. Jack endoff (2002, section 3.5) refers to this difficulty as the “Problem of 2” and shows that it recurs in many cognitive domains, for example, in recognizing two identical forks on the table, or in recognizing a melody containing two identical phrases. In an approach with a Page 12 of 31

A Parallel Architecture Model of Language Processing separate working memory, these problems do not arise: There are simply two copies of the same material in working memory, each of which has its own relations to other mater ial (including the other copy). An approach lacking an independent working memory also cannot make a distinction be tween transient and permanent linkages. For instance, recall that MacDonald et al. (1994) propose to account for structure by building into lexical items their potential for participating in structure. For them, structure is composed by establishing linkages among the relevant parts of the lexical entries. However, consider the difference between the phrases throw the shovel and kick the bucket. In the former, where composition is ac complished on the spot, the linkage between verb and direct object has to be transient and not affect the lexical entries of the words. But in the latter, the linkage between the verb and direct object is part of one’s lexical knowledge and therefore permanent. This distinction is not readily available in the MacDonald et al. model. A separate working memory deals easily with the problem: Both examples produce linkages in working mem ory, but only kick the bucket is linked in long-term memory. Neural network models suffer from two other important problems (discussed at greater length in Jackendoff, 2002, section 3.5). First, such models encode long-term memories as connection strengths among units in the network, acquired through thousands of steps of training. This gives no account of one-time learning of combinatorial structures, such as the meaning of the sentence I’ll meet you for lunch at noon, a single utterance of which can be sufficient to cause the hearer to show up for lunch. In a model with a separate working memory, the perception of this sentence leads to copying of the composite mean ing into episodic memory (or whatever is responsible for keeping track of obligations and formulating plans)—which is distinct from linguistic knowledge. Finally, a standard neural network cannot encode a general relation such as X is identical with Y, X rhymes with Y,8 or X is the (regular) past tense of Y. Connectionists, when pressed (e.g., Bybee & McClelland, 2005), claim that there are no such general relations —there are only family resemblances among memorized items, to which novel examples are assimilated by analogy. But to show that there is less generality than was thought is not to show that there are no generalizations. The syntactic generalizations mentioned earlier, such as the multiple possibilities for noun modification, again can be cited as counterexamples; they require typed variables such as N, NP, V, and VP in order to be statable. Marcus (1998, 2001), in important work that has been met with deafening si lence by the connectionist community,9 demonstrates that neural networks in principle cannot encode the typed variables necessary for instantiating general relations, including those involved in linguistic combinatoriality. This deficit is typically concealed by dealing with toy domains with small vocabularies and a small repertoire of structures. It is no accident that the domains of language in which neural network architectures have been most successful are those that make mini mal use of structure, such as word (p. 586) retrieval, lexical phonology, and relatively sim ple morphology. All standard linguistic theories give us a handle on how to analyze com Page 13 of 31

A Parallel Architecture Model of Language Processing plex sentences like the ones you are now reading; but despite more than twenty years of connectionist modeling, no connectionist model comes anywhere close. (For instance, the only example worked out in detail by MacDonald et al., 1994, is the two-word utterance, John cooked.) Accordingly, as every theory of processing should, a processing model based on the Paral lel Architecture posits a working memory separate from long-term memory: a “work bench” or “blackboard” in roughly the sense of Arbib (1982), on which structures are con structed online. Linguistic working memory has three subdivisions or “departments,” one each for the three components of grammar, plus the capability of establishing linkages among their parts in terms of online bindings of the standard (if ill-understood) sort stud ied by neuroscience. Because we are adopting parallel rather than serial processing, each department is capable of maintaining more than one hypothesis, linked to one or more hy potheses in other departments.10 This notion of working memory contrasts with Baddeley’s (1986) influential treatment, in which linguistic working memory is a “phonological loop” where perceived phonological structure is rehearsed. Baddeley’s approach does not tell us how phonological structure is constructed, nor how the corresponding syntactic and semantic structures are con structed and related to phonology. Although a phonological loop may be adequate for de scribing the memorization of strings of nonsense syllables (Baddeley’s principal concern), it is not adequate for characterizing the understanding of strings of meaningful syllables, that is, the perception of real spoken language. And relegating the rest of language pro cessing to a general-purpose “central executive” simply puts off the problem. (See Jack endoff, 2002, pp. 205–207, for more discussion.)

An Example

Figure 28.2 Linguistic working memory after pho netic processing of the first five syllables of (16a,b).

With this basic conception of working memory in place, we will now see how the knowl edge structures posited by the Parallel Architecture can be put to use directly in the process of language perception. Consider the following pair of sentences. (16) a. It’s not a parent, it’s actually a child. b. It’s not apparent, it’s actually quite obscure. (16a,b) are phonetically identical (at least in my dialect) up to their final two words. How ever they are phonologically different: (16a) has a word boundary that (16b) lacks. They are also syntactically different: a parent is an NP, whereas apparent is an adjective phrase Page 14 of 31

A Parallel Architecture Model of Language Processing (AP). And of course they are semantically different as well. The question is how the two interpretations are developed in working memory and distinguished at the end. Suppose that auditory processing deposits raw phonetic input into working memory. Fig ure 28.2 shows what working memory looks like when the first five syllables of (16) have been so assimilated. (I beg the reader’s indulgence in idealizing away from issues of phoneme identification, which are of course nontrivial.) At the next stage of processing, the lexicon must be called into play in order to identify which words are being heard. For convenience, let us represent the lexicon as in Figure 28.3, treating it as a relatively unstructured collection of phonological, syntactic, and se mantic structures—sometimes linked—of the sort illustrated in (6), (10), and (12) to (14) above. Working memory, seeking potential lexical matches, sends a call to the lexicon, in effect asking, “Do any of you in there sound like this?” And various phonological structures “vol unteer” or are activated. Following Swinney (1979) and Tanenhaus, Leiman, and Seiden berg (1979), all possible forms with the appropriate phonetics are activated: both it’s and its, both not and knot, and both apparent and a + parent. This experimental result stands to reason, given that at this point only phonetic information is available to the processor. However, following the lexically driven parsing tradition, we also can assume that the de gree and/or speed of activation of the alternative forms depends on their frequency.

Figure 28.3 A fragment of the lexicon.

Figure 28.4 The lexicon after being called by work ing memory in Figure 28.2.

Page 15 of 31

A Parallel Architecture Model of Language Processing

Figure 28.5 Activated lexical items are copied/bound into working memory, creating multiple “drafts.”

Phonological activation in the lexicon spreads to linked syntactic and semantic struc tures. If a lexical (p. 587) item’s semantic structure has already been primed by context, its activation will be faster or more robust, or both. Moreover, once lexical semantic structure is activated, it begins to prime semantically related lexical items. The result is depicted in Figure 28.4, in which the activated items are indicated in bold.

Figure 28.6 The status of working memory and the lexicon after syntactic integration.

Next, the activated lexical items are bound to working memory. However, not only the phonological structure is bound; the syntactic and semantic structures are also bound (or copied) to the appropriate departments of working memory, yielding the configuration in Figure 28.5. Because there are alternative ways of carving the phonological content into lexical items, working memory comes to contain mutually inhibitory “drafts” (in the sense of Dennett, 1991) of what is being heard. At this point in processing, there is no way of knowing which of the two competing “drafts” is correct. (For convenience in exposition, from here on we consider just the fragment of phonetics corresponding (p. 588) to appar ent or a + parent, ignoring its/it’s and not/ knot.) The syntactic department of working memory now contains strings of syntactic elements, so it is now possible to undertake syntactic integration: the building of a unified syntactic structure from the fragments now present in working memory. Syntactic integration uses the same mechanism as lexical access: the strings in working memory activate treelets in

Page 16 of 31

A Parallel Architecture Model of Language Processing long-term memory. In turn these treelets are unified with the existing strings. The string Det—N thus becomes an NP, and the adjective becomes an AP, as shown in Figure 28.6.

Figure 28.7 The status of working memory after se mantic integration.

The other necessary step is semantic integration: building a unified semantic structure from the pieces of semantic structure bound into working memory from the lexicon. This process has to make use of at least two sets of constraints. One set is the principles of se mantic well-formedness: unattached pieces of meaning have to be combined in a fashion that makes sense—both internally and also in terms of any context that may also be present in semantic working memory. In the present example, these principles will be suf ficient to bring about semantic integration: INDEF and PARENT can easily be combined into a semantic constituent, and APPARENT forms a constituent on its own. The resulting state of working memory looks like Figure 28.7. However, in more complex cases, semantic integration also has to use the syntax–seman tics interface rules (also stored in the lexicon, but not stated here), so that integrated syn tactic structures in working memory can direct the arrangement of the semantic frag ments. In such cases, semantic integration is dependent on successful syntactic integra tion. Consider the following example: (17) What did Sandy say that Pat bought last Tuesday?

Figure 28.8 Status of working memory after it’s ac tually a child is semantically integrated.

Figure 28.9 The semantics of the lower draft is ex tinguished.

In order for semantic integration to connect the meaning of what to the rest of the inter pretation, syntactic integration must determine that it is the object of bought rather than the object of say. Syntactic integration must also determine that last Tuesday is a modifi Page 17 of 31

A Parallel Architecture Model of Language Processing er of bought and not say (compare to Last Tuesday, what did Sandy say that Pat bought?). Thus in such cases we expect semantic integration (p. 589) to be (partly) dependent on the output of syntactic integration.11 In Figure 28.7, as it happens, the semantic structure is entirely parallel to the syntax, so there is no way to tell whether syntactic integration has been redundantly evoked to determine the semantics, and with what time course. At the point reached in Figure 28.7, working memory has two complete and mutually in hibitory structures, corresponding to the meanings of the two possible interpretations of the phonetic input. How is this ambiguity resolved? As observed above, it depends on the meaning of the following context. In particular, part of the meaning of the construction in (16), It’s not X, it’s (actually) Y, is that X and Y form a semantic contrast. Suppose the in put is (16a), It’s not a parent, it’s actually a child. When the second clause is semantically integrated into working memory, the result is then Figure 28.8. At this point in processing (and only at this point), it becomes possible to detect that the lower “draft” is semantically ill formed because apparent does not form a sensible con trast with child. Thus the semantic structure of this draft comes to be inhibited or extin guished, as in Figure 28.9. This in turn sets off a chain reaction of feedback through the entire set of linked struc tures. Because the semantic structure of the lower draft helps keep the rest of the lower draft stable, and because all departments of the upper draft are trying to inhibit the low er draft, the entire lower draft comes to be extinguished. Meanwhile, through this whole process, the activity in working memory has also main tained activation of long-term memory items that are bound to working memory. Thus, when apparent and its syntactic structure are extinguished in working memory, the corre sponding parts of the lexicon are deactivated as well—and they therefore cease priming semantic associates, as in Figure 28.10. The final state of working memory is Figure 28.11. If the process from Figure 28.6 through Figure 28.11 goes quickly enough, the perceiver ends up hearing the utterance as It’s not a parent, with no sense of ambiguity or garden-pathing, even though the disam biguating information follows the ambiguous passage. Strikingly, the semantics affects the hearer’s impression of the phonology. To sum up the process just sketched: • Phonetic processing provides strings of phonemes in phonological working memory. • The phonemic strings initiate a call to the lexicon in long-term memory, seeking can didate words that match parts of the strings. • Activated lexical items set up candidate phonological parsings, often in multiple drafts, each draft linked to a lexical item or sequence of lexical items. • Activated lexical items also set up corresponding strings of syntactic units and collec tions of semantic units in the relevant departments of working memory. Page 18 of 31

A Parallel Architecture Model of Language Processing • Syntactic integration proceeds by activating and binding to treelets stored in the lex icon.

Figure 28.10 The syntax and phonology of the low er draft and their links to the lexicon are extin guished. (p. 590)

Figure 28.11 The resolution of the ambiguity.

• When semantic integration dependson syntactic constituency, it cannot begin until syntactic integration of the relevant constituents is complete. (However, semantic inte gration does not have to wait for the entire sentence to be syntactically integrated— only for local constituents.) • Semantic disambiguation among multiple drafts requires semantic integration with the context (linguistic or nonlinguistic). In general, semantic disambiguation will there fore be slower than syntactic disambiguation. • The last step in disambiguation is the suppression of phonological candidates by feedback. • Priming is an effect of lexical activation in long-term memory. Early in processing, se mantic associates of all possible meanings of the input are primed. After semantic dis ambiguation, priming by disfavored readings terminates. • Priming need not be confined to the semantics of words. Because syntactic treelets are also part of the lexicon, it is possible to account for syntactic or constructional priming (Bock, 1995) in similar terms. There is ample room in this model to investigate standard processing issues such as ef fects of frequency and priming on competition (here localized in lexical access), relative prominence of alternative parsings (here localized in syntactic integration), influence of context (here localized in semantic integration), and conditions for garden-pathing (here, premature extinction of the ultimately correct draft) or absence thereof (as in the present Page 19 of 31

A Parallel Architecture Model of Language Processing example). The fact that each step of processing can be made explicit—in terms of ele ments independently motivated by linguistic theory—recommends the model as a means of putting all these issues in larger perspective.

Further Issues This section briefly discusses two further sorts of phenomena that can be addressed in the Parallel Architecture’s model of processing. I am not aware of attempts to draw these together in other models.

Visually Guided Parsing Tanenhaus et al. (1995) confronted subjects with an array of objects and an instruction like (18), and their eye movements over the array were tracked. (18) Put the apple on *the towel in the cup. At the moment in time marked by *, the question faced by the language processor is whether on (p. 591) is going to designate where the apple is, or where it is to be put—a classic PP attachment ambiguity. It turns out that at this point, subjects already start scanning the relevant locations in the array in order to disambiguate the sentence (Is there more than one apple? Is there already an apple on the towel?). Hence visual feed back is used to constrain interpretation early on in processing. The Parallel Architecture makes it clear how this can come about. So far we have spoken only of interfaces between semantic structure and syntax. However, semantic structure also interfaces with other aspects of cognition. In particular, to be able to talk about what we see, high-level representations produced by the visual system must be able to induce the creation of semantic structures that can then be converted into utterances. Address ing this need, the Parallel Architecture (Jackendoff, 1987, 1996, 2002, 2012; Landau & Jackendoff, 1993) proposes a level of mental representation called spatial structure, which integrates visual, haptic, and proprioceptive inputs into the perception of physical objects in space (including one’s body). Spatial structure is linked to semantic structure by means of an interface similar in character to the interfaces within the language faculty. Some linkages between semantic and spatial structure are stored in long-term memory. For instance, cat is a semantic category related to the category animal in semantic struc ture and associated with the phonological structure /kæt/ in long-term memory. But it is also associated with a spatial structure, which encodes what cats look like, the counter part in the present approach to an “image of a stereotypical instance.” Other linkages must be computed combinatorially on line. For instance, the spatial structure that arises from seeing an apple on a towel is not a memorized configuration, and it must be mapped online into the semantic structure [APPLE BE [ON [TOWEL]]]. Such a spatial structure has to be computed in another department of working memory that encodes one’s con ception of the current spatial layout.12 Page 20 of 31

A Parallel Architecture Model of Language Processing This description of the visual–linguistic interface is sufficient to give an idea of how exam ple (18) works. In hearing (18), which refers to physical space, the goal of processing is to produce not only a semantic structure but a semantic structure that can be correlated with the current spatial structure through the semantic–spatial interface. At the point designated by *, syntactic and semantic integration have led to the two drafts in (19). (As usual, italics denote anticipatory structure to be filled by subsequent material; the seman tics contains YOU because this is an imperative sentence.) (19) Syntax a.

b.

Semantics [VP put

YOU PUT

[NP the apple]

[APPLE; DEF]

[PP on NP]]

[ON X]

[VP put

YOU PUT

[NP the apple [PP on NP]]

[APPLE; DEF;

PP]

[Place ON X]] PLACE

Thus the hearer has performed enough semantic integration to anticipate finding a unique referent in the visual environment for the NP beginning with the apple, and starts scanning for one. Suppose spatial structure turns up with two apples. Then the only draft that can be corre lated consistently with spatial structure is (19b), with the expectation that the phrase on NP will provide disambiguating information. The result is that draft (19a) is extinguished, just like the lower draft in our earlier example. If it happens that the hearer sees one of the apples on something, say a towel, and the other is not on anything, the first apple can be identified as the desired unique referent—and the hearer ought to be able to antici pate the phonological word towel. Thus by connecting all levels of representation through the interfaces, it is possible to create an anticipation of phonological structure from visual input. This account of (18) essentially follows what Tanenhaus et al. have to say about it. What is important here is how naturally and explicitly it can be couched in the Parallel Architec ture—both in terms of its theory of interfaces among levels of representation and in terms of its theory of processing.

Page 21 of 31

A Parallel Architecture Model of Language Processing

Semantic Structure Without Syntax or Phonology The relationship between the Parallel Architecture and its associated processing model is a two-way street: It is possible to run experiments that test linguistic hypotheses. For ex ample, consider the phenomenon of “aspectual coercion,” illustrated in (20) (Jackendoff, 1997; Pustejovsky, 1995; Verkuyl, 1993; among others). Example (20a) conveys a sense of repeated jumping, but there is no sense of repeated sleeping in the syntactically parallel (20b). (20) a. Joe jumped until the bell rang. b. Joe slept until the bell rang. Here is how it works: The semantic effect of until is to place a temporal bound on a continuous process. Since sleep denotes a continuous process, semantic integration in (20b) is straightforward. In contrast, jump is a point-action verb: a jump has a definite ending, namely when one lands. Thus it cannot integrate properly with until. However, re peated jumping is a continuous process, so by construing the sentence in this fashion, se (p. 592)

mantic integration can proceed. Crucially, in the Parallel Architecture, the sense of repeti tion is encoded in none of the words. It is a free-floating semantic operator that can be used to “fix up” or “coerce” interpretations under certain conditions. A substantial num ber of linguistic phenomena have now been explained in terms of coercion (see Jackend off, 1997, for several examples; some important more recent examples appear in Culicov er & Jackendoff 2005, chapter 12). The Parallel Architecture makes the prediction that (20a) will look unexceptionable to the processor until semantic integration. At this point, the meanings of the words cannot be integrated, and so semantic integration attempts the more costly alternative of coercion. Thus a processing load should be incurred specifically at the time of semantic integra tion. Piñango et al. (1999) test for processing load in examples like (20a,b) during audito ry comprehension by measuring reaction time to a lexical decision task on an unrelated probe. The timing of the probe establishes the timing of the processing load. And indeed extra processing load does show up in the coerced examples, in a time frame consistent with semantic rather than syntactic or lexical processing, just as predicted by the Parallel Architecture.13 Similar experimental results have been obtained for the “light verb construction” shown in (21). (21) a. Sam gave Harry an order. (= Sam ordered Harry) b. Sam got an order from Harry. (= Harry ordered Sam) In these examples, the main verb order is paraphrased by the combination of the noun an order plus the light verbs give and get. But the syntax is identical to a “nonlight” use of the verb, as in Sam gave an orange to Harry and Sam got an orange from Harry. The light Page 22 of 31

A Parallel Architecture Model of Language Processing verb construction comes to paraphrase the simple verb through a semantic manipulation that combines the argument structures of the light verb and the nominal (Culicover & Jackendoff, 2005, pp. 222–225). Thus again the Parallel Architecture predicts additional semantic processing, and this is confirmed by experimental results (Piñango et al., 2006; Wittenberg et al., forthcoming, Wittenberg et al., in revision).

Final Overview The theoretical account of processing sketched in the previous three sections follows di rectly from the logic of the Parallel Architecture. First, as in purely word-driven ap proaches to processing, this account assumes that words play an active role in determin ing structure at phonological, syntactic, and semantic levels: The linguistic theory posits that words, idioms, and constructions are all a part of the rule system. In particular, the interface properties of words determine the propagation of activity across the depart ments of working memory. Second, unlike purely word-driven approaches to processing, the Parallel Architecture’s processing model builds hierarchical structure in working memory, using pieces of phrase structure along with structure inherent in words. This enables the processing model to encompass sentences of any degree of complexity and to overcome issues such as the Problem of 2. Third, structural information is available in processing as soon as a relevant rule (or word) can be activated in the lexicon and bound into working memory. That is, processing is opportunistic or incremental— in accord with much experimental evidence. This charac teristic of processing is consistent with the constraint-based formalism of the Parallel Ar chitecture, which permits structure to be propagated from any point in the sentence— phonology, semantics, top-down, bottom-up, left to right. Contextual influences from dis course or even from the visual system can be brought to bear on semantic integration as soon as semantic fragments are made available through lexical access. Fourth, the system makes crucial use of parallel processing: All relevant structures are processed at once in multiple “drafts,” in competition with one another. The extinction of competing drafts is carried out along pathways established by the linkages among struc tures and the bindings between structures in working memory and the lexicon. Because the Parallel Architecture conceives of the structure of a sentence as a linkage among three separate structures, the handling of the competition among multiple drafts is com pletely natural. What is perhaps most attractive about the Parallel Architecture from a psycholinguistic perspective is that the principles of grammar are used directly by the processor. That is, unlike the classical MGG architecture, there is no “metaphor” involved in the notion of (p. 593) grammatical derivation. The formal notion of structure building in the compe tence model is the same as in the performance model, except that it is not anchored in time. Moreover, the principles of grammar are the only routes of communication between Page 23 of 31

A Parallel Architecture Model of Language Processing semantic context and phonological structure: context effects involve no “wild card” inter actions using non-linguistic strategies. The Parallel Architecture thus paves the way for a much closer interaction between linguistic theory and psycholinguistics than has been possible in the past three decades.

Author Note This article is an abridged version of Jackendoff, 2007. I am thankful to Gina Kuperberg and Maria Mercedes Piñango for many detailed comments and suggestions on previous drafts. Two anonymous reviewers also offered important suggestions. Martin Paczynski helped a great deal with graphics. My deepest gratitude goes to Edward Merrin for re search support, through his gift of the Seth Merrin Professorship to Tufts University.

References Arbib, M. A. (1982). From artificial intelligence to neurolinguistics. In M. A. Arbib, D. Ca plan, J. C. Marshall (Eds.), Neural models of language processes (pp. 77–94). New York: Academic Press. Baddeley, A. (1986). Working memory. Oxford, UK: Clarendon Press. Bock, K. (1995). Sentence production: From mind to mouth. In J. L. Miller & P. D. Eimas (Eds.), Handbook of perception and cognition, Vol. XI: Speech, language, and communica tion (pp. 181–216), Orlando, FL: Academic Press. Bock, K., & Loebell, H. (1990). Framing sentences. Cognition, 35, 1–39. Bresnan, J. (2001). Lexical-functional syntax. Oxford, UK: Blackwell. Bybee, J., & McClelland, J. L. (2005). Alternatives to the combinatorial paradigm of lin guistic theory based on domain general principles of human cognition. Linguistic Review, 22, 381–410. Cheney, D., & Seyfarth, R. (1990). How monkeys see the world. Chicago: University of Chicago Press. Chierchia, G., & McConnell-Ginet, S. (1990). Meaning and grammar: An introduction to semantics. Cambridge, MA: MIT Press. Chomsky, N. (1957). Syntactic structures. The Hague: Mouton. Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. Chomsky, N. (1981). Lectures on government and binding. Dordrecht: Foris. Chomsky, N. (1995). The minimalist program. Cambridge, MA: MIT Press.

Page 24 of 31

A Parallel Architecture Model of Language Processing Chomsky, N. (2000). New horizons in the study of language and mind. Cambridge, UK: Cambridge University Press. Collins, A., & Quillian, M. (1969). Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior, 9, 240–247. Culicover, P., & Jackendoff, R. (2005). Simpler syntax. Oxford, UK: Oxford University Press. Dennett, D. C. (1991). Consciousness explained. Boston: Little, Brown. Elman, J. (1990). Finding structure in time. Cognitive Science, 14, 179–211. Elman, J., Bates, E., Johnson, M., Karmiloff-Smith, A., Parisi, D., & Plunkett, K. (1996). Re thinking innateness. Cambridge, MA: MIT Press. Ford, M., Bresnan, J., Kaplan, R. C., 1982. A competencebased theory of syntactic closure. In J. Bresnan (Ed.). The mental representation of grammatical relations (pp. 727–796). Cambridge, MA: MIT Press. Frazier, L. (1989). Against lexical generation of syntax. In W. Marslen-Wilson (Ed.), Lexi cal representation and process (pp. 505–528). Cambridge, MA: MIT Press. Frazier, L., Carlson, K., & Clifton, C. (2006). Prosodic phrasing is central to language comprehension. Trends in Cognitive Sciences, 10, 244–249. Gee, J., & Grosjean, F. (1983). Performance structures: A psycholinguistic and lin guistic appraisal. Cognitive Psychology, 15, 411–458. (p. 595)

Goldsmith, J. (1979). Autosegmental phonology. New York: Garland Press. Hagoort, P. (2005). On Broca, brain, and binding: A new framework. Trends in Cognitive Sciences, 9, 416–423. Hauser, M. D. (2000). Wild minds: What animals really think. New York: Henry Holt. Hirst, D. (1993). Detaching intonational phrases from syntactic structure. Linguistic In quiry, 24, 781–788. Jackendoff, R. (1983). Semantics and cognition. Cambridge, MA: MIT Press. Jackendoff, R. (1987). Consciousness and the computational mind. Cambridge, MA: MIT Press. Jackendoff, R. (1990). Semantic structures. Cambridge, MA: MIT Press. Jackendoff, R. (1996). The architecture of the linguistic-spatial interface. In P. Bloom, M. A. Peterson, L. Nadel, M. F. Garrett (Eds.), Language and space (pp. 1–30). Cambridge, MA: MIT Press. Page 25 of 31

A Parallel Architecture Model of Language Processing Jackendoff, R. (1997). The architecture of the language faculty. Cambridge, MA: MIT Press. Jackendoff, R. (2002). Foundations of language. Oxford, UK: Oxford University Press. Jackendoff, R. (2006). Alternative minimalist visions of language. In Proceedings of the 41st meeting of the Chicago Linguistic Society, Chicago . Reprinted in

R. Borsley & K. Börjars, Eds. (2011). Non-Transformational syntax (pp. 268–296). Oxford, UK: Wiley-Blackwell. Jackendoff, R. (2007). A Parallel Architecture perspective on language processing. Brain Research, 1146, 2–22. Jackendoff, R. (2010). Meaning and the lexicon: The Parallel Architecture 1975–2010. Ox ford, UK: Oxford University Press. Jackendoff, R. (2012). A user’s guide to thought and meaning. Oxford, UK: Oxford Univer sity Press. Jackendoff, R. (2011). What is the human language faculty? Two views. Language, 87, 586–624. Jackendoff, R., & Pinker, S. (2005). The nature of the language faculty and its implications for the evolution of language (reply to Fitch, Hauser, and Chomsky). Cognition, 97, 211– 225. Lakoff, G. (1987). Women, fire, and dangerous things. University of Chicago Press, Chica go. Landau, B., & Jackendoff, R. (1993). “What” and “where” in spatial language and spatial cognition. Behavioral and Brain Sciences, 16, 217–238. Langacker, R. (1987). Foundations of cognitive grammar (Vol. 1). Stanford, CA: Stanford University Press. Lappin, S. (1996). The handbook of contemporary semantic theory. Oxford, UK: Black well. Lewis, R. (2000). Falsifying serial and parallel parsing models: Empirical conundrums and an overlooked paradigm. Journal of Psycholinguistic Research, 29, 241–248. Liberman, M., & Prince, A. (1977). On stress and linguistic rhythm. Linguistic Inquiry, 8, 249–336. MacDonald, M. C., & Christiansen, M. H. (2002). Reassessing working memory: Comment on Just and Carpenter (1992) and Waters and Caplan (1996). Psychological Review, 109, 35–54. Page 26 of 31

A Parallel Architecture Model of Language Processing MacDonald, M. C., Pearlmutter, N. J., & Seidenberg, M. S. (1994). Lexical nature of syn tactic ambiguity resolution. Psychological Review, 101, 676–703. Marcus, G. (1998). Rethinking eliminative connectionism. Cognitive Psychology, 37, 243– 282. Marcus, G. (2001). The algebraic mind. Cambridge, MA: MIT Press. Miller, G. A., & Chomsky, N. (1963). Finitary models of language users. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.), Handbook of mathematical psychology (Vol. 2, pp. 419– 492). New York: Wiley. Neisser, U. (1967). Cognitive psychology. Englewood Cliffs, NJ: Prentice-Hall. Paczynski, M., Jackendoff, R., & Kuperberg, G. R. When events change their nature. Un der revision. Partee, B., Ed. (1976). Montague grammar. New York: Academic Press. Phillips, C., & Lau, E. (2004). Foundational issues (review article on Jackendoff 2002). Journal of Linguistics, 40, 1–21. Piñango, M. M., Mack, J., & Jackendoff, R. (2006). Semantic combinatorial processes in argument structure: Evidence from light verbs. Proceedings of the Berkeley Linguistics Society. Piñango, M. M., Zurif, E., & Jackendoff, R. (1999). Real-time processing implications of enriched composition at the syntax-semantics interface. Journal of Psycholinguistic Re search, 28, 395–414. Pinker, S. (1989). Learnability and cognition. Cambridge, MA: MIT Press. Pinker, S. (1999). Words and rules. New York: Basic Books. Pinker, S. (2007). The stuff of thought. New York: Penguin. Pinker, S., & Jackendoff, R. (2005). The faculty of language: What’s special about it? Cog nition, 95, 201–236. Pollard, C., & Sag, I. (1994). Head-driven phrase structure grammar. Chicago: University of Chicago Press. Prince, A., & Smolensky, P. (1993/2004). Optimality theory: Constraint interaction in gen erative grammar. Technical report, Rutgers University and University of Colorado at Boulder, 1993. (Revised version published by Blackwell, 2004.) Pustejovsky, J. (1995). The generative lexicon. Cambridge, MA: MIT Press.

Page 27 of 31

A Parallel Architecture Model of Language Processing Pylkkänen, L., & McElree, B. (2006). The syntax-semantics interface: On-line composition of sentence meaning. In M. Traxler & M. A. Gernsbacher (Eds.), Handbook of psycholin guistics (2nd ed.) New York: Elsevier. Rosch, E., & Mervis, C. (1975). Family resemblances: Studies in the internal structure of categories. Cognitive Psychology, 7, 573–605. Schank, R. (1975). Conceptual information processing. New York: Elsevier. Shieber, S. (1986). An introduction to unification-based approaches to grammar. Stanford, CA: CSLI. Smith, E., & Medin, D. (1981). Categories and concepts. Cambridge, MA: Harvard Univer sity Press. Smith, E., Shoben, E., & Rips, L. (1974). Structure and process in semantic memory: A featural model for semantic decisions. Psychological Review, 81, 214–241. Swinney, D. (1979). Lexical access during sentence comprehension: (Re)consideration of context effects. Journal of Verbal Learning and Verbal Behavior, 18, 645–659. Tabor, W., & Tanenhaus, M. (1999). Dynamical models of sentence processing. Cognitive Science, 23, 491–515. Talmy, L. (1988). Force-dynamics in language and thought. Cognitive Science, 12, 49–100. Tanenhaus, M., Leiman, J. M., & Seidenberg, M. (1979). Evidence for multiple stages in the processing of ambiguous words in syntactic contexts. Journal of Verbal Learning and Verbal Behavior, 18, 427–440. Tanenhaus, M., Spivey-Knowlton, M. J., Eberhard, K. M., & Sedivy, J. C. (1995). Integra tion of visual and linguistic information in spoken language comprehension. Science, 268, 1632–1634. Traxler, M. J., Pickering, M. J., & Clifton, C. (1998). Adjunct attachment is not a form of lexical ambiguity resolution. Journal of Memory and Language, 39, 558–592. Truckenbrodt, H. (1999). On the relation between syntactic phrases and phonological phrases. Linguistic Inquiry, 30, 219–255. Verkuyl, H. (1993). A theory of aspectuality: The interaction between temporal and atem poral structure. Cambridge, MA: Cambridge University Press. Wittenberg, E., Jackendoff, R., Kuperberg, G., Paczynski, M., Snedeker, J., & Wiese, H. (forthcoming) The processing and representation of light verb constructions. In Bachrach, A., Roy, I. & Stockall, L. (eds). Structuring the argument. John Benjamins.

Page 28 of 31

A Parallel Architecture Model of Language Processing Wittenberg, E., Paczynski, M., Wiese, H., Jackendoff, R., & Kuperberg, G. (in revision). The difference between “giving a rose” and “giving a kiss”: A sustained anterior negativi ty to the light verb construction.

Notes: (1) . What counts as “enough” syntactic structure might be different in perception and production. Production is perhaps more demanding of syntax, in that the processor has to make syntactic commitments in order to put words in the correct order; to establish the proper inflectional forms of verbs, nouns, and adjectives (depending on the language); to leave appropriate gaps for long-distance dependencies; and so on. Perception might be somewhat less syntax bound in that “seat-of-the-pants” semantic processing can often get close to a correct interpretation. (2) . Frazier et al. (2006) suggest that there is more to the prosody–syntax interface than rule (4b), in that the relative length of pauses between Intonation Phrases can be used to signal the relative closeness of syntactic relationship among constituents. This result adds a further level of sophistication to rule (4b), but it does not materially affect the point being made here. It does, however, show how experimental techniques can be used to refine linguistic theory. (3) . A further point: The notion of an independently generative phonology lends itself ele gantly to the description of signed languages, in which phonological structure in the visu al-manual modality can easily be substituted for the usual auditory-vocal system. (4) . Phillips and Lau (2004) find such anticipatory parsing “somewhat mysterious” in the context of the most recent incarnation of MGG, the Minimalist Program. Frazier (1989), assuming a mainstream architecture, suggests that the processor uses “precompiled” phrase structure rules to create syntactic hypotheses. Taken in her terms, the treelets in (6) are just such precompiled structures. However, in the Parallel Architecture, there are no “prior” algorithmic phrase structure rules like (5) from which the treelets are “com piled”; rather, one’s knowledge of phrase structure is encoded directly in the repertoire of treelets. (5) . Bybee and McClelland (2005), observing that languages are far less regular than MGG thinks, take this as license to discard general rules altogether in favor of a statisti cally based connectionist architecture. But most of the irregularities they discuss are word-based constraints. They completely ignore the sorts of fundamental syntactic gener alizations just enumerated. (6) . I do not exclude the possibility that high-frequency regulars are redundantly stored in the lexicon. I would add that I do not necessarily endorse every claim made on behalf of dual-process models of inflection. For more discussion, see Jackendoff (2002, pp. 163-167).

Page 29 of 31

A Parallel Architecture Model of Language Processing (7) . Unification is superficially like the Merge operation of the Minimalist Program (the most recent version of MGG). However, there are formal and empirical differences that favor unification as the fundamental generative process in language. See Jackendoff (2006, 2011) for discussion. (8) . Note that rhymes cannot be all memorized. One can judge novel rhymes that cannot be stored in the lexicon because they involve strings of words. Examples are Gilbert and Sullivan’s lot o’news/hypotenuse, Ira Gershwin’s embraceable you/irreplaceable you, and Ogden Nash’s to twinkle so/I thinkle so. Moreover, although embraceable is a legal Eng lish word, it is probably an on-the-spot coinage; and thinkle is of course a distortion of think made up for the sake of a humorous rhyme. So these words are not likely stored in the lexicon (unless one has memorized the poem). (9) . For instance, none of the connectionists referred to here cite Marcus; neither do any of the papers in a 1999 special issue of Cognitive Science entitled “Connectionist Models of Human Language Processing: Progress and Prospects”; neither was he cited other than by me at a 2006 Linguistic Society of America Symposium entitled “Linguistic Struc ture and Connectionist Models: How Good is the Fit?” (10) . In describing linguistic working memory as having three “departments,” I do not wish to commit to whether or not they involve different neural mechanisms or different brain localizations. The intended distinction is only that phonological working memory is devoted to processing and constructing phonological structures, syntactic working memo ry to syntactic structures, and semantic working memory to semantic structures. This is compatible with various theories of functional and neural realization. However, Hagoort (2005) offers an interpretation of the three parallel departments of linguistic working memory in terms of brain localization. (11) . Note that in sentence production, the dependency goes the other way: A speaker us es the semantic relations in the thought to be expressed to guide the arrangement of words in syntactic structure. (12) . Spatial structure in working memory also has the potential for multiple drafts. Is there a cat behind the bookcase or not? These hypotheses are represented as two differ ent spatial structures corresponding to the same visual input. (13) . Another strand of psycholinguistic research on coercion (e.g., Pylkkänen & McElree, 2006) also finds evidence of increased processing load with coerced sentences. However, those authors’ experimental technique, self-paced reading, does not provide enough tem poral resolution to distinguish syntactic from semantic processing load. Paczynski, Jack endoff, and Kuperberg (in revision) find ERP effects connected with aspectual coercion.

Ray Jackendoff

Page 30 of 31

A Parallel Architecture Model of Language Processing Ray Jackendoff is Seth Merrin Professor of Philosophy and Co-Director of the Center for Cognitive Studies at Tufts University. He was the 2003 recipient of the Jean Nicod Prize in Cognitive Philosophy and has been President of both the Linguistic Society of America and the Society for Philosophy and Psychology. His most recent books are Foundations of Language (Oxford, 2002), Simpler Syntax (with Peter Culicover, Ox ford, 2005), Language, Consciousness, Culture (MIT Press, 2007), Meaning and the Lexicon (Oxford, 2010), and A User’s Guide to Thought and Meaning (Oxford, 2011).

Page 31 of 31

Epilogue to The Oxford Handbook of Cognitive Neuroscience—Cognitive Neuroscience: Where Are We Going?

Epilogue to The Oxford Handbook of Cognitive Neuro science—Cognitive Neuroscience: Where Are We Go ing? Kevin N. Ochsner and Stephen Kosslyn The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology, Cognitive Neuroscience Online Publication Date: Dec 2013 DOI: 10.1093/oxfordhb/9780199988693.013.0029

Abstract and Keywords This epilogue looks at themes and trends that hint at future developments in cognitive neuroscience. It first considers how affective neuroscience merged the study of neuro science and emotion, how social neuroscience merged the study of neuroscience and so cial behavior, and how social cognitive neuroscience merged the study of cognitive neuro science with social cognition. Another theme is how the levels of analysis of behavior/ex perience can be linked with psychological process and neural instantiation. Two topics that have not yet been fully approached from a cognitive neuroscience perspective, but seem ripe for near-term future progress, are the study of the development across the lifespan of the various abilities described in the book, and the study of the functional or ganization of the frontal lobes and their contributions to behaviors (e.g., the ability to ex ert self-control). This epilogue also explores the multiple methods, both behavioral and neuroscientific, used in cognitive neuroscience, new ways of modeling relationships be tween levels of analysis, and the question of how to make cognitive neuroscience relevant to everyday life. Keywords: cognitive neuroscience, emotion, social behavior, social cognition, functional organization, frontal lobes, behaviors, methods, analysis, neural instantiation

Whether you have read the two-volume Handbook of Cognitive Neuroscience from cover to cover or have just skimmed a chapter or two, we hope that you take away a sense of the breadth and depth of work currently being conducted in the field. Since the naming of the field in the backseat of a New York City taxicab some 35 years ago, the field and the approach it embodies have become a dominant—if not the dominant—mode of scientific inquiry in the study of human cognitive, emotional, and social functions. But where will it go from here? Where will the next 5, 10, or even 20 years take the field and its approach? Obviously, nobody can say for sure—but there are broad intellectual

Page 1 of 12

Epilogue to The Oxford Handbook of Cognitive Neuroscience—Cognitive Neuroscience: Where Are We Going? themes and trends that run throughout this two-volume set, and a discussion of them can be used as a springboard to thinking about possible directions future work might take.

Themes and Trends Here we discuss themes and trends that hint at possible future developments, focusing on those that may be more likely to occur in the relatively near term.

What’s in a Name? It is said that imitation is the sincerest form of flattery. Given the proliferation of new ar eas of research with names that seemingly mimic cognitive neuroscience, the original has reason to feel flattered. Consider, for example, the development of three comparatively newer fields and the dates of their naming: social neuroscience (Cacioppo, 1994), affective neuroscience (Panksepp 1991), and social cognitive neuroscience (Ochsner & Lieberman, 2001). Although all three fields are undoubtedly the (p. 600) products of unique combinations of influences (see, e.g., Cacioppo, 2002; Ochsner, 2007; Panksepp, 1998), they each followed in the footsteps of cognitive neuroscience. In cognitive neuroscience the study of cognitive abili ties and neuroscience were merged, and in the process of doing so, the field has made considerable progress. In like fashion, affective neuroscience combined the study of emo tion with neuroscience; social neuroscience, the study of social behavior with neuro science; and social cognitive neuroscience, the study of social cognition with cognitive neuroscience. All three of these fields have adopted the same kind of multilevel, multimethod con straints and convergence approach embodied by cognitive neuroscience (as we discussed in the Introduction to this Handbook). In addition, each of these fields draws from and builds on, to differing degrees, the methods and models developed within what we can now call “classic” cognitive neuroscience (see Vol. 1 of the Handbook). These new fields are siblings in a family of fields that have the similar, if not identical, research “DNA.” It is for these reasons that Volume 2 of this Handbook has sections devoted to affect and emotion and to self and social cognition. The topics of the constituent chapters in these sections could easily appear in handbooks of affective or social or social cognitive neuro science (and in some cases, they already have, see, e.g., Cacioppo & Berntson, 2004; Todorov et al., 2011). We included this material here because it represents the same core approach that guides research on the classic cognitive topics in Volume 1 and in the lat ter half of Volume 2.

Page 2 of 12

Epilogue to The Oxford Handbook of Cognitive Neuroscience—Cognitive Neuroscience: Where Are We Going?

One might wonder whether these related disciplines are on trajectories for scientific and popular impact similar to that of classic cognitive neuroscience. In the age of the Inter net, one way of quantifying impact is simply to count the number of Google hits returned by a search for specific terms, in this case, “cognitive neuroscience,” “affective neuro science,” and so on. The results of an April 2012 Google search for field names is shown in the tables at right. The top table compares cognitive neuroscience with two of its an tecedent fields: cognitive psychology (Neisser, 1967) and neuroscience. The bottom table compares the descendants of classic cognitive neuroscience that were noted above. As can be seen, cognitive psychology and neuroscience are the oldest fields and the ones with the most online mentions. By comparison, their descendant, cognitive neuroscience, which describes a narrower field than either of its ancestors, is doing quite well. And the three newest fields of social, affective, and social cognitive neuroscience, each of which describes fields even narrower than that of cognitive neuroscience, also are doing well, with combined hit counts totaling about one-third that of cognitive neuroscience, in spite of the fact that the youngest field is only about one-third of cognitive neuroscience’s age.

How Do We Link Levels of Analysis? A theme running throughout the chapters concerns the different ways in which we can link the levels of analysis of behavior/experience, psychological process, and neural in stantiation. Here, we focus on two broad issues that were addressed, explicitly or implic itly, by many of the authors of chapters in these volumes. The first issue is the complexity of the behaviors that one is attempting to map onto un derlying processes and neural systems. For example, one might ask whether we should try to map what might be called “molar” abilities, such as memory or attention, onto sets of processes and neural systems, or instead whether we should try to map “molecular” subtypes of memory and subtypes of attention onto their constituent processes and neur al systems. As alluded to in the Introduction, for most of the abilities described in Volume 1, it was clear as early as 20 years ago that a more molecular, subtype, method of map ping makes the most sense in the context of neuroscience data. The current state-of-thePage 3 of 12

Epilogue to The Oxford Handbook of Cognitive Neuroscience—Cognitive Neuroscience: Where Are We Going? art in the study of perception, attention, memory, and language (reviewed in Volume 1 of this Handbook) clearly bears this out. All the chapters in these sections describe careful ways in which researchers have used combinations of behavioral and brain data to frac tionate the processes that give rise to specific subtypes of abilities. This leads us to the second issue, which concerns the fact that for at least some of the topics discussed (p. 601) in Volume 2, only recently has it become clear that more molecu lar mappings are possible. This is because for at least some of the Volume 2 topics, behav ioral research before the rise of the cognitive neuroscience approach had not developed clearly articulated process models that specified explicitly how information is represent ed and processed to accomplish a particular task. This limitation was perhaps most evi dent for topics such as the self, some aspects of higher level social cognition such as men tal state inference, and some aspects of emotion, including how emotions are generated and regulated. Twenty years ago, when functional neuroimaging burst on the scene, re searchers had proposed few if any process models of these molar phenomena. Hence, ini tial functional imaging and other types of neuroscience studies on these topics had more of a “let’s induce an emotional state or evoke a behavior and see what happens” flavor, and often they did not attempt to test specific theories. This is not to fault these re searchers; at the time, they did not have the advantage of decades of process-oriented be havioral research from cognitive psychology and vision research to help guide them (see, e.g., Ochsner & Barrett, 2001; Ochsner & Gross, 2004). Instead, researchers had to devel op process models on the fly. However, times have changed. As attested by the chapters in the first two sections of Vol ume 2, the incorporation of brain data into research on the self, social perception, and emotion has been very useful in developing increasingly complex, “molecular” theories of the relationships between the behavior/experience, psychological process, and neural in stantiation. Just as the study of memory moved beyond single-system models and toward multiple-sys tem models (Schacter & Tulving, 1994), the study of the self, social cognition, and emo tion has begun to move beyond simplistic notions that single brain regions (such as the medial prefrontal cortex or amygdala) are the seat of these abilities.

Looking Toward the Future Without question, progress has been made. What might the current state of cognitive neuroscience research auger for the future of cognitive neuroscience research? Here we address this question in four ways.

New Topics One of the ideas that recurs in the chapters of this Handbook is that the cognitive neuro science approach is a general-purpose scientific tool. This approach can be used to ask and answer questions about any number of topics. Indeed, even within the broad scope of Page 4 of 12

Epilogue to The Oxford Handbook of Cognitive Neuroscience—Cognitive Neuroscience: Where Are We Going? this two-volume set, we have not covered every topic already being fruitfully addressed using the cognitive neuroscience approach. That said, of the many topics that have not yet been approached from a cognitive neuro science perspective, do any appear particularly promising? Four such topics seem ripe for near-term future progress. These topics run the gamut from the study of specific brain systems to the study of lifespan development and differences in group or social network status, to forging links with the study of mental and physical health. The first topic is the study of the functional organization of the frontal lobes and the con tributions they make to behaviors such as the ability to exert self-control. At first blush, this might seem like a topic that already has received a great deal of attention. From one perspective, it has. Over the past few decades numerous labs have studied the relation ship of the frontal lobes to behavior. From another perspective, however, not much progress has been made. What is missing are coherent process models that link specific behaviors to specific subregions of prefrontal cortex. Notably, some chapters in this Handbook (e.g., those by Badre, Christoff, and Silvers et al.) attempt to do this within spe cific domains. But no general theory of prefrontal cortex has yet emerged that can link the myriad behaviors in which it is involved to specific and well-described processes that in turn are instantiated in specific portions of this evolutionarily newest portion of our brain. The second topic is the study of the development across the lifespan of the various abili ties described in the Handbook. Although some Handbook sections include chapters on development and aging, many do not—precisely because the cognitive neuroscientific study of lifespan changes in many abilities has only just begun. Clearly, the development from childhood into adolescence of various cognitive, social, and affective abilities is cru cially important, as is the ways in which these abilities change as we move from middle adulthood into older age (Casey et al, 2010; Charles & Carstensen, 2010; Mather, 2012). The multilevel approach that characterizes the cognitive neuroscience approach holds promise of deepening our understanding of such phenomena. Toward this end, it is impor tant to note that new journals devoted to some of these topics have (p. 602) appeared (e.g., Developmental Cognitive Neuroscience, which was first published in 2010), and var ious institutes within the National Institutes of Health (NIH) have called for research on these topics. The third topic is the study of the way in which group-level variables impact the develop ment and operation of the various processing systems described in both Volumes of this Handbook. Notably, this is an area of research that is not yet represented in the Hand book, although interest in connecting the study of group-level variables to the study of the brain has been growing over the past few years. Consider, for example, emerging re search suggesting that having grown up as a member of different cultural groups can dic tate whether and how one engages perceptual, memory, and affective systems both when reflecting on the self and in social settings (Chiao, 2009). There is also evidence that the size of one’s social networks can impact the structure of brain systems involved in affect Page 5 of 12

Epilogue to The Oxford Handbook of Cognitive Neuroscience—Cognitive Neuroscience: Where Are We Going? and affiliation, and that one’s status within these networks (Bickart et al., 2011) can de termine whether and when one recruits brain systems implicated in emotion and social cognition (Bickart et al., 2011; Chiao, 2010; Muscatell et al., 2011). Forging links be tween group-level variables and the behavior/experience, process, and brain levels that are the focus in the current Handbook will prove challenging and may require new kinds of collaborative relationships with other disciplines, such as sociology and anthropology. As these collaborations grow to maturity, we predict this work will make its way into fu ture editions of the Handbook. The fourth topic is the way in which specific brain systems play important roles in physi cal, as well as mental, health. The Handbook already includes chapters that illustrate how cognitive neuroscience approaches are being fruitfully translated to understand the na ture of dysfunction, and potential treatments for it, in various kinds of psychiatric and substance use disorders (see e.g., Barch et al., 2009; Johnstone et al., 2007; Kober et al., 2010; Ochsner, 2008 and the section below on Translation). This type of translational work is sure to grow in the future. What the current Handbook is missing, however, is dis cussion of how brain systems are critically involved in physical health via their interac tions with the immune system. This burgeoning area of interest seeks to connect fields such as health psychology with cognitive neuroscience and allied disciplines to under stand how variables like chronic stress or disease, or social connection vs. isolation, can boost or diminish physical health. Such an effect would arise via interactions between the immune system and brain systems involved in emotion, social cognition, and control (Muscatell & Eisenberger, 2012; Eisenberger & Cole, 2012). This is another key area of future growth that we expect to be represented in this Handbook in the future.

New Methods How are we going to make progress on these questions and the countless others posed in the chapters of the Handbook? On the one hand, the field will undoubtedly continue to make good use of the multiple methods—both behavioral and neuroscientific—that have been its bread and butter for the past decades. As noted in the Introduction, certain em pirical and conceptual advances were only made possible by technological advances, which enabled us to measure activity with dramatically new levels of spatial and temporal resolution. The advent of positron emission tomography, and later functional magnetic resonance imaging (20–30 years ago), were game-changing advances. On the other hand, these functional imaging techniques are still limited in terms of their spatial and temporal resolution, and the areas of the brain they allow researchers to fo cus on reflect the contributions of many thousands of neurons. Other techniques, such as magnetoencephalography and scalp electroencephalography, offer relatively good tempo ral resolution, but their spatial localization is relatively poor. Moreover, they are best suit ed to studying cortical rather than subcortical regions. We could continue to beat the drum for the use of converging methods: What one tech nique can’t do, another can, and by triangulating across methods, better theories can be Page 6 of 12

Epilogue to The Oxford Handbook of Cognitive Neuroscience—Cognitive Neuroscience: Where Are We Going? built and evaluated. But for the next stage of game-changing methodological advances to be realized, either current technologies will need to undergo a transformation that en ables them to combine spatial and temporal resolution in new ways or new techniques that have better characteristics will need to be invented.

New Ways of Modeling Relationships Between Levels of Analysis All this said, even the greatest of technological advances will not immediately be useful unless our ability to conceptualize the cognitive and emotional processes that lie between brain and behavior becomes more sophisticated. At present, most theorizing in cognitive neuroscience makes use of commonsense termi nology for describing human abilities. We talk about memory, (p. 603) perception, emo tion, and so on. We break these molar abilities into more molecular parts and character ize them in terms of their automatic or controlled operation, whether the mental repre sentations are relational, and so on. Surely, however, the computations performed by spe cific brain regions did not evolve to instantiate our folk-psychological ideas about how best to describe the processes underlying behavior. One possible response to this concern is that the description of phenomena at multiple levels of analysis allows us to sidestep this problem. One could argue that at the highest level of description, it’s just fine to use folk-psychological terms to describe behavior and experience. After all, our goal is to map these terms—which prove extremely useful for everyday discourse about human behavior—onto precise descriptions of underlying neur al circuitry by reference to a set of information processing mechanisms. Unfortunately, however, many researchers do not restrict intuitively understandable folkpsychological terms to describe behavior and experience, but also use such terms to de scribe information processing itself. In this case, process-level descriptions are not likely to map in a direct way onto neural mechanisms. Marr (1982) suggested a solution to this problem: Rely on the language of computation to characterize information processing. The language of computation characterizes what computers do, and this language often can be applied to describe what brains do. But brains are demonstrably not digital computers, and thus it is not clear whether the tech nical vocabulary that evolved to characterize information processing in computers can in fact always be appropriately applied to brains. Back in the 1980s, many researchers hoped that connectionist models might provide an appropriate kind of computational specificity. More recently, computational models from the reinforcement learning and neuroeconomic literatures have been advanced as offering a new level of computational specificity. Although no existing approach has yet offered a computational language that is powerful enough to describe more than thin slices of human information processing, we believe

Page 7 of 12

Epilogue to The Oxford Handbook of Cognitive Neuroscience—Cognitive Neuroscience: Where Are We Going? that such a medium will become a key ingredient of the cognitive neuroscience approach in the future.

Translation In an era in which increasing numbers of researchers are applying for a static or shrink ing pool of grant funding, some have come to focus on the question of how to use cogni tive neuroscience to solve problems that arise in everyday life (and therefore address the concerns of funding agencies, which often are pragmatic and applied). Research is often divided into two categories (somewhat artificially): “Foundational” re search focuses on understanding phenomena for its own sake, whereas “translational” re search focuses on using such understanding to solve a real-world problem. Taking cogni tive neuroscience models of abilities based on studies of healthy populations and applying them to understand and treat the bases of dysfunction in specific groups is one form of translational research. This will surely be an area of great future growth. Already, a number of areas of psychiatric and substance use research have adopted a twostep translational research sequence (e.g., Barch et al., 2004, 2009; Carter et al., 2009; Ochsner, 2008; Paxton et al., 2008). The first step involves building a model of normal be havior, typically in healthy adults, using the cognitive neuroscience approach. The second step involves translating that model to a population of interest, and using the model to ex plain the underlying bases of the disorder or other deviation from the normal baseline— and this would be a crucial step in eventually developing effective treatments. This popu lation could suffer from some type of clinically dysfunctional behavior, such as the four psychiatric groups described in Part 4 of Volume 2 of the Handbook. It could be an ado lescent or older adult population, as described in a handful of chapters scattered across sections of the Handbook. Or—as was not covered in the Handbook, but might be in the future—it could be a vulnerable group for whom training in a specific type of cognitive, affective, or social skill would improve the quality of life. The possibilities abound—and it would behoove researchers in cognitive neuroscience to capitalize on as many of them as possible. Not just for the pragmatic reason that they may be more likely to be funded but, more importantly, for the principled reason that it matters. It matters that we understand real-world, consequential behavior. Yes, we need to start by studying the ability to learn a list of words in the lab, and we need to under stand the brain systems responsible for such relatively simple tasks. But then we need to move toward understanding, for example, how these brain systems do or do not function normally in a child growing up in an impoverished household compared with a child af forded every advantage (Noble et al., 2007). Happily, there is evidence that federal funding agencies are beginning to under stand the importance of this two-step, foundational-to-translational research sequence. In 2011, the National Institute of Mental Health (NIMH) announced the Research Domain Criteria (RDoC) framework as part of NIMH’s Strategic Plan to “Develop, for research purposes, new ways of classifying mental disorders based upon dimensions of observable (p. 604)

Page 8 of 12

Epilogue to The Oxford Handbook of Cognitive Neuroscience—Cognitive Neuroscience: Where Are We Going? behavior and neurobiological functioning” (http://www.nimh.nih.gov/about/strategic-plan ning-reports/index.shtml). In essence, the RDoC’s framework aims to replace the tradi tional symptom-based means of describing abnormal behavior (and that characterizes tra ditional psychiatric diagnosis) with a means of describing the full range of normal to ab normal behavior in terms of fundamental underlying processes. The idea is that, over time, researchers will seek to target and understand the nature of these processes, the ways in which they can go awry, and the behavioral variability to which they can give rise —as opposed to targeting traditionally defined clinical groups. For example, a researcher could target processes for generating positive or negative affect, or their control, or the ways in which interactions between affect and control processes break down to produce anhedonia or a preponderance of negative affect—as opposed to focusing on a discretely defined disorder such as major depression (e.g, Pizzagalli et al., 2009). The two-step approach allows initial research to focus on understanding core processes— considered in the context of different levels of analysis—but with an eye toward then un derstanding how variability in these processes gives rise to the full range of normal to ab normal behavior. Elucidating the fundamental nature of these cognitive and emotional processes, and their relation to the behavioral/experiential level above and to the neural level below, is the fundamental goal of cognitive neuroscience.

Concluding Comment How do we measure the success of a field? By the number of important findings and in sights? By the number of scientists and practitioners working within it? If we take that late 1970s taxicab ride, when the term cognitive neuroscience was first used as the inception point for the field, then by any and all of these metrics, cognitive neuroscience has been enormously successful. Compared with physics, chemistry, medi cine, and biology, however—or even compared with psychology and neuroscience—cogni tive neuroscience is just beginning to hit its stride. This is to be expected, given that it has existed only for a very short period of time. Indeed, the day for cognitive neuro science is still young. This is good news. Even though cognitive neuroscience is entering its mid-30s, compared with these other broad disciplines that were established hundreds of years ago, this isn’t even middle age. The hope, then, is that the field can continue to blossom and grow from its adolescence to full maturity—and make good on the promising returns it has produced so far.

References Barch, D. M., Braver, T. S., Carter, C. S., Poldrack, R. A., & Robbins, T. W. (2009). CN TRICS Final task selection: Executive control. Schizophrenia Bulletin, 35, 115–135.

Page 9 of 12

Epilogue to The Oxford Handbook of Cognitive Neuroscience—Cognitive Neuroscience: Where Are We Going? Barch, D. M., Mitropoulou, V., Harvey, P. D., New, A. S., Silverman, J. M., & Siever, L. J. (2004). Context-processing deficits in schizotypal personality disorder. Journal of Abnor mal Psychology, 113, 556–568. Bickart, K. C., Wright, C. I., Dautoff, R. J., Dickerson, B. C., & Barrett, L. F. (2011). Amyg dala volume and social network size in humans. Nature Neuroscience, 14, 163–164. Cacioppo, J. T. (1994). Social neuroscience: Autonomic, neuroendocrine, and immune re sponses to stress. Psychophysiology, 31, 113–128. Cacioppo, J. T. (2002). Social neuroscience: Understanding the pieces fosters understand ing the whole and vice versa. American Psychologist, 57, 819–381. Cacioppo, J. T., & Berntson, G. G. (2004) (Eds.). Social neuroscience: Key readings (Vol. 14). New York: Ohio State University Psychology Press. Carter, C. S., Barch, D. M., Gur, R., Pinkham, A., & Ochsner, K. (2009). CNTRICS Final task selection: Social cognitive and affective neuroscience-based measures. Schizophre nia Bulletin, 35, 153–162. Casey, B. J., Jones, R. M., Levita, L., Libby, V., Pattwell, S. S., et al. (2010). The storm and stress of adolescence: Insights from human imaging and mouse genetics. Developmental Psychobiology, 52, 225–235. Charles, S. T., & Carstensen, L. L. (2010). Social and emotional aging. Annual Review Psy chology, 61, 383–409. Chiao, J. Y. (2009). Cultural neuroscience: a once and future discipline. Progress in Brain Research, 178, 287–304. Chiao, J. Y. (2010). Neural basis of social status hierarchy across species. Current Opinion in Neurobiology, 20, 803–809. Eisenberger, N. I., & Cole, S. W. (2012). Social neuroscience and health: neurophysiologi cal mechanisms linking social ties with physical health. Nature Neuroscience, 15, 669– 674. Johnstone, T., van Reekum, C. M., Urry, H. L., Kalin, N. H., & Davidson, R. J. (2007). Fail ure to regulate: counterproductive recruitment of top-down prefrontal-subcortical circuit ry in major depression. The Journal of Neuroscience: the Official Journal of the Society for Neuroscience, 27, 8877–8884. Kober, H., Mende-Siedlecki, P., Kross, E. F., Weber, J., Mischel, W., et al. (2010). Pre frontal-striatal pathway underlies cognitive regulation of craving. Proceedings of the Na tional Academy of Sciences of the United States of America, 107, 14811–14816. Mather, M. (2012). The emotion paradox in the aging brain. Annals of the New York Acad emy of Sciences, 1251, 33–49. Page 10 of 12

Epilogue to The Oxford Handbook of Cognitive Neuroscience—Cognitive Neuroscience: Where Are We Going? Marr, D. (1982). Vision: A computational investigation into the human representa tion and processing of visual information (397pp). San Francisco: W.H. Freeman. (p. 605)

Muscatell, K. A., & Eisenberger, N. I. (2012). A social neuroscience perspective on stress and health. Social and Personality Psychology Compass, 6, 890–904. Muscatell, K. A., Morelli, S. A., Falk, E. B., Way, B. M., Pfeifer, J. H., et al. 2012. Social sta tus modulates neural activity in the mentalizing network. NeuroImage, 60, 1771–7. Neisser, U. (1967). Cognitive Psychology (Vol. 16). New York: Appleton-Century-Crofts. Noble, K. G., McCandliss, B. D., & Farah, M. J. (2007). Socioeconomic gradients predict individual differences in neurocognitive abilities. Developmental Science, 10, 464–480. Ochsner, K. (2007). Social cognitive neuroscience: Historical development, core princi ples, and future promise. In A. Kruglanksi & E. T. Higgins (Eds.), Social Psychology: A Handbook of Basic Principles (pp. 39–66). New York: Guilford Press. Ochsner, K. N. (2008). The social-emotional processing stream: Five core constructs and their translational potential for schizophrenia and beyond. Biological Psychiatry, 64, 48– 61. Ochsner, K. N., & Barrett, L. F. (2001). A multiprocess perspective on the neuroscience of emotion. In T. J. Mayne & G. A. Bonanno (Eds.), Emotions: Current Issues and Future Di rections (pp. 38–81). New York: Guilford Press. Ochsner, K. N., & Gross, J. J. (2004). Thinking makes it so: A social cognitive neuroscience approach to emotion regulation. In R. F. Baumeister & K. D. Vohs (Eds.), Handbook of Self-regulation: Research, Theory, and Applications (pp. 229–255). New York: Guilford Press. Ochsner, K. N., & Lieberman, M. D. (2001). The emergence of social cognitive neuro science. American Psychologist, 56, 717–734. Panksepp, J. (1991). Affective neuroscience: A conceptual framework for the neurobiologi cal study of emotions. International Review of Studies on Emotion, 1, 59–99. Panksepp, J. (1998). Affective neuroscience: The foundations of human and animal emo tions. New York: Oxford University Press. Paxton, J. L., Barch, D. M., Racine, C. A., & Braver, T. S. (2008). Cognitive control, goal maintenance, and prefrontal function in healthy aging. Cerebral Cortex, 18, 1010–1028. Pizzagalli, D. A., Holmes, A. J., Dillon, D. G., Goetz, E. L., Birk, J. L., Bogdan, R., et al. (2009). Reduced caudate and nucleus accumbens response to rewards in unmedicated in dividuals with major depressive disorder. Am J Psychiatry, 166 (6), 702–710. Schacter, D. L., & Tulving, E. (1994). Memory Systems 1994 (Vol. 8). Cambridge, MA: MIT Press. Page 11 of 12

Epilogue to The Oxford Handbook of Cognitive Neuroscience—Cognitive Neuroscience: Where Are We Going? Todorov, A. B., Fiske, S. T., & Prentice, D. A. (2011). Social neuroscience: Toward Under standing the Underpinnings of the Social Mind (Vol. 8). New York: Oxford University Press.

Kevin N. Ochsner

Kevin N. Oschner is a professor in the Department of Psychology at Columbia Univer sity in New York, NY. Stephen Kosslyn

Stephen M. Kosslyn, Center for Advanced Study in the Behavioral Sciences, Stanford University, Stanford, CA

Page 12 of 12

Index

Index The Oxford Handbook of Cognitive Neuroscience, Volume 1: Core Topics Edited by Kevin N. Ochsner and Stephen Kosslyn Print Publication Date: Dec 2013 Subject: Psychology Online Publication Date: Dec 2013

(p. 606)

(p. 607)

Index

A abstract knowledge, 365–366 acetylcholine, 301t orienting network, 302 achromatopsia, 80 acoustic frequencies, 145 acoustic properties facial movements and vocal acoustics, 529–531 linking articulatory movements to vocal, 525–526 speech perception, 509–512 acoustic shadowing effect, 150 action attention and, 255–256, 267–268 attention for, 261 divergence from attention, 261–263 imagery, 123–124 music, 122–124 object-associated, 567–568 performance, 123 sensorimotor coordination, 122–123 sensorimotor modality, 367 action-blindsight, neuropsychology, 322 action monitoring, attention, 299 activate-predict-confirm perception cycle, 62–63 adaptive coding, auditory system, 149–150 affective neuroscience, 599, 600 aging binding deficit hypothesis and medial temporal lobe (MTL) function, 468 cognitive theories of, 457 compensation-related utilization of neural circuits hypothesis (CRUNCH), 459, 462, 470 double dissociation between medial temporal lobe (MTL) regions, 466f episodic memory encoding, 459–461, 464 episodic memory retrieval, 461–463, 464–465 Page 1 of 46

Index functional neuroimaging of cognitive, 465–467 future directions, 469–470 healthy vs. pathological, 468–469 hemispheric asymmetry reduction in older adults (HAROLD), 458–463, 470 medial temporal lobes, 463–465 prefrontal cortex (PFC) activity during episodic memory retrieval, 462f resource deficit hypothesis and PFC function, 467 sustained and transient memory effects, 461f working memory, 458–459, 463 working memory and episodic memory, 456 working memory and prefrontal cortex (PFC) for younger and older adults, 459f agnosia, 80, 530 auditory, 201 visual, 198–200, 204, 278, 279f visual, of patient D. E., 199 agnosia for letters, 200 agraphia, 497 alexia without, 495 AIDS, 473 akinetopsia, 198 alerting network, attention, 300–301 alexia, 80 alexia without agraphia, 495 algorithm level, cognitive neuroscience, 2 allocentric neglect, 40 Allport, D. A., 356, 568 sensorimotor model of semantic memory, 362 Alzheimer’s dementia, 203, 456, 562 amnesia, 474 amnesic patient, seizures, 4, 438 amnesic syndrome, 474 amplitude compression, sound, 140 amplitude modulation, sound measurement, 145–146 amusia, 127, 201 amygdala, 90, 93, 95f, 601, 602 audition, 200 emotion, 125 gustation, 202, 203 reconsolidation, 450 sensory, 304 sleep, 446 social perception, 203 visual experience, 566 anatomical, 61 anatomy attention networks, 301t music and brain, 126–127 angular gyrus (AG), 36, 128f, 491 Page 2 of 46

Index language, 173 neglect syndrome, 331 reading and spelling, 494, 499–500 speech perception, 509 speech processing system, 508f anorexia, olfactory perception, 102 anterior cingulate cortex (ACC) executive attention, 301t, 302, 304 learning, 419, 421 retrieval, 384 anterior olfactory nucleus, 93, 95f anterior piriform cortex, 95f, 96 anterograde amnesia, 376, 439, 443, 474 aphasia, 368n.3, 390 verbal working memory, 402–403 apraxia, 327, 567, 568 object-related actions, 361 Aristotle, 417 articulation linking to vocal acoustics, 525–526 speech perception, 514–515 artificial neural networks, spatial judgments, 41 asomatognosia, 202 Association for Chemoreception Science, 97 associative phase, stage model, 417–418 astereopsis, 197 Atlas of Older Character Profiles, Dravnieks, 97 attack, timbre, 114 attention, 255–256, 267–268, 268. See also auditory attention; spatial attention cognitive neuroscience, 313 constructs of, and marker tasks, 297–300 defining as verb, 256–257 deployment, in time, 230, 231f (p. 608) development of, networks, 302–311, 312–313 divergence from action, 261–263 efficiency, 311 as executive control, 299 experience, 312 genes, 311–312 introduction to visual, 257–259 musical processing, 119 neuroanatomy of, 300–302 optimizing development, 312–313 parietal cells, 35 perceptual load theory of, 545n.11 Posner’s model of, 299–300 relationship to action, 255–256, 267–268 remapping, 344–345 Page 3 of 46

Index selective, 225–228 as selectivity, 297–299 as state, 297 sustained, 224–225 theories of visual, 259–263 tuning, 258–259 attentional deficit neglect and optic ataxia, 337, 339–341 oculomotor deficits, 343–344 optic ataxia, 335–337 attentional disorders Bálint–Holmes syndrome, 326–328 dorsal and ventral visual streams, 320, 322–323 optic ataxia, 335–341 psychic paralysis of gaze, 341–344 saliency maps of dorsal and ventral visual streams, 321f simultanagnosia, 342–343 unilateral neglect, 328–334 attentional landscape, reach-to-grasp movement, 276 attentional orienting, inhibition, 298–299 attention-for-action theory, 262–263 Attention in Early Development, Ruff and Rothbart, 302 attention networks childhood, 306–311 developmental time course of, 311f development of, 302–311 infancy and toddlerhood, 302–306 attention network task (ANT), 300, 306–307 attention window, spatial relation, 41–42 attentive listening, rhythm, 118 audiovisual phonological fusion, 531 audiovisual speech perception extraction of visual cues in, 538–540 McGurk illusion, 531–532, 535, 537, 538 neural correlates of integration, 540–542 perception in noise, 529f spatial constraints, 538 temporal constraints, 534–538 temporal window of integration, 536f, 537 audition, 135–136, 200–201. See also auditory system amusia, 127, 201 auditory agnosia, 201 auditory scene analysis, 156–162 auditory system, 136, 137f challenges, 136, 163–164 frequency selectivity and cochlea, 136–137, 139–140 future directions, 162–164 inference problems of, 163 Page 4 of 46

Index interaction with other senses, 163 interface with cognition, 162 perceptual disorders, 201 peripheral auditory system, 138f sound measurements, 136–142 sound source perception, 150–156 auditory attention. See also attention brain areas, 229, 230f concurrent sound segregation, 218–219 conjunction of pitch and location, 221f deployment of, 228–230 divided attention, 228 enhancement and suppression mechanism, 221–222 environment, 216–220 figure-ground segregation, 221–222 future directions, 230–232 intermodal, 227–228 intramodal selective attention, 226–227 mechanisms of, 220–223 neural network of, 223–230 object-based, 217–218 selective attention, 225–228 sequential sound organization, 219–220 sharpening tuning curve, 222–223 sustained attention, 224–225 varieties of, 215–216 auditory brainstem response (ABR), evoked electrical potential, 149–150 auditory cues, speech production, 528f auditory filters, pitch, 154f auditory imagery, music, 124 auditory nerve, neural coding in, 140–142 auditory objects, 216 auditory oddball paradigm, 176, 177f auditory scene analysis, 156–162 cocktail party problem, 157f filling in, 159–160 sequential grouping, 158–159 sound segregation and acoustic grouping cues, 156–158 streaming, 159 auditory speech perception, confusion trees for, 530f auditory speech signal, properties and perception of, 526–527 auditory streams, 216 auditory system adaptive coding and plasticity, 149–150 anatomy of auditory cortex, 144f auditory scene analysis, 156–162 brain basis of sound segregation, 160–161 feedback to cochlea, 142–143 Page 5 of 46

Index filling in, 159–160 functional organization, 143–145 future directions, 162–164 phase locking, 141–142 reverberation, 161–162 schematic, 137f separating sound sources from environment, 161–162 sequential grouping, 158–159 sound segregation and acoustic grouping cues, 156–158 sound source perception, 150–156 sound transduction, 136, 137f streaming, 159 structure of peripheral, 138f subcortical pathways, 142 tonotopy, 143 auditory task, congenitally blind and sighted participants, 564f autobiographical memory, temporal gradient of, 439–440 autonomous phase, stage model, 417–418 B Bálint, Reszo, 326 Bálint–Holmes syndrome, 199, 326–328, 344 Bálint’s syndrome, 34, 36, 46, 200, 320f, 327, 334 bandwidth, loudness, 155 basal ganglia attention, 227 vision, 289 rhythm, 118 selection, 332 singing, 122 skill learning, 425 working memory, 408 Bayesian models, skill learning, 418–419 behavioral models, skill learning, 417–419 behaviorism, mental imagery, 74–75 Bell, Alexander Graham, 525f biased competition, visual attention, 261 bilateral brain responses, speech and singing, 174–175 bilateral paracentral scotoma, 196 bilateral parieto-occipital lesions, 46 bilateral postchiasmatic brain injury, 196 binaural cues, sound localization, 150, 151 binding, 458 binding deficit hypothesis cognitive theory of aging, 457, 458, 464 medial temporal lobe (MTL) function, 468 binocular vision, grasping, 275 bipolar receptor neurons, humans, 92 birds, brain organization, 41 Page 6 of 46

Index blindsight, residual visual abilities, 283 (p. 609) blood oxygen level-dependent signal (BOLD) attention, 230f audiovisual speech, 543 auditory sentence comprehension, 188f category-specific patterns in healthy brain, 559f congenitally blind and sighted participants, 564f episodic memory, 379 intramodal selective attention, 226 measuring early brain activity, 173 mental imagery, 75, 85 responses across neural populations, 19–20 spatial working memory, 399f, 400 sustained attention, 224 body perception disorders, 202 BOLD. See blood oxygen level-dependent signal (BOLD) bottom-up inputs, 62, 60, 67f brain activation in motor areas, 84 audiovisual speech integration, 540–542 auditory attention, 229, 230f auditory sentence comprehension, 187, 188f basis of sound segregation, 160–161 bilateral responses for speech and singing, 174–175 category specificity, 571–572 cognitive neuropsychology, 554–555 combining sensory information, 524–525 division of labor, 31–32, 41, 50 functional magnetic resonance imaging (fMRI) measuring activation, 12 functional specialization of visual, 194–195 mapping, 3 measuring activity in early development, 172–173 musical functions, 127–129 object-, face- and place-selective cortex, 12, 13f olfactory information, 96 semantic memory, 358 topographic maps of, 29–31 brain injury bilateral postchiasmatic, 196 memory disorders, 473–474 mental imagery and perception, 79–80 musical functions, 128 perceptual disorders, 205 sensorimotor processing, 361 traumatic, 473, 478, 481 visual disorders, 195 Broadbent, Donald, 297 Broca’s aphasia, verbal working memory, 402–403 Page 7 of 46

Index Broca’s area, 111, 127 auditory sentence comprehension, 188f infant brain, 173 reading and writing, 497–499 syntactic rules, 187 Brodmann areas, 36, 80 reading and spelling, 494f visual cortex for areas 17 and 18, 76 buildup, 160 C capacity, working memory, 400–402 capacity limitation theory, visual attention, 259, 260–261 carbon monoxide poisoning, visual agnosia, 278 categorical, spatial relations, 41–45 category-specific semantic deficits anatomy of category specificity, 563–565 BOLD response in healthy brain, 559f congenitally blind and sighted participants, 564f connectivity as innate domain-specific constraint, 567 correlated structure principle, 561–563 distributed domain-specific hypothesis, 565–567 domain-specific hypothesis, 556, 557, 561 embodied cognition hypothesis, 569 explanations of causes of, 556–558 functional imaging, 563–565 knowledge of tool use, 569f lesion analyses, 563 multiple semantics assumption, 558–559 object-associated actions, 567–568 phenomenon, 555–556 picture naming performance, 555f relation between impairments and, 557f relation between sensory, motor and conceptual knowledge, 568–570 representation of conceptual content, 570–571 role of visual experience, 566–567 second-generation sensory/functional theories, 559–561 sensory/functional theory, 556, 557 toward a synthesis, 571–572 cellular consolidation, 441–443. See also consolidation long-term potentiation (LTP), 441–443 slow-wave sleep and, 444–445 central executive, working memory, 392, 407, 475 central scotoma, 196 central vision, 322, 323f centroid, 114 cerebellum, rhythm activity, 118 cerebral achromatopsia, 196–197 cerebral akinetopsia, 198 Page 8 of 46

Index cerebral dyschromatopsia, 196 cerebral hemiachromatopsia, 197 change blindness, 322 phenomenon, 298 visual attention, 257–258, 260 chemosignaling accessory olfactory system, 90 human tears, 104f social interaction, 104–105 Cherry, Colin, 297 childhood alerting, 307–308 attention network development, 306–311 attention network task (ANT), 306f orienting, 308–309 selectivity, 308–309 Children’s Television Workshop, 579, 580 Chinese language, speech perception, 512–513 chroma of pitch, 113f, 115 cilia, olfactory receptors, 92 closed-loop, skill learning, 418 closure positive shift (CPS), 175, 183 cochlea cochlear amplification, 140 feedback to, 142–143 frequency selectivity, 136–137, 139f, 139–140 cocktail party problem, 156, 157f cognitive maps, spatial representation, 48–49 cognitive neuropsychology, 2 cognitive neuroscience, 1, 2, 600, 601, 604 advances for semantic memory, 366–367 attention, 313 audiovisual speech integration, 542–544 constraints and convergence, 3–5 locative prepositions, 44 looking to the future, 601–604 modeling relationships, 602–603 multiple levels of analysis, 2 neural underpinning, 75 skill learning, 416–417, 419–420 themes and trends, 599–601 translation, 603–604 use of multiple methods, 2–3 visual brain, 32 cognitive phase, stage model, 417–418 cognitive psychology, 600 cognitive subtraction, 396 cognitivist, 61 Page 9 of 46

Index color imagery, visual, 79–80 color vision, 196–197 communication. See speech perception comodulation, 158 compensation-related utilization of neural circuits hypothesis (CRUNCH), aging, 459, 462, 470 competition for representation, visual attention, 260–261 computational analysis, 2 computational neuroscience, fast and slow learning, 424–425 conduction aphasia, verbal working memory, 403, 406f (p. 610) conscious incompetence, skill learning, 417, 418f consolidation, 436, 451 cellular, 441–443 early views, 436–438 hippocampal activity vs. neocortical activity, 441 modern views of, 438–443 reconsolidation, 450–451 resistance to interference, 436–438 role of brain rhythms in encoding and, states of hippocampus, 447–448 role of sleep in creative problemsolving, 450 sleep and, 443–450 sleep-related, of nondeclarative memory, 448–450 systems, 438–441 temporal gradient of episodic memory, 439–440 temporal gradient of semantic memory, 439 temporal gradients involving shorter time scale, 440–441 testing effect, 451 constraints, cognitive neuroscience, 3–5 context frames, visual processing, 65–66 contextual associations, visual processing, 65–66 contingent negative variation (CNV), 231f, 301, 307, 308 continuous performance task (CPT), 307–308 contrast sensitivity spatial, 196 spatial attention, 247–248 convergence, cognitive neuroscience, 3–5 convergence zone, 357 coordinate, spatial relations, 41–45 correlated feature-based account, semantic memory, 357 cortical activation, auditory sentence comprehension, 187, 188f cortical networks, spatial attention, 245 cortical neural pattern (CNP), 377 cortical reactivation, episodic memory, 376f, 382–384 covert attention eye-movement task, 266–267 shifts of attention without eye displacements, 323 covert orienting infancy, 303 spatial attention, 240–242, 251, 263–264 Page 10 of 46

Index Cowan’s K, 401 creative problem-solving, role of sleep in, 450 Critique of Pure Reason, Kant, 50, 528 culture, pleasantness ratings by, 99, 100f D declarative memory, 353, 443 long-term, 475–478 sleep-related consolidation of, 444–448 delayed response task, working memory, 394–395, 396f dementia, 203, 481 depth perception, 197 diabetes mellitus, 203 diagonal illusion, 285 Diamond, Adele, 304 difference in memory (DM) paradigm, 379 diffusion tensor imaging (DTI), 428, 494 direction, gravity, 37–38 disconnected edges, response, 15f distributed, 23 division of labor analog vs. digital spatial relations, 45 brain, 31–32 brain organization, 41 visual processing, 277–278 domain-general binding, 380 domain-specific category-based models, semantic memory, 356 domain-specific encoding, 380 domain-specific hypothesis category-specific semantic deficits, 556, 557, 561 distributed, 565–567, 572 dopamine alerting network, 300, 301t executive attention, 301t, 302 dorsal frontoparietal network orientation, 308 spatial attention, 245–246, 250 dorsal premotor cortex (PMC), 118, 123, 397, 423 dorsal simultanagnosia, 199–200 dorsal system object map, 46 object recognition, 45–48 spatial information, 43 dorsal visual stream active vision, 323, 326 episodic retrieval, 383 interaction with ventral stream, 289–290 landmark test characterizing, 321f object identification, 32 Page 11 of 46

Index perception and vision with, damage, 279–280 peripheral vision, 322 saliency map, 321f shape sensitivity, 33–34 spatial mental imagery, 81–82 vision, 194–195 visual processing, 277f dorsolateral prefrontal cortex (DLPFC), 128f absolute pitch, 121 attention, 267 divided attention, 228 executive attention, 302 sensorimotor coordination, 122 skill learning, 419, 422, 425 working memory, 119 double-pointing hypothesis, visual-motor system, 275, 286–288 double-step saccade task, 323, 325f Drosophila, 92 dual-channel hypothesis, visual-motor control, 274–275, 276 dual-task paradigm, saccade location and attention, 265 Dutch, word stress, 178 dynamicist, 61 dysexecutive syndrome, 475 dysgeusia, 203 dyslexia, 196 E early left anterior negativity (ELAN) syntactic phrase structure, 175 syntactic rules, 187 early right anterior negativity (ERAN) harmonic expectancy violation, 116 music and language, 121 eating, olfaction influence, 102–103 eating disorders, olfactory perception, 102–103 Ebbinghaus, Hermann, 1 Ebbinghaus illusion, 284, 285, 286 echoic memory, 484n.1 edges, response, 15f efference copy, 68 efficient coding hypothesis, auditory system, 149 egocentric spatial, frame of reference, 40 electroencephalography (EEG) audiovisual speech, 541 cortical reactivation, 385 executive attention, 304 measuring early brain activity, 172–173 phonetic categories of speech, 514 REM sleep, 444 Page 12 of 46

Index sensorimotor coordination, 122 electrophysiology language processing, 175 neurons in monkey IT cortex, 19 object and position sensitivity, 16 object recognition, 11–12 prosodic processing, 183 word segmentation studies, 178 emotion, 2 brain regions, 128f memory impairment, 483–484 music and, 124–126 encephalitis, 473, 481 encoding episodic memory, 375 episodic memory mechanisms, 378 episodic memory with aging, 459–461, 464 functional imaging, 378–379 stage of remembering, 474 encoding specificity principle, episodic memory, 377 (p. 611) encyclopedic knowledge, 355 English, word stress, 177–179 enhancement and suppression mechanism, auditory attention, 221–222 entorhinal cortex, 48, 49, 93, 95f, 448, 469 envelope, amplitude capture, 145–146 environment, separating sound sources from, 161–162 environmental agnosia, 200 epilepsy, 203, 473 epiphenomenal, 363, 369n.6 episodic buffer, working memory, 392, 393–394, 475 episodic memory, 353, 375–376, 475–476 aging process, 456 coding and hemispheric asymmetry reduction in older adults (HAROLD), 459–461 cortex reactivating during retrieval, 382–384 early insights from patient work, 376–377 encoding using functional imaging, 378–379 functional neuroimaging as tool, 377–378 hippocampal activity during encoding, 379–381 hippocampal reactivation mediating cortical reactivation, 384–385 hippocampus activating during retrieval of, 381–382 medial temporal lobes (MTL) and encoding, 464 memory as reinstatement (MAR) model, 376f, 377 MTL and retrieval, 464–465 musical processes, 120–121 relationship to semantic memory, 354, 477 retrieval and hemispheric asymmetry reduction in older adults (HAROLD), 461–463 temporal gradient of, 439–440 error-dependent learning, 421 Page 13 of 46

Index error detection and correction, skill learning, 420–425 errorless learning, 481–482 error-minimization mechanisms, 63 error-related negativity (ERN), 299, 310, 311, 421–422 event-related potentials (ERPs) attention, 298, 309 audiovisual speech, 541, 542–543 auditory attention, 219f auditory scene, 216 infant N400, 182 language development, 176f language perception, 175 learning, 421 measuring early brain activity, 172 musical training, 116 music and language, 121 N400 effect, 181, 182 phonetic learning, 518, 519 phonotactics, 179–180 prosodic processing, 183, 184f sentence-level semantics, 184–185, 185f spatial attention, 247 syllable discrimination, 177f syntactic rules, 186f word meaning, 180–182, 182f word stress, 178, 179f evolution, brain organization, 41 excitation pattern, pitch, 154f executive attention attention networks, 301t childhood, 309–311 network in infants, 304–305 executive control attention as, 299 top-down, for visual-motor system, 289 executive function, 457 Exner’s area, reading and writing, 494, 500 experimental psychology, resistance to new learning, 437 explicit memory, skill learning, 425–427 extinction, dissociation between neglect and, 328–331 extrastriate body area (EBA), 13 eye movements behavioral evidence linking spatial attention to, 264–266 overt attention, 323 parietal lobes, 35 and spatial attention, 242–244, 251 visual attention and, 256 visual tracking task, 288 Page 14 of 46

Index F face agnosia, 200 face-selective cortex, fMRI signals, 13f facial movements, combination with vocal acoustics, 529–531 familiarity, 458 far transfer, working memory, 402 fault tolerance, 367 fear-conditioning task, sleep, 447 feature-based attention, 258–259, 261–262 feature integration theory, 46 feedback connections, visual world, 61–62 feedforward connections, visual world, 61–62 figure-ground segregation, auditory attention, 221–222 filling in, sound segregation, 159–160 filtering, separating sound sources from environment, 161–162 Fitts, Paul, 417 flanker task, 300, 310 Fowler, Carol, 525 frames of reference object-centered or word-centered, 38–39 spatial representation, 36–38 French syntactic structure, 186–187 word stress, 177–179, 179f frequency selectivity, cochlea, 136–137, 139f, 139–140 frontal eye field (FEF), 36, 226, 230f,\ attention, 334 orienting network, 301 remapping, 325f spatial attention, 244 spatial attention and eye movements, 266, 267 spatial working memory, 398–399 frontal lobes, 601 functional imaging, encoding episodic memory, 378–379 (p. 612) functional magnetic resonance imaging (fMRI) abstract knowledge, 366 attention, 217 attentional orienting, 298 audiovisual speech, 542 auditory imagery, 124 category specificity, 563–565 dissociations with aging, 457 episodic memory, 379 grasping control, 281–282 intramodal selective attention, 226–227 lexical competition, 516 measuring brain activation, 12 measuring early brain activity, 172–173 Page 15 of 46

Index mental imagery, 75–76 neural bases of object recognition, 11, 12 neural dispositions of language in infant brain, 173, 174f neuroimaging of patient’s dorsal stream, 281f neuroimaging of patient’s ventral stream, 280f piriform activity in humans, 96 pitch and melody, 115 reading and spelling, 494, 498f retinotopic mapping, 31 semantic memory, 439 sentence comprehension, 187 signals of object-, face- and placeselective cortex, 12, 13f slow-wave sleep, 446 spatial attention, 245 spatial mental imagery, 81 speech articulation, 511 tonal dynamics, 117 verbal working memory, 404–405 visually guided reaching, 282 visual mental imagery, 76–77 visual-spatial attention and eye movements, 266–267 visual-spatial working memory, 397–398 functional magnetic resonance imaging–adaptation (fMRI–A), 14, 361 neural representations, 17 responses of putative voxel, 20, 21f rotation sensitivity, 20f functional near-infrared spectroscopy (fNIRS) infant brain, 173 measuring early brain activity, 172–173 functional neuroimaging cognitive aging, 465–467 medial temporal lobe (MTL), 468 prefrontal cortex (PFC), 467 reading words and nonwords, 495, 500 tool for episodic memory, 377–378 working memory, 395–400 fusiform body area (FBA), 13, 22–23 fusiform face area (FFA), 13 domain-specific hypothesis, 566 sparsely distributed representation of, 22–23 visual mental imagery, 78–79 fusiform gyrus, 12, 13, 399, 491, 494–497 G Gabor patches, 76 gap effect, 238 Gault, Robert, 533 gaze, psychic paralysis of, 341–344 gaze ataxia, 327 Page 16 of 46

Index gaze-contingent display, perception, 264, 264f Gazzaniga, Michael, 1 genes, attention networks, 301t, 311–312 geometry, cognitive maps, 48–49 German, word stress, 177–179, 179f Gerstmann’s syndrome, 322 gestalt, 34, 199 gestaltist, 61 Glasgow Outcome Scale, 478 global-to-local integrated model, visual processing, 63–64 glomeruli olfactory bulb, 93 patterns for rat, activation, 94f goal-driven attentional shifts, 258 Gottfried, Jay, 96 graceful degradation, 367 grammar constraint-based principles of, 581–582 mainstream generative, (MGG), 579 no strict lexicon vs., 582–584 grapheme-to-phoneme conversion, 492, 500 graphemic buffer, 493 grasping binocular vision, 275 illusory displays, 285–286, 287f objects, 274, 283 reach-to-grasp movements, 274–276 shape and orientation of goal object, 276 studies, 275 visual-motor system, 285–286, 288 gravity, sense of direction, 37–38 grip aperture grip scaling, 288 optic ataxia vs. visual agnosia, 279f reach-to-grasp movements, 274f sensitivity to illusions, 285–286 Weber’s law, 286, 288 grounding by interaction, 570–571 grouping cues sequential grouping, 158–159 sound segregation, 156–158 group therapy, memory impairment, 483–484 Grueneberg organ, 89, 90 gustation, 202 gustatory perceptual disorders, 203 H Haberly, Lew, 95–96 Handbook Page 17 of 46

Index linking analysis, 600–601 looking to future, 601–604 new methods, 602 new topics, 601–602 new ways of modeling relationships, 602–603 overview of, 5–6, 600 themes and trends, 599–601 translation, 603–604 haptic devices, 534 harmony, music, 112 head-related transfer function (HRTF), sound localization, 151–152 hearing. See also auditory system frequency selectivity and cochlea, 136–137, 139f, 139–140 research on pitch, 153 hearing science, future directions, 162–164 Hebb, Donald, 297 Hebbian plasticity, 424, 481 hemianopia, 80, 195, 196, 204 hemianopic dyslexia, 196 hemifield constraints, attention and action, 263 hemispheric asymmetry reduction in older adults (HAROLD), prefrontal cortex (PFC), 458–463, 470 Heschl’s gyrus (HG), 115, 117, 126–127, 128f acoustic properties of speech, 510 attention, 226 auditory sentence comprehension, 188f episodic retrieval, 383 language, 173 phonetic learning, 518–519 speech perception, 509, 510, 512 high spatial frequency (HSF) information, ventral visual pathway, 67f hippocampal cells, cognitive maps, 48–49 hippocampal neuro pattern (HNP), 377 hippocampal pattern completion, 376f, 377 hippocampus, 21, 32, 36, 44, 48, 49, 95f activity during encoding, 379–381 activity during retrieval, 381–382 activity vs. neocortical activity, 441 aging and condition interaction, 463f consolidation, 438, 440–445 longitudinal changes, 457f memory, 120, 124, 376 reactivation, 384–385 role of brain rhythms in encoding and consolidation, 447–448 skill learning, 425 homonymous visual field defect, 195 homophony, 527 homunculus, 31 Page 18 of 46

Index honeybees, spatial information, 29 horizontal-vertical illusion, 285 hub, 357 human behavior, olfactory system, 101–105 human cerebral cortex, two streams of visual processing, 277f human cognitive psychology, verbal and working memory, 402 human leukocyte antigen (HLA), mating, 103 human olfaction. See also mammalian olfactory system primacy of, 88, 89f schematic of system, 89f human parietal cortex, spatial information, 49 humans chemosignaling, 104f microsmatic, 88 motor imagery, 82–83 olfactory cortex, 95f olfactory system, 90f schematic of nasal cavity, 91f human ventral stream. See also ventral visual stream functional organization of, 12–14 nature of functional organization, 21–23 Huntington’s disease, 203, 427 hypogeusia, 203 I iconic memory, 484n.1 image rotation, motor imagery, 83 imagery brain regions, 128f debate, 4–5 mental imagery and, 74–76 music, 123–124 image scanning paradigm, landmarks, 81 immediate memory, 475 implementation analysis, 2 implicit memory, skill learning, 425–427 improvised performance, music, 123 inattentional blindness, 260 independent-components account, 493 (p. 613) Infant Behavior Questionnaire, 305 infant brain, neural dispositions of language, 173–175 infants, attention network development, 302–306 inferior colliculus, 137f, 146, 148f, 149, 152, 227 inferior frontal gyrus (IFG) audiovisual speech, 540 lexical competition, 516–517 reading and writing, 497–499 speech processing system, 508 f inferior parietal lobe (IPL), 319 Page 19 of 46

Index neglect, 40, 334 polysensory areas, 36 inferior temporal lobe, 46, 78, 322, 358, 399 inferotemporal (IT) cortex, responses to shapes and objects, 11 information flow, speech processing, 517–518 information processing, 3 inhibition, attentional orienting, 298–299 inhibition of return (IOR), 299 inner hair cells, 136 integrative agnosia, 199 intelligibility, visual contributions to speech, 528–529 interaural level differences (ILDs), sound, 150–151 interaural time differences (ITDs), sound, 150–151 interference attention and action, 262–263 consolidation and resistance to, 436–438 intermodal auditory selective attention, 227–228 internal model framework, action generation, 68 intonational phrase boundaries (IPBs), 182–183 intramodal auditory selective attention, 226–227 intraparietal sulcus (IPS), 13f, 31, 81, 559f attention, 224, 230f, 319 audiovisual speech, 540 auditory attention, 224–225 neglect syndrome, 331 singing, 122, 128f spatial attention, 245 visual-spatial attention, 266, 278, 281 working memory, 399, 401f invariance, 18 invariance problem, speech perception, 513–514 invariant object recognition, neural bases of, 15–16 inverse retinotopy, 76 irrelevant sound effect, verbal working memory, 404 J Jacobson’s organ, 89–90 James, William, 296, 375, 417 Jost’s law of forgetting, 438 K Kahneman, Daniel, 297 Katz, Larry, 97 Keller, Helen, 525f key, music, 112, 113f key profile, music, 113 Khan, Rehan, 97 Korsakoff’s syndrome, 473 L landmarks, image scanning, 81 Page 20 of 46

Index language. See also reading and writing information processing, 603 parallels between music and, 121–122 perception of lexical tone, 512–513 working memory maintenance, 405f language acquisition, 171–172 auditory sentence comprehension, 187, 188f developmental stages, 176f from sounds to sentences, 182–187 from sounds to words, 175–182 neural dispositions of language in infant brain, 173–175 phoneme characteristics, 176–177 phonological familiarity, 180 phonotactics, 179–180 sentence-level prosody, 182–183 sentence-level semantics, 184–185 syntactic rules, 185–187 word meaning, 180–182 word stress, 177–179 language processing, 507–509. See also Parallel Architecture; speech perception Parallel Architecture, 578–579 working memory, 584–586 lateral geniculate nucleus (LGN), spatial attention, 244 lateral inferior-temporal multimodality area (LIMA), 496 lateral intraparietal (LIP) area, 321f attention, 319–320 visual remapping, 325f lateralization, spatial representations, 40–45 lateral occipital complex (LOC), 12, 13f cue-invariant responses in, 14–15 object and position information in, 16–18 position and category effects, 17, 18f selective responses to objects, 14f viewpoint sensitivity across, 18–21 visual mental imagery, 78 lateral olivocochlear efferents, feedback to cochlea, 142 lateral superior olive (LSO), sound localization, 151 law of prior entry, spatial attention, 249–250 learning, memory-impaired people, 481–483 left hemisphere categorical perception after damage, 43 digital spatial relations, 41, 42 lesions in, 42–43 object recognition after damage, 47 lesion analyses. See also brain injury category-specific deficits, 563 object recognition, 11–12 letter-by-letter reading, 495 Page 21 of 46

Index letter discrimination, optic ataxia, 340f lexical competition, speech perception, 516–517 lexical-semantic information, sentencelevel, 184–185, 185f lexical tone, speech perception, 512–513 lexicon after working memory, 587f fragment of, 587f phonological activation, 586–588 speech processing system, 508f life cycle, memory, 391 linguistic theories, 578–579 linguistic working memory, 586f, 593n.10 lip reading, 527, 530f listening, separating sound from environment, 161–162 localization, sound source, 150–152 location-based attention, 259 locative prepositions, spatial relations, 44–45 Locke, John, 368n.3 long-term memory, 475 long-term potentiation (LTP), cellular consolidation, 441–443 long-term store (LST), working memory, 390–391 loudness constancy phenomena, 161 deviation tones, 229 sound, 155–156 love spots, 3 M McCarthy, Rosaleen, 555 McGurk illusion, 537, 538, 542, 543 audiovisual speech integration, 531–532, 535 schematic showing, 531f magnetic misreaching, 335 magnetic resonance imaging (MRI) diffusion-weighted, 499f perfusion-weighted image, 498f, 499f working memory, 422 magnetoencephalography (MEG) acoustic properties of speech, 510 audiovisual speech, 541 cortical reactivation, 385 measuring early brain activity, 172–173 reading and spelling, 494 mainstream generative grammar (MGG), Parallel Architecture, 579 maintenance, working memory, 391, 405f (p. 614) major histocompatibility complex (MHC), mating, 103 mammalian olfactory system eating, 102–103 human olfactory cortex, 95f Page 22 of 46

Index looking at human behavior through the nose, 101–105 looking at nose through human behavior, 97–101 mating, 103–104 mouse and human, 90f multiple sensing mechanisms, 89–90 neuroanatomy of, 88–96 olfactory bulb for odorant discrimination, 92–93 olfactory epithelium, 91–92 olfactory perceptual space, 97, 98f physicochemical space to perceptual space, 99f piriform cortex, 94–96 pleasantness across cultures, 99, 100f pleasantness identification, 97, 98f, 99–101, 100f primary olfactory cortex, 93–94 schematic of human olfactory system, 89f sniffing, 90–91, 91f social interaction, 104–105 marker tasks, attention, 297, 301t mating, olfaction influence, 103–104 medial olivocochlear efferents, feedback to cochlea, 142 medial prefrontal cortex (MPFC), 128f, 601 contextual processing, 66, 67f learning, 426 memory, 441 tonal dynamics, 117, 120, 123 medial superior olive (MSO), sound localization, 151 medial temporal lobe (MTL) aging and double dissociation between regions, 466f binding deficit hypothesis and MTL function, 468 consolidation, 438, 439–440 dysfunction in healthy and pathological aging, 468–469 episodic memory, 456 episodic memory encoding, 464 episodic memory retrieval, 464–465 hippocampal activity during encoding, 379–380 surgical removal, 376–377 working memory, 463 melody brain regions, 128f music, 112 tonality, 115 memory, 2. See also episodic memory; semantic memory; working memory aids in loss compensation, 480–481 assessment of, functioning, 479–480 audition, 162–163 auditory, 217 brain regions, 128f music and, 119–121 Page 23 of 46

Index navigational, 48–49 Plato, 74 recovery of functioning, 478–479 rehabilitation, 480–484 stages of remembering, 474 systems, 474–478 memory as reinstatement (MAR) model, episodic memory, 376f, 377, 378, 381 memory disorders, 473–474, 484 anterograde amnesia, 474 assessment of memory functioning, 479–480 compensating with memory aids, 480–481 declarative long–term memory, 475–478 emotional consequences, 483–484 episodic, 475–476 memory systems, 474–478 modifying the environment, 483 new learning, 481–483 non-declarative long-term memory, 478 priming, 478 procedural memory, 478 prospective memory, 477–478 recovery of memory functioning, 478–479 rehabilitation of memory, 480–484 relationship between semantic and episodic memory, 477 retrograde amnesia, 474 semantic memory, 476–477 short-term and working memory, 474–475 stages of remembering, 474 understanding, 474–484 memory-guided saccade (MGS), spatial working memory, 399f memory impairment, 474 memory systems debate, 4 memory trace consolidation, 437 skill learning, 418 menstrual synchrony, olfaction influence, 103 mental exertion, 437 mental imagery, 74, 84–85 brain-damaged patients, 80–81 dorsal stream and spatial, 81–82 early studies of, 74–76 imagery debate, 74–76 visual, and early visual areas, 76–78 visual, and higher visual areas, 78–82 mental mimicry, 224 mental rotation tasks, strategies in, 83–84 mental rotation paradigm humans, 82–83 Page 24 of 46

Index objects, 75 mesial temporal epilepsy, 203 meter, music, 112, 117–118 middle temporal gyrus, speech processing system, 508f, 509 Mikrokosmos, Bartok, 124 Milan square’s neglect experiment, 49 Miller, George, 1 mismatch negativity (MMN), 178 audiovisual speech, 541, 545n.10 chords, 116 language development, 176f oddball paradigm, 543 phonetic learning, 518 speech sounds, 176 timbre, 118–119 mismatch paradigm, 176 mismatch response (MMR), 176 missing fundamental illusion, 152 modulation frequencies, 145 modulation tuning, sound measurement, 146–148 Molaison, Henry (H.M.), 4, 376–377, 438, 478 monkeys navigational memory, 49 neurophysiological studies, 267 perception of space and object, 33 motion perception, 198 motor imagery, 82–84 functional role of area M1, 84 music, 124 physical movement, 82–83 strategies in mental rotation tasks, 83–84 motor systems, speech perception, 514–515 mouse, olfactory system, 90f moving window paradigm, perception, 264, 264f Müller, Georg, 436 Müller–Lyer illusion, 66, 285 multimodal speech perception. See speech perception, multimodal multiple sclerosis, 473 gustatory disorders, 203 olfactory perception, 203 multiple semantics assumption, 558–559 multivoxel pattern analysis (MVPA) episodic retrieval, 383 semantic memory, 366, 368 sensibility to position, 17, 18 music absolute pitch, 121 action, 122–124 Page 25 of 46

Index amusia, 127, 201 anatomy, plasticity and development, 126–127 attention, 119 auditory imagery, 124 auditory perception, 200 brain’s functions, 127–129 building blocks of, 112–115 (p. 615) detecting wrong notes and wrong chords, 115–117 disorders, 127 emotion, 124–126 episodic memory, 120–121 imagery, 123–124 improvised performance, 123 memory, 119–121 motor imagery, 124 parallels between music and language, 121–122 perception and cognition, 115–121 performance, 123 pitch and melody, 115 psychology and neuroscience, 111–112 rhythm and meter, 117–118 score-based performance, 123 semantics, 121–122 sensorimotor coordination, 122–123 singing, 122–123 syntax, 121 tapping, 122 timbre, 114–115, 118–119 time, 112 tonal dynamics, 117 tonality, 112–114, 115–117 tonal relationships, 113f working memory, 120 working memory maintenance, 405f N National Institute of Mental Health (NIMH), 604 Navigation, spatial information, 48–49 n-back task, sustained working memory, 224, 225f negative priming, phenomenon, 299 neglect. See also unilateral neglect reference frames, 39f space, 38–40 neglect dyslexia, 40 Neopolitan sixth, 116 neural networks invariant object recognition, 15–16 spatial attention, 244–246 neural play, slow-wave sleep, 445–446 Page 26 of 46

Index neuroanatomy attention, 300–302 mammalian olfactory system, 88–96 neuroimaging cognitive aging, 465–467 cortical 3D processing, 36 musical processes, 120 navigational tasks, 49 reach-to-grasp actions, 34–35 shape-selective actions, 33 NeuroPage, memory aid, 480–481 neurophysiology monkeys, 267 visual-spatial attention and eye movements, 266 neuroscience, 111–112, 600 neurotransmitters, attention networks, 301t noise-vocoded speech, 146, 147f nondeclarative memory, 443, 448–450, 478 noradrenaline, alerting network, 301t norepinephrine, 300, 301 O obesity, olfactory perception, 102 object-centered, frame of reference, 38–39 object form topography, 22 object map, 46 object recognition category-specific modules, 22 cue-invariant responses in lateral occipital complex (LOC), 14–15 distributed object form topography, 22 electrophysiology measurements, 11–12 functional magnetic resonance imaging (fMRI), 11, 12 functional organization of human ventral stream, 12–14 future directions, 23–24 lesions of dorsal system, 46 neural bases of invariant, 15–16 neural sensitivity to object view, 20 open questions, 23–24 orbitofrontal cortex (OFC), 64 process maps, 22 representations of faces and body parts, 22–23 responses to shape, edges and surfaces across ventral stream, 15f theories of, 18, 20–21 variant neuron, 20–21 object recognition task, unilateral posterior lesions, 47f objects. See also category-specific semantic deficits ambiguous figures, 68–69 perceptual and conceptual processing of, 360–362 spatial attention, 239–240 Page 27 of 46

Index spatial information within the, 45–48 visual attention, 258 visual working memory, 399–400 object-selective cortex, fMRI signals, 13f obstacle avoidance, visual-motor networks, 283 oculomotor deficits, 343–344 oculomotor readiness hypothesis (OMRH), spatial attention, 242–244 oddball paradigm, attention, 224 odorants discrimination at olfactory bulb, 92–93 pleasantness, 97, 98f, 99–101 sensing mechanisms, 89–90 sniffing, 90–91 transduction at olfactory epithelium, 91–92 odor coding, olfactory bulb, 93 odor space, 97 olfaction. See also human olfaction; mammalian olfactory system rating pleasantness, 99–101 receptor events in, 92f olfactory bulb, 95f odorant discrimination, 92–93 spatial coding, 94f olfactory cortex, primary, 93–94 olfactory epithelium, odorant transduction, 91–92 olfactory perception, disorders, 202–203 olfactory perceptual space, 97, 98f olfactory receptors, humans, 92 olfactory tubercle, 95f On the Motion of Animals, Aristotle, 417 open-loop control, skill learning, 418 optical topography (OT), early brain activity, 172 optic aphasia, 558 optic ataxia, 278, 279f, 327, 328 central cue vs. peripheral cue, 340f dissociation between attentional deficits in neglect and, 337, 339–341 errors for saccade and reach, 338f field effect and hand effect, 336f neuropsychology, 322 Posner paradigm, 339f reaching errors, 335–337 orbitofrontal cortex (OFC), visual information, 64, 67f organized unitary content hypothesis (OUCH), 561 organ of Corti, peripheral auditory system, 136, 138f orienting network attention, 301–302 childhood, 308–309 infants, 303–304, 305 outer hair cells, 137 Page 28 of 46

Index overt attention, eye movements, 323 overt orienting, spatial attention, 240–242, 251 P parahippocampal cortex (PHC), 49, 66, 67f, 379, 465, 468, 565 parahippocampal gyrus, music, 125 parahippocampal place area (PPA), 13, 78–79, 383, 566 Parallel Architecture, 579–581, 581f, 592–593 activated lexical items, 587f constraint-based principles of grammar, 581–582 example, 586–590 fragment of lexicon, 587f goals of theory, 578–579 language processing, 578–579 lexicon after working memory, 587f mainstream generative grammar (MGG), 579 no strict lexicon vs. grammar distinction, 582–584 noun phrases (NPs), 579 phonology, 579–580 processing, 584–586 (p. 616) semantics as independent generative component, 580–581 semantic structure without syntax or phonology, 591–592 syntactic integration, 588 visually guided parsing, 590–591 working memory after semantic integration, 588f, 589f parietal cortex, rats, 49 parietal lesions, 40, 48 parietal lobe mapping, 35–36 position, 35 speech perception, 509 unilateral lesions, 46 parietal neglect, spatial working memory, 326f Parkinson’s disease, 203, 473 perceptual, 61 perceptual disorders, 193–194, 203–205 audition, 200–201 auditory agnosia, 201 body perception disorders, 202 color vision, 196–197 future research, 205 gustatory, 203 olfactory, 202–203 olfactory and gustatory perception, 202–203 social perception, 203 somatosensory perception, 201–202 spatial contrast sensitivity, 196 spatial vision, 197–198 vision, 194–200 Page 29 of 46

Index visual acuity, 196 visual adaptation, 196 visual agnosia, 198–200 visual field, 195–196 visual identification and recognition, 198–200 visual motion perception, 198 perceptual odor space, 97 perceptual trace, skill learning, 418 performance, music, 123 periodicity, pitch, 153f peripheral vision, dorsal stream, 322 phase locking sound frequency, 141, 141f upper limit of, 141–142 phonemes. See also language acquisition auditory speech signal, 526–527 characteristics, 176–177 restoration, 160 speech sounds, 175 visual speech signal, 527 phonological loop verbal working memory, 403–404 working memory, 392–393, 407, 475 phonology, Parallel Architecture, 579–580 phonotactics, language learning, 179–180 phosphene threshold, 77 physical movements, motor imagery, 82–83 picture-matching task, 568 picture-naming performance, category-specific semantic deficits, 555f picture viewing, 564f picture-word priming paradigm, phonotactics, 180, 181 Pilzecker, Alfons, 436 piriform cortex, 93, 202 olfactory object formation, 94–96 understanding, 95–96 pitch absolute, 121 resolvability, 154f sound source, 152–155 tonality, 115 place cells, 48, 49f place models, pitch, 153 place-selective cortex, 13f plasticity auditory system, 149–150 music, 126–127 phonetic learning in adults, 518–519 Plato, 74 Page 30 of 46

Index pleasantness music, 125 odorants, 97, 98f, 99–101 Ponzo illusion, 286, 287f positron emission tomography (PET) auditory attention, 223 changes in regional cerebral blood flow, 360 mental imagery, 75–76 reading and spelling, 494 semantic memory retrieval, 359 spatial attention, 245 visual information, 567 visual mental imagery, 76–77 visual-spatial working memory, 396–398 Posner, Michael, 297, 417 model of attention, 299–300 Posner cueing paradigm, 259, 298, 308, 339f posterior parietal cortex (PPC), 34–36, 49, 565 category specificity, 563 mental imagery, 81, 82, 84 optic ataxia, 336f somatosensory perception, 201, 205 spatial attention, 245 unilateral neglect, 333–334 vision for action, 277–278, 281–283, 289 visual attention, 319, 323, 327 posterior piriform cortex, 95f, 96 precedence effect, 161 prediction, relevance in visual recognition, 62–65 prefrontal cortex (PFC) activity in young and older adults, 460f episodic memory, 456–457 episodic memory encoding, 459–461 episodic memory retrieval, 461–463 hemispheric asymmetry reduction in older adults (HAROLD), 458–463 longitudinal changes, 457f resource deficit hypothesis, 467 sustained and transient memory effects by aging, 461f top-down control of working memory, 408–410 verbal and spatial working memory for older and younger adults, 459f visual-spatial working memory, 397–398 working memory, 394–395, 396f, 458–459 premotor cortex (PMC), 36, 37, 123, 128f attention, 230f body perception disorders, 202 mental imagery, 83 rhythm, 118 semantic memory, 361, 364 Page 31 of 46

Index sensory functional theory, 559f skill learning, 423 speech perception, 541 visual control of action, 283, 289 working memory, 397, 409 premotor theory attention, 344 spatial attention, 242–244 pre-supplementary motor area (pSMA) rhythm discrimination, 118 sensorimotor coordination, 122 primacy effect, 390 primary memory, 475 primary olfactory cortex, structure and function, 93–94 primary somatosensory cortex, 31 priming, memory, 478 principal components analysis (PCA), odor space, 97 procedural memory, 478 process maps, object recognition, 22 progressive semantic dementia, 477 prosody, sentence-level, 182–183 prosopagnosia, 80, 200 prospective memory, 477–478 psychic paralysis of gaze, 341–344 Balint’s description, 341 oculomotor deficits, 343–344 simultanagnosia, 342–343 psychophysics, 273, 275 pulvinar, spatial attention, 238 pulvinar nucleus, spatial attention, 244 pure alexia, 200, 495, 497 putative voxels, functional magnetic resonance imaging (fMRI) responses, 20, 21f Q qualitative spatial relations, 44 quantitative propositions, 44 R (p. 617) radical visual capture, 199 random stimuli, response, 15f rapid eye movement (REM) sleep, sleep and consolidation, 443–444 rapid serial visual presentation (RSVP), spatial attention, 250 rats hippocampal lesions in, 438 maze navigation, 48–49 spatial coding of olfactory bulb, 94f reach-to-grasp actions bilateral optic ataxia, 338f dual-channel hypothesis, 274–275 neuroimaging, 34–35 Page 32 of 46

Index visual control of, 274–276 reaction time, spatial attention, 238–239 reactivation, working memory, 407–408 reading cognitive architecture, 492–493 mechanisms, 493 representational model, 492f visual field, 195–196 reading and writing, 491–492, 500–501 angular gyrus (BA 39), 499–500 Brodmann areas, 494f Broca’s area (BA 44/45), 497–499 cognitive architecture of, 492–493 Exner’s area (BA 6), 500 functional magnetic resonance imaging (fMRI), 498f fusiform gyrus (BA 37), 494–497 inferior frontal gyrus (IFG), 497–499 neuro correlates of, 493–500 superior temporal gyrus (BA 22), 500 supramarginal gyrus (BA 40), 499 recency effect, 390 recognition, listener, 163 recognition by components (RBC) model, 16, 19 recollection, 458 reconsolidation, 450–451 recovery mechanisms of memory, 479 memory functioning, 478–479 reference frame, 37. See also frames of reference reflexive control, spatial attention, 240–242 regional cerebral blood flow, positron emission tomography (PET), 360, 395 region of interest (ROI), episodic retrieval, 382–384 remapping, attention, 344–345 Remote Associations Test, 450 repetitive transcranial magnetic stimulation (rTMS), 77, 228, 422 representation of conceptual content, 570–571 Research Domain Criteria (RDoC), 604 resilience, 562 resource deficit hypothesis cognitive theory of aging, 457, 458, 461 episodic memory retrieval, 461 prefrontal cortex (PFC) function, 467 retina, image formation, 29–31 retinotopic mapping, stimuli and response times, 77f retrieval episodic memory, 375, 377 episodic memory with aging, 461–463, 464–465 hippocampus activating during episodic, 381–382 Page 33 of 46

Index semantic memory, 358–360 spaced, 481 stage of remembering, 474 retrograde amnesia, 438, 474 retrograde facilitation, 443 retrosplenial complex (RSC), 66, 67f reverberation, sound, 161–162 rhinal cortex, longitudinal changes, 457f rhythm, music, 112, 117–118 right anterior negativity (RATN), 116 right hemisphere analog spatial relation, 41, 42 coordinate space perception after damage, 43 neglect, 38 object recognition after damage, 47 speed of response to stimuli, 43 right orbitofrontal cortex, music, 125 right temporal lobe (RTL) agnosia, 201 episodic memory, 120, 476 music, 115, 118, 120, 127, 128 rodents, macrosmatic, 88 S saccadic eye movements bilateral optic ataxia, 338f central vision, 323f overt attention, 323 spatial attention, 264–266, 265f visual perception, 324f visual tracking task, 288 saliency maps, dorsal and ventral visual streams, 321f score-based performance, music, 123 seizures, amnesic patient, 4, 438 selectivity action and attention, 256–257 attention as, 237, 297–299 childhood, 308–309 mechanisms, 256 self-regulation, late infancy, 304 semantic, 368n.1 semantic categorization task, 181 semantic dementia, 354–355, 562 semantic integration, 588–590, 592 semantic judgment task, 181 semantic knowledge, 355, 359–360 semantic memory, 353, 355, 367–368, 475, 476–477 abstract knowledge, 365–366 acquisition, 354 Page 34 of 46

Index advances in neuroscience methods, 366–367 biasing representations, 363–364 brain regions, 358 cognitive perspectives, 356 correlated feature-based accounts, 357 differences in sensory experience, 364–365 domain-specific category-based models, 356 future directions, 367–368 individual differences, 364–365 interactions between category and task, 359–360 models, 356–358 neural regions underlying semantic knowledge, 362–363 neural systems supporting, 358–363 organization, 355–358 perceptual and conceptual processing of objects, 360–362 relationship to episodic memory, 354, 477 semantic dementia, 354–355 semantic space, 367, 368 sensorimotor-based theory, 356–357 sensory-functional theory, 356–357 stimulus influence on retrieval, 358–359 task influence on retrieval, 359 temporal gradient of, 439 semantic relevance, 562 semantics music and language, 121–122 Parallel Architecture, 580–581 sentence-level, 184–185 semantic violation paradigm, 184 semitones, music, 112 sensorimotor adaptation explicit and implicit processes, 426–427 skill acquisition, 417 sensorimotor-based theory, semantic memory, 356–357, 364–365 sensorimotor contingency, 204 sensorimotor coordination, music, 122–123 sensory-functional theory category-specific semantic deficits, 556, 557, 559–561 semantic memory, 356–357 sensory memory, 475 sentence-level prosody, 182–183, 184f sentence-level semantics, 184–185 septal organ, 89, 90 sequence learning explicit and implicit processes, 426 models of, 422–423 motor, paradigms, 423 skill acquisition, 417 Page 35 of 46

Index working memory capacity and, 423–424 (p. 618) Sesame Street, Children’s Television Workshop, 579, 580 Shallice, Tim, 555 shape geometrical entity, 45–46 hole, 15f information in dorsal system, 33–34 shared-components account, 493 short-term memory, 390, 474–475 short-term store (STS), working memory, 390–391 sight, 277 sign language, double dissociation, 44 simpler syntax, 584 simultanagnosia, 46, 199, 335, 342–343, 344 simultaneous agnosia, 199 singing, music, 122–123 single-photon emission computer tomography (SPECT), 78 size-contrast illusions Ebbinghaus illusion, 284, 285, 286 Ponzo illusion, 286, 287f vision for perception and action, 284–288 skeletal image, 46 skill learning, 416 Bayesian models of, 418–419 behavioral models of, 417–419 closed-loop and open-loop control, 418 cognitive neuroscience, 416–417, 419–420 error detection and correction, 420–425 explicit and implicit memory systems, 425–427 fast and slow learning, 424–425 Fitts’ and Posner’s stage model, 417–418 future directions, 428–429 late learning processes, 424 practical implications for, 427–428 questions for future, 429 role of working memory, 422–424 sleep consolidation, 443–450 rapid eye movement (REM), 443–444 role in creative problem–solving, 450 slow-wave, and cellular consolidation, 444–445 slow-wave, and systems consolidation, 445–447 slow-wave sleep cellular consolidation, 444–445 systems consolidation, 445–447 Smart houses, 483 smell, 202. See also human olfaction; mammalian olfactory system sniffing Page 36 of 46

Index mechanism for odorant sampling, 90–91 visualization of human sniffairflow, 91f social cognitive neuroscience, 599 social interaction, olfactory influence, 104–105 social neuroscience, 599, 600 social perception, disorders in, 203 somatosensory perception, 201–202 sound measurements amplitude compression, 140 amplitude modulation and envelope, 145–146 auditory system, 137f frequency selectivity and cochlea, 136–137, 139f, 139–140 mapping sounds to words, 515–518 modulation, 145–148 modulation tuning, 146–148 neural coding in auditory nerve, 140–142 peripheral auditory system, 136–142 structure of peripheral auditory system, 138f sound segregation acoustic grouping cues, 156–158 auditory attention, 218–219, 219f brain basis of, 160–161 separating from environment, 161–162 sound source perception. See also auditory system localization, 150–152 loudness, 155–156 pitch, 152–155 space, 28 cognitive maps, 48–49 models of attention, 238–239 neglecting, 38–40 spaced retrieval, 481 sparsely, 23 sparsely distributed representations, faces and body parts, 22–23 spatial attention, 250–251. See also attention behavioral evidence linking, to eye movements, 264–266 control of, 240–242 cortical networks of, 245 covert visual-, 263–264 dorsal and ventral frontoparietal networks, 245–246 early visual perception, 247–250 effect on contrast sensitivity, 247–248 effects on spatial sensitivity, 248 effects on temporal sensitivity, 248–250 eye movements and, 242–244, 263–266 functional magnetic resonance imaging (fMRI) in humans, 266–267 law of prior entry, 249–250 neural sources of, 244–246 Page 37 of 46

Index neurophysiological effects of, 246–247 object-based models, 239–240 oculomotor readiness hypothesis (OMRH), 242–244 parietal cells, 35 premotor theory, 242–244 reflexive and voluntary control, 240–242 space-based models, 238–239 subcortical networks of, 244 zoom lens theory, 238–239 spatial coding, olfactory bulb, 94f spatial contrast sensitivity, 196 spatial information coordinate and categorical relations, 42f shape recognition, 46 within object, 45–48 spatial memory task, schematic, 397f spatial representations, 28–29, 50 actions, 34–36 brain’s topographic maps, 29–31 central vision, 323f cognition by humans, 28–29 cognitive maps, 48–49 distinction between analog and digital, 42 lateralization of, 40–45 neglecting space, 38–40 spatial frames of reference, 36–38 spatial information within the object, 45–48 visual perception, 324f what vs. where in visual system, 31–34 “where,” “how,” or “which” systems, 34–36 spatial sensitivity, spatial attention, 248 spatial vision, 197–198 spatial working memory, younger and older adults, 459f spectrogram, 145 noise-vocoded speech, 147f speech utterance, 147f spectrotemporal receptive fields (STRFs), modulation tuning, 146–148 spectrum, pitch, 154f speech. See also language acquisition left hemisphere dominance for processing, 174 speech reading, 527 tactile contributions to, 532–534 visual contributions to, intelligibility, 528–529 working memory maintenance, 405f speech perception, 507–509 acoustic properties, 509–512 articulation in stop consonants, 511 articulatory and motor influences on, 514–515 Page 38 of 46

Index functional architecture of auditory system for, 508f future directions, 519 invariance for phonetic categories, 513–514 lexical competition, 516–517 lexical tone, 512–513 mapping of sounds to words, 515–518 nature of information flow, 517–518 neural plasticity, 518–519 (p. 619) phonetic category learning in adults, 518–519 spectral properties, 511 temporal properties, 510–511 voice onset time (VOT), 511–512, 515 voicing in stop consonants, 510–511 vowels, 511–512 speech perception, multimodal, 544–545 audiovisual, 534–538, 545n.8 audiovisual speech integration, 531–532, 542–544 auditory perception of speech, 530f auditory speech signal, 526–527 cross-talk between senses, 524–525 facial movements and vocal acoustics, 529–531 linking articulatory movements and vocal acoustics, 525–526 lip reading, 530f McGurk effect, 531–532 simultaneous communication by modalities, 525f tactile aids, 533–534 tactile contributions, 532–534 visual contributions to speech intelligibility, 528–529 visual speech signal, 527–528 spelling. See also reading and writing cognitive architecture of, 492–493 mechanisms, 493 representational model, 492f split-brain patient, 45 spotlight, spatial attention, 238 stage model, skill acquisition, 417–418 state, attention as, 297 stimulus-driven, attentional shifts, 258 storage, stage of remembering, 474 streaming, sound segregation, 159 stroke, 481 gustatory disorders, 203 object identification, 568 reading or spelling, 494 spelling, 497, 498f visual field disorders, 195 Stroop task, 299, 302 subcallosal cingulate, 125 Page 39 of 46

Index subcortical networks auditory system, 142 neglect, 40 spatial attention, 244 suicide, 473 Sullivan, Anne, 525f superior colliculus, 30, 36 attention, 322, 323 audiovisual speech, 540 orienting network, 301 spatial attention, 238, 242, 244, 267 visual processing, 277f, 283 superior parietal lobe agent operating in space, 36 eye movement, 35 limb movements, 37 superior parietal occipital cortex (SPOC), 282 superior temporal cortex, 36, 40, 187, 226, 327, 404–406, 512 superior temporal gyrus (STG), 40, 144f, 187 auditory attention, 227 auditory cortex, 144f language, 173, 174f memory, 225f, 402 music, 115, 128f semantic memory, 365 speech perception, 509 speech processing system, 508f (p. 620) unilateral neglect, 333 written language, 491, 494, 497, 500 superior temporal sulcus (STS), 144f, 566 auditory brain, 200 audiovisual speech, 540 language, 173 semantic memory, 366 speech perception, 509 working memory, 404, 405f supplementary motor area (SMA) rhythm discrimination, 118 sensorimotor coordination, 122 suppression, 160 supramarginal gyrus (SMG) reading and spelling, 494, 499 speech perception, 509 speech processing system, 508f surfaces, response, 15f syntactic phrase structure emerging ability to process, 175 language, 185–187 Page 40 of 46

Index syntactic violation paradigm, 186, 187 syntax, music and language, 121 systems consolidation, 438–441. See also consolidation slow-wave sleep and, 445–447 temporal gradient of autobiographical memory, 439–440 temporal gradient of semantic memory, 439 T tactile contributions aids, 533–534 speech, 532–534 Tadoma method individual communicating by, 533f sensory qualities, 534 tapping, music, 122 taste, qualities, 202 tears, chemosignaling, 104f tele-assistance model, metaphor for ventral-dorsal stream interaction, 289 temporal lobe structures, speech perception, 509 temporally graded retrograde amnesia, 443 temporal order judgment (TOJ), 536f, 545n.7 temporal-parietal junction (TPJ), 238 temporal properties, speech perception, 510–511 temporal sensitivity, spatial attention, 248–250 temporal ventriloquist effect, 537 testing effect, consolidation, 451 Thai language, speech perception, 512–513 timbre music, 114–115 music perception, 118–119 time deployment of attention in, 230, 231f music, 112, 113f toddlers, attention network development, 302–306 Token Test, 44 tonality. See also music brain regions, 128f detecting wrong notes and wrong chords, 115–117 dynamics, 117 music, 112–114 pitch and melody, 115 tonal hierarchies, 113 tone deafness, amusia, 127, 201 tonotopy, auditory system, 143 top-down effects contextual, 65–66 interfunctional nature, 67–69 modulation, 61, 69 visual perception, 60–61 Page 41 of 46

Index working memory, 408–410 topographagnosia, 200 topographic maps, brain, 29–31 trace hardening, consolidation, 436–437 transcranial direct current stimulation (tDCS), 366 transcranial magnetic stimulation (TMS) analysis method, 2 mental imagery, 75–76 motor imagery, 84 object-related actions, 361 phonetic categorization, 515 prefrontal cortex function, 467 repetitive TMS (rTMS), 77, 228, 422 semantic memory, 366 verbal working memory, 405 transient attention, perceived contrast, 250 traumatic brain injury (TBI), 473, 478, 481 trial-and-error learning, 481 trigeminal nerve, odorants, 89–90 Tulving, Endel, 353 tuning curve, attention, 222–223 tuning mechanism, attention, 261 tunnel vision, 196 U unconscious competence, skill learning, 417, 418f unification, 584 unilateral neglect, 328f, 328–334 dissociation between neglect and extinction, 328–331 lateralized and nonlateralized deficits within, 331–333 Posner paradigm, 339f posterior parietal cortex organization, 333–334 unilateral posterior lesions, object recognition task, 47f unimodal lexical-semantic priming paradigm, 181 V ventral frontoparietal network, spatial attention, 245–246, 250 ventralmedial prefrontal cortex (VMPFC), attention, 229 ventral visual stream central vision, 322 domain-specific hypothesis, 565–567 episodic retrieval, 383 functional organization of human, 12–14 interaction with dorsal stream, 289–290 landmark test characterizing, 321f linear gap vs. object in gap, 33f nature of functional organization in human, 21–23 object identification, 32 perception and vision with, damage, 279–280 responses to shape, edges and surfaces, 15f Page 42 of 46

Index saliency map, 321f vision, 194–195 visual processing, 277f ventrolateral prefrontal cortex (VLPFC), 128f, 364 attention system, 119 audition, 200 musical syntax, 117 ventromedial prefrontal cortex (VMPFC), 123, 125, 229 verbal working memory, 402–407 aphasia, 402–403 language disturbances, 402–403 phonological loop, 403–404 prefrontal cortex (PFC) for younger and older adults, 459f short-term memory deficits, 406 vertical and horizontal axes, spatial vision, 197 vibration, pitch, 154f vision, 2 color, 196–197 dorsal visual pathway, 194–195 human representation of space, 29 image formation on retina, 29–31 motion perception, 198 neural computations for perception, 283–288 spatial, 197–198 spatial contrast sensitivity, 196 ventral visual pathway, 194–195 visual acuity, 196 visual adaptation, 196 visual field, 195–196 visual acuity, 196, 324f visual adaptation, 196 visual agnosia, 558 bilateral damage, 32, 33f grasping for patient with, 279f neuropsychology, 322 visual identification and recognition, 198–200 visual attention. See also attention capacity limitations, 259, 260–261 change-blindness studies, 257–258 feature-based, 258–259 introduction to, 257–259 limits of, 257–258 location-based, 259 object-based, 258 tuning, 258–259 visual brain, functional specialization, 194–195 visual control of action, 273 interactions between two streams, 289–290 Page 43 of 46

Index neural computations for perception and action, 283–288 neural substrates, 277–280 neuroimaging evidence for two visual streams, 280–290 neuroimaging of DF’s dorsal stream, 281f neuroimaging of DF’s ventral stream, 280f reaching, 282–283 reach-to-grasp movements, 274–276 size-contrast illusion, 284–288 superior parietal occipital cortex (SPOC) and visual feedback, 282 visual cortex, topographic representation, 30f visual cues extraction of, in audiovisual speech, 538–540 loudness, 156 speech production, 528f visual disorientation syndrome, 327 visual mental imagery early visual areas, 76–78 higher visual areas, 78–82 reverse pattern of dissociation, 80 ventral stream, shape-based mental imagery and color imagery, 78–81 visual-motor psychophysics, 275 visual-motor system, 273. See also visual control of action cues for grasping, 275 double-pointing hypothesis, 286–288 grasping, 283, 285–286, 288 grip aperture, 285, 288 interactions of ventral–dorsal streams, 289–290 neural computations for perception and action, 283–288 Ponzo illusion, 286, 287f size-contrast illusion, 284f visual object working memory, 399–400 visual perception, 4–5 activate-predict-confirm perception cycle, 62–63 ambiguous objects, 68–69 bottom-up progression, 67f contextual top-down effects, 65–66 error-minimization mechanisms, 63 feedback connections, 61–62 feedforward connections, 61–62 global-to-local integrated model, 63–64 importance of top-down effects, 60–61, 69–70 interfunctional nature of top-down modulations, 67–69 magnocellular (M) pathway, 64, 65 prediction in simple recognition, 62–65 spatial attention, 240–242, 247–250 top-down facilitation model, 64 triad of processes, 324f understanding visual world, 61–62 Page 44 of 46

Index visual processing, division of labor, 277–278 visual sequence task, 305 visual spatial attention, 237 visual spatial localization, 197 visual-spatial sketchpad, working memory, 392, 393, 475 visual-spatial working memory, 396–399 visual speech signal, properties and perception of, 527–528 visual synthesis, 323 visual synthesis impairment, 344 visual tracking task, eye movements, 288 visual word form area (VWFA), 13, 495–496 visuo-spatial attention central vision, 323f visual perception, 324f voice onset time (VOT), speech perception, 510–511, 515 voluntary control, spatial attention, 240–242 vomeronasal organ (VSO), 89–90 vowels, speech perception, 511–512 voxel-based morphometry brain analyses, 126, 127 reading and spelling, 494 W WAIS-R Block Design Test, 45 Wallach, Hans, 161 Warrington, Elizabeth, 354, 555 Weber’s law, 286, 288 Wernicke’s aphasia, 403 Wernicke’s area, 111 Williams’ syndrome, 32 (p. 621) Wilson, Don, 96 words frame of reference, 38–39 mapping of sounds to, 515–518 meaning in language acquisition, 180–182 word-learning paradigm, 181 word stress in language acquisition, 177–179 working memory, 389–390, 410 aging process, 456 audition, 162–163 capacity, 400–402 capacity expansion, 402 central executive, 392, 407 cognitive control of, 407–410 compensation-related utilization of neural circuits hypothesis (CRUNCH), 459 delayed response task, 394–395, 396f development of concept of, 391–394 dorsolateral prefrontal cortex (DLPFC), 119 emergence as neuroscientific concept, 394–395 Page 45 of 46

Index episodic buffer, 393–394 event-related study of spatial, 399f frontal lobe, 34 functional neuroimaging studies, 395–400 hemispheric asymmetry reduction in older adults (HAROLD), 458–459 lexical matches, 586, 587f linguistic, 593n.10 maintenance, 391, 405f medial temporal lobes (MTL), 463 musical processes, 120 n-back task for sustained, 224, 225f neurological evidence for short- and long-term stores, 390–391 phonological loop, 392–393, 407 positron emission tomography (PET) studies, 397–398 prefrontal cortex, 194 prefrontal cortex (PFC), 394–395 reactivation, 407–408 recall accuracy as function of serial position, 390f role in skill learning, 422–424 short-term and, 390, 474–475 spatial, 326f spatial memory task, 397f syntactic department of, 588 verbal, 402–407 visual object, 399–400 visual-spatial, 396–399 visual-spatial sketchpad, 393 written language. See reading and writing Z zombie agent, dorsal system, 34 zoom lens theory, spatial attention, 238–239

Page 46 of 46