This book provides a chapter-by-chapter update to and reflection on of the landmark volume by J.J. Gibson on theEcologic
160 99 29MB
English Pages 338  Year 2019
Information Visualization: Perception for Design, Fourth Edition explores the art and science of why we see objects the
130 52 14MB Read more
Metaphor allows us to think and talk about one thing in terms of another, ratcheting up our cognitive and expressive cap
219 51 39MB Read more
This study examines the history of a fundamental problem in Aristotelian cognitive psychology, ie the nature and functio
148 6 24MB Read more
The epistemology and the phenomenology of perception are closely related insofar as both depend on experiences of self-e
151 2 2MB Read more
Perception as Information Detection
This book provides a chapter-by-chapter update to and reflection on the landmark volume by J. J. Gibson, The Ecological Approach to Visual Perception (1979/2015). Gibson’s book was presented as a pioneering approach in experimental psychology; it was his most complete and mature description of the ecological approach to visual perception. Perception as Information Detection commemorates, develops, and updates each of the 16 chapters in Gibson’s original volume. The book brings together some of the foremost perceptual scientists in the field, from the United States, Europe, and Asia, to reflect on Gibson’s original chapters, expand on the key concepts discussed and relate this to their own cutting-edge research. This connects Gibson’s classic with the current state of the field, as well as providing a new generation of students with a contemporary overview of the ecological approach to visual perception. This book is an important resource for perceptual scientists as well as both undergraduates and graduates studying sensation and perception, vision, cognitive science, ecological psychology, and philosophy of mind. Jeffrey B. Wagman is Professor of Psychology at Illinois State University, Normal, IL, USA. His research focuses on perception of affordances. He is a recipient of the Illinois State University Outstanding University Researcher Award and a Japan Society for the Promotion of Science Invitation Fellowship for Research in Japan. He is an Associate Editor of the journal Ecological Psychology. Julia J. C. Blau is an Assistant Professor of Psychology at Central Connecticut State University, New Britain, CT, USA. She earned her doctorate in ecological psychology from the University of Connecticut. Her research focuses on the fractality of event perception, as well as the ecological approach to film perception.
Resources for Ecological Psychology A series of volumes edited by Jeffrey B. Wagman and Julia J. C. Blau [Robert E. Shaw, William M. Mace, and Michael Turvey, Series Editors Emeriti]
Social and Applied Aspects of Perceiving Faces Alley Perception and Control of Self-Motion Warren/Werthiem Michotte’s Experimental Phenomenology Thinès/Costall/Butterworth Perceiving Events and Objects Jannson/Bergstrōm/Epstein Global Perspectives on the Ecology of Human-Machine Systems (Volume 1) Flach/Hancock/Caird/Vicente Local Applications of the Ecological Approach to Human-Machine Systems (Volume 2) Hancock/Flach/Caird/Vicente Dexterity and Its Development Bernstein/Latash/Turvey Ecological Psychology in Context James Gibson, Roger Barker, and the Legacy of William James’s Radical Empiricism Heft Perception as Information Detection Reflections on Gibson’s Ecological Approach to Visual Perception Wagman/Blau
Perception as Information Detection Reflections on Gibson’s Ecological Approach to Visual Perception Edited by Jeffrey B. Wagman and Julia J. C. Blau
First edition published 2020 by Routledge 52 Vanderbilt Avenue, New York, NY 10017 and by Routledge 2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN Routledge is an imprint of the Taylor & Francis Group, an informa business © 2020 Taylor & Francis The right of the editors Jeffrey B. Wagman and Julia J. C. Blau to be identified as the authors of the editorial matter, and of the authors for their individual chapters, has been asserted in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this book may be reprinted or reproduced or utilized in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Names: Wagman, Jeffrey B., editor. | Blau, Julia J. C., 1982- editor. Title: Perception as information detection : reflections on Gibson’s Ecological approach to visual perception / edited by Jeffrey B. Wagman and Julia J.C. Blau. Description: New York, NY : Routledge, 2020. | Series: Resources for ecological psychology | Includes bibliographical references and index. Identifiers: LCCN 2019014293 (print) | LCCN 2019016466 (ebook) | ISBN 9780429316128 (eBook) | ISBN 9780367312954 (hardback) | ISBN 9780367312961 (pbk.) Subjects: LCSH: Gibson, James J. (James Jerome), 1904-1979. Ecological approach to visual perception. | Visual perception. | Environmental psychology. Classification: LCC BF241.G48 (ebook) | LCC BF241.G48 P47 2020 (print) | DDC 152.14–dc23 LC record available at https://lccn.loc.gov/2019014293 ISBN: 978-0-367-31295-4 (hbk) ISBN: 978-0-367-31296-1 (pbk) ISBN: 978-0-429-31612-8 (ebk) Typeset in Sabon by Wearset Ltd, Boldon, Tyne and Wear
List of Illustrations List of Contributors Foreword: Resources for Ecological Psychology Preface
viii xi xiii xiv 1
J effrey B . W agman and J ulia J . C . B lau
The Environment to Be Perceived
1 The Third Sense of Environment
E dward B aggs and A nthony C hemero
2 The Triad of Medium, Substance, and Surfaces for the Theory of Further Scrutiny
T etsushi N ona k a
3 Ecological Interface Design Inspired by “The Meaningful Environment”
C hristopher C . P agano A N D B rian D ay
4 Challenging the Axioms of Perception: The Retinal Image and the Visibility of Light C laudia C arello A N D M ichael T . T urvey
vi Contents Part II
The Information for Visual Perception
5 Getting into the Ambient Optic Array and What We Might Get Out of It
W illiam M . M ace
6 The Challenge of an Ecological Approach to Event Perception: How to Obtain Forceful Control from Forceless Information
R obert S haw A N D J effrey Kinsella - S haw
7 The Optical Information for Self-Perception in Development
A udrey L . H . van der M eer A N D F . R . R uud van der W eel
8 A Guided Tour of Gibson’s Theory of Affordances
J effrey B . W agman
9 Perceiving Surface Layout: Ground Theory, Affordances, and the Objects of Perception
W illiam H . W arren
10 Acting Is Perceiving: Experiments on Perception of Motion in the World and Movements of the Self, an Update
L . J ames S mart J r . , J ustin A . H assebroc k , A N D M ax A . T eaford
11 Revisiting “The Discovery of the Occluding Edge and Its Implications for Perception” 40 Years On
H arry H eft
12 Looking with the Head and Eyes
J ohn M . F rancha k
13 James Gibson’s Ecological Approach to Locomotion and Manipulation: Development and Changing Affordances Karen E . A dolph , J ustine E . H och , and O ri O ssmy
Contents vii 14 Information and Its Detection: The Consequences of Gibson’s Theory of Information Pickup
B randon J . T homas , M ichael A . R iley , A N D J effrey B . W agman
15 The Use and Uses of Depiction
T homas A . S toffregen
16 Revisiting Ecological Film Theory
J ulia J . C . B lau
Figures 2.1 Water velocity 60 s after an 86 mm-long fish (Lepomis gibbosus) passed the area 2.2 (a) Typical posture and movement of craftsmen during stone bead production. (b) Examples of ellipsoidal glass beads produced by expert (HQ) and non-expert (LQ) craftsmen. (c) Singularity spectrum estimated for expert (HQ) and non-expert (LQ) craftsmen 2.3 (a) Ventral surface of a flake detached by conchoidal fracture. (b) Flake terminology 2.4 Re-fitted elongated ovate fine-grained porphyritic phonolite cobble from Lokalalei 2C 3.1 A hand feeling a pair of scissors 4.1 (a) A diverging pencil of rays from a single reflecting point. (b) A few converging cones show how an optical image could be built from an infinite number of points on the object 4.2 The demonstration of a “retinal image” by Scheiner 4.3 Illumination is transparent to that which is illuminated 4.4 (a) Three holes drilled in a plywood box provided an illumination shaft. (b) Rods angled obliquely directly in front of the viewing aperture of the box 4.5 Photographs taken through the viewing aperture 4.6 Newton revisited 4.7 (a) Box with interior reflecting surfaces. (b) A top view schematic of the light in the interior of the foam-lined box. (c) A top view schematic of the light in the interior of the box shown in (a) 4.8 Light reflected from spacewalkers: observations 5.1 Pike’s Peak, Barr Trail 5.2 Ambient optic array diagrams 6.1 The transformation of the optic array defined by a locomotor movement
28 31 33 42 53 56 61 63 63 64
65 67 76 77 94
Illustrations ix 6.2 6.3 6.4 7.1 7.2 7.3 7.4 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 9.1 9.2 9.3 9.4 10.1 10.2
10.3 10.4 10.5 12.1
Illustrating the dual components of an ecosystem The dual frame discrepancy hypothesis Lee’s swinging room A newborn baby participating in the weight-lifting experiment A newborn boy only a few hours old is studying his hand intensely A 4-month-old girl in deep concentration on the visual motion presented on the screen in front of her Accelerating looming stimulus approaching the infants’ eyes resulting in increased theta-band oscillatory activity in the visual cortex of the extrinsic loom Surfaces and affordances as interfaces Affordances for support Objects that afford sitting on (by humans) Many animal species perceive affordances for reaching Performing a given behavior creates and eliminates affordances Relationship between physics and geometry and perception-action in standard (top) and ecological (bottom) approaches Objects that afford grasping with one hand and with two hands Perceptual experience vs. artificial measurement (M) devices The horizon ratio Geometry of the ground theory Ground texture as an intrinsic scale for relative exocentric size and distance Distance perception: experimental tasks, representative data, and the results of numerical simulation based on Equations 9.1 and 9.3 Depiction of the moving room used in Stoffregen and Smart (1998) and Smart, Stoffregen, and Bardy (2002) (a) Depiction of the virtual hallway stimuli used by Warren, Kay, and Yilmaz (1996). (b) Depiction of large screen projection of optic flow (sinusoidal) used in Dijkstra, Schöner, and Gielen (1994) Set-up and depiction of study for Littman (2011) Potential alteration of perception and action when using VE Depictions of optic flow conditions used in Smart et al. (2014) (a) Mobile eye tracker devised by Land (1992). (b) Simultaneous eye and field of view recording for tracking eye gaze
101 105 106 112 115 122 126 131 133 134 135 137 139 141 142 154 155 157 160 179
181 184 185 186 208
x Illustrations 12.2 (a) Lightweight, mobile adult eye tracker. (b) Mobile infant eye tracker 12.3 Different stages of eye, head, and body rotations during a 90° shift of gaze 12.4 Field of view for an average-height 8-year-old girl 12.5 Heat maps showing the frequency of face (top row) and toy (bottom row) locations within the head-centered field of view for 12- and 24-month-old infants during fixations 14.1 Top: Apparatus and procedure for the overhead reaching task. Bottom: Apparatus and procedure for the minimum pass-through-ability task 14.2 (a) No perceptual intent condition. (b) Perceptual intent condition of Experiment 1 15.1 A camera obscura 15.2 The chambered eye considered as a camera with a lens 15.3 Hand paintings in paleolithic rock art 15.4 Hand prints in development 15.5 The “City Churches” of Sir Christopher Wren, as seen from 35,000 feet 15.6 Locating the vanishing point 16.1 (a) A camera receives light from a scene and focuses it, then (b) a spinning shutter exposes each frame 16.2 (a) The original scene (in grayscale). (b) The negative. (c) The positive print 16.3 (a) A bright light is projected through the positive print. (b) A spinning shutter exposes, then blocks each frame three times 16.4 The scanning pattern of a television or computer screen 16.5 (a) No blur specifies the ball (and camera) are still. (b) Global blur specifies the camera is moving. (c) Localized blur specifies the ball is moving 16.6 Typical editing structure 16.7 Nested event structure
209 215 217
218 240 247 256 257 263 263 266 272 276 277 278 279 281 283 286
Table 1.1 Some distinctions between the world, habitat, and Umwelt
Karen E. Adolph, Department of Psychology, New York University, New York, USA. Edward Baggs, Bartlett School of Architecture, University College London, London, UK. Julia J. C. Blau, Department of Psychology, Central Connecticut State University, New Britain, CT, USA. Claudia Carello, Center for the Ecological Study of Perception and Action, University of Connecticut, Storrs, CT, USA. Anthony Chemero, Departments of Philosophy and Psychology, University of Cincinnati, Cincinnati, OH, USA. Brian Day, Department of Psychology, Butler University, Indianapolis, IN, USA. John M. Franchak, Department of Psychology, University of California, Riverside, CA, USA. Justin A. Hassebrock, Department of Psychology, Miami University, Oxford, OH, USA. Harry Heft, Department of Psychology, Denison University, Granville, OH, USA. Justine E. Hoch, Department of Psychology, New York University, New York, USA. Jeffrey Kinsella-Shaw, Department of Kinesiology, University of Connecticut and Center for the Ecological Study of Perception and Action, University of Connecticut, Storrs, CT, USA. William W. Mace, Department of Psychology, Trinity College, Hartford, CT, USA. Tetsushi Nonaka, Graduate School of Human Development and Environment, Kobe University, Kobe, Japan.
xii Contributors Ori Ossmy, Department of Psychology, New York University, New York, USA. Christopher C. Pagano, Department of Psychology, Clemson University, Clemson, SC, USA. Michael A. Riley, Department of Psychology, Center for Cognition, Action, & Perception, University of Cincinnati, Cincinnati, OH, USA. Robert Shaw, Center for the Ecological Study of Perception and Action, University of Connecticut, Storrs, CT, USA. L. James Smart Jr., Department of Psychology, Miami University, Oxford, OH, USA. Thomas A. Stoffregen, School of Kinesiology, University of Minnesota, Minneapolis, MN, USA. Max A. Teaford, Department of Psychology, Miami University, Oxford, OH, USA. Brandon J. Thomas, Department of Psychology, University of Wisconsin, Whitewater, WI, USA. Michael T. Turvey, Center for the Ecological Study of Perception and Action, University of Connecticut, Storrs, CT, USA, and Haskins Laboratories, New Haven, CT, USA. Audrey L. H. van der Meer, Developmental Neuroscience Laboratory, Department of Psychology, Norwegian University of Science and Technology (NTNU), Trondheim, Norway. F. R. Ruud van der Weel, Developmental Neuroscience Laboratory, Department of Psychology, Norwegian University of Science and Technology (NTNU), Trondheim, Norway. Jeffrey B. Wagman, Department of Psychology, Illinois State University, Normal, IL, USA. William H. Warren, Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI, USA.
Foreword Resources for Ecological Psychology
This series of volumes is dedicated to furthering the development of psychology as a branch of ecological science. In its broadest sense, ecology is a multidisciplinary approach to the study of living systems, their environments, and the reciprocity that has evolved between the two. Traditionally, ecological science has emphasized the study of the biological bases of energy transactions between animals and their physical environments across cellular, organismic, and population scales. Ecological psychology complements this traditional focus by emphasizing the study of information transactions between living systems and their environments, especially as they pertain to perceiving situations of significance to planning and execution of purposes activated in an environment. The late James J. Gibson used the term ecological psychology to emphasize this animal-environment mutuality for the study of problems of perception. He believed that analyzing the environment to be perceived was just as much a part of the psychologist’s task as analyzing animals themselves, and hence that the “physical” concepts applied to the environment and the “biological” and “psychological” concepts applied to organisms would have to be tailored to one another in a larger system of mutual constraint. His early interest in the applied problems of landing airplanes and driving automobiles led him to pioneer the study of perceptual guidance of action. The work of Nikolai Bernstein in biomechanics and physiology represents a complementary approach to problems of the coordination and control of movement. His work suggests that action, too, cannot be studied without reference to the environment, and that physical and biological concepts must be developed together. The coupling of Gibson’s ideas with those of Bernstein forms a natural basis for looking at the traditional psychological topics of perceiving, acting, and knowing as activities of ecosystems rather than isolated animals. The aim of this series is to form a useful collection, a resource, for people who wish to learn about ecological psychology and for those who wish to contribute to its development. The series will include original research, collected papers, reports of conferences and symposia, theoretical monographs, technical handbooks, and works from the many disciplines relevant to ecological psychology. Jeffrey B. Wagman and Julia J. C. Blau
James J. Gibson began the preface to The Ecological Approach to Visual Perception (1979/2015), by remarking “Vision is a strange and wonderful business.” This is no less true in the first few decades of the 21st century than it was in the last few decades of the 20th century. Vision continues to be a strange and wonderful business. We hope that this book and the chapters contained therein will serve as testament to this. Since the publication of The Ecological Approach to Visual Perception (and in large part because of it), Gibson’s ecological approach has flourished, with a book series (‘Resources for Ecological Psychology’), an international society (the International Society for Ecological Psychology) and an affiliated conference series, a dedicated journal (Ecological Psychology), and research centers and prominent researchers based in locations around the globe including the United States, Mexico, Brazil, the Netherlands, the United Kingdom, Japan, South Korea, Taiwan, and Australia, among others. We are certain that Gibson would be pleased with how far his ecological approach has progressed in the subsequent decades. Demonstrating this progress is one of our primary motivations for editing this volume. However, we are equally certain that Gibson would recognize that there is still much work yet to be done. Inspiring others to continue to (as he put it) “scramble through the underbrush” along with us in further developing the ecological approach is another of our primary motivations. In short, this book is our attempt to bring Gibson’s classic 1979 book into the 21st century and beyond. We are especially proud that this book will be among the first published in a revived ‘Resources for Ecological Psychology’ series that we will be co-editing. We thank the Series Editors Emeriti (Robert Shaw, William Mace, and Michael Turvey) for this opportunity. This volume is dedicated to all those whose have been influenced by Gibson’s ecological approach to (visual) perception, and who have dutifully (and thoroughly!) shared this influence with us. In particular, it is dedicated to the faculty who shaped our thinking, research, and career paths during our respective graduate educations at the Center for the Ecological Study of Perception and Action at the University of Connecticut.
Preface xv This group includes (but is not limited to) Claudia Carello, Jay Dixon, Carol Fowler, Till Frank, Bruce Kay, Bill Mace, Claire Michaels, Bob Shaw, Michael Turvey, and many others. We are grateful to the chapter authors. Quite literally, this project would not have been possible without their contributions. We also thank the following people who graciously served as external reviewers of the chapters: Joseph Anderson, Raoul Bongers, Felipe Cabrera, Pablo Covarrubias, Matthieu M. de Wit, Judy Effken, Alen Hajnal, Ángel Jiménez, Endre Kadar, Nam-Gyoon Kim, Kerry Marsh, Jon Matthis, Christopher Pagano, H. A. Sedgwick, Joanne Smith, Jim Todd, and Michael Turvey. JBW also thanks Dawn McBride and Connor Wagman, whose love and support make all things possible. JJCB also thanks Eric, Grayson, and Gwendolyn Blau (my incredible family), Amy Rose Capetta (my magnificent QEF ), and Alexandra Paxton (my brilliant sounding board). We hope that our readers enjoy reading this collection as much as we enjoyed editing it. Jeffrey B. Wagman and Julia J. C. Blau
Introduction Jeffrey B. Wagman and Julia J. C. Blau
James J. Gibson’s classic book The Ecological Approach to Visual Perception (1979/2015) was a landmark book in experimental psychology. It was his most complete and mature description of the ecological approach to visual perception––an approach in which animals detect complex optical patterns that specify the behaviors that the animal can perform. It provided a fundamentally new way to conceive of both the process of visual perception and (perhaps more importantly) what there is to be perceived. Gibson’s book has influenced generations of scholars across many disciplines, but it has had its most significant and longest-lasting impact on scientists studying visual perception, especially those who, like Gibson, do so from an ecological perspective. As a testament to its enduring influence, the book was reprinted in 1986 and again in 2015 (as a ‘Classic Edition’ by Taylor & Francis). It has been translated into both German and Japanese and, as of this writing, has been cited nearly 35,000 times in the scientific literature. Perception as Information Detection: Reflections on Gibson’s Ecological Approach to Visual Perception commemorates, but also builds on and updates each of the 16 chapters in The Ecological Approach to Visual Perception. In this edited volume, representatives of multiple generations of perceptual scientists from the United States, Europe, and Asia reflect on a particular chapter of Gibson’s classic 1979 book, describe how the concepts therein influenced both their respective research programs and the development of the science more generally. We hope that the book both serves to bridge the decades-long gap since its initial publication and brings Gibson’s seminal work into the 21st century by providing an updated overview of the ecological approach to visual perception. Perception as Information Detection: Reflections on Gibson’s Ecological Approach to Visual Perception is both a tribute to The Ecological Approach to Visual Perception and a reflection on the current state of the science.
The Environment to Be Perceived
1 The Third Sense of Environment Edward Baggs and Anthony Chemero
James Gibson begins his final book by making a distinction between two senses in which the world surrounds the animal. The whole project of The Ecological Approach to Visual Perception depends upon the distinction Gibson makes between the physical world surrounding animals and the environment surrounding them. The physical world contains everything from sub-atomic particles to galaxies, but it does not contain meaning. Perception does contain meaning. If the world that animals perceive is the physical world, it is mysterious why perception is meaningful. The traditional solution to this mystery, which Gibson rejected, is that animals make the meaning somehow. For animals like us, the traditional solution has it that our brains construct the meaning in private, and that is where the meaning lives. In proposing that the world that animals perceive is the environment, not the physical world, Gibson proposes a different solution to this mystery. Perception is meaningful, on Gibson’s account, because there is meaning in the environment that animals perceive. Animals do not create meaning; they discover it in the environment. Gibson’s solution amounts to a radical rejection of the understanding of the nature of the world and its relationship to experience, perception, and knowledge that had been in place since the founding of modern science. Gibson’s solution is the key move that motivates the rest of his book, and indeed the whole of his ecological approach to psychology. The distinction has, however, created new problems. The aim of this chapter is to argue that Gibson should have made a distinction between not just two, but three different senses in which the world surrounds the animal. The distinction Gibson (1979/2015) makes is between the following: 1. The physical world: Reality is assumed to be structured in various ways, prior to and irrespective of the existence of any animal living in it. This structure is properly the subject matter of physics and geology. 2. The environment: Animals themselves do not perceive the world of physics—of planets and atoms. Animals have evolved to perceive and act at the terrestrial scale, relative to objects and surfaces that are meaningful because they offer possibilities for action. To talk of an
6 Edward Baggs and Anthony Chemero environment is to imply the existence of an animal that is surrounded. The two terms are mutually codependent, and it is in this sense that the animal and its environment constitute an animal–environment system. This distinction is both appropriate and intuitively appealing. But it fails to include a further distinction: that between the evolutionary history of a species and the personal history of a particular individual animal. These are quite different things. We suggest that Gibson’s lumping together of these ideas has led to no small number of confusions, and to unnecessary controversy among proponents of ecological psychology. It is necessary to subdivide what Gibson called the environment into two quite distinct concepts. We propose to replace his two-way distinction with a three-way distinction between (1) the physical world; (2) the species habitat; and (3) the animal-specific Umwelt.
A Troublesome Observation The additional distinction we are proposing here is not novel. Gibson himself was quite aware that his use of “environment” referred ambiguously to both the surroundings of an individual living animal (a token) and the surroundings of an idealized member of a species (a type), and he noted that this ambiguity could potentially cause confusion. Gibson writes (1979/2015, p. 3): The environment consists of the surroundings of animals. Let us observe that in one sense the surroundings of a single animal are the same as the surroundings of all animals but that in another sense the surroundings of a single animal are different from those of any other animal. These two senses of the term can be troublesome and may cause confusion. The apparent contradiction can be resolved, but let us defer the problem until later. (The solution lies in the fact that animals are mobile.) As is clear here, Gibson thought he could resolve the confusion by pointing out that animals move around. Gibson elaborates on this at the end of Chapter 3 (Gibson, 1979/2015, p. 38): In the course of time, each animal moves through the same paths of its habitat as do other animals of its kind. Although it is true that no two individuals can be at the same place at the same time, any individual can stand in all places, and all individuals can stand in the same place at different times. Insofar as the habitat has a persisting substantial layout, therefore, all its inhabitants have an equal opportunity to explore it. In this sense the environment surrounds all observers in the same way that it surrounds a single observer.
The Third Sense of Environment 7 What Gibson says here is more or less true. For most of the objects and surfaces around us, we can station our eyeballs relative to those surfaces at a point of observation that was previously occupied by one of our fellows. (There are exceptions to this: only you can see the end of your own nose from the position of your own eyeball; Gibson, 1967a.) But this does not really resolve the tension. There are many ways in which the world can look different to two different members of the same species, even if we all keep moving around in the appropriate way. This is not because we are trapped in our own private mental world, as the traditional view has it, but simply because we are different animals with different abilities. Take a Chinese newspaper. Neither of the authors of this chapter knows how to read Chinese. We might be able to recognize the general shape of the characters and make the judgment, “That looks like Chinese writing,” or again we might recognize someone in a photograph printed on the front page. But we cannot read any of the headlines. The script is not meaningful for us in the way that it is for someone who can read Chinese. Similarly, the cockpit of a plane looks different to a trained pilot than it does to a novice or to a child. The pilot can do things with the buttons and levers that non-trained individuals cannot. Or, at an even more basic level, think about climbing a set of stairs. The stairs look different depending on whether you are a toddler or a long jumper or you are someone who has just had a hip operation or you are wearing high-heeled shoes. What’s missing, in Gibson’s elision of the two senses of environment here, is an acknowledgment that the way the world looks to us is to some extent a result of the way we currently are as individuals. In claiming that we all share the same environment because we can in principle all see the same surfaces from the same set of positions, Gibson is treating perception as if it were an idealized process. He is suggesting that perception can be separated from the perception–action loop that comprises the activity of a living animal with a history of engagement and learning. This is odd, because Gibson’s overall project throughout the book is to deny that perception is a separate phenomenon in this way, and to promote the view that perception and action are inherently one and the same process. We suggest, then, that Gibson was right to point out the “troublesome” ambiguity of the term “environment,” and that he was prescient in noting that this ambiguity “may cause confusion.” Indeed, as we will show below, it has caused confusion. The tension can be resolved, but not in the manner that Gibson proposed. Instead, we need to make a further distinction, between: the environment as it exists for a typical member of a species, a habitat; and the environment as it exists for a particular living animal, an Umwelt. It is important to note that the three senses of environment do not define three distinct universes. Rather, the three senses are overlapping and nested. Briefly, the habitat is the physical world considered relative to a typical or ideal member of a species, i.e., it is a type of environment. It is a complementary term to a species. The habitat continues to exist even when
8 Edward Baggs and Anthony Chemero a given animal dies. The Umwelt, meanwhile, is the physical world considered relative to a particular living individual animal, i.e., it is a token instance of a habitat. Umwelt is a complementary term to a specific organism. When that organism dies, its Umwelt necessarily ceases to exist. (The term Umwelt originates in the work of Jakob von Uëxkull, and refers to the world as it appears to a given animal. See Kull, 2009, for a discussion of the history of the term.) We will not here attempt to provide watertight definitions of the three concepts of world, habitat, and Umwelt. Instead we offer a set of illustrative differences in Table 1.1. The upper part of Table 1.1 sets out some Table 1.1 Some distinctions between the world, habitat, and Umwelt Physical world
Scatle-free; atoms to galaxies Terrestrial scale
View from nowhere
Ideal perspective for typical member of a species
Has a first-person perspective
Exists prior to any animal encountering it
Exists for typical member of a species
Brought forth through development and active exploration; enacted
No complementary term
Complementary to “species”
Complementary to “organism”
Orthogonal to life
Complementary to a form of life Complementary to a living creature
• Affordance-resources • Landscape of affordances • Potential opportunities • Affordances as dispositional properties of the habitat
• Opportunities and solicitations • Field of affordances • Opportunities • Affordances as relations in the Umwelt
Thermodynamic information; • Information-about the habitat • Information-for an Shannon information • C onventional information (in exploring organism the human habitat) • Saturated with linguistic behavior (in humans) History of pure structure, reconfigured by physical processes
• Has a history in evolution • Has a history in • Expanded by animal activities development (niche construction) • Enriched by skill • Target of learning acquisition • Differentiated in learning • Location of learning • Enriched by learning
Orthogonal to social processes; a set of points in space
• Shared among members of a species; continues to exist when a given animal dies • Behavior settings
• Ceases to exist when a particular animal dies • Places
The Third Sense of Environment 9 broad distinctions between the three concepts. The lower part summarizes the material below, in which we discuss how making the three-way distinction helps clarify the concepts of affordances, ecological information, learning, and the social.
Some Tensions and Their Resolutions Replacing Gibson’s two-way distinction with the three-way distinction we propose here should allow us to re-evaluate how we think about the rest of Gibson’s book, and indeed about the 40 years of research that has followed. The entire program is predicated on treating “the environment” as an unproblematic fundamental. Ecological psychologists happily state that their unit of study is the “animal–environment system.” But this phrase maintains the ambiguity identified above: are we talking here about a typical member of a species residing in the environment (of that species), or are we talking about a particular animal in its environment (its Umwelt)? As a result of this ambiguity a number of contradictions and controversies have built up over the years. The fact that we are still wrestling with what are supposedly the core concepts of the field so many years after the program was instituted suggests that the basic distinctions that Gibson made might not have been quite fine enough. We will focus here on four key areas where the contradictions bite most acutely. We suggest that the habitat–Umwelt distinction can help resolve the tensions that have arisen in each of these cases. We focus on affordances, information, learning, and the social. Nobody Knows What Affordances Are Anymore First, a tension has arisen in the literature over how we should think about affordances: are they dispositional properties, relational properties, or something else? Gibson does not get down to defining affordances until Chapter 8 of his book (see also Wagman, Chapter 8, in this volume). The way he initially defines affordances there is dependent on the distinction made in Chapter 1. “The affordances of the environment,” he writes, “are what it offers the animal …” (Gibson 1979/2015, p. 119). Notice that “the environment” is here treated as an unproblematic term. Indeed, the way Gibson begins Chapter 8 seems to imply that affordances are properties of the habitat, in our terms: they are objective features to be found “out there.” They are a special way of describing the layout of surfaces and objects, perhaps. They are to be understood independently of any given animal, and are therefore not properties of the Umwelt. Thus, Gibson asks, “How do we get from surfaces to affordances? And if there is information in light for the perception of surfaces, is there information for the perception of what they afford?”
10 Edward Baggs and Anthony Chemero Consistent with this way of describing affordances as things that are “out there,” Turvey (1992) attempted to provide a formal definition of affordances as dispositional properties belonging to objects, surfaces, and the like. On this account, affordances are actualized when they come into contact with an animal that possesses a complementary dispositional property, namely, an effectivity. A flight of stairs has a property, “climbable,” that is actualized when a stair-climbing animal comes along and climbs the stairs. In a similar vein, Reed (1996a, p. 26) insists that affordances “are aspects of the environment of all organisms, not just features of the environment of one creature.” Reed proposes that affordances are in fact resources, or persisting features of the habitat of a species. Because of this, affordances are able to exert evolutionary selection pressure on populations across generations. So that is one story about affordances: they are “out there” properties that animals can encounter. The problem is that Gibson also suggested an entirely different way of thinking about affordances. A couple of pages into Chapter 8, he writes, “an affordance is neither an objective property nor a subjective property; or it is both if you like … It is equally a fact of the environment and a fact of behavior” (Gibson 1979/2015, p. 121). Again, summarizing the chapter, Gibson writes, “Affordances are properties taken with reference to the observer. They are neither physical nor phenomenal” (p. 135). This rather complicates matters. Now it seems as if Gibson is insisting that affordances are, after all, properties of the Umwelt of a given organism, in addition to being simply “out there” as properties of the habitat. Attempting to demystify this neither-subjective-nor-objective quality of affordances, Chemero (2003) proposed that affordances should be treated not as simple “out there” properties of the habitat, but as relational properties that arise, or can potentially arise, in the encounter of an animal and its surroundings (see also Stoffregen, 2003a). One advantage of this way of thinking is that it allows us to talk about differences between individual members of a species. The plane affords flying to the pilot but not to the child because the affordance is defined relative to the individual’s skills or abilities (Chemero, 2009). Other theorists have expanded on this affordances-as-relations concept. Withagen et al. (2012) argued that affordances do not merely exist as bare possibilities for action, but can sometimes invite or solicit particular behaviors, as when a furniture store is arranged so as to guide customers through it along a particular path. Erik Rietveld and his colleagues propose that we should think of our surroundings as a “landscape of affordances,” that is, as a whole set of possibilities that constitute our ecological niche (Rietveld & Kiverstein, 2014). This is to be distinguished from the “field of affordances” or the finite set of possibilities that are available to us at a given moment, those that stand out as relevant to us (Bruineberg & Rietveld, 2014; van Dijk & Rietveld, 2017).
The Third Sense of Environment 11 It remains an outstanding question whether a single formalization of the affordance concept can capture everything that Gibson intended it to explain (Michaels, 2003). For the present purposes, we suggest that at least some of the disagreement in the field can be resolved by pointing out that affordances serve a different purpose, depending on whether we are invoking them as properties of the habitat or as properties of the Umwelt. In the former case, they are dispositional properties or resources, they exist independently of any given animal, and they exert selection pressure. In the latter case, they are relational properties, they depend for their existence on the continued existence of a particular living animal, and they change as that animal develops new abilities and skills (and, conversely, they disappear as the animal “forgets” those skills or as its abilities degenerate with age or injury). We can have all of the features of affordances-as-resources without rejecting the features of affordances-as-relations, and vice versa. These two accounts of affordances are not in fact in conflict with one another. One is proper to the habitat, the other is proper to the Umwelt. Nobody Knows What Information Is Anymore A second outstanding tension has arisen in the context of attempts to understand Gibson’s ecological conception of information. Gibsonians have traditionally understood information as structure in energy arrays that serves a dual role: it is both about the structure in the world that caused the pattern in the array, and it is for an active organism, i.e., it serves to guide action in a concrete sense (Michaels & Carello, 1981, p. 37). This dual conception of information originates in the attempt by Gibson’s followers to formalize Gibson’s approach into a workable scientific program (Turvey, Shaw, Reed, & Mace, 1981). Information, on this approach, should be understood as the central term in a 1:1:1 relation standing between structure in the world and structure in perception (Shaw & McIntyre, 1974; see Chemero, 2009, p. 111). Chemero (2009) refers to this two-way specifying relation as the symmetry principle. An important claim is that this relation can be read in both directions (Baggs & Chemero, 2018). Starting at one end: structure in the world specifies how energy (light, sound waves, etc.) is structured, and this in turn specifies what an animal perceives. Information in energy is information about the structure in the world, and the perceptual system samples this information. Starting at the other end: an animal is understood as an active explorer of its surroundings. In order to survive, an animal needs to seek out places and resources that are favorable to its ongoing existence. To do this, it must continually move about such that it is always perceiving a set of surroundings that are compatible with its needs. Perception can serve this role because of the chain of specifying relations between perception, information, and the world: the content of the animal’s perception specifies the
12 Edward Baggs and Anthony Chemero structure in surrounding energy arrays, and this structure in turn specifies whether or not there exists structure in the world to support the animal’s ongoing existence. As the animal moves around, it has to seek information for its own purposes A worry here is that this way of describing perception might be simply too rigid. Meaning is understood as a purely external phenomenon, as being located entirely “out there.” This seems to leave the animal with a limited role: to be a living animal is simply to respond to structure that one comes across. Indeed, this property of the ecological approach has proved to be off-putting even to some working in neighboring non-representational approaches to cognition, whom one might think would be sympathetic. For instance, Varela et al. (1991) dismiss Gibson’s approach on the grounds that Gibson’s conception of information relies on an unresolved dualism: because meaning is understood to be external, they suggest, there remains an explanatory gap between structure in the world and the biological and phenomenological processes that allow a living animal to make use of that structure. Gibsonians, these authors assert, are attempting to build a “theory of perception entirely from the side of the environment” (Varela, Thompson, & Rosch, 1991, p. 204). The suggestion is that a theory that contains both information-about and information-for is a theory that maintains a mind–body dualism. One strategy for answering this critique has been to concede that information-about is a problematic concept, because it implies meaningful content “out there” that somehow has to be integrated into the living perception–action processes of an exploring animal. Van Dijk et al. (2015) argue that all we need is information-for: “Information for” calls attention to the fact that ecological information need not be about anything—has no “aboutness”—prior to use, but it is for something to an active animal. It is for perceiving the environment, for acting on affordances and the likes. We do not endorse this strategy. Another strategy is to abandon the symmetry principle and allow that information can carry structure that is about more than simply physical surfaces and their layout. Several authors have proposed that the concept of information should be expanded so that it can account for things that we perceive but that do not specify structure in the world in the rigid, lawful ways demanded by the symmetry principle (Bruineberg, Chemero, & Rietveld, 2018; Chemero, 2009; Golonka, 2015; Withagen, 2004; Withagen & Chemero, 2009). According to these proposals, some information is conventional information. This allows, for example, that the label on a can of beer is informative about the can’s contents because the can exists within an ongoing system of social practices. This notion of conventional information means that information is in part dependent on our activities
The Third Sense of Environment 13 as a community of convention-makers and convention-observers. This seems like a necessary move if one wishes to develop an ecological approach to deal with cultural and language-involving phenomena. To capture the nature of linguistic phenomena, however, we will need an account of how we create and use linguistic structures, in addition to an account of how we recognize that objects such as beer labels have meanings about things beyond what they are made of. We will recommend below that the appropriate move is to treat language as a social phenomenon, i.e., as an interpersonal process not reducible to a perceptual process. For now, we note that another possibility exists which may allow us to maintain all three concepts discussed here: information-about, informationfor, and the symmetry principle (for non-conventional information). The solution is to understand information-about as a property of the habitat, while information-for should be understood as a property of the Umwelt. There is indeed structure in energy arrays that can potentially be used to guide activity. A fruit pie cooling on a windowsill emits a chemical trail into the air that can be detected by an animal with the appropriate olfactory system. There is information about the pie that can be said to exist in the habitat of animals of that type. The chemical composition of the pie is specified in the chemicals diffused in the air, as required by the symmetry principle and unlike in the case of the beer label. When a particular animal does detect the chemical trail and starts to navigate up the scent gradient toward the pie, we should understand the animal as using information for its own pie-seeking purposes. It is both proper to say that the pie is specified in patterns in the energy array, and that information is actively sought by a living animal. How Do Animals Learn? How Do We Invent? A third tension arises from a lack of clarity over the status of learning in Gibson’s account. This tension can be traced directly to the decision made by James J. Gibson and his wife Eleanor J. Gibson in the 1960s to divide their research efforts: the former would focus on the senses while the latter would focus on perceptual learning (Gibson, 1966a, p. viii). This division, made between the Gibsons for their own convenience, has had a long- lasting consequence: the developmental part of the ecological account has ever since remained somewhat separate from the core perception–action program (Rader, 2018). To understand the tension here, it is useful to turn to a foundational paper in perceptual learning. Gibson and Gibson (1955) argue that the traditional way of framing the learning process is incorrect. The traditional view has it that learning about the world is building up and maintaining an internal store of knowledge about external facts: learning is a process of enrichment. The Gibsons propose that a more useful way of thinking
14 Edward Baggs and Anthony Chemero about learning is to acknowledge that what the animal comes into contact with—the stimulus—is already richly structured, and the task that the learner is faced with is to seek out distinguishing features in the stimulus that allow the learner to recognize things in the world as being different from one another: learning is a process of differentiation. The go-to examples here are how wine experts come to recognize differences in taste between different wines, and how workers tasked with categorizing the sex of newly hatched chicks come to recognize male from female. In both instances, the learning process seems to be highly implicit: skilled perceivers in these tasks may not necessarily be able to articulate precisely what it is that allows them to carry out their categorizations. Yet they are discriminating, and they must be doing so by having learned to attend to some variable, or set of variables, in the perceptual stimulus. The problem here is that while what these expert perceivers are doing is discriminating perceptual structure, not enriching an internal model, this does not actually constitute an account of the learning process. Indeed, in a sense, it is misleading, because from a phenomenological standpoint it remains true that learning results in a richer experience of the world. The pilot who has learned how to fly the plane has a richer set of behavioral possibilities open to her, she has a richer Umwelt, even if this is arrived at in part through learning to perceptually discriminate between the levers and buttons on the flight control panel. We should even acknowledge here that this enrichment is possible because of changes in the organization of the pilot’s body and nervous system, though of course this does not imply that the pilot has built a second, mental version of the control panel inside her head that she is using to control the plane. There is just more in the pilot’s Umwelt. In recent years, several authors have developed a theory of “direct learning” which attempts to formalize the discrimination viewpoint and turn it into a workable empirical program (Jacobs & Michaels, 2007; Jacobs, Silva, & Calvo, 2009). The innovation here is the suggestion that the optimum variable that an actor should be attending to, in learning to carry out some task, is already specified in the information space of possible solutions to the task. To become an expert in some task, the actor has to explore this information space and seek a trajectory toward some optimum solution. This is a powerful view of learning, but it is a view that is still dependent on there being pre-existing structure “out there” that the animal is able to explore. It is hard to see how this same approach might be applied to the kind of generative behaviors that children routinely engage in, namely, those involving language or pretend play. Children do not merely explore a space of pre-given solutions, but invent and create. The developmental story remains an underdeveloped part of the ecological approach. For now we note only that the Gibsons’ dichotomy between differentiation and enrichment accounts of perceptual learning is not fine enough. Again it is useful to think of the differences between the
The Third Sense of Environment 15 habitat and the Umwelt. In the habitat, a space of possible actions and discriminations exists before any particular organism comes into existence, and this information space may well serve as the target of learning. In the Umwelt, meanwhile, the world comes to have shape in part because of changes in the animal: learning does indeed result in an enrichment of possibilities from the perspective of the living animal. Can We Have a Gibsonian Account of the Social? The fourth issue we wish to discuss here is the problem of reconciling Gibson’s approach with an account of social phenomena. Owing largely to the work of Harry Heft, effort in this area has been focused on taking Gibson’s theory of perception as a starting point and trying to unite it with the “other” ecological psychology, that of Roger Barker and his colleagues (Barker, 1968; Heft, 2001; McGann, 2014). Barker’s project was to attempt to classify and understand the behaviors that a typical person engages in while carrying out their daily life. He and his team of researchers investigated this by observing people’s actual behavior in non-laboratory settings, namely, in a small town in Kansas over a number of years. The central insight of this project is that a person’s behavior is best predicted at a given instant not by what other events have just occurred, as is implied in the methodology of the stimulus–response psychology that was dominant at the time. Behavior is instead best predicted by the places where those behaviors occur. A given child’s behavior in a classroom or a candy store will closely resemble that of her peers when her peers are in the same behavior setting. To know how the child will likely behave, it is more useful to know where she is right now than it is to know what kinds of behaviors she has engaged in earlier in the day. The tension here is this. Perceiving a behavior setting, say, a reading group, as a specific instance of that behavior settings seems to require that we as individual actors and observers in some sense stand outside that setting, that we categorize it as an instance of “reading group” before we are able to enter into it and participate. But this cannot be correct. Reading groups do not exist simply as things that are furnished by the natural world. Rather, they exist because we enact them. How can it be that such settings are at once things that we encounter and things that we create? In a recent paper, Heft (2018) briefly acknowledges this tension: [A] behavior setting is an emergent property of interdependent patterns of action over time among individuals and “milieu,” using Barker’s terminology. From this vantage point, it can be seen that a particular child does not enter into the behavior setting, as an individual might enter an enclosure, but rather that the child joins a behavior setting as a participant and in doing so contributes to its ongoing functioning.
16 Edward Baggs and Anthony Chemero This description still does not quite resolve the tension, however. To join a behavior setting is still to enter into something that is already ongoing. This still leaves unexplained how it is that new behavior settings ever emerge, as they must. Here it is again helpful to invoke the habitat–Umwelt distinction. The resolution lies in allowing that there is indeed structure in the habitat: behavior settings exist as discoverable, observable units of collective behavior. There are post offices and train stations, and French classes. But in the Umwelt we do not need to perceive that we are in a given behavior setting. Rather, the behavior settings that exist around us are what we grow into. The Umwelt of a given individual is shaped by the places where that individual dwells, and by the history of interactions that the individual participates in. Costall (1995) argued that the social does not have to be so much incorporated into the ecological approach as demystified. This argument remains a valuable one. In the Umwelt of a child, everything is already social. In fact, the social comes first. The structure of the child’s surroundings is, prenatally, entirely constituted by and mediated through its mother’s body. Following birth, the child’s surroundings are shaped by the ongoing social practices of caregivers and community. Only through many years of education and learning can the child come to master its settings and take control of its interactions in a manner we are inclined to label “adult.” In the case of language, we are both confronted with a pre-existing set of linguistic conventions and resources (a community of speakers of a language), and we are enactors of that language. A child language learner must learn to control her or his own linguistic behavior with reference to the activities she or he engages in, progressing from babbling to vocalizing words to skillfully using novel syntactic structures as a means of directing the attention of self and others. It is reasonable to say that certain aspects of a natural language are properties of the habitat: dictionaries, grammars, prejudices about particular dialects, and ways of talking, etc. We should not, however, jump to the conclusion that an individual’s “personal” language is simply a property of her or his Umwelt. Language is not simply “out there,” from the perspective of an individual speaker. It is not simply something we perceive, but also a form of acting (Baggs, 2015). As adults, our language is so thoroughly integrated into our self-regulation processes and thinking that it must be something other than simply a set of “external” resources. Take counting. When, as children, we learn to count, we start by reproducing an auditory sequence that is modeled by more skilled speakers around us, “one, two, three, four….” We reproduce this sequence in interaction with other people. Later, this same sequence can serve to structure our own activities: tracking sets of objects, telling time, measuring, etc. What began as a social phenomenon has entered into the process of self-regulated activity. In early childhood this behavior is
The Third Sense of Environment 17 enacted “out loud.” As the child becomes more skilled, she or he learns to count “in her or his head.” Language has to be “internalized,” in the sense that what starts off as an entirely public behavior in the child comes, over time, to be integrated into her or his self-regulation processes, in turn, enabling new behaviors that were not previously available (Vygotsky, 1978). The problem of explaining how the social comes to enter into individual self-regulation processes is the problem of explaining the qualitative shift that separates the human form of life from that of other species. The Ecological Approach as a Theory of the Structure of the Habitat All of this leads us to a point where we must re-evaluate exactly what Gibson is trying to do in The Ecological Approach to Visual Perception. Of course, in a basic sense, he is simply writing a book about visual perception. But in another sense, he is attempting something much more than that. According to Reed (1988, p. 2), Gibson concentrated for so many years on direct perception because he believed “that a breakthrough in the understanding of perceptual awareness and knowledge would carry in its wake a new approach to the whole of psychology.” Gibson’s followers have pursued this broader aim: the ecological approach as a new approach to psychology itself. Indeed, when Gibsonians eventually set up their own journal, they called it Ecological Psychology, the suggestion being, perhaps, that the approach had already matured into a fully elaborated scientific research program. Gibson himself, though, never used the term “ecological psychology,” at least not as far as we are aware, although he did contribute to a collection put together by his followers whose subtitle was ‘Towards an Ecological Psychology’ (Shaw & Bransford, 1977). It remains unclear in what sense it is appropriate to talk of an “ecological psychology” at all. In the opening sentence of his Principles of Psychology, William James provides the following definition: ‘Psychology is the Science of Mental Life, both of its phenomena and of their conditions’ (James, 1890). Following this definition, Gibson’s Ecological Approach can perhaps be read as an unprecedented account of the conditions of mental life: it is an account of the structure of things “out there”—the structure that an animal can potentially come into contact with. What it generally is not is an account of what animals actually do when they come into contact with their surroundings. Another way of putting this is to say that Gibson gives an account of the structure of the habitat. Surfaces, layout, and points of observation in an ambient optic array: these are all idealized descriptions of the structures available to a potential animal. Indeed, it is arguably the case that what Gibsonians have been doing ever since is describing the structure of behavior in the habitat. Take the well-studied example of braking to avoid a
18 Edward Baggs and Anthony Chemero c ollision (Fajen, 2005a). The aim of this research program has been to uncover optimal control laws or strategies that a person can use to control their braking. In other words, the aim is to characterize the information space of a task. Warren (2006) notes that this strategy imposes a limit on the level of predictive explanatory power that we can hope to achieve: “researchers may need to be satisfied with theories that capture the dynamical ‘deep structure’ of behavior—the morphology of attractors, repellers, and bifurcations for a given task—rather than one that can precisely predict individual behavior on particular occasions.” Another way to understand this description of the methodology in terms of “deep structure” is to appreciate that control laws reside in the habitat. What is not included is an account of what occurs whenever a particular living thing actually encounters some piece of worldly structure and attempts to act with reference to it (though see Fajen, 2008, for a discussion of learning in the specific case of braking to avoid a collision). In recent years, a number of authors have been attempting to combine Gibson’s insights with those of the enactivist approach instituted by Maturana and Varela (Maturana & Varela, 1980; Varela et al., 1991); the aim here is to establish a full-blown post-cognitivist science of the mind (Chemero, 2009; Di Paolo, Buhrmann, & Barandiaran, 2017; Fultot, Nie, & Carello, 2016; Hutto & Myin, 2017; McGann, 2014; van Dijk, Withagen, & Bongers, 2015). The central insight of the enactivist approach is that mind is a living process (Thompson, 2007). That is, mental activity is self-producing, in the sense that the organism produces and maintains a boundary between itself and the world; it is asymmetrical in the sense that the organism does something to its surroundings across the boundary that it has itself established; and it is normative in the sense that the animal acts in accordance with norms that are established, for example, by the biological need to act in an adaptive manner (Baggs, 2018; Di Paolo et al., 2017). The combined ecological–enactive program has yet to fully take off. It is possible that long-standing confusion over the scope of the ecological approach has been the problem here. It may be important to recognize that the bulk of research in the ecological approach has been targeted at the habitat, that is, at describing structure as it exists for an ideal member of a species. The enactive approach, by contrast, aims to provide a theory of the animal-in-its-Umwelt. Properly understood, there is no reason why these two approaches cannot inform and support one another. (For a longer version of this argument, see Baggs & Chemero, 2018.) Can we still be realists, if we are to talk about individual animals in their Umwelts? Gibsonians have always considered themselves to be realists, in the sense that perception–action is said to be targeted directly at structure “out there” in the world, not mediated via some mental copy of that structure. Enactivists, by contrast, typically align themselves with constructivism, which is an explicitly anti-realist position, according to which
The Third Sense of Environment 19 living organisms bring forth their surroundings from a subjective point of observation (Riegler, 2005). We believe it is possible to resolve even this contradiction. The solution will be to pursue a dialectical version of realism (cf. Chapter 6 of Di Paolo et al., 2017). Structure exists at the habitat level, and this structure exists independent of any single living organism. For a living organism, meanwhile, the world is viewed from a particular perspective. In the course of our activities we gradually uncover more of reality by a process of learning and interaction, and at the same time, through our activities, we alter the structure of the habitat for ourselves and those around us. The Umwelt is not an inner copy of the habitat, but it is the habitat considered from the point of view of the ongoing activity of a particular living organism.
Practical Ecological Interventions The success of a behavioral science research program should ultimately be measured by what it can be used for. The distinction we are proposing provides a useful way of categorizing the kinds of things we might be able to do with such a science. The habitat/Umwelt distinction implies two ways in which the ecological approach can inform practical interventions in everyday life. First, we can reconfigure the habitat in order to make it easier for actors to carry out some task. Second, we can reconfigure the animal by educating them to attend to their surroundings, their Umwelt, in a particular way, while leaving the habitat unchanged. We focus on a single example here: interventions designed to encourage safer road crossing. Existing work within the ecological approach has attempted to address this problem in the Umwelt. Lee et al. (1984) describe a training system to teach children how to safely cross existing roads. Lee and his colleagues set up a “pretend road” on the sidewalk alongside an actual road, and asked children to cross it “as if crossing the adjacent road in the face of oncoming vehicles.” The authors argue that this is a useful training regime, and they conclude that “children should be trained in crossing in the presence of traffic at an early age.” Here is a curious point. The empirical program in the ecological approach has largely targeted the habitat, as described above. When Gibsonians have attempted to apply their insights outside the laboratory, however, they have tended to treat the habitat as unchanging, and have instead targeted the Umwelt as the locus of intervention. Lee et al. (1984) treat roads as if they were a natural feature of reality to which children must be trained to adapt. An underexplored type of intervention is in the habitat. Marshall (2018) asks a useful question here: why are roads so much safer in Australia than they are in the United States, as measured by fatality statistics (the fatality rate in the US is more than twice as high)? The discrepancy appears to be down to differences in the structure of the habitat. Partly, it is argued, there is stricter enforcement of regulations in Australia. But the roads are
20 Edward Baggs and Anthony Chemero also physically designed in a different way in the two countries. In Australia there are a great deal many more roundabouts at places where in the US there would be a light-controlled four-way intersection. Roundabouts are inherently safer because they do not give rise to opportunities for the most severe types of inter-vehicle collisions: head-on and T-bone crashes (Rodegerdts et al., 2010). The roads in Australia are also generally less wide, meaning that pedestrians have less width to cross, and drivers are more cautious in their speed because a narrower travel corridor makes greater demands on drivers in terms of accurately controlling their position in the road. The tools of the ecological approach are well suited to capturing why certain features of road infrastructure are more successful, i.e., safer, than others. More effort should be devoted to exploring interventions in habitats outside the laboratory using these tools.
Conclusion In this chapter we have argued that Gibson’s distinction between the physical world and the animal environment should be refined. The latter term should be subdivided. We should appreciate the difference between the species-specific habitat and the animal-specific Umwelt. Making clear this distinction allows us to resolve several long-standing tensions in the field. The notions of affordance and information can be better understood if we allow that habitat resources are different from active relational engagements, and that arrays of energy are different from self-regulating organisms actively seeking out structured patterns. These notions are not in conflict with one another, but are complementary. Similarly, the phenomena of learning and social interaction can be better understood by keeping distinct the setting of behavior and the process through which that behavior comes to take shape over time as an individual becomes a skillful participant in a social practice. One last question: does this mean that we have to stop talking about “animal-environment systems?” If the term “environment” is ambiguous, then should we abandon it in favor of more accurate terms—“species– habitat” and “animal-Umwelt” systems? This does not strike us as an attractive conclusion. We like the phrase “animal-environment system.” We are used to it. It is part of the family. For the purposes of this chapter, we have avoided talking about “the environment.” But perhaps the term need not be problematic as long as its meaning is appropriately constrained by context.
Acknowledgments Edward Baggs was supported by funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No 706432. Anthony Chemero was supported by the Charles Phelps Taft Research Center.
2 The Triad of Medium, Substance, and Surfaces for the Theory of Further Scrutiny Tetsushi Nonaka
Logic works perfectly well once mankind has developed adequate language. But logic is helpless if it has to develop this adequate language … As for the formulation of adequate logic, there must be a language which does not impoverish the real situation. It is terrible that in our technocratic age we do not doubt the initial basic principles. But when these principles become the basis for constructing either a trivial or finely developed model, then the model is viewed as a complete substitute for the natural phenomenon itself. (from the Kyoto Prize Lecture by Israel M. Gelfand, 1989, p. 19. Copyright 2009, Tatiana V. Gelfand and Tatiana I. Gelfand)
Charles Sanders Peirce (1955, p. 6) once remarked that every great work of science affords some exemplification of the defective state of the art of reasoning of the time when it was written. Perhaps one of the great services of The Ecological Approach to Visual Perception (Gibson, 1979/2015) to psychological science was bringing to the fore a set of fundamental assumptions which have long been, and perhaps still are, invisible to psychologists who live amidst them. For generations, we have been taught that the apprehension of the world depends on the perception of space and time, and that “perception begins in receptor cells that are sensitive to one or another kind of stimulus energy” (Kandel, Schwartz, Jessell, Siegelbaum, & Hudspeth, 2013, p. 445). Such a narrative is so deep in psychology as to be almost unquestionable: Fish don’t talk about water. But what if these fundamental assumptions are irrelevant to the activity of perception that orients the organs of perception and explores the cluttered environment? What if our perceptual systems, through evolution, are coordinated to a particular scale of nature that is at a different level from the description provided by physics of atoms and objects in space? What if the level of the description of the world that our perceptual systems fit into is so unique that it deserves investigation in its own right? Gibson suggested that psychologists can approach the problem of perception in an entirely different light, once an adequate description of the
22 Tetsushi Nonaka environment to be perceived which does not impoverish the real situation has been developed. In Chapter 2 of his 1979 book, “Medium, Substances, and Surfaces,” Gibson attempted to develop a “new” description of the environment that our perception and behavior fit into. “It will be unfamiliar, and it is not fully developed, but it provides a fresh approach where the old perplexities do not block the way” (Gibson, 1979/2015, p. xi). The aim of this chapter is two-fold: to reflect on the development of a “new” description of the environment—the triad of medium, substances, and surfaces, and to highlight its significance to the study of perception and behavior.
Useful Vision Almost immediately after the publication of The Senses Considered as Perceptual Systems (Gibson, 1966a), Gibson began working on the revision of The Perception of the Visual World (Gibson, 1950a), which, as we now know retrospectively, ended up in an entirely new book (Gibson, 1979/2015). In the James J. Gibson papers (#14–23–1832) at the Division of Rare and Manuscript Collections of Cornell University Library, New York, there is a folder entitled “Notes for Revision of Visual World,” which contains over 50 pages of handwritten notes for the “new book” written by Gibson in 1967. One of the notes reads as follows: For New Book (“Useful Vision”?) The ability to distinguish environmental substances, the forms of matter, the material composition of the world around us, what things are made of—the ability to distinguish among the main types and to identify earth, clay, rock, vegetation, bark, leaves, fur, feathers, skin— this capacity is extremely important to animals and the pickup of such information does not depend on a single sense organ or perceptual system. These natural substances should not be thought of chemically but ecologically. They are important to animals for what they afford. They are specified (1) by the texture of the surface and the pigment color of it (even better than by the “shape” of the “object”). They are specified (2) by their effluvia, if volatile. And they are specified (3) by their specific gravity (density), by hardness-softness (resistance to deformation) roughness-smoothness, and thermal conductivity. Accordingly, they can be identified by seeing the surface, by smelling the substance, and by palpating (“feeling”) the substance … (Gibson, 1967b) Although this handwritten memo was probably not intended for publication, its message is clear. A theory of visual perception, first and foremost, needs to account for “useful vision.” It must account for our remarkably
Medium, Substance, and Surfaces 23 veridical ability to deal with the surfaces and substances, objects and events of the environment that we are, like it or not, obliged to cope with and use as a species and as individuals (E. J. Gibson, 1994, p. 503). What different types of clay, rock, vegetation, bark, leaves, fur, feathers, or skin afford to us may not be distinguishable at first glance. But they can potentially be identified through the activity of perceiving—looking around, palpating, listening to, or sniffing them. Unlike neural signals converted from stimuli impinging on receptors, the activity of perceiving, involving adjustments of organs, is a function of a set of meaningful properties of the environment that are selectively attended to by animals. Naturally, an effort to study the activity of perception thus conceived demands the description of the functional referent of perception (Holt, 1915). Otherwise, it would be like watching a tennis match with half the court occluded from view. To study useful dimensions of perception, the next logical step would be to find out the adequate level of the description of the world that our perception and behavior fit into. An attempt to develop a “new” description of the environment, it seems, was a natural consequence of Gibson’s effort at understanding useful vision.
The Triad It is likely that Gibson wrote the first draft of Part I of the new book— “The Environment to be Perceived”—between January and November in 1971, and came up with the title of the book, An Ecological Approach to Visual Perception (albeit beginning with “An” instead of “The”) during this period. On January 12, 1971, Gibson (1971a) sent a tentative outline of the new book to the editor at the Houghton Mifflin Company that was quite different from the final version of the book: the book was then entitled Everyday Visual Perception, and the Part I of the book was “A New Theory of Perception,” for which Gibson left notes that contrast a theory of sensation-based perception and that of information-based perception (e.g., Gibson, 1967c). 1971 seems to have been a key year in the development of ideas that culminated in Gibson’s (1979/2015) final book. It was when a series of notes on affordances (Gibson, 1982a) and “Do We Ever See Light?” (Gibson, 1971b) that later constitute the important parts of the 1979 book were written, and the phrase “an ecological approach to visual perception” appeared in a note (Gibson, 1971c). The use of the term “ecological psychology” by Gibson is also found in the note entitled “A Preliminary List of Postulates for an Ecological Psychology,” written in June, 1971 (Gibson, 1971d). After writing a series of notes, by November 1971, Gibson had written up an early version of the new Part I, “The Environment to be Perceived,” which included all the basic contents of the finished version of the part, including the triad of medium, substances, and surfaces and the nine ecological laws of surfaces (Gibson, 1971e).
24 Tetsushi Nonaka Normal perception involves the possibility of further exploration, which we are aware of whether or not the possibility is taken advantage of (Gibson, 1978a). But, what makes us aware of the possibility of further exploration in the first place? What makes us aware of the layout of the environment in and out of sight? What makes it possible for animals to discover the potentially meaningful features of the environment that have not yet been taken advantage of? These are the questions that share the same fundamental issue which cannot be resolved without restoring the active observer to the world in a way physics never did (Gibson, 1973). Gibson’s following thought experiment illustrates well what gets lost in the description of the world by physics in terms of space, time, matter, and energy: What if a wholly passive animal were in a wholly frozen world? Or, conversely, what if an animal were in “an environment that was changing in all parts and was wholly variant, consisting only of swirling clouds of matter” (Gibson, 1979/2015, p. 10)? In both cases, it would be impossible for the animal to disentangle a set of variables that are specific to the world out there (i.e., independent of the point of observation). These hypothetical worlds are not the environment for perceiving animals. But, note that “in both extreme cases there would be space, time, matter, and energy” (p. 10). Restoration of the active observer that scrutinizes the environment requires a fundamental reworking of the description of the world, or in Gibson’s (1971f ) words, “the permutation of the orchard with Newton’s apple!” The crux of this permutation was the replacement of matter and bodies in empty space with the triad of (1) medium (the gaseous atmosphere); (2) substances that are more or less substantial; and (3) surfaces that separate the substances from the medium (Gibson, 1979/2015, p. 27).
The Medium The recognition of the air as a medium (for terrestrial animals) allows the distinction between potential and effective stimulation. Radiant energy, acoustic energy, and chemical energy are propagated through the medium, which provides the ambient sea of stimulus energy in which animals can move about. Instead of inquiring whether one model of inferring the causes of sensation aroused by stimuli is better than another, with the notion of medium, we can now begin to study activity before sensations have been aroused by stimuli, an activity that orients the organs of perception and explores the sea of potential stimulation for the information external to the perceiver (Gibson, 1982b, p. 398). Unlike points in space defined by an arbitrary frame of reference, the ambient energy array surrounding each potential point of observation is unique (Gibson, 1979/2015, p. 13). As the observer moves from one point of observation to another, the optical array, the acoustic array, and the chemical array are transformed accordingly (p. 13). This provides the opportunities for an active observer to
Medium, Substance, and Surfaces 25 move in the medium to detect invariants underlying the transforming perspectives in the ambient array surrounding a moving point of observation. Among the recent advances that have furthered our understanding of the notion of medium for perceiving animals is the insight provided by Turvey and Fonseca (2014), whose research, probably for the first time since Aristotle (1907), brought to light the problem of medium in haptic perception. They hypothesized that interconnected structural hierarchies composed of tensionally prestressed networks of our bodies that span from the macroscale to the microscale—from muscles, tendons, and other connective tissues to various micro-elastic structures such as a network of collagen fibers—constitute the medium for the haptic sense organs of animals (Turvey & Fonseca, 2014). Like the air being the medium for sound, odor, and reverberating flux of light, despite being on the other side of the skin, the presence of isometric tension distributed throughout all levels of interconnected, multiscale networks make available the opportunities for an active perceiver to spontaneously transform the distribution of forces throughout the tensionally integrated system in such a way as to detect the invariant patterns that specify the source of mechanical disturbances. Turvey and Fonseca’s (2014) rediscovery of the medium of haptic perception resonates with the recent surge of interest in the mechanical basis of information and pattern formation in a wide range of fields— mechanobiology, soft robotics, sensory ecology, and rheology (e.g., Hanke, 2014; Ingber, 2006; Iwamoto, Ueyama, & Kobayashi, 2014; Rieffel, Valero-Cuevas, & Lipson, 2010). Because the form of any structure, whether a vortex flow of water or a living tissue, is determined through a dynamic interplay of physical forces, the distinct pattern of forces characteristic of a mechanical disturbance may convey a physical form of information that constrains perception and the behavior of an agent (Ingber, 2005). One good example of this is the hydrodynamic perception by aquatic animals (Hanke, 2014). Harbor seals, for instance, are known to use their vibrissae to haptically discriminate the water movements left behind by prey or predator that have passed by at an earlier point in time, and perceive the motion path, size and shape of the object that caused the trail (Hanke, Wieskotten, Marshall, & Dehnhardt, 2013) (Figure 2.1). A point worth emphasizing is the fact that although the informative patterns of water movement are there to be perceived by an animal, there are many reasons that the animal may not attend to the information. The harbor seal running away from the white shark may not attend to a pattern of water movement that specifies the presence of salmon that can be preyed upon. Near the surface of clear water during daytime the animal may attend to optical information without taking advantage of hydrodynamic information. The notion of medium makes possible for us to recognize this distinction between the existing information available to the animal and the information selectively picked up by the perceptual activity of the animal.
26 Tetsushi Nonaka
Figure 2.1 Water velocity 60 s after an 86 mm-long fish (Lepomis gibbosus) passed the area. Bold arrow indicates swimming direction. Source: From Bleckmann et al. (2014): Figure 1.4a, adapted with permission from SpringerVerlag.
According to Gibson (1979/2015, p. 13), “if we understand the notion of medium, … we come to an entirely new way of thinking about perception and behavior.” Recent years have witnessed a growing interest in the problem concerning the laws of variation in efficient exploratory behavior to obtain the information about aspects of the environment relevant to the task at hand (e.g., Boeddeker, Dittmar, Stürzl, & Egelhaaf, 2010; Nonaka & Bril, 2014; Stephen, Arzamarski, & Michaels, 2010; Viswanathan, Da Luz, Raposo, & Stanley, 2011). In general, exploration requires fluctuations, and fluctuations increase in time. A growing body of research suggests that the fluctuations in exploratory behaviors exhibit the property of superdiffusion, where the fluctuation grows faster than normal diffusion governed by a Gaussian probability density function (Nonaka & Bril, 2014; Stephen et al., 2010; Viswanathan et al., 2011). Nonaka and Bril (2014) studied the exploratory movement of expert stone beads craftsmen in India who shape a bead by a series of hammer strikes on a stone held against the pointed tip of an iron bar (Figure 2.2). In the field experiment, the craftsmen shaped the ellipsoidal beads made of two different materials (carnelian stone—a familiar material, and
Medium, Substance, and Surfaces 27 glass—an unfamiliar, much more fragile material) in the workshops where they normally work. The use of the novel material must require the acute sensitivity to the properties of the material, where the finer the exploration, the better the probable outcome of the activities that follow. In the exploratory tapping movement of the craftsmen during the preparatory phase of the task, they found (1) the presence of long-range correlations where the variance of the displacement time series of the hand wielding the hammer grows superlinearly in time; and (2) underlying multiplicative interactions between fluctuations at different temporal scales indicated by the heterogeneity of scaling properties over time. When faced with the unfamiliar condition using unusual, fragile material, the exploratory hammer tapping movement of highly skilled experts who were able to cope with the situation exhibited a pronounced increase in the long-range temporal correlations. In contrast, the wielding behavior of less skilled experts—those who could not shape the glass beads— exhibited a significant loss of long-range correlations and reduced heterogeneity of scaling properties over time, which robustly discriminated the groups with different skill levels (Figure 2.2). Alterations in multiscale temporal structure of movement fluctuations were apparently associated with changes in the situation differently depending on the level of expertise (Nonaka & Bril, 2014). The empirical evidence derived from this field experiment, albeit a special case of an unusually complex skill, may well serve the purpose of constraining the possible accounts of active touch. Traditionally, active touch is explained by the activity of neurons that compare central motor commands with peripheral sensory feedback during manual exploration (Kandel et al., 2013, p. 524)—an account which exclusively focuses on sensory receptors and the nervous system without reference to the architecture of the body where they are embedded. But the problem expert craftsmen face is unlikely to be that of associating central and peripheral signals. The presence of nonlinearity arising from multiplicative interaction across fluctuations at different timescales would greatly complicate such a process, with no simple correspondence between the central motor command, the generated movement, and the peripheral sensory feedback that arises as a consequence of the movement. Instead, the result is a much better fit to the alternative scenario of active touch that takes into account the medium for the haptic perceptual system (Turvey & Fonseca, 2014), in which efficacy of active touch depends on the tuning of the whole system including the multiscale tensile states of the body, the structures of which are transformed by exploratory behavior in such a way to discriminate the invariant patterns that specify the source of mechanical disturbances from all the other patterns that do not specify the source (Gibson, 1966a, p. 55).
Source: From Nonaka and Bril (2014). Adapted with permission fromthe American Psychological Association.
Figure 2.2 (a) Typical posture and movement of craftsmen during stone bead production. (b) Examples of ellipsoidal glass beads produced by expert (HQ) and non-expert (LQ) craftsman. (c) Singularity spectrum f(a(q)) (–1.4 ≤ q ≤3) estimated for expert (HQ) and non-expert (LQ) craftsmen in the conditions using carnelian stone (black spheres) and glass (gray spheres) as raw materials. The vertical and horizontal bars are standard errors of the means for f(q) and a(q), respectively, of multiple realizations of each condition.
Medium, Substance, and Surfaces 29
Substances and Surfaces Substances refer to the portions of the environment that are in the solid or semisolid state, which are structured in a hierarchy of nested units—such as rock, soil, sand, mud, clay, oil, tar, wood, minerals, metal, and the various tissues of plants and animals (Gibson, 1979/2015, p. 15). Unlike the medium, the substances differ in all sorts of ways—in hardness, resistance to flow, mass per unit volume, resistance to breaking, the tendency to regain the previous shape after deformation, the tendency to hold the subsequent shape after deformation, or the degree to which they absorb light (p. 16). Different substances have different biochemical, physiological, and behavioral effects on the animal. For example, the affordances of a substance for human behavior, such as making useful things, depends on these properties of substances. The ability to distinguish among the different substances is extremely important to us, and we often succeed in picking up the relevant information through the process of scrutiny. The central question is: where is the action in our scrutiny of the substances of the environment (E. J. Gibson, 1994, p. 503)? Brick, for example, can be visually or haptically inspected. But no one has ever seen or touched the inside of a brick (Feynman, 1985). Every time you break the brick, you only see or feel its surface. Nevertheless, the surface of a brick has a characteristic texture that specifies what it is made of, and can be distinguished from other substances by seeing and touching its surface. Likewise, “the surfaces of the substances from which primitive men fashioned tools have different textures—flint, clay, wood, bone, and fiber” (Gibson, 1979/2015, pp. 21–22). The surface is something of fundamental importance to our perception, especially to visual perception. It is not only because we cannot see the inside of a thing, but also because our vision fails without the illuminated surfaces of a thing. In fact, all we ever see is the surfaces of substances, and the only way we see illumination is by the way of “the surface on which the beam falls, the cloud, or the particles that are lighted” (p. 48; see also Carello & Turvey, Chapter 4, in this volume). There is experimental evidence that seeing the surfaces depends on the structure of the ambient array which has different intensities in different directions (Gibson, Purdy, & Lawrence, 1955). There is also evidence that the eye would be unfocusable in homogeneous ambient light (i.e., in the unusual case where light that surrounds a point of observation would not be different in different directions). In consequence, “the possessor of the eye could not fix it on anything, and the eye would drift aimlessly” (Gibson, 1979/2015, p. 47). Our visual system can only be adjusted to and oriented to surfaces, and this ability relies on the fundamental fact that “all persisting substances have surfaces, and all surfaces have a layout”—the first of the nine ecological laws of surfaces proposed by Gibson (1979/2015, p. 19). Furthermore, “the surface of a substance is where a mechanical action like collision is
30 Tetsushi Nonaka located, where chemical reactions take place, where vaporization occurs, or solution, or diffusion into the medium” (p. 86). Then, the answer to the question raised in the beginning of the foregoing paragraph has to be as follows: The surface is where the action is—where the activity of perception is focused, adjusted to, and oriented toward. We pay great attention to the layouts of substantial surfaces and what they afford. To paraphrase Reed (1996b, p. 122), for many centuries, human beings have been chopping, cutting, tying, molding, sharpening, honing, dyeing, shaping, etching, washing, scrubbing, brushing, sweeping, raking, shaving, ironing, mowing, polishing, scraping, grinding, and much more. These are operations that modify the layout of a substantial surface so as to alter its affordance for human life—its utility or function—which turn rough into smooth, and make available the edge or the vertices that affords cutting or piercing. The general problem of how we perceive the meaningful features of the substances and surfaces of the environment is an important problem of psychology. There is, however, a lack of experiments that investigate this rich behavioral repertoire involving the scrutiny of substantial surfaces of the environment. While experimental testing of behavior with virtual surfaces displayed in the monitor are becoming increasingly common, it is still rare to discuss the ways in which our perception and behavior differ depending on the reality of the surfaces involved (i.e., real, material surfaces vs. virtual surfaces). In what follows, drawing on the research on the ancient skill of stone tool-making that takes advantage of fracture mechanics exhibited by a specific type of brittle substance, I hope to illustrate how the focus on substances and surfaces could shed light on the important problem of human cognition, and how it could expand the intellectual horizons of ecological psychology by connecting with other relevant developments in understanding human ways of life.
Flaking Stone The earliest known evidence of the alteration of a surface layout by humans to change its affordances is the modification of a natural, hard, and rigid mineral material (cobbles, pebbles, and rock fragments) by means of percussion with hard stone hammers (Režek, Dibble, McPherron, Braun, & Lin, 2018). Thanks to its excellent durability, hard stone records very old traces of physical actions applied to it (Pelegrin, 2005). The archeological records clearly show that by around at least 2.6 million years ago (and likely much earlier, e.g., Harmand et al., 2015; McPherron et al., 2010), early human species were already habitually fracturing stones so as to alter their utility or function—stone knapping, as archeologists call it. From the very beginning, the aim of stone knapping was to obtain a specific layout of surfaces—the razor-sharp edges—that afford a function which is absent or extremely rare in the natural world: the cutting function (Roche,
Medium, Substance, and Surfaces 31 lumenschine, & Shea, 2009). This is inferred from the fact that stone B fragments (called flakes), detached from a block of stone (called a core) found in the old archeological sites, unequivocally display the characteristic layout of surfaces resulting from the specific fracture mechanism called a conchoidal fracture. This mechanism leaves razor-sharp cutting edges and conspicuous bulbs of percussion on the fracture plane that are unlikely to have been formed naturally (Roche, 2005; Semaw et al., 1997) (Figure 2.3a). Conchoidal fracture refers to the phenomenon producing a Herzian cone. It arises from the fracture of a specific type of brittle substance—a homogeneous and isotropic crypto-crystalline structure (e.g., flint, fine- grained silicified sandstone) or glasses (e.g., obsidian) with no preferred planes of weakness (Pelegrin, 2005). Conchoidal fracture requires loading at a point near the angular edge of the block of raw material with its
Figure 2.3 (a) Ventral surface of a flake detached by conchoidal fracture. (b) Flake terminology. Source: (a) Adapted from “Lithic flake on flint, with its fundamentals elements for technic description,” Copyright 2006, José-Manuel Benito Álvarez. From José-Manuel Benito Álvarez, Wikimedia (https://creativecommons.org/licenses/by-sa/2.5/legalcode).
32 Tetsushi Nonaka e xterior angle (called the exterior platform angle) less than 90 º (Figure 2.3b), and the to-be-flaked exterior surface (called the flaking surface) needs to be flat or slightly convex to allow for the propagation of the energy transmitted by the loading event (Pelegrin, 2005). Imagine you are a prehistoric human. What would you do to obtain precious cutting tools by fracturing stones? First, you need to look for natural homogeneous, isotropic, and brittle material that can be fractured conchoidally to obtain razor-sharp cutting edges, and the hammerstone that is suitable for aimed striking. In addition, the raw material needs to have a peculiar natural layout of surfaces with angular edges and a more or less flat surface (e.g., cobbles with a flat surface as opposed to a convex one) required for conchoidal fracture. Then, in order to make the most of the precious raw material (i.e., to produce as many usable flakes as possible), you need to make sure that the specific layout of surface of the core that allows further flake removals is preserved after each flake removal. The foregoing is what has been found in the in-depth analysis of the traces of stone tool-making behavior by early humans who lived near the western margin of present-day Lake Turkana in Kenya—the archeological site of Lokalalei 2C—around 2.34 million years ago (Delagnes & Roche, 2005; Roche et al., 1999). Surprisingly for stone fragments that are so old, it was possible to re-fit many of these pieces found in in Lokalalei 2C back together into an original cobble. By re-fitting the flakes, Roche and her colleagues examined the sequence of flake removals—the order by which these flakes were originally removed (Delagnes & Roche, 2005). An early stone knapper who fractured the cobble in Figure 2.4, for example, selected a fine-grained phonolite cobble that has a flat surface (Face A) as opposed to a highly convex surface (Face B) resulting in edges with acute angles around the flat side of the cobble (indicated by the gray dotted line around Face A). Roman numerals in Figure 2.4 present the order of a series of flake removals, and the arrows show the direction of strike with a hammerstone. Even to a novice’s eyes, it is obvious that the knapper obtained these flakes not randomly but by following a certain set of rules. For example, the knapper carried out all the flake removals on the flat face (Face A) but the final attempt (V on Face B), and aimed the blows at the edges with acute angles shown by the gray dotted line. In addition, after a series of flake removals, the knapper switched to the opposite edge on the same face, and alternately struck flakes from the two opposing edges (Figure 2.4). Had flaking been carried out from one direction, the flat surface would have been quickly lost and the remaining core would have been wasted just after a couple of flake removals. In this example, the knapper maintained the surface flat by alternating the direction of strike, thus providing the opportunities for further flake removals to the very end. It was also found that the cores and flakes in Lokalalei 2C do not show any impact damage from failed percussions, such as might be caused by faulty estimation of the opportunities for flaking (Delagnes & Roche,
Medium, Substance, and Surfaces 33 /s
/ /// /// /// &ĂĐĞ
Figure 2.4 Re-fitted elongated ovate fine-grained porphyritic phonolite cobble from Lokalalei 2C. Flaking was carried out on a large and relatively flat natural face (Face A) from the longest available edge and a shorter adjacent edge, which are the only portions of the perimeter of the core with suitable natural striking angles (gray dotted line). The series of flakes were alternately struck from these two edges. Source: From Delagnes and Roche (2005), Figure 8. Adapted with permission from Elsevier.
2005). Such systematic flake production, seemingly following a certain set of rules (called débitage), was not the result of trying out multiple ways to flake stones and at last succeeding in some way.
Seeing the Outcome at the Outset How might we understand such regularity apparent in the archeological record of flake removal? Delagnes and Roche (2005, p. 465) wrote, “the débitage of successive series clearly hinges upon conscious planning.” But even if such an operational sequence was consciously planned in detail, if the outcome of each strike was unpredictable and out of control, it would have been impossible to carry out such an orderly sequence as shown in Figure 2.4. The French archeologist Texier (1995, p. 652), addressing this
34 Tetsushi Nonaka point, once wrote, “if the knapper is able to organize a débitage, he has already a good skill and knows exactly the consequences of a strike given to such a core.” The fundamental question, then, is not so much how the early tool-makers consciously planned the order of the core reduction sequence, but how they foresaw the outcome of each hammerstone strike at the outset. Without this ability to foresee and control the path of fractures that would result from each strike, a phenomenon such as organized débitage could never have been realized. Nonaka, Bril, and Rein (2010) systematically tested the skill of modern stone knappers to foresee and control the path of fracture that would result from their own flaking actions. Nonaka et al.’s hypothesis was that expert knappers could foresee and control the outcome by tuning their actions to the lawful regularities that exist in a conchoidal fracture. Previous conchoidal fracture experiments using a protocol in which a steel ball is dropped on plate glass documented an invariant relation among particular variables (Dibble & Pelcin, 1995): The size of the flake depends on the combination of two variables—the angle of the edge and the distance between that edge and the point of the percussion (see Figure 2.3b). The force with which the hammer strikes the core does not affect the flake size once it is above the threshold of fracture initiation (i.e., the minimum amount of force needed to remove a flake of a given size). The threshold of fracture initiation depends on flake size, which in turn depends on the striking location (Dibble & Režek, 2009). Although they are by no means the only variables that affect the outcome of conchoidal fracture, importantly, these variables are all under the direct control of the knapper. Do knappers, prior to the detachment of a flake, attend to such lawful relations so as to foresee and control the consequence of a strike given to a core? In the experiment, participants—including prominent replica craftsmen—were asked to draw the outline of the fracture path on the surface of a flint core expected to result from the blow they would deliver at the core (Nonaka et al., 2010). After drawing the outline of the fracture path, the participants were then asked to proceed to actually fracture the stone as they had expected. The result of the experiment indicated that the task was surprisingly difficult. Most knappers who could produce usable flakes could not control the path of the fracture resulting from the strike given to a core. But there were a few expert knappers who proved capable of controlling the fracture path almost exactly as they had expected. Confirming the aforementioned hypothesis, it turned out that the outline of the fracture path drawn prior to the flake removal by those experts who succeeded in the task already exhibited the lawful relation among the three variables— the angle of the exterior edge, the distance between the point of percussion to the exterior edge, and the flake dimensions that reflect the constraints of conchoidal fracture (see Figure 2.3b). In contrast, no such relation appeared in the outlines drawn by the other knappers. The result suggests
Medium, Substance, and Surfaces 35 that those who are able to control conchoidal fracture are focusing their attention to specific salient features of the layout of stone surfaces that are lawfully related to the constraints of fracture mechanics. Simply put, novices intended to detach rather impossible flakes, while experts intended to detach feasible flakes from the outset. It was further found that the experts hit the core with a lower kinetic energy of the hammerstone at impact than the nonexperts, and that only experts varied the kinetic energy of the hammerstone at impact in relation to the to-be-detached flake size (Nonaka et al., 2010). Experts were aware of the properties of a core and the mass of the hammerstone at hand in the sense that they detached flakes without excessively overshooting the required kinetic energy at impact. This means that those who controlled the conchoidal fracture were tuning their action into yet another higher- order functional relation among the relevant variables of surface layout of the core, potential flake size, and the required kinetic energy determined by these variables, which also depended on the mass of the hammerstone. In summary, foresight in the control of stone flaking was shown to depend on the continuous participation of the knapper’s behavior in the lawful regularity of conchoidal fracture. This participation, in turn, is made possible by the perceptual attunement of the knapper to discriminate a specific layout of surface of the core that has a consequence on the future fracture event, which is observable but not easily attended to by everyone (Nonaka et al., 2010). The path of fracture resulting in a flake is entailed by the natural unfolding of the system of which the behavior of the knapper is a component part (cf., Stepp & Turvey, 2010). In other words, foresight in this ancient skill is a matter of focusing the perception and behavior of an actor on the informative structure that specifies the inevitability of the environmental event in which the actor takes part. This study directs us to reconsider the primacy of the rich opportunities provided by the substances and surfaces of the environment, toward the use of which our perceptual and action systems continue to evolve and develop (Nonaka, 2012).
Epilogue A “new” description of the environment to be perceived—the triad of medium, substances, and substances that allows for both persistence and change—provides a principled way to frame the existing possibility of further exploration, of scrutinizing, or of looking more carefully to extract invariants (Gibson, 1978a). Without taking this possibility into account, the activity of perceiving would easily get confounded with the activity of guessing which occurs in a rather atypical situation where further scrutiny is wholly prevented. This confusion, in turn, would lead to the reduction of the laws of perception to those of guessing. Luminous, mechanical, or chemical energy is structured by the substantial environment and becomes ambient in the medium. The ambient sea of
36 Tetsushi Nonaka energy around each of us is usually very rich in what we call pattern and change, which provides the inexhaustible reservoir of potentially informative invariants that lies open to further scrutiny (Gibson, 1979/2015, p. 233). At this level of description of the environment, what has been known tacitly is made explicit: The activity of perception is “open-ended,” and you can keep discovering new features and details about the environment by the act of scrutiny (p. 245). Unlike guessing based on a few cues or clues, normal perception is not based on “going beyond the data,” as long as one can look again, or go back and look again (Gibson, 1978a). What the triad of medium, substances, and surfaces offers us is a theory of unlimited further discovery for perception. We will have to make a fresh start.
Acknowledgments I thank Tatiana V. Gelfand for giving me the permission to quote I. M. Gelfand’s words from his Kyoto Prize Lecture as an epigraph. The writing of this chapter was in part supported by JSPS KAKENHI Grant Numbers JP18K12013 and JP18KT0079 from the Ministry of Education, Science, Sports and Culture, Japan, awarded to Tetsushi Nonaka.
3 Ecological Interface Design Inspired by “The Meaningful Environment” Christopher C. Pagano and Brian Day
Physics, optics, anatomy, and physiology describe facts, but not facts at a level appropriate for the study of perception. In this book I attempt a new level of description. (Gibson, 1979/2015, p. xi)
James J. Gibson’s 1979 book, The Ecological Approach to Visual Perception, remains the seminal piece for the development of ecological psychology. In Chapter 3, “The Meaningful Environment,” Gibson describes the environment in which animals perceive and act. The primary aim of the chapter is to develop an ecological nomenclature for surface layout and objects. Gibson believed that the environment is comprised of surfaces and mediums, which he describes in Chapter 2 of his book (see Nonaka, Chapter 2, in this volume), rather than an environment comprised of formal plane geometry. This distinction is very important for the development of an ecological psychology because it describes the environment as something that is both acted upon and acted within by an organism. In later chapters, he will define the concept of an affordance as a mutual relationship between the organism and the environment (see also Chemero, 2003; Stoffregen, 2003a; Turvey, 1992). Chapter 3 of Gibson’s book sets up the possibility of that mutuality by defining one half of that coupling: the environment. In what follows we discuss some of the important ideas in Gibson’s chapter that have been refined and extended in the decades following the original publication of the book. We relate these ideas to selected topics within human factors, with an emphasis on ecological interface design.
The Meaningful Environment What is typically seen as the principal contribution of Gibson’s ecological approach to perception is that his approach makes direct perception possible; that aspects of the world, such as its three-dimensional nature, can be perceived without mental representation or mental inference despite the
38 Christopher C. Pagano and Brian Day seemingly destructive mapping from the world to the two-dimensional retina (e.g., Lombardo, 1987). This point alone, however, can be made using traditional plane geometry and traditional physics (e.g., texture gradients, optic flow, etc.). In Chapter 3, “The Meaningful Environment,” Gibson goes beyond overcoming the destructive mapping for plane geometry and he sets up the argument that what is available in the stimulation that the eye receives is not just information that is lawfully related to the actual distances and sizes of elements in three-dimensional space. Rather, this stimulation is lawfully related to the meaning that the terrain, inclines, enclosures, objects, etc., hold for an organism that interacts with those surfaces and objects. Gibson’s push toward meaning is made clear from the first few lines of the chapter, when he states, “If what we perceived were the entities of physics and mathematics, meanings would have to be imposed on them. But if what we perceive are the entities of environmental science, their meanings can be discovered” (1979/2015, p. 28, original emphasis). Rather than establishing a viable direct realism for perceptual properties that can be described by formal plane geometry, which would in itself be a daunting task, Gibson creates a direct realism for a meaningful environment. Indeed, defining the environment as meaningful is necessary for his direct perception. Meaning is given ontological status; it is part of the actual surrounding world rather than existing merely within the mind of the perceiver. Gibson’s description of ecological reality is a fitting way to end Part I of the book, which focuses on the environment that is coupled to the organism and perceived by the organism. The aim of Gibson’s third chapter is to highlight that the environment, comprised of surfaces, is laid out in a way that has inherent meaning (i.e., affordances) for the behavior of humans and animals. This approach contrasts starkly with previous conceptions of the environment as comprised of the abstract concepts of mathematical space to which meaning must be added by the observer. Without a terminology to describe the inherently meaningful environment to be perceived, the development of direct perception in Part II of the 1979/2015 book would not be possible. Gibson conceived of the visual system as the eyes in the head on a body supported by the ground (e.g., Gibson, 1966a), but Chapter 3 stresses that this system is embedded in an inherently meaningful environment. If vision is for exploration and discovery, then the logical starting place is to describe what it is about the environment that needs to be discovered. Gibson provides this description of the environment in Chapter 3. By developing an ecological nomenclature for surface layout and objects, Gibson establishes that the object of perception is a meaningful environment. Gibson defines the environment in such a way that it can serve as one half of the animal-environment mutuality that he describes in later chapters. He goes beyond a direct perception of surface geometry by incorporating a requisite direct perception of meaning. In later chapters,
Ecological Interface Design 39 the “meaning” of environmental surfaces described in Chapter 3 will come to be understood as affordances (see Wagman, Chapter 8, in this volume). It is often assumed that Gibson’s idea of direct perception made possible his concept of affordances. However, Chapter 3 makes it clear that it is the meaningfulness of the environment that makes direct perception possible. Since this meaningfulness is inherent in affordances, it is actually Gibson’s concept of affordances that makes direct realism possible. That affordances precede direct perception is illustrated by the way some researchers have accepted a rudimentary concept of affordances, and applied it fruitfully, while continuing to reject the possibility of direct perception (e.g., Norman, 2013). Conversely, it does not seem possible to accept Gibson’s direct realism and reject the possibility of affordances. For Gibson, ecological reality consists of meaningful things to be discovered. Gibson consistently uses the word “discovered” to illustrate how meaning is uncovered in an environment, rather than having to be imposed upon an environment by the observer. Situating the environment as inherently meaningful lays the basis for a claim he makes later that value is inherent in the animal’s relationship with the environment: “The perceiving of an affordance is … a process of perceiving a value-rich ecological object. Any substance, any surface, any layout has some affordance for benefit or injury to someone. Physics may be value-free, but ecology is not” (Gibson, 1979/2015, pp. 131–132). Recognizing the environment as meaningful and describing aspects of the environment (e.g., surfaces) that can be perceived lays the groundwork for affordances. In Chapter 3, Gibson uses the concept of affordances to discuss the opportunities that are available to an actor in an environment. The main aim of his chapter is to offer a description of the environment that leads logically to the concept of affordances so that his direct realism may follow.
A Nomenclature for Surface Layout Gibson contrasts the words used to describe a geometric layout with those used to describe an ecological environment or habitat to move from abstract geometry toward a surface geometry. For example, Gibson proposes a terminology for a theory of surface layout that includes ground, open environment, enclosure, detached and attached objects, edges, and place, among others. He then picks out certain features of the environment that are pertinent to all animals that use visual information to locomote and otherwise act upon the environment. Many of these features of the environment have been studied extensively, such as the ground (Matthis & Fajen, 2014), stairs (Konczak, Meeuwsen, & Cress, 1992; Warren, 1995), doorways and apertures (Wagman & Malek, 2008, 2009; Warren & Whang, 1987), and gaps in the support surface (Burton, 1992). Place has received much less attention. However, recently there has been a renewed call to subject place to the
40 Christopher C. Pagano and Brian Day same empirical scrutiny and investigation as other environmental features (Heft, 2018). Gibson also touches on fire and water. He treats fire as an ecological event, connecting it with his descriptions of other events that urge moving from a conception of time passage to that of event passage (Gibson, 1975; James, 1892). This shift is consistent with his call to move from describing the environment with ecological terms rather than with geometric terms. For Gibson, fire is simultaneously a substance and an event that has various meanings and uses. The same is true for water, which can be considered as a surface, a medium, or a substance, endowing it with a wide range of meanings. His discussion of fire and water illustrates that many aspects of our environment do not have a single classification, meaning, or value (a point Gibson makes explicitly in Chapter 3, when discussing objects). The fact that aspects of our environment take on many meanings and values is the perfect complement to the fact that animals can act in many different ways. The flexibility of action is a hallmark of human and animal agency, and its perfect complement is found in the ecological environment. In the example of fire, Gibson states, “There is a gradient of danger and a limit at which warmth becomes injury. The controlling of fire entails the control of motor approach to fire and the detecting of the limit” (1979/2015, p. 33). This type of thinking inspired Gibson’s ideas concerning the margin of safety and time to contact. In fact, a field theory of such gradients was discussed as early as Gibson and Crooks (1938).
The Margin of Safety An important aspect of what makes the environment meaningful has to do with hazards and the danger that they afford. In the middle of his discussion regarding detached objects, Gibson inserts an aside concerning “The Detecting of a Limit and Margin of Safety.” What Gibson is pointing out is that every action has a limit beyond which it can no longer be completed safely, in addition to a limit where it cannot be completed at all. Take, for example, someone who has a maximum stepping length of 100 cm. Any gap or obstacle in the environment longer than 100 cm will need to be crossed with a different action than stepping. Today ecologically minded scientists would say that the individual has an absolute critical boundary for stepping of 100 cm. However, this person likely does not act at––or even near––the upper limits of his or her stepping ability. Instead, this individual has a preferred critical boundary, where any gap or obstacle in the environment of, say, at least 80 cm, will cause him or her to alter their action. One conception of the margin of safety refers to the region between a person and the edge of danger, and this gap corresponds to the proximity of danger in space and the imminence of danger in time (Gibson, 1961a). However, there are competing ideas about the margin of safety (see Wagman, Bai, & Smith, 2016). Mark et al. (1997) contended that a
Ecological Interface Design 41 margin of safety for one type of action refers to the difference between an absolute critical and a preferred critical boundary for that action. Similarly, in a study of aperture crossing, Warren and Whang (1987) used safety margins to explain why people rotate their bodies to go through apertures that are 1.3 x shoulder width. Wagman and Malek (2009) used the same explanation when comparing safety margins for humans walking under foam and hard barriers. Franchak, Celano, and Adolph (2012) argue instead that safety margins are not about safety per se but about people accounting for the dynamic size of the body in motion. In this view, shoulder width is not the appropriate measure of the absolute critical boundary for passability through an aperture because the body oscillates from side to side while walking, thus increasing the effective width of the body. The concept of a margin of safety deserves further empirical investigation to understand how actors perceive the limits of their potential actions and avoid exceeding them to avoid injury.
Tools and the Embodied Action Schema Gibson (1979/2015) defines an object as “a persisting substance with a closed or nearly closed surface that can be either detached or attached” (p. 34). Any reference to an object is to a ‘concrete’ one, and not an ‘abstract’ one. In this sense, the surface of an object becomes paramount, because that surface is the conduit for information regarding the substance, reflectance, color, and layout of the object. Generally, detached objects are much more interesting for human beings and other animals who behave similarly because they are often of a size that makes them portable and manipulatable. Pencils, pens, headphones, watches, chairs, books, laptops, etc. all qualify as detached objects. Gibson states that “the list of examples (of objects) could go on without end” (p. 34), and he is right. At nearly every moment, we use and interact with detached objects. Gibson then begins a discussion of tools, which are a special sort of detached object. Gibson conceptualizes tools as graspable, portable, manipulatable, and usually rigid. Take, for instance, a wrench (spanner). We can use a wrench to turn nuts and bolts, strike something, or throw it. Each tool has multiple affordances. When we use tools, we incorporate the tools as extensions of our action capabilities in our body schemas. Work following Gibson’s has upended previous accounts of the body schema. Traditionally, the body schema was thought of as a stable internal model that includes the sizes, shapes, and masses of the body’s parts that is learned early in life (Head, 1920; Iodice, Scuderi, Saggini, & Pezzulo, 2015). Head (1920) postulated that any changes to the body are compared to a fixed body schema stored in memory. More recently, ecological scientists have theorized that the body schema is simultaneously malleable and stable, and that it is perceived in real time (Pagano & Turvey, 1998). A body schema is malleable in that it can be adjusted due to permanent or
42 Christopher C. Pagano and Brian Day temporary changes made to the body or the body’s abilities. For example, when people acquire tools (e.g., canes or artificial limbs to correct temporary or permanent changes in their physical capabilities), they calibrate to their new action capabilities quite quickly (Day, Ebrahimi, Hartman, Pagano, & Babu, 2017; Maravita & Iriki, 2004). Gibson states: When in use, a tool is a sort of extension of the hand, almost an attachment to it or a part of the user’s own body, and thus no longer a part of the environment of the user … This capacity to attach something to the body suggests that the boundary between the animal and the environment is not fixed at the surface of the skin but can shift. More generally it suggests that the absolute duality of the “objective” and “subjective” is false. When we consider the affordances of things, we escape this philosophical dichotomy. (1979/2015, p. 35, original emphasis) In the only figure included in the original chapter, Gibson illustrates this point with a drawing of a hand holding a pair of scissors with the caption “A tool is a sort of extension of the hand … and one can actually feel the cutting action of the blades” (p. 35; see Figure 3.1). This is used to illustrate how attachments to the body change the boundaries of the body and thus alter the affordances as one interacts with the environment.
Figure 3.1 The original caption reads; “A tool is a sort of extension of the hand. This object in use affords a special kind of cutting, and one can actually feel the cutting action of the blades” (p. 35). This figure is also employed as Figure 6.6 in Gibson (1966a). There its caption reads; “A hand feeling a pair of scissors. The pressures of the metal surfaces on the skin can be felt, but one is mainly aware of the separation and of the movements of the handle. In use, one actually feels the cutting action of the blades” (p. 112). Source: From Gibson (1979/2015), Figure 3.1. Copyright 2015. Reproduced by permission of Taylor & Francis Group, LLC, a division of Informa plc.
Ecological Interface Design 43 Rather than being a stored and relatively fixed representation, the body schema is perceived and calibrated continuously as limbs and their attachments change (Day et al., 2017; Maravita & Iriki, 2004; Pagano & Turvey, 1998). The conception of the body schema as continuously perceived from ongoing proprioceptive input runs counter to an assumption inherent in the traditional body schema literature that proprioception does not inform the central nervous system about the metric properties of the body and its parts (e.g., Longo & Haggard, 2010). To reflect the fluid nature of the body schema and the role of perceptual calibration in its maintenance, we have proposed the term embodied action schema (Day et al., 2017, 2019). The embodied action schema represents the body’s current action capabilities, including any effects of a tool, such as the lengthening of the arm to which it is attached (Cardinali et al., 2009; Day et al., 2017, 2019; Maravita & Iriki, 2004; Sposito et al., 2012). A key finding is that both limbs and hand-held objects are perceived through kinesthesis via the same mechanism, with the same mechanical principles underlying the perception of hand-held objects and the perception of the body (Pagano & Turvey, 1998). This explains both the malleability of the body schema and how attached objects become incorporated into the body schema to be perceived and controlled as if a part of the body. Incorporating a tool that extends one’s effectivities typically requires a period of learning or adjustment before proficiency in its use can be achieved. Recent research has shown that calibration to one’s tool- enhanced capabilities occurs relatively quickly, though there is variability in how much calibration is required for different conditions (Bingham, Pan, & Mon-Williams, 2014; Bourgeois & Coello, 2012; Day et al., 2017, 2019; Fajen, 2005b; Mon-Williams & Bingham, 2007). This research has also shown that calibration is action-specific and it involves a mapping from embodied units of perception to embodied units of action (Bingham & Pagano, 1998; Coats et al., 2014; Pan, Coats, & Bingham, 2014). That calibration is action-specific means that calibrating for one action (e.g., reaching), will not generalize to another action that involves a different unit of action (e.g., throwing or walking) (Proffitt & Linkenauger, 2013; Rieser, Pick, Ashmead, & Garing, 1995; Witt, 2011; Witt, Proffitt, & Epstein, 2010). Rather than using external metrics such as inches or centimeters, the relevant units are intrinsic to the body’s scale and its action capabilities (Cutting, 1986; Gibson, 1979/2015; Mantel, Stoffregen, Campbell, & Bardy, 2015; Pagano, Grutzmacher, & Jenkins, 2001; Proffitt & Linkenauger, 2013). Thus, what is actually being calibrated is the mapping between intrinsic units of perception and intrinsic units of action (Bingham & Pagano, 1998; Bingham et al., 2014; Pan et al., 2014). The conclusions generated from the calibration research mentioned above suggest that we move from traditional concepts of a body schema, which implies a stable entity that is either innate or learned early in life, toward an embodied action schema. An embodied action schema represents the
44 Christopher C. Pagano and Brian Day ongoing effects of constant proprioception and continuous calibration. An embodied action schema also allows for changing action capabilities of an actor to be perceived in real time, and stresses the similar intrinsic scaling employed by the dual processes of perception and action (i.e., motor control). To this point, the focus has been on small, portable tools and their incorporation into the embodied action schema. But humans have created much larger tools; cars for driving, drones for surveillance, robots to enhance our physical strength and capabilities, nuclear power plants, etc. An area of study that ties together perception of tools and the margin of safety is affordance-based control of action (Warren, 2006; see also Gibson & Crooks, 1938). An example of this area of study is the control of braking by vehicle drivers (Fajen, 2005a, 2007, 2008). In this case, action capabilities are determined by the automobile. But, as with scissors or other hand-held tools that Gibson mentioned in Chapter 3, the new action capabilities are incorporated into the embodied action schema of the driver and can be both perceived and felt by the driver. In his 1961 paper on contributions of experimental psychology to safety research, Gibson mentions that when the margin of safety is not easily detected during the operation of automobiles or other devices, then one’s proximity to the margin of safety must be signaled via some type of display.
Ecological Interface Design Figure 3.1 in Gibson’s 1979 book illustrates how attachments to the body change the boundaries of the body to alter the affordances of the agent- environment system. The same figure was used in his 1966 book to illustrate that, when using hand-held objects, one feels the interaction between the tool and environment to be at the distal tip of the tool, where the interaction is taking place, rather than at the skin surface that is in contact with the proximal end of the tool. Extensive research has shown that tools can be used as probes to explore the environment (e.g., Barac-Cikoja & Turvey, 1991; Burton, 1992; Carello, Fitzpatrick & Turvey, 1992; Wagman & Hajnal, 2014a). The fact that hand-held implements faithfully convey information regarding distal objects and surfaces means that a properly designed tool can serve as a medium for haptic perception, just as light can serve as a medium for visual perception. In this way tools are functionally transparent to properties of the environment (Barac-Cikoja & Turvey, 1991; Burton, 1993; Hartman, Kil, Pagano, & Burg, 2016; Heft, 2002; Mangalam, Wagman, & Newell, 2018; Mantel, Hoppenot, & Colle, 2012). When probing a surface, one can feel both properties of the probe and properties of the surface being probed (Carello et al., 1992). This finding parallels what happens in vision. When perceiving visually, it is possible to perceive both distal environmental surfaces and properties of the light that is serving as a medium, such as whether the overall light level
Ecological Interface Design 45 is bright or dim, whether viewing is occurring in daylight sun or via artificial light at night, whether that artificial light is “warm” and yellow or blue and “cool,” etc. Normally, however, one’s attention is focused on the task at hand involving objects in the environment, rather than on aspects of the illumination. Haptic probing is similar, a tool can become functionally transparent. Our laboratory has investigated distal haptic perception by examining attunement and calibration in a simulated laparoscopic surgery task (Altenhoff, Pagano, Kil, & Burg, 2017; Hartman et al., 2016; Long, Pagano, Singapogu, & Burg, 2016). Laparoscopic surgery, also referred to as minimally invasive surgery (MIS), is preferable to traditional open surgery because the incisions are much smaller and thus trauma to the body and associated infection and recovery rates are greatly reduced. The surgeon, however, cannot directly view his or her tissue manipulations and these manipulations must be both performed and perceived via intervening tools. As a result, surgeons are more likely to accidentally break or tear healthy tissue during laparoscopic surgery than during traditional open surgery. Our laboratory identified haptic distance-to-break (DTB), an invariant that specifies to the observer when they are about to break a tissue. DTB is inspired by, and is mathematically analogous to, optical time-to-contact (TTC), which is specified by the rate of optical expansion as a surface is approached (e.g., Lee, 1976). We investigated the ability of subjects to perceive DTB through the use of a custom laparoscopic surgery simulator. The simulator consisted of a laparoscopic surgical tool attached to a torque motor that was directed by computer software to simulate artificial materials being pressed against, deformed by the tool, and ultimately broken if deformed too far. As the tool is pushed inwards, the amount of force required to push farther grows exponentially until the material fails. In one experimental task, participants were asked to push the surgical tool as far as they could without breaking the tissue, effectively pushing right up to the point of breakage and stopping without actually breaking the tissue. With only 10–15 minutes of calibration training, novice participants with no experience in MIS were able to perform the task with a high degree of accuracy, coming to within a few millimeters of the tissues’ break point (Altenhoff et al., 2017; Hartman et al., 2016). Experienced surgeons with an average of 4.8 years practicing MIS improved their ability to detect DTB after the same brief calibration training (Long et al., 2016). Anecdotally, the surgeons reported that the simulator and the task were representative of what they experienced during actual MIS. Although the expert surgeons performed better than the novices, it was remarkable how proficient the novices (undergraduate psychology students) became after a brief period of calibration training. That such novices reached a commendable level of expertise, and that experts had their performance significantly improved, underscores the efficacy of calibration training for perceptual attunement to invariants in an applied task.
46 Christopher C. Pagano and Brian Day Research on optical TTC and similar invariants is often criticized on the grounds that humans and animals do not always show natural or spontaneous use of such invariants (see Hecht & Savelsbergh, 2004; Jacobs & Michaels, 2007). This, however, is not a concern for the applied science of display design. A contribution made by Ecological Interface Design (EID) is the focus on what is meaningful in the domain application, the identification of invariants (i.e., constraints) in the domain, and the display of such invariants via artificial devices. If successful, such devices serve as a medium for the distal perception of the domain. Once the invariants have been identified and rendered in an appropriate display, users can be trained to attune to the invariants. The fact that experienced experts do not always use such invariants on their own accord, but demonstrate improved performance once trained to do so, actually underscores the importance of explicitly incorporating such invariants into training regimes. Training of this nature is referred to as “the education of attention” (E. Gibson, 1963, 1969; Gibson & Gibson, 1955) to the invariant that is most relevant to the successful completion of a given task (Hartman et al., 2016; Wagman, Shockley, Riley, & Turvey, 2001; Withagen & Michaels, 2005a). Designing for functional transparency in visual, haptic, and acoustic displays is a hallmark of EID. The functional transparency that is easily achieved with hand-held tools serves as an example of what might be achieved with more complicated devices. Thus, the challenge is to achieve direct perception and control via artificial displays when dealing with complex human-made systems, such as process control plants, airplane cockpits, and hemodynamic monitors (Burns & Hajdukiewicz, 2004; Effken, Kim, & Shaw, 1997; Hajdukiewicz & Vicente, 2004; Momtahan & Burns, 2004). Vicente and Rasmussen (1990) stress that: The goal of EID is to make the computer as transparent as possible, thereby minimizing the need for inference. Just as the nervous system and the energy medium are functionally transparent to the properties of world that are relevant to the survival of the organism (Shaw & Bransford, 1977), so the computer interface should be as functionally transparent as possible to the properties of the work domain that are relevant for effective control … The idea is to “make visible the invisible” so as to create the phenomenological feeling in operators that they are directly controlling the internal functions of the system, not dealing with a computer intermediary. (p. 230) The goal is to achieve for computer-mediated visual displays the level of functional transparency that can be achieved with haptic probing and haptic displays. For haptic probing, the ability of the tool to act as a transparent medium is determined by its material composition. The tool must, for example, be sufficiently rigid to be useful as a probe (Burton & Cyr,
Ecological Interface Design 47 2008). In a recent magazine advertisement for a fishing rod, the manufacturer described the material composition of the rod (e.g., the type of graphite in the rod and the type of cork in the handle) and then stated that “you’ll feel every tremor” and “you’ll think it has nerves inside it.” Feeling through the rod is critical for fly-fishing. It is critical for successfully fighting the fish as it is reeled in; and it enables the user to feel, through the rod and the line, the specific species of trout on the hook, its gender, and its approximate age (Rosenblum, 2010). In another magazine advertisement, Saab boasted that it had engineered its automobiles to be functionally transparent. The advertisement reads; Saab vs Descartes: I think therefore I am slower. I feel therefore I am faster. Sensory cues are designed into a Saab 9-5 to enhance driver/car interaction. The center of gravity is near the driver’s hips, where the body first detects lateral movement. Tactile sensations of the road are intentionally not filtered out of the steering system. Control becomes intuitive. Driving becomes safer and more enjoyable. Car and driver are one. (“Saab Affiches,” 2018) We are unsure of the accuracy of their claim regarding the hips being where the body first detects lateral movement (and we would use the term “haptic” instead of “tactile”), but the advertisement expresses the appropriate sentiment for creating artificial displays. The challenge is to achieve this with devices such as computer-mediated displays. In the fourth chapter of Gibson’s 1979/2015 book (see Carello & Turvey, Chapter 4, in this volume), he claims that one does not see light as such and one does not see the retinal image as such. Instead one sees by means of the information that is contained in the light. EID is founded on the assumption that if it is theoretically possible for direct perception to occur via the specificity of information in the light, then it is theoretically possible for direct perception to occur via artificial displays. Such displays must act as media by similarly conveying information lawfully specific to the affordances of the remote environment. It is important to note that perception is indirect when a psychological process, such as inference-making or mental integration, intervenes between the object in the world and one’s awareness of that object (Heft, 2002). Thus, the light reaching the eye or the sound reaching the ear does not result in indirect perception if these media are functionally transparent, as Gibson argued they tend to be under natural viewing conditions (e.g., Heft, 2002). Similarly, the existence of an intervening hand- held tool or a computer display does not necessarily have to result in indirect perception if these media do not cause an additional psychological process. The challenge is to turn this into reality. Providing users with information via a display in a way that does not result in indirect perception has been accomplished by designing the
48 Christopher C. Pagano and Brian Day display so as to convey the optic flow that would be available to the observer during natural perception and action. This has been useful for designing displays for robot teleoperation (Gomer, Dash, Moore, & Pagano, 2009; Mantel et al., 2012). In cases such as interfaces for process control plants, one begins by understanding the task-relevant invariants that exist in the dynamics of the system itself (e.g., Bennett & Flach, 2011; Vicente & Rasmussen, 1990). That is to say, human factors practitioners must begin, as Gibson does in Chapter 3, by understanding exactly what is meaningful about the environment. For process control, this involves understanding the physical constraints of the system and then displaying the higher-order parameters (i.e., invariants) that are specific to those constraints. In this way, the display acts as a smart perceptual instrument to convey higher-order information akin to TTC and DTB, rather than displaying the lower-order parameters of traditional physics that Gibson argues against in Chapter 3 (Vicente & Rasmussen, 1990). The lower- order parameters have to be mentally integrated to form a mental model of the system’s current state, and then mental projection must be employed to project the current state into the future to understand how one must act to affect coordination and control. Displaying higher order parameters similar to TTC and DTB allows for direct perception and thus allows the display to act as a medium on a par with light and the retina. Another goal of EID is to convey the constraints of a complex system in an intuitive manner. Constraints are one of the meaningful aspects of the domain environment, and their incorporation into displays has been particularly useful (Bennett & Flach, 2011; Borst, Flach, & Ellerbroek, 2015; Vicente & Rasmussen, 1990). Effken and colleagues, for example, have created displays for patient care that depict the physiological constraints imposed on hemodynamics by fluid, force, and resistance, as well as the effects of clinical interventions on those constraints through particular combinations of medications (Effken, 2006; Effken et al., 1997; Effken, Loeb, Kang, & Lin, 2008). Additional constraints exist between the patient and the health care practitioner, or in more general terms, between the user and the work domain (Flach, Reynolds, Cao, & Staffell, 2017). A good display for any complex system needs to show the system’s constraints because the constraints are the targets of effective human action (Borst et al., 2015; Effken et al., 2008). In the Conclusion of the 1979/2015 book, Gibson discusses the use of displays in experimental science (see Stoffregen, Chapter 15, this volume). He writes: The information for a certain dimension of perception or proprioception can be displayed without interference from the accompanying information to specify the display. That is the lesson of research on pictures and motion pictures. What is required is only that the essential invariant be isolated and set forth. (p. 292)
Ecological Interface Design 49 Earlier, near the end of Chapter 3, Gibson speaks about “artificial objects” or devices that display optical information. He says: A display … is a surface that has been shaped or processed so as to exhibit information for more than just the surface itself … images, pictures, and written-on surfaces afford a special kind of knowledge that I call mediated or indirect, knowledge at second hand. (p. 37; original emphasis) Gibson seems to be speaking here of static images, not of the dynamic changing displays typically employed in EID. A “graphical” or “configural” display is not necessarily an ecological display, and the type of EID displays that have been most successful are those that contain dynamic changing elements whose motions and relationships are specific to the dynamic movements and constraints within the domain system (Bennett & Flach, 2011; Burns & Hajdukiewicz, 2004; Effken et al., 1997; Vicente & Rasmussen, 1990). Thus, visual EID displays are akin to haptic displays, where movement and dynamics are integral (e.g., Altenhoff et al., 2017). For both haptic and visual displays, the invariants of the system—the constraints that exist between component parts—are often only revealed over time through the movements of the displayed elements. Graphical or configural displays, such as alternative nutrition labels proposed for grocery products, are usually little better than traditional displays (e.g., Marino & Mahan, 2005). This is because a nutrition label is a static depiction of information that does not change over time. A hemodynamic display, in contrast, changes as the patient’s condition changes and it reacts to user inputs in a dynamic fashion. In addition, constraints do not exist between the separate elements displayed on a nutrition label (e.g., sodium, total fat, vitamin C, etc.), whereas the elements of hemodynamic displays are mutually constraining (Effken et al., 1997, 2008; Flach et al., 2017). In short, EID as an approach to display design is most fruitful when it is applied to display information regarding a dynamic system. Display design and the broader study of human-machine interaction both fall within the overarching fields of human factors and ergonomics. The study of human factors and ergonomics begins with a basic understanding of human perception, cognition, motor function, and the environment (as with Gibson’s treatment of the environment in Chapter 3). This understanding is applied to the design of artifacts that are safe and intuitive to use. As modern humans have made all manner of tools and continue to alter their environment, there is now, more than ever, a need for human factors. While not widely recognized, ecological psychology provides a theoretical basis for human factors, and human factors can be considered as the real-world application of ecological psychology (e.g., Dainoff, 2008; Dainoff & Mark, 2001; Dainoff & Wagman, 2004;
50 Christopher C. Pagano and Brian Day Day et al., 2019; Effken, 2006; Flach, 1990; Flach et al., 1995; Hancock et al., 1995; Hartman et al., 2016; Mantel et al., 2012; Vicente & Rasmussen, 1990). EID is just one example of what ecological psychology has to contribute to its broader field of human factors and ergonomics, and it begins with an understanding of ‘The Meaningful Environment.’
4 Challenging the Axioms of Perception The Retinal Image and the Visibility of Light Claudia Carello and Michael T. Turvey Visual perception can fail not only for lack of stimulation but also for lack of stimulus information. (Gibson, 1979/2015, p. 54)
A hallmark of James Gibson’s strategy for understanding perception was to be very clear on the meaning of particular terms that often serve as mutable placeholders for vague concepts. As his ecological approach was maturing, he took pains to pin down words that had a technical veneer but were used all too casually for the role they played in perceptual theory. Among them, stimulus, information, and light (Gibson, 1960a, 1960b, 1961b, 1963) figure prominently in “The Relationship Between Stimulation and Stimulus Information” (Gibson, 1979/2015, Chapter 4). The stated goal of Gibson’s Chapter 4 was to “describe the information available to observers for perceiving the environment” (p. 41). In his attempts to do so, he uses another Gibsonian stylistic signature, the careful contrast, in order to articulate what information is and what information is not. But the particulars of information—how the optic array is structured lawfully by surface layout and by ecological events— were left to subsequent chapters (Chapters 5 and 6 in Gibson 1979/2015; Mace, Chapter 5, in this volume; Shaw & Kinsella-Shaw, Chapter 6, in this volume). The challenge taken up by Chapter 4 was the fundamental underpinnings of two doctrines that had shaped the science of visual perception for centuries: (1) vision begins with the retinal image; and (2) light is visible. While the former is typically explicit, the latter is typically implicit, though no less baneful because of it. Gibson’s logical dismantling of these doctrines was crucial to setting up his treatment of information. Lamentably, 40 years after The Ecological Approach to Visual Perception, and despite the growing influence of Gibson’s many insights on modern visual science (Nakayama, 1994), the need to parry these historical doctrines persists.
52 Claudia Carello and Michael T. Turvey
The Inevitability of Images Although the irrelevance of the retinal image for vision has long been accepted wisdom among ecological psychologists, this view is not universally embraced. Contemporary scholars still proceed as if its role in perception is axiomatic. Image language is especially vivid in computer vision (e.g., Nixon & Aguado, 2012), where its physiological rationale figures prominently: “The photoreceptors sample the retinal image to produce a neural image representation” (Ferwerda, 2001, p. 24). Neuroscience handbooks are firmly entrenched in the mythology, providing expositions entitled “The beginnings of visual perception: the retinal image and its initial encoding” (Yellott, Wandell, & Cornsweet, 2011). And the textbooks we use to teach our undergraduates cannot resist spinning fact as folklore: “The pattern of light reaching the retina must mirror the distribution of light in the scene being viewed. This light distribution, or retinal image as it’s called, is the raw material on which the retina works” (Sekuler & Blake, 2002, p. 56). Light is, indeed, patterned by “the scene,” by the substances and surfaces that are being viewed—this is fact. But whereas Gibson sought an understanding of how that patterning or distribution of light specifies the substances and surfaces that structure it, orthodox theories persist with labeling the result of that structuring as an image—this is folklore. Certainly, the elegant characterization of image formation owing to Kepler (1604/1996) is irresistible for those who see a need to get a copy of the world inside the body (see Stoffregen, Chapter 15, this volume). Light rays diverge from every point on an object. A subset of these enters the eye where the diverging rays are made to converge by a lens. This diverging and converging occur at every point on the object, thereby painting a picture on the putative retinal surface (Figure 4.1). Gibson seemed to limit his frustration with Kepler to the allure of the latter’s physical optics. While conceding the tidy success of the theory of image formation applied to optical devices, such as cameras and projectors, he lamented its application to visual perception: It works beautifully, in short, for the images that fall on screens or surfaces and that are intended to be looked at. But this success makes it tempting to believe that the image on the retina falls on a kind of screen and is itself something intended to be looked at, that is, a picture. It leads to one of the most seductive fallacies in the history of psychology—that the retinal image is something to be seen. (Gibson, 1979/2015, p. 53) Consistent with Gibson’s complaint, the astronomer’s theory of vision is unabashed in endorsing this implication: I say vision occurs when the image (idolum) of the whole hemisphere of the world which is in front of the eye, and a little more, is formed
Challenging the Axioms of Perception 53
^ŝŶŐůĞ ĨŽĐƵƐ ƉŽŝŶƚ
Figure 4.1 (a) A diverging pencil of rays from a single reflecting point on an object is infinitely dense; a subset converges in a pencil of infinitely dense rays to a single focus point on the back of the eye. (b) A few converging cones show how an optical image could be built from an infinite number of points on the object.
on the reddish white concave surface of the retina (retina). I leave it to natural philosophers (physici) to discuss the way in which this image or picture (pictura) is put together by the spiritual Principles of vision residing in the retina and in the nerves, and whether it is to made to appear before the soul or tribunal of the faculty of vision by a spirit within the cerebral cavities, or the faculty of vision, like a magistrate sent by the soul, goes out from the council chamber of the brain to meet this image in the optic nerves and retina, as it were descending to a lower court. For the equipment of opticians does not take them beyond this opaque surface which first presents itself in the eye. (Kepler, 1604/1996, De modo visionis, in Crombie, p. 338) Two things are notable here. First, Kepler ignored the retinal anatomy that puts the lie to the assumption of an “opaque surface” upon which an image can form. Dissections by the physician and anatomist Herophilos (c.335–280 bce) prompted his description of the retina as being reminiscent
54 Claudia Carello and Michael T. Turvey of a “folded fishing net” (Swanson, 2015). The contemporary characterization is in terms of ten layers and a dense network of blood vessels. In neither case is there an opaque surface. Of its multiple layers, then, which should we interpret as screen-like? Second, Kepler appreciated that getting a retinal image is only the first step. That image must still be considered further by additional authoritative entities. One would think that the seduction of starting with an image, however misleading, would lose its appeal once ensnared in the infinite regress of a succession of images, each in need of its own viewer. But apparently, logic goes out the window when one is smitten. Witness the descriptions of compound eyes. As Gibson (1979/2015) pointed out, the compound eyes of arthropods are different from the chambered eyes of vertebrates in that they do not focus light. The cone of rays diverging from an object is not converted into a converging cone of rays by a lens; an image is not formed. However, those who study insect vision seem to retain the story initiated for humans. As Gibson lamented: Zoologists who study insect vision are so respectful of optics as taught in physics textbooks that they are constrained to think of a sort of upright image as being formed in the insect eye. But this notion is both vague and self-contradictory. There is no screen on which an image could be formed. (1979/2015, p. 55) Indeed, image language is routine in discussions of all kinds of eyes. One prominent review of the evolution of eyes, for example, highlights “the ways that the preexisting molecular and cellular building blocks have been assembled to provide various solutions to the problem of obtaining and transmitting an image” (Land & Fernald, 1992, p. 5, emphasis added). Descriptions of the optical elements of compound eyes (both apposition- types, which are common to diurnal insects and arthropods, and superposition-types more typical of animals in dim environments, such as nocturnal insects and long-bodied crustaceans) emphasize how they can produce “good images” with mirrors, tubes, and analogues of two-lens telescopes (Land & Nilsson, 2006).
On the Myth of the Retinal Image As intimated in the preceding remarks, the retinal image has long been the centerpiece of vision theory. It has played the theoretical role of an intermediary, a third entity that serves to relate two other entities, the viewer and the environment that is viewed. No doubt, the persistence of image-centric accounts of vision reveals great respect for the physics of image-formation. But optics is only the set-up for the appeal of the retinal image. The tangible payoff seems to have been provided by Descartes’ demonstration of an
Challenging the Axioms of Perception 55 actual image on the back of an eye, the very embodiment of Kepler’s “image … formed on the reddish white concave surface of the retina … this opaque surface which first presents itself in the eye” (noted above). There is no denying the dominion of the retinal image in centuries of theorizing about vision, even up to the present day. But we must emphasize that it is an inappropriate starting point for more than reasons of logic. Not only is there, as Gibson emphasized, no homunculus to view the retinal image; it seems that there is no retinal image. As noted, the notion of an image presupposes an opaque surface on which the image can appear and be observed. However, only a dead retina is opaque (with the degree of opacity increasing with time after death); a live retina is transparent (Kawashima et al., 2015). The most storied evidence for a retinal image comes from investigations of the eye of a dead animal following the eye’s removal from its embedding capsule (a process termed enucleation) began with Scheiner and Descartes in the early 1600s. Of singular importance to their exposition and investigation of the image was another intuition of Kepler’s, namely, that to comprehend the functioning of a single eye requires two eyes—one eye (with a putative image) to be looked at and one eye (that of the experimenter as observer illustrated in Figure 4.2d) harnessed to do the looking (Boring, 1942, p. 223). Descartes detailed the essentials of such investigations in Optics: Discourse V (Cottingham, Stoothoff, & Murdoch, 1985). As depicted in Figure 4.2, the back of the enucleated eye is removed and replaced by transparent material (e.g., paper or eggshell) referred to as a “white body,” the function of which is to allow light to pass through the eye from front to back. The enucleated eye is then fixed so as to face illuminated objects in an otherwise dark room. Viewing the white body in its capacity as a screen at the rear of the enucleated eye, one will, in Descartes’ words: “see there, not without wonder and pleasure, a picture representing in natural perspective all the objects outside …” (Cottingham, Stoothoff, & Murdoch, 1985, p. 166). For Descartes, the images so formed in the back of the eye in the preceding way are images that “also pass beyond into the brain” (p. 167). It is not the case that Descartes’ authority went unchallenged. As elucidated by Campbell (1817, p. 21), the surgery in question—in conjunction with the viewer’s location behind the eye—transformed the enucleated eye- with-window into what contemporary optics would identify as a rear-view projector (Figure 4.2d). The philosopher Arthur Bentley appears to be the most forthright in arguing that the status of the retinal image is more than problematic. In an essay with the unambiguous title, “The fiction of ‘retinal image,’ ” he asserted that “Examination shows it to be a confused mixture of analogies” (Bentley, 1954/1975, p. 268). Bentley argued that Scheiner’s and Descartes’ procedure demonstrates no more than that “the eyeball contains an optical system … it has next to nothing to do with the process of vision in a living organism” (p. 271).
56 Claudia Carello and Michael T. Turvey
Figure 4.2 The demonstration of a “retinal image” by Scheiner, repeated by Descartes. (a) An intact eye presented with an object. (b) The enucleated eye with retina and rear coatings scraped off. (c) A “white body” attached to the rear of the enucleated eye. (d) The image viewed by Descartes is not on the retina but on the “white body” (cf. Campbell, 1817).
A core issue for Campbell and for Bentley was that an image can be located on a translucent screen (the post-surgery state of affairs, Figure 4.2c) but it cannot be located on a transparent screen (the pre-surgery state of affairs, Figure 4.2a).1 The foregoing has been expressed in the following terms: When looking through the windshield of one’s car, or the windows of one’s office, or the lenses of one’s glasses, that which one sees is beyond the windshield, the windows, and the lenses, and is responded
Challenging the Axioms of Perception 57 to behaviorally as such. This fundamental observation assigns the burden of proof for the existence of a retinal image to the domain of living vision, that is, the investigation of eyeballs that are alive not dead. (Turvey, 2019, p. 346) Bentley suggests that Hermann von Helmholtz—one of the most influential scholars in the physiology and philosophy of perception over the past 150 years—may have been responsible for popularizing what seems to be weak evidence for the image inside a living eye. In the zealousness of his commitment to a role for the retinal image, for example, Helmholtz provided a second-hand recounting of what a colleague had told him about viewing the retina directly (i.e., by looking through someone’s pupil). Under proper lighting conditions, “It is even possible to see the image sometimes through the sclerotic of the eye of a live individual, especially if he is blond and has bright blue eyes which usually contain scant pigment in the choroid coating” (Helmholtz, 1866/2000, p. 91).2 Such an image, Bentley noted, was more likely simply a reflection from the cornea or the lens, so-called Purkinje-Sanson images. Helmholtz was convinced that his new invention, the direct ophthalmoscope, would provide proof of the image: With the aid of this instrument, it is possible to look directly in the eye from the front and see clearly not only the retina itself and its blood vessels but the optical images that are projected on it. That this is actually the case is proved by the fact that if the eye under examination is focused on an object that is bright enough, a distinct and sharply defined image of it may be seen by the observer on the surface of the retina.” (Helmholtz, 1866/2000, p. 92) Convoluted wording notwithstanding—he is saying the proof that you can see an image on the retina is that you may see an image on the retina—this report has not been added to in the 130 or so years since it first appeared. “Reports of observation are extremely rare in the texts, considering how notorious the fact itself is” (Bentley, 1954/1975, p. 273, note 11). It is likely that even these rare reports are simply repetitions of Helmholtz’s claim. Queries put to the internet with the search term “retinal image” engender multiple instances of what should be termed images of the retina but none of what we think of as images on the retina.3
The Persistence of Hypostasizing the Retinal Image Although Bentley was convinced that by the end of his essay, “ ‘retinal image’ will turn out to be just a careless use of words” (1954/1975, p. 271,
58 Claudia Carello and Michael T. Turvey note 5), confidence in its status persists. A journal article with the forthright title, “How we perceive our own retina” (Kirschfeld, 2017) gives familiar voice to the presumed narrative: “At each saccade, we take an individual ‘snapshot’, as it were, of the optical target … If we consciously perceive our own retina, it is like looking at such a snapshot” (p. 1905). The twist in this new account is that the author purported to set up conditions that would allow viewers to consciously perceive their own retinas: It is possible to observe one’s own retina by illuminating the eye sideon. In this case, light enters the eye behind the lens (‘retrolental illumination’, RLI; figure 4.3a) and the retinal vessels throw a shadow onto the photoreceptors so that the subject perceives a wonderful view of his own vessels, known as the entoptic image of the retina. (p. 1905) In the reported experiment with four participants, such an entoptic image was produced in the left eye of the observer. Simultaneously, the right eye viewed a cardboard with variously sized gauges that the observer used to report on the sizes of the fovea and the blind spot in the entoptic image. The author was happy to note that one image was not distorted relative to the other, despite the overrepresentation of the fovea in Visual Area 1. However, the disconnect between retinal-image-as-snapshot and image-of-the-anatomicalretina was overlooked. The label for the specially produced image should have been a hint: Entoptic means “originating within the eyeball.” Shadows of the vessels of the eye are not what have been meant historically and theoretically by “retinal image.” The photograph of the author’s own retina, complete with axes and points superimposed on the photograph to show the location of the fovea, tellingly showed no projection of the cardboard seen by the other eye. The claim for an image of the environment on the retina seems not to have been substantiated on anything other than a translucent retina. Nonetheless, vision scientists and neurobiologists have been beguiled by all of the focusing controls that seem to point to the importance of image quality qua image, even lamenting “special-purpose visual systems [that] failed to provide the comprehensive neural machinery that would allow these images to be fully exploited” (Land & Nilsson, 2006, pp. 167–168). But what struck Gibson as important across the diversity of eye designs is that they are well suited to registering the structure of ambient light. Nevertheless, in both cases, in the convex eye as much as in the concave eye, the adjacent order in the external array is preserved in the order of stimulation at the receptor mosaic. What-is-next-to-what remains constant; there is no shuffling or permutation of order. It is true that the order is inverted by the concave eye and not by the convex eye but this is a matter of no consequence. (Gibson, 1966a, p. 165)
Challenging the Axioms of Perception 59 This is a very different take on adjacent order from that advanced by promoters of the image: “Adjacent ommatidia view adjacent solid angles, so that the image as a whole is built up from the contributions of all the ‘apposed’ ommatidia” (Land & Nilsson, 2006, p. 172). For some vision scientists, it seems, image status can be granted to the adjacent order itself, without requiring a landing site or a projection. As a case in point, Sekuler and Blake (2002, p. 57) suggest that the term “image” applies to any distribution of light that preserves the spatial ordering of locations in space. Whether or not one wants to consider the adjacencies to build up an image, however incidentally, Gibson emphasized: The structural properties of a retinal image are what count, not its visible properties. Structurally, it should not be called an image at all, for that term suggests something to be seen. The structure of a retinal image is a sample of the structure of ambient light. (1966a, p. 172) The structure of ambient light is central to the other major theme of Chapter 4 of Gibson (1979/2015), namely, his denial that perceivers actually perceive light, to which we now turn.
Do We See Light as Such? Although the retinal image dominates orthodox theories of visual perception, the assumption that we perceive light itself is often lurking nearby. The inexorable discussions of how the intensity and wavelength of light affect the visual system; routine references to the visible spectrum; the psychophysical dedication to measuring thresholds, constancies, and metamers; and lamentations that the blind spot doesn’t respond to light, all seem to implicate seeing light as the starting point. They are symptoms of a failure to draw a fundamental distinction that Gibson highlighted as his approach developed over the decades. In 1979, he summarized it succinctly as: “the difference between receptors and a perceptual organ. Receptors are stimulated, whereas an organ is activated. There can be stimulation of a retina by light without any activation of the eye by stimulus information” (1979/2015, p. 47). This distinction is further buttressed by Gibson’s invitation to be clear about what we mean by light. In everyday visual circumstances, there is that which illuminates (light sources), that which is illuminated (surfaces), and illumination (light as such). A particularly bold statement that we perceive light as such was provided by Boynton, a prominent scholar in color vision and physiological optics, who was critiquing Gibson’s then-new claims for ecological optics: “we are not in visual contact with objects, or edges, faces, facets, or textures. We are in contact only with photons” (Boynton, 1974, pp. 300–301).
60 Claudia Carello and Michael T. Turvey Gibson’s rejoinder continues to try to clarify the problem: There is a misunderstanding of the metaphor of ‘visual contact,’ one that goes back to Johannes Mueller … It leads to the doctrine that all we can ever see (or at least all we can ever see directly) is light. (1974, p. 310) Yet the visibility of photons—that is, the seeing of illumination, the seeing of pure non-reflected light—has been an issue of long standing (e.g., Baylor, Lamb, & Yau, 1979; Hecht, Schlaer, & Pirenne, 1942; Sakitt, 1972; van der Velden, 1946) and continues to be (Field & Rieke, 2002; Koenig & Hofer, 2011; Tinsley et al., 2016). Increasingly sophisticated methods have been dedicated to demonstrating single-photon detection by the human eye. Whether or not the results can be considered unequivocal seems not to have undermined the characterization of the goal of the refined methods. One recent investigation aptly observes that demonstrating the response of a rod cell to an individual photon is not equivalent to demonstrating the perception by a human subject of a photon (Tinsley et al., 2016). Along with that admission, however, was the frustration that limitations of previous methodologies are to blame simply for “an inherent ambiguity about the exact number of photons required to elicit the perception of seeing light” (p. 2). It is instructive, for present purposes, to clarify what investigators mean when they try “to probe the absolute limit of light perception.” The experimental setting of Tinsley et al. was that of passing laser light through a 1 mm-thick beta-barium borate (BBO) crystal. When suitably illuminated, BBO gives rise to single-photon states. The participant’s task on each trial was to report whether a light was seen or not, and how confident the participant felt with respect to the report. More specifically, on each trial, the participant watched for a dim flash, which occurred at one of two times, with both times indicated by a beep. A decision was then given as to which beep was associated with a dim flash, and what level of confidence should be assigned to the decision. In the 10% of trials when confidence was high, correct decisions, averaged over participants, were about 60% (compared to 51.6% for the 90% of trials engendering low or middling confidence). Tinsley et al. concluded: “To our knowledge, these experiments provide the first evidence for the direct perception of a single photon by humans” (p. 6). Our reading of direct perception suggests an alternative conclusion. With regard to the triad of that which illuminates, that which is illuminated, and illumination as such, it is perhaps more accurate to claim that the participants in the Tinsley et al. experiment occasionally saw a source of illumination (that which illuminates). They did not see illumination as such. They did not see photons.
Challenging the Axioms of Perception 61
On the Invisibility of Illumination A simple mundane phenomenon highlights that whereas light sources and illuminated surfaces are visible, illumination is not. Consider a flashlight held vertically in a hand with the emitted light striking the floor in an otherwise ordinarily illuminated room (Figure 4.3). The hand with flashlight can be so adjusted in height that one can see the emitted light at the floor without seeing the flashlight’s bulb. That is, the illumination of the floor (in the form of a circular disk of light if the flashlight is perpendicular to the ground) is distinct from the source of illumination (a tungsten filament/incandescent bulb or a light emitting diode/solid state bulb). The latter could also be seen if the flashlight were raised or tilted. What cannot be seen is the light between the flashlight (the source of illumination) and the floor (the thing illuminated). The illumination itself (i.e., light as such) is not visible. Coloring or sharpening the light source does not change the experience (Figures 4.3c and 4.3d). The illuminated surface is seen; the illumination is not. Similar demonstrations are now common in high school physics classes, where the light is a bright laser which only becomes visible with the introduction of chalk dust. That the mundane phenomenon is nonetheless remarkable is illustrated by the puzzlement expressed by a 4-year-old film enthusiast who demanded “Where’s the movie?!” while looking overhead in a dark theater and gazing urgently between the projector and the screen (J. Blau, personal communication, 2014). Why does the assumption that light is perceived sit unexamined and unquestioned in perceptual theory? It fits seamlessly into the narrative developed for theories of indirect perception, namely, that our acquaintance with the surrounding environment is mediated by sensations of light. ;ĂͿ
Figure 4.3 Illumination is transparent to that which is illuminated. (a) The light from a flashlight can be seen on a carpet but not in the air between the flashlight and the floor; illumination is not visible between the observer and the wall. (b) The view of the wall is no different when the light is off. (c) The situation is identical for the sharp (red) beam from a laser pointer: The spot on the floor is visible but the beam is not apparent in front of the wall. (d) The view of the wall is no different when the laser pointer is off.
62 Claudia Carello and Michael T. Turvey Photoreceptors are responsive to light; they yield sensations of light and perception of the environment is built from those sensations. In such an account, we are in direct contact with illumination; we are in indirect contact with that which is illuminated. In questioning the preceding characterization, Gibson challenges us: What about the opposite assertion that we never see light? It may at first sound unreasonable, or perhaps false, but let us examine the statement carefully. Of all the possible things that can be seen, is light one of them? (1979/2015, p. 48) Can we test this assertion? One possible test would be provided if we could arrange the circumstances so as to allow an observer to encounter light when it is not illuminating any surfaces. If Gibson is right, then observers should not experience that light; in the absence of illuminated surfaces, they should experience only darkness. The requisite demonstration, implemented as part of a science exhibit, has been described by Zajonc (1993). Using details provided by him, we constructed the apparatus illustrated in Figure 4.4 in order to conduct some systematic investigations. The basic design entails a light source at one end of an empty box, interior surfaces covered in light-absorbing material, and an aperture through which to view the interior of the box. With the light turned on, the interior of the box provides a setting in which there is illumination in the absence of things illuminate-able. Illuminate-able objects or surfaces can be introduced into the otherwise non-reflective box interior, either through a small slot or by opening the hinged rear wall. Photographs of the interior of the box were taken through the viewing aperture for each of the following four conditions: (1) light source off, box empty (Figure 4.5a); (2) light source on, box empty (Figure 4.5b); (3) light source off, rod in box (Figure 4.5c); and )4) light source on, rod in box (Figure 4.5d). The photographs reveal that the fact of light in the box is insufficient to experience its consequence, which is restricted to Figure 4.5d. (These impressions from the photographs have been verified with a dozen observers under experimental conditions.)4 The implication is that illumination is visible only by way of that which is illuminated. The latter conclusion can be taken a step further with a fifth condition: light source on with smoke filling the box. Smoke reflects light so that it can be seen without a rod in the box (Figures 4.5e and 4.5f ); it is a case of Rayleigh scattering, where the reflecting surfaces are particles in the medium. Without the particles, there would be nothing to see. The visibility of the smoke in the box at the viewing aperture (Figure 4.5f ) should be underscored. It gives necessary evidence that the unseen light in Figures 4.5a–c is not a matter of the unavailability of light. The light is there but invisible. Smoke renders it visible.
;ĂͿ &ƌŽŶƚǀŝĞǁ WůǇǁŽŽĚĞǆƚĞƌŝŽƌ /ůůƵŵŝŶĂƟŽŶ ƐŚĂŌ
Figure 4.4 (a) Three holes drilled in a plywood box provided an illumination shaft (extended with a PVC pipe to secure a flashlight), a monocular viewing aperture centered on the front wall and, to its right, an insertion slot to allow the introduction of a thin rod. (b) Rods angled obliquely directly in front of the viewing aperture of the box, which was lined with black, light-absorbing foam.
Figure 4.5 Photographs taken through the viewing aperture for (a) light off/no rod; (b) light on/no rod; (c) light off/rod present; (d) light on/rod present; (e) when the box is filled with smoke, the beam can be seen through the unlatched back wall; (f) the smoke as seen through the viewing aperture.
64 Claudia Carello and Michael T. Turvey The observer’s experience with the light box is akin to the experience of an astronaut on a spacewalk. Light is all around but, if the spacecraft, Earth, and the moon are out of view, nothing is seen but darkness peppered with point-light sources (the distant stars). Although the astronaut’s eyes are continuously registering photons—the sun’s light is all around— the astronaut does not have sensations of light (Zajonc, 1993).
Colored Illumination Strictly as Such For the purpose of generality, the light box can be used to extend the inquiry to color vision. For example, colored acetate strips can be hung inside the box in front of the light shaft. Colored light is thereby projected into the box. When a sheet of white paper is affixed to the rear interior surface of the box opposite the light shaft, a camera placed within the box captures the colors projected on it, affirming the presence of red, green, or blue illumination throughout the box. Observer experiences looking through the viewing aperture are no longer unexpected: They see darkness whether the light is red, green, or blue. But once again, an unpainted wooden rod inserted through the slot reveals the color of the flashlight’s acetate strip as a red, green, or blue stick is seen in an otherwise black background. The light box can be taken outdoors, the sun can replace the flashlight, and we can do an outdoor analogue of Newton’s (1704/1952) original prism experiment. As Figure 4.6 shows, the box can capture a light spectrum. And in agreement with the prior experiments, when the box is
Figure 4.6 Newton revisited. (a) The box was placed outdoors, slanted so that the illumination shaft faced the sun, and an optical glass triangular prism was aimed so as to shine down the illumination shaft. (b) With a camera inside the box, the light spectrum (rendered here in shades of gray) can be seen on an opaque white surface placed against the rear wall of the box. (c) The spectrum that is visible on the rear wall is not seen at the viewing aperture. (d–f) When a rod is introduced, however, it is easily seen in the color of the part of the spectrum that it intersects.
Challenging the Axioms of Perception 65 closed, spectral qualities are registered by camera and observers only in the presence of the inserted rod, that is, only in the presence of a reflecting (illuminate-able) surface. We are claiming that observers experience the radiant light-filled box interior as black (i.e., as unlit) because the interior walls do not reflect light and vision is tied to reflected light not radiant light. One might contend, however, that observers’ experience of blackness on those occasions is because the direction of light in the box (the direction of the illumination shaft) is perpendicular to the direction of the viewer’s line of sight (the viewing aperture). That is, one might contend that vision requires that illumination’s direction be parallel to, not perpendicular to, the viewer’s line of sight. A final demonstration with the light box, however, indicates otherwise. All of the foam walls were covered with light-reflecting paper (Figure 4.7a). In sharp contrast to the nonreflecting version of the box, the walls ;ĂͿ
Figure 4.7 (a) Box with interior reflecting surfaces. (b) A top view schematic of the light in the interior of the foam-lined box. (c) A top view schematic of the light in the interior of the box shown in (a).
66 Claudia Carello and Michael T. Turvey of the light-reflecting version were plainly visible at the viewing aperture. The inserted rod was not needed to reveal the illumination of the box’s interior. We take the implication to be that light in the box is invisible in Figures 4.5a–c and 4.6c not because the projection of light is perpendicular to the line of sight but because it is not reflected. Figures 4.7b and 4.7c schematize the contrast between light’s behavior in the non-reflecting box and the reflecting box. The consequence of scattering in an enclosure, such as the box with reflecting walls, is multiple- reflection or reverberation, an endless bouncing of light from surface to surface, a network of convergence and divergence that is indefinitely dense. It renders the light in a given environmental arrangement specific to the environmental arrangement (Gibson, 1966a, 1979/2015). For those who remain unconvinced, let us return to the situation of astronauts in deep space. Imagine two spacewalkers who can see each other in the surrounding blackness of space by virtue of the light each reflects to the other’s eyes (Figure 4.8a). If we restrict our attention to the light reflected from the face of one (Figure 4.8b), it is clear that she is bathed in the light she reflects, including light that is reflected from her eyes. If the other astronaut moves out of view, she sees only darkness where he had been. She does not see the light despite still being immersed— eyes included—in it (Figure 4.8c). The problem is not that light doesn’t make its way into the eye of the observer where it can affect the sensory apparatus, the problem is that there is no structured light.
Implications of the Invisibility of Light The foregoing examples were previewed by Gibson (1979/2015): The only way we see illumination, I believe, is by the way of that which is illuminated, the surface on which the beam falls, the cloud, or the particles that are lighted. We do not see the light that is in the air, or that fills the air. If all this is correct, it becomes quite reasonable to assert that all we ever see is the environment or facts about the environment, never photons or waves or radiant energy. (p. 48) This assertion is echoed by the physicist, Zajonc: “without an object on which light can fall, one sees only darkness. Light itself is always invisible. We see only things, only objects, not light” (1993, p. 2). What about the opposite scenario: What would we experience if light itself were visible? For an illuminated environment in a transparent medium, we would have to assume that all points of observation would be completely filled with light. In consequence, the surrounding surfaces would effectively be occluded by light at every point of observation (A. Zajonc, personal communication).
Challenging the Axioms of Perception 67
Figure 4.8 (a) Light reflected from spacewalkers allows each astronaut to be seen by the other in the blackness of space where there are no other surfaces to reflect that light. (b) Limiting the depiction to what is reflected from the face of the astronaut on the left shows that light is available at his or her point of observation. (c) Even when the second astronaut is out of view, the light the first astronaut reflects is still available at her point of observation; yet she sees nothing— there is energy but no structure.
Do we ever see light as such? The demonstrations and experiments reported here suggest that we do not. What we see are illuminated surfaces. This is a point that Gibson returned to in each of his books when he discussed Metzger’s (1930) experiments on the Ganzfeld as well as his own experiments (Gibson & Waddell, 1952). A homogeneous field of dim illumination, such as from a fine-grained surface completely filling the visual field or covering the eyes with homogeneous light-diffusing caps, provides light but no structure. Observers do not perceive the surface before them: What the observer saw, as I would now put it, was an empty medium … The purpose of the experiment is to control and vary the projective capacity of light. This must be isolated from the stimulating capacity
68 Claudia Carello and Michael T. Turvey of light. Metzger’s experiment points to the distinction between an optic array with structure and a nonarray without structure. To the extent that the array has structure, it specifies an environment. (Gibson, 1979/2015, p. 143) Gibson (1966a) connected Ganzfelds to the occurrence of the special weather conditions of a whiteout in snow plains: “whiteout provides no information about the world because, although energy is present, structure is absent” (p. 293). And he drew an analogy to a blackout: “Blackout provides no information about the world because energy is absent” (p. 293). In neither case are surfaces perceived; energy alone is not enough. Echoing Bentley’s (1954/1975) characterization of the retinal image, we would say that casual claims of “seeing light” are another instance of “a careless use of words.”
Summary Our critical evaluation of the commonplace discourse on the retinal image and the visibility of light bears on Gibson’s aim in Chapter 4: To “describe the information available to observers for perceiving the environment.” In our chapter, we have highlighted the historical inclination to accept the image formulation as a fitting characterization of the light to the eye, and the echoes in contemporary literature. As underscored, that inclination engenders a theory of vision that is necessarily inferential: Perception is a process of making inferences with respect to images on the retina. Correspondingly, in our chapter we have highlighted the historical inclination to accept visible light as a fitting ontological characterization of light as such, again with contemporary echoes. That inclination leads to a theory of vision necessarily expressed as having experiences of light quanta—that visual experiences are first and foremost of light as such. In his Chapter 4, in contrast, Gibson highlighted the distribution of light—its structure, its order—as information about the surfaces and substances that structured it. In so doing, he provided a formulation in terms of information at nature’s ecological scale that rejects both of the storied historical implications.
Notes 1. We have not found citations of Bentley or Campbell in any of Gibson’s books or papers. Although calls to question the literal existence of the retinal image would likely be celebrated by Gibson, his own arguments were simply against how the retinal image has been used conceptually. 2. Helmholtz’s footnote 1 attributes this report to A. W. Volkmann, a renowned scholar in physiological optics. 3. Anecdotally, we have asked our own optometrist what he sees when he uses the ophthalmoscope to peer into our eyes. He listed the features related to retinal anatomy and health (vessels, floaters, etc.). Trying not to lead the witness too
Challenging the Axioms of Perception 69 much, we artfully encouraged him to say more. He responded, “Do you mean do I see what you see? No, that’s just silly” (S. McKeown, personal communication, April 27, 2018). 4. Identical impressions were reported by Zajonc (1993), whose light source was a powerful projector rather than a flashlight.
The Information for Visual Perception
5 Getting into the Ambient Optic Array and What We Might Get Out of It William M. Mace
James Gibson labored carefully over the development of the concept of the ambient optic array. This chapter will present and elaborate Gibson’s most mature articulation of the ambient optic array concept but with enough history to dramatize Gibson’s constant effort to appreciate relevant empirical results, to adjust all aspects of his theorizing to one another and to keep the system focused on the overall task of explaining perception of the environment in a coherent way. With the development of the ambient optic array concept in hand, the chapter will then survey what has happened in research on vision from 1979 to the present in order to discover what people have and have not appreciated in Gibson’s work.
The Ambient Optic Array (for Real) At the beginning of Chapter 5 of the 1979/2015 book, Gibson says: The central concept of ecological optics is the ambient optic array at a point of observation. To be an array means to have an arrangement, and to be ambient at a point means to surround a position in the environment that could be occupied by an observer. The position may or may not be occupied, for the present, let us treat it as if it were not. (1979/2015, p. 58) Thus, the optic array is a theory of the optical structure of the environment. So far as I can tell, there has never been anything quite like this, that is, the “of the environment” part. Ecological optics says that in an environment with an atmosphere, opaque surfaces, and light, the light bouncing in all directions will come to an equilibrium such that there will be patterned structure that is specific to the structuring environment. The minimum structure for Gibson is an intensity difference. Light, as streams of photons, is not the relevant description. The relations of difference are relevant. Light is a medium to carry structure. Structured light makes light focusable. Unstructured (uniform) light is unfocusable. A surround of homogeneous intensity is a Ganzfeld, and both scientists (Avant, 1965; Gibson &
74 William M. Mace Waddell, 1952) and artists, e.g., Robert Irwin (Weschler, 2009), and James Turrell (Adcock, 1990) have fashioned versions. In order to have accommodation adjustments of an eye’s lens, the adjustments have to be to something. There has to be an intensity difference created by arrangements in the optic array that can be brought into focus. Accommodation is not something an optical system can do in memory or imagination. There is something in the array, structured by an environmental arrangement, to be focused on. This (accommodative adjustment) is the first of Gibson’s list of criteria for reality that will be presented below. Now consider this next move of Gibson’s in presenting the structure in the ambient optic array: We obtain a better notion of the structure of ambient light when we think of it as divided and subdivided into component parts. For the terrestrial environment, the sky-earth contrast divides the unbounded spherical field into two hemispheres, the upper being brighter than the lower. Then both are further subdivided, the lower much more elaborately than the upper and in quite a different way. (Gibson, 1979/2015, p. 60) The ambient optic array is divided into two parts: the upper (sky) and the lower (earth), creating a brightness difference at the horizon. Note that this is an extreme simplification but it is the opposite of atomism, which simplifies by reduction to the smallest parts. Gibson’s simplification, the division of the whole optic array into two parts, is a simplification that is more like a Gestalt whole. Simplification begins with the largest part. But unlike Gestalt entities, this whole is not a mental organization. It is the environment that is divided into sky and earth. It is a large containing envelope that life is inside of (Mace, 1977). This very large containing envelope does not require observers for its existence, although there is no specific containing envelope without something to be contained. Minimally, to do the geometry, a point of observation must be declared. Both halves of the array function as background for all that is contained in the world. The sky, with little detailed structure, is not a very determinate background, whereas the richness of surface layout on the earth makes for a far more determinate background for the nested components that it contains. Now consider the subdivisions of each hemisphere––cloud configurations in the sky, rich varieties of surface layout on the earth. The two enveloping hemispheres have nested structures of adjacent patches that project to points of observation. Gibson said that the parts of an array are given as solid angles projecting to possible points of observation. The solid angles are faces and facets of surfaces as well as the gaps between them (like looking through tree leaves into a background sky). He stressed that the perspective lines in his diagrams (see Figure 5.2) indicated differences
Getting into the Ambient Optic Array 75 in intensity of light, differences that would be preserved (invariant) over absolute light levels. Thus, the lines in his diagrams are not rays of light. For Gibson, the ambient optic array is a plenum. It is structurally dense, like a jigsaw puzzle. Each “form” is a face of a surface, a facet, or even an opening onto the sky. There are no gaps in this dense structure. The entire set projects to a point of observation to yield a structured optic array at that point of observation. One consequence of this initial image of a structured array is to avoid any thinking based on single points in empty space. Motion perception, considered as a change of position of a point relative to the retina, is not a natural topic from the standpoint of a packed optic array. Rather, for Gibson, the changing structure of the optic array (the disturbance of the structure) had to be fundamental and types of change, not simple motions, need to be studied as such. Because of this nested puzzle structure, for Gibson, the natural way to locate some portion of the array was by inclusion (nestedness), not by systems of coordinates. As puzzle pieces, each solid angle would be unique, and the total arrangement at each point of observation would be unique— in contrast to a coordinate system, in which every point is the same as every other point. It follows from this uniqueness that the exact location in the world of a photograph can be determined, where “exact location” refers both to the world setting and the camera location in that setting. Using Google Earth to zoom in on and then zoom back can provide multiple nested levels, to underscore the distinctiveness of otherwise similar- looking locations. See Figure 5.1. Zooming back to increase the nested structure makes the location more definite. When Gibson described the contrast between radiant and ambient light, and the key features of the ambient optic array, he began with an unoccupied array without motion. Figure 5.2a shows Gibson’s diagrammatic optic array of a room with one window and no person. Remember, as mentioned above, that the lines do not indicate rays of light, but differences of intensity. Gibson found projective geometry useful, even if limited. The geometry he advocated was that of Euclid and Ptolemy, termed “natural perspective.” Natural perspective is constructed from solid angles formed by the faces of environmental surfaces projected to a point. Artificial perspective introduces projection surfaces as slices of the solid angles. The projection surfaces of artificial perspective contributed greatly to the mischief of picture theories of vision that Gibson strenuously avoided. The projective geometry of the top of the table (or stool) for the seated and standing observer in Figure 5.2c would have the conventional texture compression differences of slant variation. Most significant for Gibson, the dotted lines in Figures 5.2a and 5.2b represent occluded surfaces, which are not an intrinsic part of perspective geometry. In Figure 5.2c, Gibson emphasized the occlusion difference between optic arrays of a seated person and the same person standing. When a point (place) in an array is occupied, then the surfaces of the viewer’s body are added to the structure of that array. Because the body is opaque, roughly
76 William M. Mace
Figure 5.1 Pike’s Peak, Barr Trail. Used with permission, Google Earth. Source: Google Earth Pro 126.96.36.19991. Pike’s Peak, Barr Trail. 38o 50’51.37” N 105o 01’ 54.35” W elev 12792 ft eye alt 12727 ft Imagery date 8/2017. Note: Google and the Google logo are registered trademarks of Google LLC, used with permission.
half of the optic array of a human is occluded. The edges of the eye sockets, the nose, and some body are visible, as in the famous diagram by Ernst Mach (see the four panels of Figures 7.1 and 7.2 in Gibson, 1979/2015).
Structure in the Optic Array Although Gibson was enthralled by the geometry of perspective, he was increasingly concerned by what it left out. He stressed that the geometric notion of a station point at the convergence of solid angles does not capture: (1) the facts of nestedness (the angles do not add up to 360°); (2) the change of structure induced by moving observers; (3) the optical consequences of opacity; (4) the reflectance of surfaces; (5) the spectral reflectance (color) of surfaces; and (6) the shadow structure. The structure of the ambient optic array must include all of these.
Invariants: The Foundation of Realism Gibson’s core insight that allowed him to locate a basis for realism in the optic array was that information for both change and non-change could be
Figure 5.2 Ambient optic array diagrams. Source: From Gibson (1979/2015), Figures 5.2, 5.3, and 5.4. Copyright 2015. Reproduced by permission of Taylor & Francis Group, LLC, a division of Informa plc.
78 William M. Mace present and detected together. Abstractly, there could be invariance under transformation. An early example that taught him this lesson came from his “shadow caster” studies. These studies still were rooted in perspective geometry and were very much within the scope of the “ground theory” perceptual psychophysics of the program laid out in Gibson (1950a). See p. 236 of Gibson (1979/2015). Changing shadows of a moving rigid object can look like just that: shadows of a rigid object moving. Gibson and Gibson (1957) showed observers the projected shadow of a rigidly rotating patterned planar object. The dominant impression was a three-dimensional rigid rotation of the object. The projected form stretches and compresses in two dimensions but the observed event is constant rigid rotation in three dimensions––a change in slant of a constant shape. Hence, invariance under transformation. The phenomenon, but not the explanation, follows Wallach and O’Connell (1953). Related geometry was pursued by John Hay (1966) who showed, contra Johansson (1964), that translation, shearing, stretching, and foreshortening all had to be taken into account to capture rigidity––the constant shape. Even though the power of the idea of invariance in the optic array is well illustrated in rigid rotation, it is most emphatically shown in occlusion––especially dynamic occluding edges (see Heft, Chapter 11, in this volume). (There are 20 headings and subheadings in Gibson’s chapter on the Ambient Optic Array. Nine of these are related to occlusion.) One instance of occlusion occurs when a relatively uniform texture wipes out another at a margin. This creates what Gibson called kinetic disruption in the optic array, an instance of topological breakage. When one opaque surface hides another, from some point of view, Gibson argued that the mode of disappearance specifies only a change in point of view and not a change in the existence of a hidden surface. A surface in the world that is seen to be hidden by rotating or by another surface coming between the observer and the hidden surface is seen to persist if occlusion is the only change defined. For humans, the most pervasive occluding edges are the eye sockets and face. As a head turns, it brings new texture into view in one direction and hides texture on the trailing side. If those changes are specific to changes in the view, and not surface existence, then it is possible to sweep the occluding edges of eye sockets, nose, and face through the optic array and, over time, see much of the full surround without sampling it all simultaneously. With locomotion, one changes optic arrays. In Gibson’s (1979/2015) terms, the perspective structure changes. See also the legend of Figure 5.4 of Gibson (1979/2015). New surfaces come into view, others go out of view. The ones that come into view by uncovering are seen to pre-exist (and not come into existence). Those that go out of view are seen to persist (and not go out of existence). These changes are reversible and that reversibility specifies the existence of a surface separate from the self. I can turn my head to bring into view parts of a room to my left, turn to the right where I cannot see those parts, and then turn back to see
Getting into the Ambient Optic Array 79 them again. This is the core structure in the optic array, whose full significance has been pursued only by Gibson. Gibson, Kaplan, Reynolds, and Wheeler (1969), and the accompanying film, showed a variety of types of disappearance (occlusion, melting, evaporating, being eaten, going into the distance) of optical texture. Some specified the destruction of a surface, others (occlusion and going into the distance) did not. Those distinctions are crucial. Changes that specify destruction of surfaces (e.g., evaporation, eating away, melting) are not reversible. The hallmark of an irreversible change, presented in film or video, is that an observer can tell if a film is being shown forward or backward (Gibson & Kaushall, 1973). Aerosol spraying out of a can is one thing. As a visible event, it is not reversible. A cloud of aerosol collecting and going into a can is obviously impossible and looks strange when the sprayed aerosol is shown backward.
Setting the Stage for Gibson’s Position that Structure in the Optic Array Is Self-Warranting The first part of Gibson (1979/2015) is about how the ambient optic array gets its structure from the nature of the world’s surfaces, media, and illumination, together with active observers themselves. When perceiving occurs, Gibson’s position is that structure is detected and that the specificity of the relation between the environment and the optic array guarantees that the environment is thereby detected. Many people who describe Gibson’s ideas would agree with this statement. Fewer people may have wrestled with the issues in the next section, which are required to finish the thought. Option 1 The Ambient Optic Array and the World Are Categorically Different When one says that the optic array is informative, what does it mean? Speaking colloquially, we can note that globs of substance in the environment have three dimensionality and mass. Material can be weighed. Seemingly, light patterns do not have such properties. Therefore, light structure and “objects” in the world are very different kinds of things. It presumably is easy for most people to accept that arrangements of light can point to (i.e., be “cues for”) material parts of the world, but surely arrangements of light cannot be such things. Thus, there is a common distinction between information and “things.” Gibson himself stressed that the structure in the optic array should not be confused with what was in the environment: Separate terms needed to be devised for physical motions and for the optical motions that specified them, for events in the world and for events in the array, for geometry did not provide the terms … Perhaps
80 William M. Mace the best policy is to use the terms persistence and change to refer to the environment but preservation and disturbance of structure to refer to the optic array. (1979/2015, p. 236) And, of course, Part I (The Environment to Be Perceived) of the 1979 book is distinct from Part II (The Information for Visual Perception). If we were to follow the framing of Option 1, then the structure in the optic array and the world specified by that structure are very different kinds of things. I take this to be the most common view in vision science. The distinction between proximal and distal, assumed by most inferential approaches, contains this assumption. It is from this assumption that one is faced with what Warren (2005, p. 338) and Turvey (2019, Chapter 7) described as “Hume’s problem.” That is, if we immediately experience the optic array, but the world that “causes” the optic array is a different kind of thing, why would animals ever think to hypothesize causes of their experience, and where would the causes that animals hypothesize come from? This problem was discussed at greater length by several of us in the early 1980s (e.g., Shaw, Turvey, & Mace, 1982) under the rubrics “incommensurability of natural kinds” and “intractable nonspecificity.” Option 2 The Ambient Optic Array and the World Are Different Perspectives on the Same Thing (Gibson) A glimpse of an answer to “Hume’s problem,” to help develop this Option 2 comes from William James. James, like Gibson, thought that a variety of deep puzzles originated in scholars’ lack of imagination about the richness of experience. The foundation experiences in traditional British empiricism were simple sensations, far from “the world” as such. Thus, the world had to be “outside” of experience as something constructed or inferred. In contrast, James embraced a phenomenally rich empiricism that he called Radical. He argued that this allowed “the world,” as content and object of experience, inside experience to begin with. To illustrate, James asked what the object of a mental image might be. In the case of his thinking of “Memorial Hall,” he asked if it really is “Memorial Hall’ ” that he is thinking of. He concludes: [I]f I can lead you to the hall, and tell you of its history and present uses; if in its presence I now feel my idea, however bad it may have been, to be continued; if the associates of the image and of the felt hall run parallel, so that each term of the one context corresponds serially, as I walk, with an answering term of the others; why then my soul was prophetic, and my idea must be, and by common consent would be, called cognizant of reality.
Getting into the Ambient Optic Array 81 … That percept was what meant, for into it my idea has passed by conjunctive experiences of sameness and fulfilled intention … Knowledge thus lives inside the tissue of experience … (James, 1904, 539–540) Thus, to James, “Memorial Hall” is a family of experiences sufficient to count as the referent of the name of the building. And it remains “inside the tissue of experience.” Two other points worth mentioning in light of James’s quote. First, the “end” that is indicated is a family of experiences that do not have a sharply defined end, but rather a pragmatic one. At some point James has seen enough to be fully satisfied that he is in the presence of Memorial Hall. We might call this “clarification to sufficiency,” or “clarification until sufficient.” One never sees ALL of anything in the world. This is consistent with how Gibson defined perceiving in 1979 (Gibson, 1979/2015, p. 244). Notice immediately the similarity of James’s description to Gibson’s “rules for the visual control of locomotion” (Gibson, 1979/2015 p. 222). For example, one Gibson rule for the control of locomotion is, “To permit scrutiny, magnify the patch in the array to such a degree that the details can be looked at. To manipulate something graspable, magnify the patch to such a degree that the object is within reach” (pp. 222–223). These “rules,” in turn, are very like the descriptions Gibson used throughout his career, from Gibson and Crooks (1938) to flying (Gibson, 1947, 1950a). The descriptions of flying, in turn, are close to those of Langewiesche (1944). Gibson’s descriptions are very phenomenological, although he always used the word introspection (for phenomenology connections, see Heft, Chapter 11, in this volume).
Automatic Tests for Reality Within the Optic Array The major leap that allowed Gibson to realize what James partially intuited was the role of invariance. By recognizing the existence and role of invariance in the optic array, Gibson established a basis for realism that avoided the old conundrums of incommensurability (Gibson, 1967a). Gibson argued that his discoveries, which surely center on invariance, actually allowed for “automatic tests for reality” within the structure of the ambient optic array. Consider the following three Gibson quotations: A criterion for an existing surface is that it can project a visual solid angle to some point of observation in the illuminated medium of a cluttered environment. If it does not, it is nonexistent (unreal, imaginary). An existing surface can therefore come into and go out of sight by a shift of the point of observation. A nonexistent surface cannot. Perception need not be sustained by a persisting sensory input, since it consists of registering invariants and variants, not of processing
82 William M. Mace inputs. One proof is that we perceive a layout of surfaces, only some of which are in sight at any given moment. That is, the perceiving of persistence depends on information not on the persistence of vision after stimulation ceases. (Gibson, 1977; original emphasis) I suggest that perfectly reliable and automatic tests for reality are involved in the working of a perceptual system … A surface is seen with more or less definition as the accommodation of the lens changes; an image is not. A surface becomes clearer when fixated; an image does not. A surface can be scanned; an image cannot. When the eyes converge on an object in the world, the sensation of crossed diplopia disappears, and when the eyes diverge, the “double image” reappears; this does not happen for an image in the space of the mind. An object can be scrutinized with the whole repertory of optimizing adjustments … No image can be scrutinized––not an afterimage, not a so-called eidetic image, not the image in a dream, and not even a hallucination. The most decisive test for reality is whether you can discover new features and details by the act of scrutiny. Can you obtain new stimulation and extract new information from it? Is the information inexhaustible? Is there more to be seen? The imaginary scrutiny of an imaginary entity cannot pass this test. (Gibson, 1979/2015, p. 245) If one re-reads what James said about connecting his “mental image” to Memorial Hall by unfolding the experiential sequence to an acceptable conclusion, it is a short (but significant) step to add Gibson’s criteria to get to the “real world” through perceptual experience. As an example, consider adding the reversibility of occlusion to what James said. Then, James’s “soul” need not only be “prophetic,” but it can extract the invariance of existing surfaces by the reversibility of the changes in the optic array. The route to Memorial Hall, and the surfaces of the Hall itself, consist of persisting surfaces that can be brought into view much as James described. Those surface textures that go out of view can, reversibly, be brought back into view if they persist. Reversibility is the guarantor of the independent existence of the persisting surfaces. One could then add to James’s description an allusion to the return trip to his home, a later trip back to Memorial Hall, and so on. The reversibility of those experiences establishes the independent reality of the surfaces. James maintained that an object of experience can still be within experience. As important as James’s move is, it still leaves an opening for a strong subjectivism, even if that is anathema to James. Gibson showed that one could preserve the coherence of James’s formulation by staying within experience, but also establish that independently existing, persisting surfaces, could be revealed through invariance.
Getting into the Ambient Optic Array 83
Artistic Emphasis on the Ambient Optic Array The retinal image was a bête noire for Gibson. The ambient optic array was his alternative. It provided something overarching or surrounding to be sampled and could be a basis for explaining what larger whole successive sampling could converge on. Without a basis for convergence (clarification, equilibration, adjustment) larger than a single sample, processes, such as convergence, clarification, or adjustment, are left unexplained. Other ways to draw attention to the ambient array as a structure overarching retinal images can be illustrated through non-pictorial art, including architecture. Robert Irwin The contrast between the retinal image and the ambient optic array may have no better dramatization than the career of the artist, Robert Irwin (Weschler, 2009). His work literally progressed through stages from painting pictures to obsessing over the settings for displaying pictures, to the environments themselves, leading us out of the temptations of the retinal image and delivering us to the ambient optic array, which is the point of my drawing attention to Irwin’s career. Robert Irwin was a talented conventional artist when he began his career in the early 1950s. However, he soon established the goal of creating work that was not about anything but itself: I began to recognize the difference between imagery and physicality, and furthermore that for me, the moment a painting took on any kind of image, the minute I could recognize it as having any relationship to nature, of any kind, to me the painting went flat. … Imagery for me constituted representation, ‘re-presentation’, a second order of reality, where I was after a first order of presence. (Weschler, 2009, p. 65) In pursuit of this elusive goal, he experimented with using paintings of lines, then tiny dots. He invented something he called his “disc” paintings to defeat the confines of a frame, challenging himself to dissolve the frame completely. Twelve slides of a range of Irwin’s work can be viewed on the Pace Gallery website, www.pacegallery.com/artists/211/robert-irwin. A major turning point in Irwin’s career occurred after an apparently failed installation at the Museum of Modern Art in New York in 1970. He decided that working out of his California studio was a dead end. In a period of reflection, he made trips to the Mojave desert and mused: The Southwest desert attracted me, I think, because it was the area with the least kinds of identifications or connotations. It’s a place
84 William M. Mace where you can go along for a while and nothing seems to be happening. It’s all just flat desert, no particular events, no mountains or trees or rivers. And then, all of a sudden, it can just take on this sort of … magical quality. It just stands up and hums, it becomes so beautiful, incredibly, the presence is so strong. Then twenty minutes later it will simply stop. And I began wondering why, what those events were really about, because they were so close to my interests, the quality of phenomena. (pp. 163, 164) Here are truly ambient optic arrays, structured by extended surface, sky, and lighting at a specific time of day. Much of Irwin’s subsequent work concentrated on finding minimal ways to heighten viewers’ awareness of environments (both indoors and outdoors) that already exist. An example is “Running Violet V” in a Eucalyptus Grove near the Faculty Club of the University of California at San Diego. Because Irwin’s concerns are the effects of a whole surround, pictures cannot capture the effects he created. To avoid any misunderstanding, he did not allow photos of his work to be published. He later relented. Consequently photos and video explorations of many of his works have been made and are not hard to find online. James Turrell A second major artist, who surely does know something about Gibson (having studied psychology as an undergraduate at Pomona) and who once worked with Robert Irwin, is James Turrell (Adcock, 1990). Turrell also has spent a career using surface layout and light to create non-picturebased visual experiences. Many of Turrell’s works involved sharply cutting apertures in walls and ceilings, framing the sky to create variants of what Katz (1935) called “film color.” In these situations, the sky seems to fill the frame but to be flush with it, as a mysterious part of the surface. The ultimate frame for the sky, built by Turrell, is a sharply sculpted crater, Roden Crater, in northern Arizona. A site for numerous arrangements for framing light, including celestial objects, the central experience for observers is to lie down on the floor of the crater and observe the dome-like appearance of the framed sky. Much of Turrell’s work, shown in museums, consists of careful arrangement of lights, sometimes to make a well-delineated, shaped volume of light, sometimes evenly distributed to make a Ganzfeld. For art that exploits occluding edges to control what viewers see as they move relative to the work, see the constructions of Yaacov Agam. Internet sources are easy to find. Architecture Going beyond the retinal image to the ambient optic array seemingly would not need to stretch to artists like Irwin and Turrell and could just as
Getting into the Ambient Optic Array 85 well simply cite sculpture and architecture. The architect, Michael Benedikt is best known to ecological psychologists for pursuing and promoting a geometric concept called the isovist (Benedikt & Burnham, 1985; Benedikt, in preparation, part II). Isovist software is available at isovists.org. An isovist is a shaped space of all surfaces visible from a given location. Draw all lines of sight from a given location to all visible points. Gibson’s Figure 11.2 in the 1979/2015 book is essentially an isovist. Numerous measures can then be made on an isovist, for example, minimum distance, maximum distance, average distance, area (isovists are usually examined in two dimensions), perimeter, perimeter2/area = jaggedness, the inverse of jaggedness gives compactness. The isovist is a kind of abstraction of Gibson’s concept of a vista, where the vista is all that is unoccluded from a given place of observation, but the vista contains all of the nested structure, as illustrated in Figure 5.1. An isovist is abstracted for the purpose of numerical measures. Since Benedikt introduced his ideas to the community of ecological psychology in 1981, the use of isovists has increased dramatically in architecture. “Space Syntax” is the name of the prime sub-field begun as early as 1976 by Bill Hillier at University College London. See Hillier (1996) for a full presentation that includes the use of isovists and Turner et al. (2001) for additional analysis. There are now multiple websites showing how isovists have been used in building and urban design.
Developments in Vision Science, Post 1979 What are the developments related to the ambient optic array since 1979 that we might look to? What might speak to developing the optic array concept as distinct from the retinal image? Research outdoors For some years after Gibson’s 1950 vintage studies, we saw little interest by other researchers in the ground as visual content and context. He, Ooi, and their colleagues and students have changed that and done extensive research using the ground as a background for perceptual judgments (e.g., He et al., 2004; Ooi & He, 2007). Farley Norman has contributed his share as well (Norman et al., 2017). Hartman (1994) reported size judgments of a truck parked on a hill seen against a cloudy sky compared to a clear sky. Jim Todd (1982) and Koenderink (1986) provided nice progress on analysis of the rigid vs. non-rigid distinction. Spröte and Fleming (2016) have shown a more recent interest in investigating non-rigid transformations. Their studies here were with static (albeit stereo) graphic images to see how much might be conveyed about properties of pictured objects. Their interest is in inferences and judgments about causal processes that led to observed shapes, as well as material properties of the shapes.
86 William M. Mace Mingolla and his students (Ruda et al., 2015) have used occlusion as both an independent and a dependent variable. They have investigated the optical information for occlusion, but also related their results to depth perception. Tanrikulu et al. (2018) continue to examine occlusion as tied to depth perception. Hancock and Manser (1997), using a driving simulator, compared estimates of time to contact (see Smart et al., Chapter 10, in this volume) with vehicles that simply disappeared compared to those that “disappeared” by occlusion. The most common heirs to that work assimilate it to “prediction motion” (PM) tasks (Makin, 2018). Nam Gyoon Kim (2008) examined abilities to indicate heading direction in the context of occlusion from an undulating ground plane. Recent work using occlusion by the Bingham lab is covered in Heft’s Chapter 11, in this volume. Readers understand well, I imagine, that research tends to cluster around empirical manipulations and phenomena more than the issues framed Gibson’s way. I mentioned “prediction motion” above. In the same vein, Gibson and Gibson (1957) tend to be extended now under the rubric of “structure from motion” more than invariance under transformation. See Koenderink (1986) for a critique of some “structure from motion” work within a broad presentation of “optic flow.” Another area that Jim Todd contributes to prominently is called “shape-from-shading.” Alan Gilchrist (2018) has persistently examined lightness perception in real conditions, inveighing against exclusive use of computer displays for such research. Seokhun Kim completed a dissertation using an optical tunnel (Kim, Carello, & Turvey, 2016), inspired by Gibson, Purdy, and Lawrence (1955). See Figures 9.1–9.3 in Gibson (1979/2015), where the discussion begins on page 145. Gibson (1982c) says, “this experiment led to the hypothesis of an optic array as contrasted with the retinal image” (p. 98). In the optical tunnel, Gibson manipulated whether or not there was focusable light, and whether or not spaced partitions could be arranged to specify a solid surface. He was manipulating what was available to an eye rather than what was “on” the eye at a given moment. In order to advance our knowledge of the optic array, we should look for clues anywhere people are doing serious work related to perceiving the environment. What Gibson can help with most is the framing vision of what the overall project can be about in order to avoid debilitating puzzles and blind alleys. Research Outside Gibson, but Related What is discussed in the next sections refers to a good bit of work that could be seen as related to Gibson’s ambient optic array although that is more like a whiff of Gibson. It is hard to see “Hume’s problem” taken seriously.
Getting into the Ambient Optic Array 87 Computer Graphics and Video Games The biggest story for ambient optic array research since 1979 is in visual technology. The optic array can be manipulated as never before. The development of computer hardware and software has not only brought dramatic new capabilities for labs with very large budgets, but dramatic price drops have brought extraordinary technology within the reach of a very large number of labs. The video game industry and animation in feature films have fueled much of the relevant media technology. Recall that for James Gibson and his students to make a short black-and-white computer animation to investigate occlusion (Gibson et al., 1969), they had to make a special trip to Bell Labs. Now, artists and researchers have software tools like Python and Blender. They have commercial software like RealFlow by Next Limit available to them: www.nextlimit.com/realflow/latest-release/#prettyphoto/0/. Behind this kind of work are scientists like Robert Bridson (2016), who took on the challenge of depicting fluids. He is the developer of Naiad software widely used in animation for feature films. See also Bridson and Batty (2010) and the Bridson website which links a large number of papers: www. cs.ubc.ca/~rbridson/. Adelson (2001) described forays into the study of materials, including seminal work by the Koenderink group. The Adelson group (Xiao et al., 2016) more recently reported research on folding in fabric. Virtual Reality Closely related to computer graphics are virtual reality (VR) displays, some of which are immersive and some not. Immersive virtual reality promises to give us great purchase on the ambient optic array because it provides optical structure that surrounds and varies with head motion. It provides a visual environment that can be explored. Smart phone implementations can simulate windows on another environment that can be explored. See Scarfe and Glennerster (2015) for a review. Among ecological psychologists who have employed virtual reality are Geoff Bingham (Bingham et al., 2001), Frank Zaal, Claire Michaels, and Reinoud Bootsma (Zaal & Bootsma, 2011; Zaal & Michaels, 2003), Bill Warren (Warren et al., 2017), and Jiang et al. (2018). Loomis (2016) reflected on 25 years of VR and the topic of “presence” for the eponymous journal. Tom Stoffregen and students (Munafo, Diedrick, & Stoffregen, 2017) have evaluated the Oculus Rift and found that it can induce discomfort and sickness in some people. Plenoptic Function The “ambient” part of Gibson’s ambient optic array has not been taken seriously by many, except in the technology of virtual reality (see below). One notable exception is the “Plenoptic function” of Adelson and Bergen
88 William M. Mace (1991). This also is called the “light field” (Ng et al., 2005). Like Gibson, the authors imagine a point of observation with light coming from all 360° around the point. However, they presume that the proper surround for a point is light not surfaces: The significance of the plenoptic function is this: The world is made of three-dimensional objects, but these objects do not communicate their properties directly to an observer. Rather, the objects fill the space around them with the pattern of light rays that constitutes the plenoptic function, and the observer takes samples from this function. The plenoptic function serves as the sole communication link between physical objects and their corresponding retinal images. It is the intermediary between the world and the eye. (Ng et al., 2005, p. 6) Research like this is fashioned with computer graphics, visual physiology, and optical devices very much in mind. Perceiving the world would come later. Put another way, whether or not “the world” in any sense appears naturally in the theory, or comes in outside of the theory, is in question. Where does “the world” come from in the theory of perception? Where does it make its appearance? Natural Scenes Following the work of Field (1987), people working within traditional areas of vision research ventured to extend their work beyond simple displays to “natural scenes” (nearly always photos of scenes, not the scenes themselves). Geisler (2008) reviews about 20 years of such work. See also Ruderman (1994). This approach intersects with the procedures and goals of Brunswik (1956) much more than Gibson. People are mainly interested in efficient digital image compression and in rationalizing features of the visual system. Thus, they ask what is statistically distinctive in a picture that makes it look like a “natural scene.” Predictability, then, suggests digital compression strategies that would not require storing values for every pixel in a digital image. A recent example can be found in McCann, Hayhoe, and Geisler (2018). They took 96 stereo photographs (of which they could use 81) around the Austin campus of the University of Texas for their studies of depth perception accuracy in natural scene images. Like most vision science work, the proximal-distal distinction is taken for granted and “Hume’s problem” is not taken seriously.
Summary of Current Research The remaining challenge is to develop the coherent ecological approach that Gibson charted for us. What has happened so far is that some of
Getting into the Ambient Optic Array 89 ibson’s ideas have been patched on to existing programs that are taking G up most of the publication space in vision science (psychophysics and neuroscience––experimental and modeling), computer graphics, human factors, design. Each of these will play out their own agendas. We’ll see what happens as they collide with internal contradictions or just give up on grandiose goals and fall back to engineering practicalities.
Formal Framework for an Ecological Program My emphasis in this chapter has been to show that invariance in the ambient optic array allowed Gibson to bring the external world into the optics. Robert E. Shaw has argued that the formal necessities to carry out the ecological program are best carried out at the level of symmetries (see Shaw & Kinsella-Shaw, Chapter 6, in this volume). The ecological program is replete with paired concepts: animal–environment, perceiving– acting, information–energy, observation–control, affordance–effectivity. Shaw argues that those must be related as mathematical duals and that the needed invariance to hold it all together is what he calls generalized action. The duals can be composed as duals of duals, allowing for differentiation without disintegration. The comprehensive structure of duals, which Shaw calls a coalition, is an alternative to hierarchies and heterarchies. A comprehensive list of references to this work is found in Shaw, Kinsella- Shaw, and Mace (2019).
6 The Challenge of an Ecological Approach to Event Perception How to Obtain Forceful Control from Forceless Information Robert Shaw and Jeffrey Kinsella-Shaw For the academic year (1969–1970), James J. Gibson invited Robert Shaw, who was at that time an assistant professor at the University of Minnesota, to visit Cornell University, so that he could learn more about Gibson’s ecological approach to psychology. Shaw had met Gibson at Minnesota in the summer of 1968 while taking his ecological psychology seminar at the Center for Human Learning’s Summer Institute. It was from their lively interactions in Gibson’s seminar that they had, so to speak, “a meeting of minds.” Shaw’s duties at Cornell would be to help teach Gibson’s graduate seminar and to teach the undergraduate perception course. Gibson expressed the hope that Shaw, who had a background in mathematics and logic, might see ways in which fundamental principles of ecological psychology might be made explicit—even formalized. Shaw gratefully accepted because he had an inkling that the fundamental significance of invariants in Gibson’s theory of perceptual information might be amenable to a group symmetry interpretation—a concentration of Shaw’s. The purpose of this prologue is to impress upon the reader that everything discussed in this chapter benefited from that year of close, almost daily, interaction with Gibson and having been granted license by Gibson to query him about any facts, concepts, or principles of his approach that Shaw might need help with (Shaw, 2002).
Some Challenges to an Ecological Approach to Event Perception The biggest challenge to explaining Gibson’s ecological approach to event perception is not understanding the problem he addresses but understanding the problem he fails to address. Although the role of ecological information, as expressed in terms of his optic array construct, deals perfectly well with kinematic event properties (i.e., geometric change), it scarcely addresses the problem of kinetic events at all (forceful change). This is not surprising since the physical concepts needed are not generally known outside of physics (e.g., d’Alembert’s inertial force principle; see Shaw & Kinsella-Shaw, 2007).
An Ecological Approach to Event Perception 91 Consequently, this chapter will attempt to achieve three goals: (1) to assay the problems that a complete theory of event perception must resolve if progress is to be made; (2) to show that Gibson’s opening gambits on the event perception problem were promising if incomplete; and (3) to show how a physical concept, d’Alembert’s principle, introduced into ecological physics by the authors (Shaw & Kinsella-Shaw, 2007), can provide a method to solve this dimensional inhomogeneity problem. What seems to be needed is a major innovation in the way event space- time geometry is conceptualized. Gibson helps to clarify what this new approach must overcome. This chapter is intended to complement Gibson’s event perception chapter by suggesting specifically how information might be available for forces on which action control depends.
What Are Ecological Events? James J. Gibson (1979/2015) tells us that to make progress on the problem of event perception, we should begin thinking of events at the ecological scale as being the primary realities for perceptual theory—with time and space being abstractions from them. For Gibson, time and space are distinct orders in the global layout of events and the structure over which they are defined. “Time” is an abstraction from successive-order and “space” from the adjacent-order of structures in the ecological-scaled event space. There were no competing theories of event perception available at the time—apart from one. Early last century, Einstein (1905/1923) built his Special Theory of Relativity on the fundamental notion of point-events (unextended points with both spatial and temporal coordinates) and points of observation on those events as defined at locations in space-time. The virtue of using 4D space-time geometry over ordinary 3D space geometry is that dynamics is achieved intrinsically by the tracery of points in space-time (called a “world-line”) without the need to add change “cinematographically” by sequences of frames (see Blau, Chapter 16, in this volume). However, Einstein’s space-time (Minkowski geometry) event theory was not what Gibson (1979/2015) had in mind, for, given that the overall environment is stationary, he did not think relativity theory was relevant for perception at the ecological scale. He asserted: The optical information for distinguishing locomotion from nonlocomotion is available, and this is extremely valuable for all observers, human or animal. In physics the motion of an observer in space is “relative,” inasmuch as what we call motion with reference to one chosen frame of reference may be nonmotion with reference to another frame of reference. In ecology this does not hold, and the locomotion of an observer in the environment is absolute. The environment is simply that with respect to which either locomotion or a state of rest occurs, and the problem of relativity does not arise. (p. 68, emphasis added)
92 Robert Shaw and Jeffrey Kinsella-Shaw As mentioned above, Gibson offers a different take on the nature of the temporal and spatial dimensions over which Einstein’s space-time is defined. Gibson offers the concept of successive order as a replacement for the temporal dimension and the concept of adjacent order as a replacement for the spatial dimension. In addition, the ecological approach also recognizes that events exhibit both structural changes, or variants, and structural persistence, or invariants, over these orders. Evidently, animals and humans have the general perceptual capability not only to distinguish between change and nonchange but also to classify styles of change, as well as the objects that undergo the styles of change (Kim, Effken, & Shaw, 1995). Consequently, event perception can be defined as the detection of information specifying a style of change that a structure undergoes over some determinate region of space-time, or better, over determinate adjacent places in a succession of durations. Two fundamental problems of event perception include, first, how one perceives change at all, and, second, how one perceives particular styles of change as such. A shortcoming of our field was (and still is) the lack of consensus on terminology for discussing what it is about events that might be perceived. Which abstract aspects of events are specified in optic array information that are likely to be detected?
Gibson’s Principles of Event Perception In his approach to event perception, Gibson (1979/2015, Chapter 6) declares his focus to be on just three kinds of events: 1. Changes in surface layout include translations and rotations of an object, collisions, deformations, and disruptions. 2. Changes of surface color or texture, significant alterations of the surfaces of plants and animals. 3. Changes in the existence of surface transitions of evaporations, dissipation, melting, dissolving, and decay. Dynamics is the branch of physics that studies time-dependent processes, and includes both kinetics and kinematics. Kinetics is the branch of classical mechanics that is concerned with the relationship between motion and causes, specifically, forces and torques. Kinematics, on the other hand, is a branch of classical mechanics that describes the motion of points (e.g., particles), bodies (objects), and systems of bodies (groups of objects) without considering the mass of each or the forces that caused the motion. Consequently, kinematics, as a field of study, is often referred to as the “geometry of motion,” while kinetics might be referred to as the “causes of the geometry of motion.” An additional complication, Gibson notes, is that some of these events are reversible, but many are not. Such invertibility plays havoc with causal structure since the inverting of cause → effect to
An Ecological Approach to Event Perception 93 effect → cause violates the requirement that effects cannot precede their causes. This fact alone would make one question whether causality has any role to play in scientific explanation (d’Abro, 1951). For clearly, since the advent of quantum theory with its superposition principle, information specifying such inverted events cannot be restricted to causal structure alone. (A telling subtitle to d’Abro’s book is “The Decline of Mechanism.”) Traditional mechanical causation requires that for a cause to give rise to a noncontiguous effect in a nonlocal fashion, there must be a causal chain linking the two. In many cases, unfortunately, mechanical causal chains may run aground because some of the mediating linkages involve nonlocal constraints, such as least action, conservation laws, or field effects. Gibson and his followers have recognized such nonlocal constraints as making possible both retrospective and prospective control (Shaw & Kinsella- Shaw, 1988; Turvey, 1992). Such processes exhibiting nonlocal control were called ‘entelechal’ by Aristotle, and have also been called conative processes, and recently discussed in detail by Shaw, Kinsella-Shaw, and Mace (2019).
The Optic Array: Information for Visual Perception It is most important to note that all three kinds of events mentioned by Gibson involve only kinematic dimensions, i.e., space and time, without kinetic dimensions (mass, energy, force, or momentum). There is a reason for this emphasis on the kinematic side of dynamics. Gibson introduces a theoretic construct, called the optic array, as an aid in characterizing the optical information that is detected during visual perception. The optic array is a cone of light rays defined at any point of observation (moving or stationary) that might be occupied by an actor (but need not be). More precisely, an optic array is a 360 º solid angle of adjacent light contrasts concentrated at a point of observation (Figure 6.1, from Gibson, 1979/2015, p. 65). Although, intuitively speaking, optic array geometry seems to be a kind of projective geometry construct, it is not intended to be. Gibson uses its projected rays only as a heuristic to facilitate communication, and refuses, on principle, to embrace projective geometry as a general method for ecological optics. For examples of Gibson’s use of the ray concept, see Gibson (1966a, pp. 12, 15) where he mentions rays and uses rays in his figures. “At every point in the illuminated medium there will be a sheaf or “pencil” of rays converging from all directions” (p. 15). And in Gibson (1979/2015, p. 52; emphasis added), he says: “A focused pencil of rays consists of two parts, the diverging cone of radiant light and the converging cone of rays refracted by the lens, one cone with its …”. He cautions us to beware of the important distinction between projective geometry descriptions of the optic array and what they should
94 Robert Shaw and Jeffrey Kinsella-Shaw
Figure 6.1 The transformation of the optic array defined by a locomotor movement. The solid lines represent the optic array before the observer stands up, the dashed lines after s/he has moved. The path of locomotion of the head is forward and upward by which the whole array is transformed. Source: From Gibson (1979/2015), Figure 5.4. Copyright 2015. Reproduced by permission of Taylor & Francis Group, LLC, a division of Informa plc.
be. When speaking of the array structure, we should refer to optical forms (projected images), not to environmental forms (shapes of objects), and to a change in the array structure, as a transformation, not as a motion of objects or surfaces. There are no objects or surfaces in the optic array corresponding to those in the environment! Hence the use of the term here is more closely allied to some kind of (as yet unknown) “perspective” geometry than to projective geometry since abstract mathematics has not yet been applied to the problems of ecological optics. Gibson admonishes us to be circumspect in treating the optic array as ordinary geometry because the constituents of ordinary geometry, such as point, line, or plane, lack realistic dimensions that actual material objects possess (Pagano & Day, Chapter 3, in this volume). Points are deemed to be “ghostly” zero dimensional, lines to have unrealistic zero width, and areas as having no thickness at all. The lack of thickness to surface area is inherited from its generator—the moving line segment, and the lack of width to a line is inherited from its generator—the moving point. Since
An Ecological Approach to Event Perception 95 surface opaqueness is lacking, there can be no occlusion of one surface by another, nor the self-occlusion needed to specify an object’s shape. For Gibson (1979/2015), there is also always some degree of recurrence (transformational invariance) and some degree of nonrecurrence (transformational variance) in the flow of ecological events. And, although Gibson alludes to points of observation in the environment, he is careful to explain that they are not to be thought of as being stationary, discrete, “ghostly” points in space but rather are pauses in the movement of an observer along a path through the environment. Also, all lines of projection pass through a point (the point of projection) with the resultant loss of the topological property of orientability. This loss renders the projected object’s orientation ambiguous, so that a projective geometry cannot distinguish the front of an object from its back (thus, precluding self-occlusion) or tell its left side from its right side, making all surfaces unrealistically “see-through.” Because of these impossible properties, projective geometry and its Euclidean bases provide unrealistic models of the ecological world (Shaw & Mace, 2005). For these reasons, one might accept that the proper use of projective geometry is not constructing facsimiles of opaque objects but only in describing heuristic guidelines for their placement in geometric space. For instance, it can specify centers of objects by points and margins surrounding areas of light contrasts or object edges by lines. Such use is extrinsic and post hoc at best. Gibson (1957) asserts this admonition in a paper on optical motions and optical transformations: The notion of point-to-point correspondence in projective geometry, simple and powerful as it is, does not apply to the optics of events any more than it applies to the optics of opaque surfaces, for it leaves occlusion out of account. The fallacy lies deep in our conception of empty space, especially the so-called third dimension of space. Whatever the perception of space may be, if there is any such thing, it is not simply the perception of the dimension of depth. (p. 289) Keeping these cautionary notes in mind, scientific prudence dictates that we approach the problem of event perception with a degree of trepidation.
Gibson’s Event Perception Gibson (1979/2015, Chapter 6) presents the rudiments for an ecological approach to event perception. (For the reasons given, he preferred to call it “an approach” rather than a theory.) Since his book is limited to vision, we should not be surprised that his discussion concentrates on the optical information for events. Consequently, our discussion in this exegesis is also
96 Robert Shaw and Jeffrey Kinsella-Shaw restricted to vision. It goes without saying that no adequate event perception theory can ignore the likelihood that all the senses act together as a perceptual system, such that one sensory system not only complements the others by adding redundant information but also supplements their information with additional event information specific to its own sensory system. In the place of cooperating sensory systems, the idea of amodal information that lacks sense “modality” specificity has been raised (e.g., Michotte, Thinès, & Crabbé, 1964). More recently, some contemporary psychologists have argued that in the place of multiple modality-specific arrays there should be only a single general sensory system (Stoffregen, Mantel, & Bardy, 2017). Under this view, the relative independence of the senses under loss, namely, when one sensory subsystem is lost or weakened, not all are lost and it does not compromise the global array. There are two fundamental assumptions of the optic array: First, the information for recognizing an event is captured through “projection” of the physical disturbances (i.e., mechanics) into optical disturbances of the ambient optic array. Second, the information that directly specifies an event’s identity resides in the invariant properties of the optic array disturbances caused by the actual event and reside at the place of observation—whether that place is occupied by an observer or not. By placing the optic array information source outside the observer, an important goal of ecological optics is automatically achieved. Since the place of observation may not be occupied by an observer, no cognitive abilities of an observer (memory, inference, mental images) can be relevant to the make-up of the information of interest. Thus, information is sui generis and, since there can be no mediator, it must be directly apprehended. Gibson (1979/2015) argues that optic array events and real-world events are so dissimilar that they do not even deserve to be called by the same name: These disturbances in the optic array are not similar to the events in the environment that they specify. The superficial likenesses are misleading. Even if the optical disturbances could be reduced to the motions of spots, they would not be like the motions of bodies or particles in space. Optical spots have no mass and no inertia, they cannot collide, and in fact, because they are usually not spots at all but forms nested within one another, they cannot even move. This is why I suggested that a so-called optical motion had so little in common with a physical motion that it should not even be called a motion. (p. 101) In short, Gibson informs us that disturbances in the optic array lack mass, inertia, and even motion, and therefore do not resemble events in the world involving material objects with those properties. Event perception
An Ecological Approach to Event Perception 97 presumably still works because, as impoverished as the array disturbances may be, they somehow still share common event invariants with real-world events. The fundamental problem of information as specification is revealed in this assertion. If so, then this way of posing the problem leaves us facing a conundrum of how purely kinematic optical information can specify kinetic events. And emphasizing the extreme dissimilarity of optical array disturbances to the actual events, except for sharing invariants, as true as it may be, seems to obscure the path to a solution. The laws of optics and the laws of mechanics provide the bases for determining all the invariant properties involved in each event, and must somehow be the means by which we recognize the physical event and the optical event as having the same referent. To be recognized as being about the same event, the force-driven disturbances in the environment and the forceless disturbances in the light projected from them into the optic array must share the same invariant information. How they might do so is the major puzzle to be addressed later in this chapter. Specifically, we will ask how Gibson’s theory of event perception which assumes only optic array kinematics might be expanded to include optic array kinetics. Formally, the issue is one of dimensional inhomogeneity, a mismatch in dimensionality between events with dimensions of mass, length, and time versus events with only dimensions of length and time. This problem is profoundly serious because a theory of living systems based on a mismatch in dimensionality can never, even in principle, solve Bernstein’s degrees of freedom problem that must be solved if a perceiving- acting system is to be capable of adaptive interactions with the environment. For instance, without kinetic information, then the negative affordances of lethal or injurious encounters with surfaces, missiles, or other life forms could not be recognized and thus not avoided. For example, there would be no information to distinguish the extreme danger of a charging bull from the friendly encounter with a running child. Their difference in mass makes the impact force from colliding with the bull extremely dangerous, while the impact force from colliding with a small child might even be fun. Likewise, the relative danger of stepping-off places would not be informationally distinguished from falling-off places since the difference in danger impact due to the force of gravity would be unspecified. This is not to say that the optic array might not register some useful kinematic information, such as a global transformation of the optic array which specifies to the actor that they are moving rather than some part of the environment, which is specified by local patches of change.
The Challenge of Abstractness One thing that makes Gibson’s theory of perception difficult to grasp is its degree of abstractness, that, in general, information for x is not x per se
98 Robert Shaw and Jeffrey Kinsella-Shaw but how x fits into the layout of the environment and changes with it on some occasions but remains unchanged on others. For instance, to begin with, we need to explain the abstract role of the ambient optic array in characterizing ecological information, and how being defined at the ecological scale entails it being directly detectable. Gibson’s introduction of the optical array as the chief information construct was a brilliant theoretical move for four reasons: (1) by optically interfacing the observer with its environment, it means the information is automatically ecologically scaled; (2) it provides a place where the invariant structure of the environment and the perspective structure of the actor are brought together; and (3) thus allows the information specific to both the environmental context and the perceiving actor’s situation to be directly picked up at the same time; and, finally, (4) it objectifies the information by locating it outside the observer and therefore assuring it is devoid of any possible cognitive or mentalistic contributions. This is perforce the case since no actual observer even needs to be at the point of observation toward which the optical array is projected, thus no contribution from the observer’s memory nor inference is even, in principle, possible. It merely needs to be directly detected. And being both of the environment and of the place where perception might occur, but need not, the optical array is most thoroughly ecological—respecting the observer no more than the observed. And, like affordances, whose information it may frame, it “points both ways,” toward environment and organism alike. An example of information about x not being x per se is the fact that a square shape and a trapezoid shape may project the same 2D form, a trapezoid, while they project different isometric invariants when each is rotated around the same axis. Since symmetry theory is our strategy for explaining the invariant information specifying events as made available by the optic array, it would be useful to provide some scientific background on symmetry group theory. We do this next.
The Proven Importance of Symmetry Theory in Science At the first conference on ecological optics at Cornell University in 1970, as agreed, Shaw presented a paper entitled “The Role of Symmetry in Event Perception,” in which he attempted to introduce symmetry principles to ecological theory (Shaw, McIntyre, & Mace, 1974). Here is a précis of his introduction: Shaw began by presenting the views of Ernst Cassirer (1944) because Gibson had cited him twice in his 1950 book (e.g., pp. 153 and 193). Cassirer argued for the fundamental role of group theory in perception, asserting that the primitive form of understanding is that of the intuitive concept of a group. The usefulness of the group concept in contemporary
An Ecological Approach to Event Perception 99 mathematics and physics offers strong support to the validity of this insight. For instance, one of the chief functions of group theory in mathematics and physics, as shown by Noether (1918), has been to describe what properties of objects, events, or even natural laws remain invariant, or symmetrical, across different domains, or under modification by transformations (e.g., rotations and translations). This work by Noether (1882–1935) showing group theory to be the basis of the conservation laws has proven to be a watershed moment for physics. Noether’s theorem in 1918 shows that there is a one-to-one correspondence between each conservation law and a differentiable symmetry of nature. For example, energy conservation follows from the time-invariance of physical systems, and angular momentum conservation arises from the fact that physical systems behave the same, regardless of how they are oriented in space. These results make Noether, arguably, the most important contributor to the mathematical foundations of physical science in the modern era.
The Dual Role of Symmetry in Event Perception Symmetry has a double role to play in theories that make it a highly prized commodity in science. It is by nature a duality in that it refers equivocally and simultaneously to both a property left invariant under a transformation (an SI) and to the invariant specifying the transformation that leaves the property invariant (a TI) (Pittenger & Shaw, 1975). Hence if you base your approach on a symmetry principle, your perspective on the phenomenon of interest is, philosophically speaking, necessarily a double aspectism—an ontology well designed for analyzing systems founded on duality symmetries. For example, a rotation of a circle is a symmetry operation since both order of points and distances between them remain invariant after the application of the operation. As already mentioned, Shaw, McIntyre and Mace (1974) argued that symmetry group theory provides one way for making clear what invariant properties all events must share by virtue of being events. In an attempt to address this problem, using the terms TI and SI, an event (E) can said to be perceptually specified when both of these aspects of invariant information are available to be detected, that is, when the two-variable function, E(TI, SI), can be evaluated. For instance, an event involving a bouncing ball might be denoted as E(T1 = bouncing, SI = ball) = bouncing ball.
How to Build a Physically Lawful System Two unicycle wheels are free to roam anywhere unconstrained by what the other one is doing. But when coupled by a constraint, say, a rigid frame to
100 Robert Shaw and Jeffrey Kinsella-Shaw make them into a bicycle, the two wheels must act symmetrically and follow the same course. Thus, we see that symmetry is the expression of a constraint. For forceless, kinematic, optic array information to become forceful, kinetic, optic array information requires the addition of constraints. A system that is kinematic is free to take on all possible temporal and spatial values. This freedom allows the system to enter any states unencumbered by which other states it might enter—so long as the next states are successively ordered (i.e., are temporal) and adjacently ordered (i.e., spatially ordered). For a kinematic system to become kinetic, it must take on the symmetry constraints it lacks. In other words, it must give up some freedom of making changes to its state configurations. Hence a move from being merely kinematic to being kinetic is to assume some constraints that were missing. Which new constraints that are adopted will determine what kind of new dynamics the refashioned system will have? If the refashioned system is to satisfy physical laws, then it must assume exactly the right constraints or it will be unrealistic and not physically lawful in the usual way. But what is the “usual way?” Fortunately, Noether (1918) answered this question for us by proving that from symmetries in Nature the conservation laws arise intrinsically. If a physical system behaves indifferent to its orientation in space, then its Lagrangian (i.e., its mechanical action) which governs it laws of motion will be symmetric under continuous rotations, and Noether’s theorem will dictate that the angular momentum of the system will be conserved. Similarly, assume a physical system behaves the same, regardless of place or time, its Lagrangian is symmetric (invariant) under continuous translations in space and time, respectively. Then, according to Noether’s theorem, the system will obey the conservation laws of linear momentum and energy. (Noether’s theorems showed how symmetries and conservations are mathematically synonymous.) And, finally, if the behavior of a physical system does not change upon spatial or temporal reflection, then its Lagrangian has reflection symmetry and time reversal symmetry, respectively. Then, according to Noether’s theorem, a system with these symmetries means it will exhibit parity and entropy conservation laws, respectively.
How to Build a Physically Lawful Ecosystem You will need a toolbox with a minimal set of conceptual tools—among them, duality, reference frame, and discrepancy. Here we provide an intuitive introduction to these concepts eschewing formal treatment until a later time.
An Ecological Approach to Event Perception 101 Duality In mathematics, a duality, generally speaking, translates concepts, theorems or mathematical structures into other concepts, theorems or structures, in a one-to-one fashion, often (but not always) by means of an involution (e.g., reflection) operation: if the dual of A is B, then the dual of B is A. Such involutions sometimes have fixed points, so that the dual of A is A itself. The clearest case of this can be seen in digraph theory where the dual is obtained simply by reversing the arrows that couple its states (see Figure 6.2). Gibson (1979/2015) treats affordances as dual aspects of an ecosystem that refer mutually and reciprocally to both the organism and the environment components. He tells us: An affordance cuts across the dichotomy of subjective-objective and helps us to understand its inadequacy. It is equally a fact of the environment and a fact of behavior. It is both physical and psychical, yet neither. An affordance points both ways, to the environment and to the observer. (p. 121) Gibson’s insight that an affordance provides two perspectives, one from the organism on the environment and one from the environment on the
ŶĞĐŽƐǇƐƚĞŵĐŽŵƉƌŝƐĞƐĂŶŽƌŐĂŶŝƐŵĂŶĚŝƚƐĞŶǀŝƌŽŶŵĞŶƚ͕ĂŶĚ ŝŶĐůƵĚĞƐƚŚĞĂīŽƌĚĂŶĐĞƐĂŶĚĞīĞĐƟǀŝƟĞƐĚĞĮŶĞĚŽŶϬĂŶĚ͕ ŝŶĚĞƉĞŶĚĞŶƚůǇ͕ĂƐǁĞůůĂƐŝŶƚĞƌĚĞƉĞŶĚĞŶƚůǇďĞƚǁĞĞŶϬĂŶĚ͘
Figure 6.2 Illustrating the dual components of an ecosystem.
102 Robert Shaw and Jeffrey Kinsella-Shaw organism, is an example of a duality (dual perspectives) that reminds us of a quote by a major mathematician: “Fundamentally, duality gives two different points of view of looking at the same object. There are many things that have two points of view [agent, patient; organism, environment] and in principle they are all dualities” (Atiyah, 2007, p. 69). The key to understanding the ecological approach is to see that, like Noah after the flood, it construes everything as coming in pairs: organism- environment, affordance-effectivity, information-control. These dual relationships can be conveniently summarized in a directed graph (i.e., a digraph), as shown in Figure 6.2. The Dual Frame Discrepancy Hypothesis: Finding Forces from Forceless Information The most fundamental fact to recognize is that in an ecological approach every major concept has a dual partner—they necessarily come in dual pairs because, by definition, there are always two points of view—one from the organism (O) to the environment (E) and another from the E to the O. Consequently, we begin with the organism’s frame of reference with respect to the environment and immediately recognize that there is an environmental frame of reference with respect to the organism. If the two reference frames are in total agreement, such that information about x in E-terms and information about x in O-terms are in thorough agreement, perception, memory, and inference will reveal no discrepancies. In such case, the O-frame and the E-frame will be dual isomorphs, and perception and action will be perfectly coordinated. Reference Frame The notion of a reference frame is not the same as a coordinate system or traditional reference system with a point origin, (0, 0, 0, 0), and metric coordinates, (x, y, z, t). Instead a reference frame, as construed under the ecological approach, begins with a point of view (POV) around which perspectives are variously organized (see Figure 6.3). The POV might be global in being the perspectives surrounding O taken with respect to all of E—an open vista delimited by the visual horizons alone, or something more focal, ranging from an object and how it is situated in E being at a nearby place or at a place some intermediate distance away, or, even most locally, being just defined on the self alone. A reference frame is not located just by places surrounding a POV or lying at various distances away but is also taken relative to immediate-to sustained-encounters of various durations. Most importantly, the surround is always filled by distributions of affordances toward which actions might be taken more or less easily. The metrics are pragmatic, being restricted to action limits, such as being easily reachable (e.g., arm’s length, steps away), or navigable over a measured
An Ecological Approach to Event Perception 103 duration (e.g., a few minutes, an hour or so, a day trip), or reachable by locomotory treks of certain durations (e.g., walking, running, by bicycle, car, train, etc.). The POV may also be dynamically delineated as revealed in the “field of safe travel” surrounding automobiles or pedestrians, as explained in Gibson and Crooks (1938). If O and E belong to the same ecological frame, then they are mutually and reciprocally dual (as signified by ‘’), but the dual relations (e.g., aa, ab) may be ordered or unordered. Here a E: O’s primal perspective on E vs E’s dual perspective on O. E > O: E’s primal perspective on O vs O’s dual perspective on E.
A key duality is the primal affordance an actor intends to realize and the dual action effectivity by which it does so (e.g., catches the ball thrown, lifts the baby down from its highchair, trims the bushes with the hedge clippers). Discrepancy In general, discrepancy theory describes the deviation of a situation from the state one would like it to be in, say, to be the dual action to some primal affordance goal. You intend to hit the bullseye with the dart but your throw is errant. Consequently, on the next throw you adjust the direction of the dart ‘s release by a slight hand rotation. The kinetics of neuromuscular control is felt directly through kinesthetic information. Put differently, the information frame of the situated dartboard and the situated control frame of the hand holding the dart dynamically share a common force bases, one that is rooted in visual and neuromuscular kinesthetics (felt weight and momentum of arm, stiffness parameter, etc. in the context of visual information about target parameters).
104 Robert Shaw and Jeffrey Kinsella-Shaw
The Dual Frame Discrepancy Hypothesis The work-to-be-done as specified in the primal visual information frame must be matched by the work-actually-done in the dual neuromuscular control frame. If not, then there is a discrepancy to be eradicated by reactive adjustments. For the information and control frames to coalesce into the proper ecological frame vis-à-vis the perceiving-acting cycle, a synergy comprising the two frames must emerge that has both the intended specificity (goal-path accuracy) and efficacy (properly focused dynamics, or ecological work) (Shaw & Kinsella-Shaw, 1988). Gibson formulated this idea in his 1966 book. This idea is so important and central to the ecological approach, we should have the authority of Gibson’s own words (Gibson, 1979/2015): There are various ways of putting this discovery, although old words must be used in new ways since age-old doctrines are being contradicted. I suggested that vision is kinesthetic in that it registers movements of the body just as much as does the muscle-joint skin system and the inner ear system. Vision picks up both movements of the whole body relative to the ground and movement of a member of the body relative to the whole. Visual kinesthesis goes along with muscular kinesthesis. The doctrine that vision is exteroceptive, that it obtains “external” information only, is simply false. Vision obtains information about both the environment and the self. In fact, all the senses do so when they are considered as perceptual systems. (p. 175)
Applying the Dual Frame Discrepancy Hypothesis Case 1 Consider two trains standing next to each other on adjacent tracks in a train station. On each train there is a person standing in the aisle, facing forward, holding a full cup of coffee. Call them Bob and Alice. Unaware that Bob is watching her through the adjacent train car windows, Alice is lost in thought when her train jerks into motion. Bob’s train remains stationary. The sudden jerk naturally causes Alice to spill her coffee and Bob seeing Alice’s train’s abrupt motion, even though his train remains at rest, also spills his coffee at the same time. Why? (See Figure 6.3). While there is no mystery regarding what caused Alice to spill her coffee, it remains surprising that Bob being on a train at rest should spill his coffee just from watching Alice’s minor calamity. This puzzle is instructive and solving it will make clear one way that forceless optic array information about a forceful action can induce a forced outcome. Or, stated differently, how can a strictly informational coupling between an event taking place in one local reference frame somehow induce a secondhand forceful outcome to take place in another distant reference frame?
An Ecological Approach to Event Perception 105
Figure 6.4 Lee’s swinging room. When local and global information agree (a), then posture is not compromised. But when local and global information disagree (b), then there is a dual frame discrepancy and posture is upset by an optical “push.”
angle, she or he sees only the wall and nothing else in the environment, especially not the floor. The motion of the wall projects a global optical transformation into the person’s optic array information that specifies to the perceiver that she or he has moved from being upright. The information does not cause the person’s reaction—information is forceless and, therefore, cannot be a mechanical cause. Since the information-to-control coupling is forceless, we need an answer to this question: By what means, metaphorically speaking, does the language of information get translated into the language of control? The answer is clear. The kinetics are supplied by the person’s own neuro-muscular system whose postural equilibrium is upset by an optical push. It is known that optical disturbances may trigger involuntary reactions from the perceivers (Shaw & Kinsella-Shaw, 2007). Here it was found that a movement of the wall so subtle that it goes unnoticed can still induce the person to sway in phase with the wall’s movement. Although instructed to stand still without moving, precise (goniometric) measurements at the person’s ankle joint show she or he still sways in phase with the room’s movement.
d’Alembert’s Principle and Inertial Forces We follow the arguments given in Shaw and Kinsella-Shaw (2007). In Newton’s original analysis, his first law was based on impressed forces,
An Ecological Approach to Event Perception 107 F = mA, inertial forces were omitted. Newton’s laws of motion only apply to frames of reference in which a body remains at rest or moves uniformly at a constant speed when no forces are impressed upon it. This is called an inertial frame of reference. The frame itself need not be at rest—it can be moving at a constant speed relative to another frame of reference. No inertial forces are felt when a frame is inertial. In all these cases, an optical “push” arises whenever an abrupt change in optical structure occurs that transforms the person’s inertial frame of reference into a non-inertial frame. Put simply: Optical pushes arise from information specifying “frame discrepancy.” By examining its physical foundations, we shall see what this hypothesis means. Newton’s laws as originally formulated, however, do not apply to objects in non-inertial frames, that is, in frames that are accelerating. But they may be reformulated so that they do, as shown by the French physicist, Jean le Rond d’Alembert (1717–1783). His principle states: When any object is acted on by an impressed force that imparts an acceleration to the object, an inertial force is produced as a reaction. Inertial forces differ from impressed forces in how they are produced. An inertial force is created by the accelerating frame moving out from beneath the objects it contains—temporarily leaving them behind— until the frame of the impressed force drags them along as well. In keeping with the Principle of Virtual Work, the resultant of this impressed force and the inertial force is zero. In other words, when a car is at rest or moving uniformly at a constant velocity, no impressed forces act on it or its driver and thus no inertial forces. However, when the driver depresses the accelerator, the car’s motor impresses a force on the car that accelerates it. At the same time, in reaction to the impressed force, an equal but counter-directed inertial force is produced that acts on the driver, pushing him back against the seat. As observed earlier, Newton’s laws only apply to objects in inertial frames; therefore they do not apply to the accelerating car—a non-inertial frame. But by invoking d’Alembert’s Principle, Newton’s laws can be generalized to cover this case too. Behind this principle was a simple but brilliant insight by d’Alembert, which is clearly revealed in four steps (for a general discussion, see Lanczos, 1970): First Step: We start with Newton’s Second Law of motion which asserts that mass multiplied by acceleration equals an impressed force, the familiar, mA = F. Second Step: Rearrange the equation as follows: F – mA = 0. Third Step: Define a new vector, I = –mA. This is called an inertial force. Notice, this is a counter-force; its sign is the opposite of the sign on the impressed force vector. Fourth Step: We can now reformulate Newton’s Law as F + I = 0. The third step looks a bit trivial, being nothing more than giving a new name to the negative product of mass x acceleration. In fact, it allows the
108 Robert Shaw and Jeffrey Kinsella-Shaw expression of an important principle in the next step. In Newtonian mechanics, the concept of a system being in equilibrium entails the nullification of all impressed forces acting on it. Static equilibrium applies to objects not in motion. With this reformulation of Newton’s law, d’Alembert showed us how to generalize the concept of equilibrium to objects in motion. To make this generalization required a brilliant insight—d’Alembert had to see that inertia itself is a force that can be included with impressed forces to make up the total effective force of the system, i.e., effective in the sense of summing to zero. This now allows us to extend any criterion for a mechanical system being in static equilibrium to a moving mechanical system being in dynamic equilibrium. Inertial forces are experienced daily by those of us whose bodies are carried along with a variety of accelerated frames—automobiles, trains, buses, airplanes, swings, carnival rides, horses, or rocket ships to the moon. The origin of these “unimpressed” forces is the tendency for objects to resist change of their state of motion or state of rest, in accordance with Newton’s Second Law, which asserts that a force is anything that accelerates a mass, i.e., F = mA. To reiterate, inertial forces differ from impressed forces in how they are produced. An inertial force is created by the accelerating frame moving out from beneath the objects it contains—temporarily leaving them behind— until the train’s impressed force drags them along as well. Armed with d’Alembert’s principle, we can now show how it is possible, at least in one case, to transform kinematic optical array information into kinetic optic array information. Given dual frames of reference are involved in a situation, such as Alice and Bob being on the two trains (or seeing both the wall and the floor simultaneously in Lee’s room), the two frames of reference must be confusable by an observer (e.g., Bob). There must also be two potential energy sources, say, A and B—A for the impressed force and B for the reactive force—that are informationally coupled (e.g., Bob sees Alice’s train start up and mistakes it for his own). If the shades on Bob’s train car were pulled down, then he would have no information regarding Alice’s train and experience no optical push. (Or, likewise, there would be no optical push while standing in the Lee room with one’s eyes shut.) Again, study Figure 6.3. This is how a kinematic display becomes the control for kinetic forces. The information coupling of the two observer- two train frames into an ecological physics field lends support to the dual frame discrepancy hypothesis.
An Ecological Approach to Event Perception 109
Conclusion One aim in this chapter was to critically review Gibson’s approach to event perception, as discussed in Chapter 6 of his 1979/2015 book. We stressed the importance of event perception for having a generally adequate ecological approach to visual perception. A second aim was to discuss the importance of symmetry theory as a precise way to conceptualize Gibson’s invariants approach to information. Here we followed Gibson in recognizing that successive order and adjacent order were useful replacements for time (temporal order), an abstraction from the former, and space (spatial order), an abstraction from the latter. A third aim, and one we consider most significant, was to review the problem of how forceless kinematic optic array information could also specify forceful kinetic information so that the language of control might somehow be a direct translation of optic array information. We argued that it is possible to do so by means of the dual frame discrepancy hypothesis. A perceived discrepancy between dual frames that should be congruent causes the perceiver to make neuromuscular adjustments to eradicate the discrepancy that manifests as a self-produced inertial force. Then, by applying d’Alembert’s principle in the usual way, the reactive response can be shown to be dimensionally homogeneous with a Newtonian force when the second law is reformulated in the manner of d’Alembert to include inertial forces. Our interpretation of the most fruitful aspect of Gibson’s approach to event perception is the intrinsic duality of the affordance concept. For this allows a natural way to have dual frames between which a discrepancy can arise, and is the insight needed to map kinematics into kinetics, in the context provided by Gibson’s construct of the optic array.
7 The Optical Information for Self-Perception in Development Audrey L. H. van der Meer and F. R. Ruud van der Weel
According to Gibson (1979/2015), information about the self accompanies information about the environment, and the two cannot be separated. Information to specify the self, including the head, body, arms, and hands, goes together with information to specify the environment. Self-perception and environment perception are inseparable: one perceives the environment and perceives oneself at the same time. Infants spend hours looking at their hands, and so they should, for many lessons in ecological optics need to be learned before they realize that their hands are not ordinary objects, but belong to themselves and can be used to touch and grasp objects in the environment. In order to successfully reach out and catch moving toys, infants need to develop prospective control––the ability to perceive what is going to happen in the near future. They also need to perceive objects in terms of their own bodily dimensions and action capabilities, whether something is within reach or out of reach, graspable or too big to be grasped, in terms of their affordances for action (see Wagman, Chapter 8, in this volume).
Early Perceptuo-Motor Development There is unity and continuity between the development of perceptual and motor skills. Rather than assuming that the baby, under the influence of cortical maturation, develops voluntary motor skills to which perception then needs to be mapped, there is ample evidence that perceptual and motor skills develop hand in hand. This suggests that development is best conceived as a gradual, continuous process that begins at, or even before, birth. Acting successfully entails perceiving environmental properties in relation to oneself (Gibson, 1979/2015). Organisms do not perceive objects per se, but what these objects afford for action. What any given object affords depends on the size and action possibilities of the perceiver. However, before babies can reach out and successfully grasp objects in the environment, they first have to learn they have an arm––and what it can do. Traditionally, the emergence of successful reaching and grasping is described as a discrete step in development that suddenly appears at
Optical Information for Self-Perception 111 around four months of age (e.g., Gesell, 1928). As a result of the strong influence of the maturation perspective, newborn babies are still usually considered reflexive creatures, responding to physical stimuli in a compulsory and stereotyped manner and incapable of performing intentional movements (Van der Meer & Van der Weel, 1995). Cortical maturation and the resulting inhibition of reflexive movements are thought to take up most of the first four months of a baby’s life. Until that time, movements made by very young babies are typically dismissed as reflexive, involuntary, and purposeless. However, our behavioral research with neonates indicates that newborns show voluntary control over their feeding behavior, resulting in precise coordination between sucking, breathing, and swallowing (Van der Meer, Holden, & Van der Weel, 2005), and that they can move their arms in a purposeful way. Moving a limb or the whole body in a controlled manner requires acting together with gravity and other non-muscular forces (Bernstein, 1967; Profeta & Turvey, 2018). Consequently, movements cannot be represented simply as patterns of efference to the muscles, nor in any preprogrammed context-insensitive way. Accurate control requires online regulation of muscular activation based on perceptual information about the dynamics of the limb movement and the external force field, as well as about the movement of the limb relative to objects or surfaces to which it is being guided. Are neonates capable of such perceptuo-motor control or are their movements to be seen as simply reflexive or due to spontaneous patterned efference to the muscles, as is commonly believed? Next, we will describe a series of experiments on neonatal arm movements and discuss their possible functional significance for later reaching and grasping. If we could demonstrate that newborns take into account gravity when making arm movements, then this would show that these spontaneous movements cannot be characterized as stereotyped or reflexive. If we could also show that these early spontaneous arm-waving movements were under perceptual control, then it is very likely that these movements have an exploratory function essential to establish a bodily frame of reference for action.
Waving or Reaching? The Functional Significance of Early Arm Movements To test whether newborn babies take account of external forces in moving their limbs, we recorded spontaneous arm-waving movements while the baby lay supine with its head turned to one side (Van der Meer, Van der Weel, & Lee, 1995a, 1996). Free-hanging weights, attached to each wrist by strings passing over pulleys, pulled on the arms in the direction of the toes. The babies were allowed to see only the arm they were facing, only the opposite arm on a video monitor (see Figure 7.1), or neither arm because of occluders. The babies opposed the perturbing force to keep an
112 Audrey van der Meer and Ruud van der Weel
Figure 7.1 A newborn baby participating in the weight-lifting experiment. Despite small weights pulling the arms away from the face in the direction of the toes, infants opposed the perturbing force and kept the arm up and moving in their field of view, but only if they could see the arm, either directly or, as here, on the video monitor.
arm up and moving normally, but only when they could see the arm, either directly or on the video monitor. Thus, newborn babies purposely move their hand to the extent that they will counteract external forces applied to their wrists to keep the hand in their field of view. In order to investigate whether newborns are also able to adjust their arm movements to environmental demands in a flexible manner, we investigated whether manipulating where the baby sees the arm has an influence on where the baby holds the arm (Van der Meer, 1997a). Spontaneous arm-waving movements were recorded in the semi-dark while newborns lay supine facing to one side. A narrow beam of light 7 cm in diameter was shone in one of two positions: high over the baby’s nose or lower down over the baby’s chest, in such a way that the arm the baby was facing was only visible when the hand encountered the, otherwise, invisible beam of light. The babies deliberately changed arm position depending on the position of the light and controlled wrist velocity by slowing down the hand to keep it in the light and thus clearly visible. In addition, we found that the babies were able to control deceleration of the hand in a precise manner. For all instances where the baby’s hand entered the light and remained there for 2 seconds or longer, the onset of deceleration (point of peak velocity) of the hand was noted with respect to the position of the light. Remarkably, in 70 out of all 95 cases (almost 75%), the babies started to
Optical Information for Self-Perception 113 decelerate the arm before entering the light, showing evidence of anticipation of, rather than reaction to, the light. On those occasions where the babies appeared not to anticipate the position of the light, more than 70% of these occurred within the first 90 seconds after starting the experiment or after changing the position of the light. Thus, by waving their hand through the light in the early stages of the experiment, the babies were learning about and remembering the position of the light. This very quickly allowed them to accurately and prospectively control the deceleration of the arm into the light and remain there, while effectively making the arm clearly visible. From an ecological perspective, the information for self-perception is not restricted to a specific perceptual system. This brings us to the question: Would newborn babies be able to control their arm movements by means of sound? In order to answer this question, newborn babies between 3 and 6 weeks of age were placed on their backs with the head kept in the midline position with a vacuum pillow (Van der Meer & Van der Weel, 2011). In this position, both ears were uncovered and available for sound localization. Miniature loudspeakers were attached to the baby’s wrists. The baby’s mother was placed in an adjacent room where she could see her baby through a soundproof window. The mother was instructed to speak or sing to her baby continuously into a microphone, while the sound of her voice in real time was played softly over one of the loudspeakers attached to the baby’s wrist. In order to hear her mother’s voice, the baby would have to move the “sounding” wrist close to the ear, and change arms when the mother’s voice was played over the other loudspeaker. The results showed that newborn babies were able to control their arms in such a way that the distance of the left and the right wrist to the ear was smaller when the mother’s voice was played over that wrist than when it was not. Further analyses showed that there were significantly more reductions than increases in distance between wrist and ear when the sound was on, while when the sound was off, the number of reductions and increases in distance between wrist and ear was about the same. Thus, sighted newborn babies can precisely control their arms with the help of both sight and sound. This implies that arm movements are not simply reflexive, nor can they be explained away as excited thrashing of the limbs. Neonates can act intentionally from the start, and they come equipped with perceptual systems that can be used to observe the environmental consequences of their actions. At the same time, actions provide valuable information about oneself. This dual process of perceiving oneself and perceiving the consequences of self-produced actions provides very young infants with knowledge about themselves that is crucial for producing adaptive behavior (E. J. Gibson, 1988).
114 Audrey van der Meer and Ruud van der Weel
Establishing a Frame of Reference for Action It seems plausible that the spontaneous arm waving of neonates of the kind measured in our experiments is directed and under precise control. Neonates will purposely counteract external forces applied to their wrists to keep the hand in their field of view. They can also precisely control the position, velocity, and deceleration of their arms to keep them clearly visible. Moreover, they can direct their arms to their ears with the help of sound. Their level of arm control, however, is not yet sufficiently developed so that they can reach successfully for toys. Young babies have to do a lot of practising over the first four or five months, after which they can even catch fast-moving toys (Von Hofsten, 1983). What could be the functional significance of neonatal arm movements for later successful reaching and grasping? To successfully direct behavior in the environment, the infant needs to establish a bodily frame of reference for action (Van der Meer & Van der Weel, 1995). Since actions are guided by perceptual information, setting up a frame of reference for action requires establishing informational flow between perception and action. It also requires learning about body dimensions and movement possibilities. Thus, while watching their moving arms, newborn babies pick up important information about themselves and the world they move in––information babies need for later successful reaching and grasping, beginning at around four or five months of age. It is widely known that young infants spend many hours looking at their hands (see Figure 7.2). And so they should, for infants have to learn many lessons in ecological optics in those early weeks before they can successfully reach for and pick up toys in the environment. First of all, infants have to learn that the hands belong to the self, that they are not simply objects, but that they can be used to touch all sorts of interesting objects in the environment. In order to successfully reach out and grasp toys, infants also have to familiarize themselves with their own body dimensions in units of some body-scaled or, more generally, action-scaled metric (Warren, 1984). In other words, infants have to learn to perceive the shapes and sizes of objects in relation to the arms and hands, as within reach or out of reach, as graspable or not graspable, in terms of their affordances for manipulation (Gibson, 1979/2015). All this relational information has to be incorporated (Merleau-Ponty’s term, see Marratto, 2012) into a bodily frame of reference for action in those early weeks before reaching for objects develops. We have all experienced this process of incorporation, namely, when learning new perceptuo- motor skills (Tamboer, 1988). For instance, tennis rackets, skis, golf clubs, and other extensions of the human body, such as false teeth and new cars, first have to be incorporated into our habitual frame of reference, or embodied action scheme (Day, Ebrahimi, Hartman, Pagano, & Babu, 2017), before we can use them to our full potential. At first, we experience
Optical Information for Self-Perception 115
Figure 7.2 A newborn boy only a few hours old is studying his hand intensely.
those instruments as unmanageable barriers between the environment and ourselves. However, once incorporated into our bodily frame of reference, they increase our action possibilities considerably and are almost regarded as our own body parts. In this context, it is possible to speculate about the role of early arm movements for distance perception in general. The late Professor Henk G. Stassen was a mechanical engineer from Delft University of Technology in The Netherlands. He specialized in man-machine systems, cybernetics, and ergonomics, and was involved in designing artificial arms for babies who were born with two stumps because of genetic disorders or because their mothers had taken the drug thalidomide in the 1960s during pregnancy to prevent miscarriage. Stassen (personal communication, 1994) observed that if you fit babies with artificial arms early, at around two to three months, they do not seem to have any problems avoiding obstacles as soon as they learn to walk. However, if the arms are fitted too late, the babies will have tremendous problems perceiving affordances for navigation, and they will initially bump into walls and obstacles when they start walking around their first birthday. Gibson (1979/2015) suggested that the nose provides an absolute baseline for distance perception (p. 110). Stassen’s observations would add that we perceive distance in terms of our arm length, as within reach or out of reach. During infancy, new skills are constantly appearing and bodily dimensions are changing rapidly. In general, the bodily frame of reference has to
116 Audrey van der Meer and Ruud van der Weel be updated during life to accommodate changes in action capabilities and body characteristics. Sudden changes in action capabilities, as after a stroke, show this very clearly, as do rapid changes in body size in pregnancy (Franchak & Adolph, 2014) and adolescence. Teenagers, for example, can be notoriously clumsy; they undergo such sudden growth spurts that their bodily frames of reference need to be recalibrated nearly daily. Successfully reaching out and grasping objects in the environment require infants to be familiar with their own body dimensions. As infants wave their arms while supine, they learn about their own body and its dimensions through vision. It seems likely that a fast-growing organism will constantly need to calibrate the system controlling movement, and visual proprioceptive information is least susceptible to growth errors (Lee & Aronson, 1974). This being so, our findings could have practical implications for babies with visual deficits and for the early diagnosis of premature babies at risk of brain damage. If early arm movements have an important function for later reaching, then infants with signs of hypoactivity and/or spasticity of the arms should be monitored closely with respect to retardation of developing reaching and possibly other perceptuo-motor skills. In such cases, early intervention should concentrate on helping the baby to explore its arms and hands, both visually and non-visually. A simple intervention technique that could be used on babies with a visual deficit is the use of brightly colored, high-contrast mittens, or placing sounding proximity sensors around the baby’s wrists of the kind used in cars for parking. Reaching out is typically the first developmental milestone that blind babies fail to reach on time. The distinct sound always accompanying that particular proprioceptive feeling when the arms move around the face might enable the blind baby to establish a stable bodily frame of reference for reaching based on auditory exploration of the self.
Getting Around with Light and Sound: The Role of Prospective Control and Affordances In order to act successfully, a baby needs not only to be able to perceive environmental properties in relation to the self, in terms of their affordances for action, as explained above, but the infant also needs to control its movements by perceiving what is likely to happen next, which requires prospective control (Lee, 1993; Turvey, 1992). To avoid colliding with objects, the consequences of continuing the present course of action, such as heading in a particular direction (Warren, Kay, Zosh, Duchon, & Sahuc, 2001) or braking with a particular force (Lee, 1976, 2009), must be perceived and evasive action taken in time. However, collisions are not necessarily to be avoided. Getting around the environment in fact depends on bringing about collisions of the feet with the ground, the hands with a (moving) ball, and so on. These collisions have to be carefully controlled if
Optical Information for Self-Perception 117 they are to secure the required propulsive force while avoiding injury to the body. Thus, precise prospective control is needed. The where, when, and how of the collision must be perceived ahead of time, and the body must be prepared for it. Therefore, a crucial aspect of animal-environment coupling concerns the pickup of predictive perceptual information. This type of predictive information for prospective control is, in principle, available in the ambient (optic) array via tau (Lee, 2009). The variable tau (τ ) and its rate of change specify the time-to-contact between an approaching object and any perceptual system. We will return to the concept of tau in more detail later. The ambient array is of fundamental importance in understanding direct perception because it stands outside particular perceptual systems. It is the input available to each and every one of them. “It matters little through which sense I realize that in the dark I have blundered into a pigsty” (Von Hornbostel, 1927, p. 83). The job of the perceptual systems is to pick up the information in the flow field (Gibson, 1979/2015; Lee, 1980). But, are infants capable of picking up such higher-order control information straight from birth? One possibility is that at first, the infant’s perceptual systems are only sensitive to lower-order variables, such as distance or velocity, and that, with learning, the infant becomes more and more sensitive to the higher- order variables, such as tau (Fajen, 2005b). It does not seem likely that infants perform at random until they stumble onto the proper higher-order variable, and thereafter “perceive directly.” Rather, merely correlated variables may be used early, and perhaps guide the search for a higher-order informational complex (Jacobs & Michaels, 2007). Take, for example, reaching for and intercepting a moving toy. Catching requires quite advanced timing and anticipation skills. It makes no sense to move the hand to the place where the toy was last seen because by the time the hand gets there, the toy will have moved further along its trajectory. Thus, reaching for a moving object requires prediction of the future location of the object, which in turn requires prospective control of head, eye, and arm movements. To test how prospective control of reaching develops, we investigated reaching abilities in both full-term and preterm infants catching a toy moving at different speeds (Van der Meer, Van der Weel, & Lee, 1994). The toy was occluded from view by a screen during the last part of its approach to force the infants to make use of predictive information. Babies were tested at four-weekly intervals, between the ages of 16 and 48 weeks. At 16 weeks, all babies showed visual interest in the toy, but none of them were able to direct their arms to the toy to intercept it. In addition, as soon as the toy disappeared behind the occluder, they lost interest in the toy and seemed surprised when it reappeared at the other end. However, as soon as the infants started reaching around 20–24 weeks of age, they anticipated the reappearance of the moving toy with their gaze, suggesting that this ability is a prerequisite for the onset of reaching for moving toys. From
118 Audrey van der Meer and Ruud van der Weel about 40 weeks, infants showed advanced prospective control and geared their actions of shifting gaze and moving the hand forward to certain times, rather than distances, before the toy would reappear, consistent with tau. In this way, infants make available the same time to catch successfully whether the toy is moving slowly or quickly. Our data from preterm infants show that the onset of reaching, anticipation of the moving toy with gaze and hand, and the switch to a more efficient timing strategy was delayed in almost all premature babies (Van der Meer, Lee, Liang, & Lin, 1995b), as well as their tau-coupling skills (Kayed & Van der Meer, 2009; Lee, 1998). However, most preterms had caught up with their full-term peers by the time they were one year of age, corrected for prematurity. Two preterm infants who showed poorest anticipation and had not switched to the time-strategy at one year of age, were 12–18 months later diagnosed as suffering from cerebral palsy, indicating that poor development of prospective skill on the catching task might serve as an indicator for brain damage. Some of the teenagers born preterm that we tested on a similar task were also using a less sophisticated strategy for timing their catching movements (Aanondsen et al., 2007). Greater understanding of both normal and abnormal development of use of perceptual information in prospectively guiding action might therefore have important diagnostic and therapeutic consequences (cf. Vaz, Silva, Mancini, Carello, & Kinsella-Shaw, 2017). The mastering of reaching and grasping normally develops very early, and it provides a foundation for more specific perceptuo-motor skills based on these abilities. Catching moving toys is such a case, requiring the pickup of predictive information and quite advanced timing skills. Thus, if there is a problem with basic interceptive skills, then more complex skills such as balancing and speaking––skills that are highly dependent on prospective control––are also likely to be affected later on in life. When studying dynamic balance on force plates in toddlers, children, and students (Austad & Van der Meer, 2007), the elderly (Spencer & Van der Meer, 2012), and patients suffering from fibromyalgia and chronic fatigue syndrome (Rasouli, Stensdotter, & Van der Meer, 2016), we found that prospective control of balance during gait initiation gradually improves during childhood and middle adulthood, but then quickly deteriorates with age and for patients. So far, we have seen that from very early on in life, babies show an interest in their moving arms. By looking at their waving arms, they discover and learn about all the relationships essential for successful reaching and grasping: They establish a stable bodily frame of reference for reaching. Once established, the frame of reference then allows the baby to negotiate actions that require taking into account properties of the environment, such as toys moving at different speeds, requiring prospective control. Prospective control demands that the affordances of the environment be perceived. The properties of the environment are perceived in terms of their affordances for action, and depend on the size, the level of development,
Optical Information for Self-Perception 119 and the level of skill of the organism. Affordances are therefore not fixed and, consequently, perception of affordances needs recalibrating during life to accommodate changes in action capabilities and bodily characteristics. This is particularly apparent during infancy, when new skills are constantly appearing and bodily dimensions are changing rapidly (Adolph, Eppler, & Gibson, 1993, see Adolph, Hoch, & Ossmy, Chapter 13, this volume). Organisms do not perceive objects per se, but what these objects afford for action. Acting successfully thus entails perceiving environmental properties in relation to oneself. When studying ducking under a barrier in adults, nursery school children, and toddlers, we found that the affordance of passability of a barrier is influenced not only by static variables such as body size, but also by dynamic variables, such as speed of locomotion, degree of motor control, and level of development. When subjects have less control over their own vertical position in space, because they are running, have cerebal palsy, or have only just learned to walk, a barrier affords more cautious ducking behavior (Van der Meer, 1997b). Interestingly, in a study where children with hemiparetic cerebal palsy had to knock an approaching ball off a track, we found that the children started the hitting action earlier with their affected arm, thus compensating for the fact that this arm moves slower than the unaffected arm. Because cerebral palsy is a congenital disorder, the children were used to the affected arm being slower and more difficult to control, and made allowances for this when initiating and continuously controlling their arm movements using tau-coupling (Van der Weel, Van der Meer, & Lee, 1996). The perceptual richness of a task also turns out to be crucial for affordance perception, and has an effect on a child’s movement control. We found that children with hemiparetic cerebal palsy find it easier to pronate and supinate their forearm in an informationally rich and concrete “bang- the-drum” task than in an abstract “move-as-far-as-you-can” task requiring the same movement (Van der Weel, Van der Meer, & Lee, 1991). The factor discriminating concrete and abstract tasks is the degree to which the act required is directed to controlling physical interaction with the environment, as opposed to producing movement for its own sake. Concrete tasks generally have greater informational support from the environment. In the concrete “bang-the-drum” task, the movement was controlled by visual, auditory, and tactile information about the child’s relation to the drum, and the attainment of the goal was readily perceptible by the child. Perception of the environment has mostly been considered through visual information. There is generally little research about the use of acoustic information for guided movement in the environment (but see Jenison, 1997; Lee, Van der Weel, Hitchcock, Matejowsky, & Pettigrew, 1992; Russell & Turvey, 1999). Similar to vision, audition provides us with spatial information over extended distances. Hearing may be even more important than vision in orienting toward distant events. We often hear things before we see them, particularly if they take place behind us or on
120 Audrey van der Meer and Ruud van der Weel the other side of opaque objects, such as walls. One of infants’ first opportunities to move in the environment is by use of rotation skill in a prone position. This skill emerges when infants are six or seven months old, and requires them to use their arms and legs to rotate around their body axis. Emergence of rotation skill provides the first opportunity for infants to detect what is behind them, and to perform adequate whole-body movements based on auditory information. Based on affordance research, we investigated whether non-crawling infants would be able to use auditory information to rotate along the shortest way to a sound source, relative to their own position in space (Van der Meer, Ramstad, & Van der Weel, 2008). We found that in 88% of all trials, young infants were able to rotate along the shortest way to their calling mum who was sitting behind them. Infants chose the shortest way in 75% of the trials for the largest angle (157.5 degrees) to 96% for the smallest angle (90 degrees), suggesting that infants experience increased difficulty differentiating more ambiguous auditory information for rotation. In general, use of auditory perception for action has been a neglected research area in the ecological tradition. Our results can contribute to the understanding of the auditory system as a functional listening system where auditory information is used as a perceptual source for prospectively guiding behavior in the environment.
Toward a Developmental Neuroscience from an Ecological Perspective Indirect theories of perception take for granted the distinction between sensation and perception and go on to explain the difference, for example, when arguing whether the ability to enrich incoming sensations is innate or learned, or a combination thereof (Michaels & Carello, 1981). The ecological approach to perceptual learning and development (E. J. Gibson & Pick, 2000), on the other hand, is based on a direct theory of perception. Here, the information for perception is indefinitely rich and detailed and, as a result, perceptual learning is a lifelong process of differentiating information variables, rather than enriching bare stimulus input (Gibson & Gibson, 1955). Traditionally, ecological psychology has opposed the concept of enrichment, neglecting the brain in the process because that is where the enrichment is supposed to take place and the mental representations reside. However, with the rise in neuroscience and brain imaging techniques, the challenge for ecological psychologists is to study typical and atypical functional brain development in a way that is consistent with ecological theory (see also Chemero, 2009; De Wit, De Vries, Van der Kamp, & Withagen, 2017). Central to our developmental neuroscience approach are the ecological concepts of prospective control and time-to-collision. For the past 25 years, we have been conducting developmental research on visual motion perception from an ecological perspective (Agyei, Van der Weel, & Van
Optical Information for Self-Perception 121 der Meer, 2016a). We have studied, at both the behavioral and brain level, the visual motion paradigms of optic flow, looming, and occlusion in infants. Our developmental studies show that the onset of self-produced locomotion coincides with a surge in infants’ perceptual and brain development. We show experimentally that the structure present in the optic flow field aids participants of all ages to detect visual motion. We also present evidence that invariants present in perceptual information are merely reflected in the infant brain, consistent with Gibson’s (1966a) concept of resonance. Recently, we have started to look for evidence of vicarious use of brain tissue (Gibson, 1966a), where neurons temporarily assemble to enable a given task.
The Visual Motion Paradigms of Optic Flow, Occlusion, and Looming Optic Flow In a series of experiments, we simulated self-motion with structured optic flow and compared it to unstructured random motion, while we measured cortical responses to visual motion with high-density electroencephalography (HD EEG, see Figure 7.3). Our studies show that both adults and infants find it easier to pick up visual motion, as measured by shorter latencies, when the information available to them is structured, as in optic flow (Van der Meer, Fallet, & Van der Weel, 2008). We also find that at the neural level, four-month-olds do not differentiate between simulated forward, backward, and random visual motion, whereas older infants with some weeks of crawling experience do (Agyei, Holth, Van der Weel, & Van der Meer, 2015). Preterm infants at one year of age (corrected for prematurity) show very little development in cortical activity in response to visual motion (Agyei, Van der Weel, & Van der Meer, 2016b), which leads us to suspect a dorsal stream vulnerability. As opposed to the ventral stream that develops mainly after birth, the dorsal visual processing stream develops during the last three months of pregnancy (Atkinson & Braddick, 2007). Preterm birth during this period seems to disturb the normal development of the dorsal stream (e.g., Van Braeckel, Butcher, Geuze, Van Duijn, Bos, & Bouma, 2008), possibly affecting the typical dorsal stream functions of timing, prospective control, and visuo-motor integration. In addition to direction of motion, we studied perception of motion speed in a naturalistic setting where a vehicle driving down a virtual road was simulated with optic flow (Vilhelmsen, Agyei, Van der Weel, & Van der Meer, 2018; Vilhelmsen, Van der Weel, & Van der Meer, 2015). Adult participants differentiated between direction and speed of motion when they watched a road that was simulated by poles moving from near the center of the screen and out (or in) toward the edges of the screen, creating a realistic simulation of an optic flow field. Older infants between 8–11
122 Audrey van der Meer and Ruud van der Weel
Figure 7.3 A 4-month-old girl in deep concentration on the visual motion presented on the large screen in front of her, while the corresponding electrical brain activity (EEG) with a sensor net consisting of 128 electrodes is measured.
months with several weeks of crawling experience differentiated between simulated backwards and forwards self-motion, but only at low driving speeds of 17 km/h. Four-month-old infants without any locomotor experience, on the other hand, did not show any cortical evidence of being able to discriminate between direction and speed of simulated self-motion. This suggests that perceptual development does not happen in a vacuum, but that perception and action develop hand in hand (Campos, Anderson, Barbu-Roth, Hubbard, Hertenstein, & Witherington, 2000). As soon as infants become mobile, the need for perceptual sensitivity to, for example, whether an object is approaching on a collision course and, if so, when it will collide, becomes much more urgent, and our brain data suggest that perception of motion direction and speed takes several weeks of self- produced locomotor experience to develop.
Optical Information for Self-Perception 123 Occlusion In most of our behavioral catching studies with infants described above, the moving toy was occluded from view by a screen during the last part of its approach to force the infants to make use of predictive visual information for prospective control. Catching a toy moving at different speeds that temporarily disappears behind an occluder before it can be caught, requires prospective control of head, eye, and arm movements. Our catching studies show that toward the end of the first year, most infants shift their gaze to the other side of the occluder and start moving their hand forward at a fixed time-to-contact before the toy had reappeared from behind the occluder, as opposed to a fixed distance to the reappearance point (Van der Meer et al., 1994, 1995b). This suggests that at first, infants’ perceptual systems are only sensitive to lower-order variables such as distance or velocity and that, with perceptual learning, infants become more and more sensitive to the higher-order perceptual variable of time-to-contact captured by tau (Jacobs & Michaels, 2007; Van der Weel, Craig, & Van der Meer, 2007). We have just started to study brain electrical activity as a function of perception of temporarily occluded moving objects by means of analyses in the time-frequency domain in 8–12-month-old infants (Slinning, Rutherford, & Van der Meer, 2018). Similar to Bache, Kopp, Springer, Stadler, Lindenberger, and Werkle-Bergner (2015), we report gamma oscillations in response to occlusion. In addition to brain data, we also collect data on eye movements. When working with EEG, one has to find a way of “aligning” the enormous amount of data. Stimulus onset, a measure completely outside the participant’s control, is traditionally used for this purpose in research on event-related potentials. When studying timing in a visual tracking task involving occlusion with adults (Holth, Van der Meer, & Van der Weel, 2013), we show the importance of including behavioral data when studying the neural correlates of prospective control. From an ecological neuroscience perspective, we strongly recommend incorporating behavioral measures that are actively controlled by the participant, such as eye or reaching movements (Makeig, Gramann, Jung, Sejnowski, & Poizner, 2009), into the EEG analysis, as this will make the brain data clearer and easier to interpret. Looming How does the infant brain deal with information about imminent collisions? By simulating a looming object on a direct collision course toward infants, it is possible to investigate brain activities in response to looming information. Looming refers to the last part of the approach of an object that is accelerating toward the eye. To prevent an impending collision with the looming object, infants must use a timing strategy that ensures they
124 Audrey van der Meer and Ruud van der Weel have enough time to estimate when the object is about to hit them in order to perform the appropriate evasive action. Defensive blinking is widely considered as an indicator for sensitivity to information about looming objects on a collision course. Infants must use time-to-collision information to precisely time a blinking response so that they do not blink too early and reopen their eyes before the object makes contact, or blink too late when the object may already have made contact. For an accurate defensive response to avoid collisions and prevent injury, development of prospective control is important. Infants must use looming visual information to correctly time anticipatory responses to avoid impending collisions. We investigated the timing strategies that 5–7-month-old infants use to determine when to make a defensive blink to a looming virtual object on a collision course in a series of behavioral studies (Kayed, Farstad, & Van der Meer, 2008; Kayed & Van der Meer, 2000, 2007). To time their defensive blinks, the youngest infants used a strategy based on visual angle analogous to the distance strategy described in the catching studies above. As a result, they blinked too late when the looming object approached at high accelerations. The oldest infants, on the other hand, blinked at a fixed time-to-collision allowing them to blink in time for all the approach speeds of the looming virtual object. When precise timing is required, the use of the less advantageous visual angle strategy may lead to errors in performance, compared to the use of a strategy based on time-to-collision that allows for successful performance irrespective of object size and speed. With the presentation of a looming virtual object on a direct collision course, we also studied the developmental differences in infants longitudinally at 4 and 12 months using EEG and the visual evoked potential (VEP) technique (Luck, 2005). The looms approached the infant with different accelerations, and finally came up to the infant’s face to simulate an optical collision. Measuring the electrical signal generated at the visual cortex in response to visual looming, peak VEP responses were analysed using source dipoles in occipital areas. Results showed a developmental trend in the prediction of an object’s time-to collision in infants. With age, average VEP duration decreased, with peak VEP responses closer to the loom’s time-to-collision (Van der Meer, Svantesson, & Van der Weel, 2012). Infants around 12 months of age with up to three months of crawling experience used the more sophisticated and efficient time-to-collision strategy when timing their brain responses to the virtual collision. Their looming-related brain responses occurred at a fixed time of about 500 ms before the optical collision, irrespective of loom speed. The use of such a timing strategy based on a fixed time close to collision may reflect the level of neural maturity in terms of myelinization as well as the amount of experience with self-produced locomotion (Held & Hein, 1963). Both are important factors required for accurate timing of evasive (and interceptive) actions, and need to be continuously incorporated into the baby’s frame of reference for action.
Optical Information for Self-Perception 125 By localizing the brain source activity for looming stimuli approaching at different speeds and using extrinsic tau-coupling analysis, the temporal dynamics of neuronal activity in the first year of life was further investigated (see Figure 7.4). Tau-coupling analysis calculated the tau of the peak-to-peak source waveform activity and the corresponding tau of the looms. Source dipoles that modeled brain activities within the three occipital areas of interest were fitted around peak looming VEP activity to give a direct measure of brain source activities on a trial-by-trial basis. Testing prelocomotor infants at 5–7 and 8–9 months and crawling infants at 10–11 months of age, we reported synchronized theta oscillations in response to visual looming. Extrinsic tau-coupling analysis between the external looms and the source waveform activities showed evidence of strong and long tau-coupling in all infants, but only the oldest infants showed brain activity with a temporal structure that was consistent with the temporal structure present in the visual looming information (Van der Weel & Van der Meer, 2009). Thus, the temporal structure of the different looms was merely reflected in the brain, and not added to the brain, as indirect theories of perception would have it. As infants become more mobile with age, their ability to pick up the looms’ temporal structure may improve and provide them with increasingly accurate time-to-collision information about looming danger. Unlike young infants who treated all the looms the same, older infants with several weeks of crawling experience differentiated well in their brain activity between the three loom speeds with increasing values of the tau-coupling constant, K, for the faster looms, as shown in Figure 7.4E–G. The finding that changing patterns in the optical looming information were reflected in the changing patterns in the neurological flow may shed some light on the concept of resonance introduced by Gibson (1966a). The variable tau (τ ) and its rate of change specify the time-to-contact between an approaching object and the visual system (Lee, 2009). The same variable was found to be operating in the neural flow when looming-related activity was progressing through the infant brain. Thus, oscillatory activity in the visual cortex was tau-coupled to the approaching looms, that is, the change in the theta rhythm’s temporal structure was linearly correlated with the value of tau of the looms. This, in our view, may indicate a process of resonance in which informational and electrical flow are successfully coupled in terms of the same variable tau via the coupling constant K. However, how are these intricate processes of resonance further organized in the infant brain? Traditionally, it is assumed that there exists a one-to-one mapping between brain structure and function, implying some kind of modular organization of the brain (Fodor, 1981). In the case of our looming experiments, this would involve a specific mapping procedure between the incoming looming information and a specialized, encapsulated module in the brain dealing with looming-related neural activity. Gibson (1966a),
Optical Information for Self-Perception 127 however, suggested an alternative for this type of modular organization when he introduced Lashley’s (1922) concept of vicarious use of brain tissue, explaining that the same neural tissue can be involved in different temporarily assembled structures suitable to a task. In other words, the functioning of the neurons depends on the context in which they are operating. In this view, neurons can change function completely when incorporated in different systems; they temporarily assemble to enable a given task. Reed (1996b) introduced a different concept to stress the high degree of flexibility of organization of the nervous system, namely, that of degeneracy. Degeneracy is the ability of elements that are structurally different to perform the same function or yield the same output. These two concepts express the highly flexible organization of the brain. Bullmore and Sporns (2009) refer to this type of flexible organization as functional connectivity as opposed to structural connectivity. In our latest longitudinal looming results on 25 infants, we observed both structural and functional organization principles in the infants’ brain responses to the approaching looms (Van der Weel & Van der Meer, 2019). The location of electrical looming-related activity was stable across all subjects and trials and occurred within a 1 cm3 area of the visual cortex. These findings hint at a rather structural organization of brain activity in response to looming. However, these findings may be explained by the strict retinotopic organization of the visual system. However, when it came to orientation of electrical looming-related activity, the results tell an entirely different story, showing a high degree of variability of activity that, in addition, was spread across a much larger
Figure 7.4 (A) Accelerating looming stimulus approaching the infants’ eyes resulting in increased theta-band oscillatory activity in the visual cortex. A four-shell ellipsoidal head model was created for every trial and used as a source montage to transform the recorded EEG data from electrode level into brain source space. The results of this analysis for dipole VCrL (visual cortex radial left, depicted in head model in light gray) are shown for the three infant age groups in B–D. Each graph shows averaged, peak-aligned source waveform (SWF) activity at dipole VCrL for the three looms (in nanoampere, nA). Overall shape of the SWFs was similar at the different ages, but their duration was about twice as long in the 5- to 7-month-olds as compared to the 10- to 11-month-olds. Note that SWF activity did not discriminate well between slow, medium, and fast looms. Therefore, peak-to-peak SWF brain activity was tau-coupled onto the corresponding part of the extrinsic loom to study the temporal dynamics of neuronal activity. (E–G) Average taucoupling plots, tSWF vs tloom for each infant age group for the three loom speeds, showing that crawling 10- to 11-month-olds differentiated well between slow (in light gray), medium (in gray), and fast looms (in black), with significantly higher values for the coupling constant, K, for faster looms, whereas younger prelocomotor infants did not. Source: From Van der Weel and Van der Meer (2009).
128 Audrey van der Meer and Ruud van der Weel area of the visual cortex. This reveals a much more functional form of organization with connectivity patterns emerging in various directions and changing radically from trial to trial. With this type of flexible organization, there is no need for a one-to-one mapping between brain structure and function, as suggested by Fodor (1981). Instead, brain organization can be flexible in the sense that structurally different neural tissue can be involved in flexible temporarily-assembled structures and the functioning of the neurons depends on the context in which they are operating. In this view, neurons adhere to flexible principles; they temporarily assemble to reveal the typical temporal characteristics of the approaching looms to the brain. The main objective of our developmental neuroscience research based on the principles of Gibson’s ecological approach to visual perception has always been to show that the brain is not adding to, structuring, or otherwise enriching the incoming perceptual information, but that crucial higher-order informational variables, about, for example, time-to-collision via tau, are merely reflected by the brain. Our findings show that infants, around their first birthday and after several weeks of self-produced crawling experience, clearly display in their looming-related brain activity a temporal structure that is consistent with that present in the visual looming information. Thus, invariants in the perceptual information specifying an imminent optical collision are merely reflected in the more mature infant brain, consistent with Gibson’s (1966a) concept of resonance. Our latest developmental findings on looming provide evidence for degeneracy (Reed, 1996b) or vicarious use of brain tissue (Gibson, 1966a), where brain organization is flexible, as neurons temporarily assemble to enable a given task and change function completely when incorporated in different systems (Van der Weel & Van der Meer, 2019).
Conclusion What we tried to achieve by writing this chapter is to highlight the fact that over the past 35 years, ever since we met as undergraduate students in Amsterdam, we have been inspired and influenced by J. J. Gibson’s ecological approach. Our developmental research on early arm movements and interceptive timing skills emphasizes the importance of establishing a bodily frame of reference for action as an anchor for both affordance perception and prospective control of adaptive behavior. For us, Dave Lee’s concept of tau (2009)––an example of an informational variable that can be picked up directly––is instrumental in linking prospective control to affordances. The ecological approach is often accused of neglecting the brain when explaining perception and action. We would argue here that the brain is part and parcel of the perceptual and motor systems and therefore deserves to play a role within ecological theory. However, departing from an
Optical Information for Self-Perception 129 e cological approach to perception and action, the questions asked and the answers that are considered satisfactory will be very different from those arising from traditional perspectives. Therefore, the challenge for an ecological or Gibsonian neuroscience is to study the brain in a way that is consistent with ecological theory. Over the past 15 years, we have collected evidence that the brain is not adding to, structuring, or otherwise enriching the information coming in through the perceptual systems. Instead, the (temporal) structure already present in the information appears to be simply better reflected in the more mature infant brain after several weeks of experience with self-produced locomotion. In our brain research, we find the ecological concepts of resonance and vicarious function (Gibson, 1966a) and degeneracy (Reed, 1996b) increasingly useful.
8 A Guided Tour of Gibson’s Theory of Affordances Jeffrey B. Wagman
Although James J. Gibson contributed in myriad ways to experimental psychology (and beyond) over a 50-year career, the concept of “affordance” is arguably his most enduring legacy and his most influential contribution. The concept has taken hold in experimental psychology, environmental psychology, human factors, communication, neuroscience, and artificial intelligence. Moreover, the renowned philosopher Daniel Dennett (2017) identified “Affordances” as the singular scientific concept that ought to be more widely known by the general public. There have been multiple reviews of the empirical literature on perception of affordances (e.g., Dotov, de Wit, & Nie, 2012; Fajen, Riley, & Turvey, 2009) as well as multiple rounds of theoretical development that have refined the understanding of the concept (e.g., Chemero, 2003; Fajen, 2007; Rietveld & Kiverstein, 2014; Turvey, 1992; Withagen & van Wermeskerken, 2010). Consequently, I will not attempt either here. Rather, I will provide a guided tour of Gibson’s (1979/2015) chapter, unpacking and explicitly connecting its contents with the theoretical and empirical developments that have emerged in its substantial wake. “The Theory of Affordances” is Chapter 8 in The Ecological Approach to Visual Perception. That the chapter is included in Part II of the book (“The Information for Visual Perception”) rather than in the subsequent sections that more explicitly describe the process of detecting such information reflects Gibson’s commitment that affordances are real, objective, and ecological facts to be perceived. And Gibson gets right to this point in the very first paragraph: I have described the environment as the surfaces that separate substances from the medium … But I have also described what the environment affords … How do we go from surfaces to affordances? And if there is information in light for the perception of surfaces, is there information for the perception of what they afford? Perhaps the composition and layout of surfaces constitute what they afford. If so, to perceive them is to perceive what they afford. This is a radical hypothesis, for it implies
Guided Tour: Gibson’s Theory of Affordances 131 that the “values” and “meanings” of things in the environment can be directly perceived. (1979/2015, p. 119) The short answer to Gibson’s first question is that we go from surfaces to affordances when a point of observation in the optic array becomes occupied by a perceiver (with a given set of action capabilities). Subsequently, Gibson (1979/2015) writes, “affordances imply the complementarity of the animal and environment” (p. 119). It might also be said that they are the result of the complementarity between animal and environment. If surfaces are the interface between substance and medium (see Nonaka, Chapter 2, in this volume; Pagano & Day, Chapter 3, in this volume), affordances are the interface between animal (qua action capabilities) and surfaces (i.e., environment) (see Figure 8.1). Affordances are possibilities for behavior emerging from relations between properties of animals and properties of the environment. To some extent, the answer to Gibson’s second question is the topic of the entire book. Later, Gibson refers to this as the “central question for the theory of affordances” (p. 132). It may very well also be the central question for the ecological approach in general. What makes this hypothesis so radical is that the origin of meaning (which has long bedeviled philosophers and psychologists) is neither physical nor psychological, but, rather, ecological. Affordances (and hence meanings) emerge from relations between animal and environment. They are activity-specific meanings of the surroundings (Turvey, 2013). If
Figure 8.1 Surfaces and affordances as interfaces.
132 Jeffrey B. Wagman affordances are perceived, then meanings are perceived. For Gibson, perception is cognitive—not because it is representational or computational, but because perception is of meaningful, complex, and emergent relationships between perceiver and environment. Moreover, this is so because there is complex and emergent information about such relationships available in structured energy arrays. Gibson then provides examples of affordances, starting with perhaps the most basic of all affordances for a terrestrial animal––that of support (see Figure 8.2). He writes: If a terrestrial surface is nearly horizontal (instead of slanted), nearly flat (instead of convex or concave), and sufficiently extended (relative to the size of the animal) and if its substance is rigid (relative to the weight of the animal), then the surface affords support. (1979/2015, p. 119) Perception and actualization of affordances for support and changes in such processes with development, learning, and expertise in both typical and atypical settings have been fertile topics of investigation in subsequent decades (e.g., E. Gibson et al., 1987; Joh, Adolph, Narayanan, & Dietz, 2007; Walter, Wagman, Stergiou, Erkman, & Stoffregen, 2017). Returning to the theme of the emergence of affordances (and meanings), Gibson writes: [I]f a surface is horizontal, flat, extended, rigid, and knee-high relative to a perceiver, it can in fact be sat upon. If it can be discriminated as having just these properties, it should look sit-on-able … If the surface properties are seen relative to the body surfaces, the self, they constitute a seat and have meaning. (1979/2015, p. 120) Substance and surface properties structure reflected light (see Mace, Chapter 5, in this volume). The structured light encountered at a point of observation provides information about the possible relationships between the perceiver and those substances and surfaces. Thus, when the relationship between a surface and a perceiver is such that sitting is possible (Figure 8.3), the reflected light at a point of observation occupied by that perceiver will provide information about this affordance (see Mark, 1987). Moreover, restrictions on the ability to explore such structure will impair the ability to perceive such affordances (Mark, Balliet, Craver, Douglas, & Fox, 1990, see Stoffregen, Yang, Giveans, Flanagan, & Bardy, 2009). Gibson also mentions affordances for climbing, falling off, getting under, and bumping into as well as those for nutrition, manufacture, and
Guided Tour: Gibson’s Theory of Affordances 133
Figure 8.2 Affordances for support.
manipulation. Finally, he describes what are among the most complex affordances of all—those of other animals and other people. With these first few paragraphs, Gibson created research fodder for generations of scientists to come.
134 Jeffrey B. Wagman
Figure 8.3 Objects that afford sitting on (by humans).
The Niches of the Environment “The Niches of the Environment” and “Man’s Alteration of the Natural Environment” (the two subsequent sections) might seem to be distinct or even unrelated, but they are inextricably linked (continuous, really) in Gibson’s ecological approach. A niche is a way of life—a set of affordances. Animals vary in their ways of life and in the sets of affordances they encounter. They also vary in the complexity of their nervous systems (and brains, if they possess one). Importantly, though, this ought not to matter in perception of affordances. From Gibson’s ecological perspective, a particular animal-environment relationship lawfully structures patterned energy arrays, such that this structure is specific to (provides information about) this relationship. With respect to perceiving a given affordance, the details of a given animal’s nervous system (and brain) may be irrelevant, so long as that animal can detect the structure in that array that specifies the relevant animal-environment relationship.
Guided Tour: Gibson’s Theory of Affordances 135 Accordingly, research has shown that species representing phyla across the animal kingdom (including worms, spiders, mollusks, crabs, frogs, snakes, mice, rats, hamsters, and dogs) are sensitive to affordances (Branch, 1979; Heyser & Chemero, 2012; Jayne & Riley, 2007; Jiménez, Sanabria, & Cabrera, 2017; Reed, 1982a; Sonoda, Asakura, Minoura, Elwood, & Gunji, 2012; Wagman, Langley, & Farmer-Dougan, 2017)—in many cases, in ways analogous to humans. For example, for both humans and frogs, perception of affordances for fitting through an aperture depends on the size of the body in motion (Franchak, Celano, & Adolph, 2012; Ingle, 1973), and for humans, dogs, rats, and hamsters, perception of affordances for reaching depends on the length of the relevant effector and the relative comfort of a given reaching mode (Cabrera, Sanabria, Jiménez, & Covarrubias, 2013; Carello, Grosofsky, Reichel, Solomon, & Turvey, 1989) (Figure 8.4).
Man’s Alteration of the Natural Environment Why has man (sic) changed the shapes and substances of his environment? To change what it affords him (sic) … It is a mistake to separate the natural from the artificial as if there were two environments. It is also a mistake to separate the cultural … from the natural environment. (Gibson, 1979/2015, p. 122)
Just as there is a continuity between perception of affordances by human and non-human animals, there is a continuity between perception of affordances in the natural and built environments (see Heft, 2007;
Figure 8.4 Many animal species perceive affordances for reaching. Source: Dog photo from Wagman et al. (2017), Figure 1. Reprinted with permission from Springer Nature. Rat and hamster photos, courtesy of Felipe Cabrera.
136 Jeffrey B. Wagman ithagen & van Wermeskerken, 2010). Parts I and II of The Ecological W Approach to Visual Perception are devoted to developing an ecological physics—a description of how animal-environment relationships, understood as affordances, lawfully structure energy arrays such that they provide information about those relationships to a perceiver. The continuity between natural and built environments implies that it is possible to design environments such that they give rise to information about affordances (see Figure 8.3). Accordingly, this has been described as an inverse ecological physics (Flach, 1990). Such principles have been used to design display interfaces for complex work environments such as battlefields, cockpits, power plants, and hospitals (see Bennett, 2017; Effken, 2006; Vicente, 2002) as well as for architectural design (e.g., Prieske, Withagen, Smith, & Zaal, 2015; Rietveld, 2016; Sporrel, Caljouw, & Withagen, 2017). The concepts of niche and human alteration of the environment are continuous in that all animal species alter their environments, thereby constructing their own niches and creating their own affordances (see Withagen & van Wermeskerken, 2010). Humans do so more than any other species, especially in the creation of communication and representation systems, artifacts, dwellings, and social structures. In this way, the continuity between the natural and built environments extends to the cultural environment (Costall, 2012; Heft, 2007, 2017; Rietveld & Kiverstein, 2014).
Some Affordances of the Terrestrial Environment The Surfaces and Their Layouts Equilibrium and posture are prerequisite to other behaviors, such as locomotion and manipulation … the ground is quite literally the basis of the behavior of land animals. And it is also the basis of their visual perception … (1979/2015, p. 123; original emphasis)
Gibson makes two important points about affordances in the above quote, both somewhat subtly. First, he describes behaviors (and hence, affordances) as hierarchically nested over both space and time (Reed, 1996b; Stoffregen, 2003a; Wagman & Miller, 2003), laying the groundwork for the description of affordances as “quicksilvery” (Chemero & Turvey, 2007), emerging and dissolving from moment to moment as behavior unfolds (see Figure 8.5). Second, he argues that direct contact with a ground surface entails direct contact with information about that ground surface. That is, direct behavior entails direct perception (Turvey, 2013).
Guided Tour: Gibson’s Theory of Affordances 137
Figure 8.5 Performing a given behavior creates and eliminates affordances.
The earth has “furniture” … It is cluttered … The solid, level, flat surface extends behind the clutter … This is not, of course, the earth of Copernicus; it is the earth at the scale of the human animal, and on that scale it is flat, not round. (Gibson,1979/2015, p. 123)
138 Jeffrey B. Wagman Gibson developed an ecological physics because he felt that standard physics and geometry were inappropriate for understanding relationships between animals and environments. Newton’s physics and Euclid’s geometry are useful for describing properties of abstract, imaginary, idealized, scale-free, or closed systems. Yet, animal-environment systems are concrete, real, actualized, scale-dependent, and open. For Gibson, laws of geometry and physics should not be the a priori basis from which an understanding of perception-action achievements of animals is developed. Rather, it should be that perception-action achievements of animals are the a priori basis from which the laws of geometry and physics are developed (Turvey, 2004) (see Figure 8.6). Whereas a nearly horizontal, nearly flat, and sufficiently extended surface affords support and unimpeded locomotion: [a] vertical, flat, extended rigid surface such as a wall or cliff face is a barrier to pedestrian locomotion. Slopes between vertical and horizontal afford walking, if easy, but only climbing, if steep, and in the latter case the surface cannot be flat; there must be “holds” for the hands and feet. (Gibson, 1979/2015, p. 124) Accordingly, people can perceive affordances for walking up inclined surfaces (e.g., Fitzpatrick, Carello, Schmidt, & Corey, 1994; Kinsella-Shaw, Shaw, & Turvey, 1992; Regia-Corte & Wagman, 2008) and for climbing up vertical surfaces (e.g., Pijpers, Oudejans, & Bakker, 2007; Seifert, Cordier, Orth, Courtine, & Croft, 2017). An insufficiently extended surface affords tumbling down or falling off. In Gibson’s words: “the brink of a cliff is a falling-off place. It is dangerous and looks dangerous. The affordance of a certain layout is perceived if the layout is perceived” (1979/2015, p. 124). Critically, what is a falling-off place, and, hence, what looks like a falling-off place depend on the animal and its action capabilities. Along these lines, infants’ perception of affordances of surfaces depends on their experience moving and keeping their balance in different postures in the course of development. Between 6 and 12 months of age, infants learn to sit, then crawl, and finally walk. Infants who are experienced sitters but novice crawlers refuse to reach across impossibly wide gaps when tested in their experienced sitting posture but attempt to do so when tested in their novice crawling posture. Experienced crawling infants refuse to crawl down impossibly steep slopes or high drop-offs but when tested in a novice walking posture, they attempt to walk straight over the brink (Adolph, Kretch, & LoBue, 2014; Kretch & Adolph, 2013; see Adolph, Hoch, & Ossmy, Chapter 13, in this volume). In short, perceivers are aware of (and become attuned to) affordances—not to metric properties, such as distances, heights, or slopes.
Guided Tour: Gibson’s Theory of Affordances 139
Figure 8.6 Relationship between physics and geometry and perception-action in standard (top) and ecological (bottom) approaches.
Returning to the concepts of niche, human alteration of the natural environment, and niche construction, Gibson writes: “People have altered the steep slopes of their habitat by building stairways so as to afford ascent and descent. What we call the steps afford stepping, up or down, relative to the size of a person’s legs” (1979/2015, p. 124).The first part of this
140 Jeffrey B. Wagman passage, of course, inspired the very first experimental investigation of affordances. In a paradigm-establishing study, Warren (1984) found that the boundary between riser heights that are perceived to be climbable and those that are not occurs at body-scaled ratio (of riser-height-to-leg-length) predicted by biomechanical modeling. What is often underappreciated is that Warren also found that the preferred riser height occurs at a body-scaled ratio predicted by the metabolic costs of stair climbing. Follow-up work in the subsequent decades focused on age-dependent changes in perception of climbability (Konczak, Meeuwsen, & Cress, 1992) and age-independent invariants that support such perception (Cesari, Formenti, & Olivato, 2003). In many cases, locomotion is still afforded even when obstacles are present. As Gibson writes: “ordinarily, there are paths between obstacles and these openings are visible” (1979/2015, p. 132). Along these lines, Warren and Whang (1987) found that the boundary between aperture widths that are perceived to be pass-through-able and those that are not occurs at body-scaled ratio (of aperture-width-to-shoulder-width) and that optical information at a point of observation (at the eye-height of the perceiver) specifies this affordance. Subsequent work has shown that perception of affordances for fitting through horizontal or vertical apertures is better predicted by the size of the body in motion (e.g., dynamic walking height) than by static body measurements (e.g., standing height) (Franchak, Celano, & Adolph, 2012). Other research has investigated perception of affordances for passing through apertures when body size is altered by an external object (e.g., a wheelchair or athletic equipment) (Higuchi, Cinelli, Greig, & Patla, 2006; Higuchi, Takada, Matsuura, & Imanaka, 2004) or body growth (Franchak & Adolph, 2014; van der Meer, 1997b), and changes in perception that accompany practice performing such behaviors (Franchak, van der Zalm, & Adolph, 2010; Higuchi et al., 2011; Stoffregen, Yang, Giveans, Flanagan, & Bardy, 2009). The Objects Gibson again highlights the differences between ecological physics and traditional physics by differentiating between attached and detached objects: “We are not dealing with Newtonian objects in space, all of which are detached but with the furniture of the earth, some items of which are attached to it and cannot be moved without breakage” (1979/2015, p. 124). Detached objects afford manipulation, and this is Gibson’s focus in the remainder of this section. Graspable objects “have opposite surfaces separated by a distance less than the span of the hand” (p. 125). Subsequently, choices about whether and how an object is reachable as well as which grasp configuration should be used (e.g., number of digits in a single-hand grasp, whether the object is grasped with one or two hands, or with a hand-held tool) are scaled to the person’s anthropometric properties (Cesari & Newell, 2000; Richardson, Marsh, & Baron, 2007; Wagman & Morgan, 2010) (Figure 8.7).
Guided Tour: Gibson’s Theory of Affordances 141
Figure 8.7 Objects that afford grasping with one hand and with two hands.
Graspable detached objects afford manipulation. “An elongated object of moderate size and weight affords wielding. If used to hit or strike, it is a club or a hammer” (Gibson, 1979/2015, p. 125). The first sentence foreshadowed an entire program of research on perception by effortful or dynamic touch, showing that perception of geometric and functional properties of a wielded object is constrained by how that object resists being manipulated by muscular forces about a joint (see Carello & Turvey, 2017). The second sentence foreshadowed specific research showing that a wielded object is perceived to afford striking-with to the extent that it facilitates the transference of appropriately scaled forces (Wagman, Caputo, & Stoffregen, 2016; Wagman & Carello, 2001). In most cases, an object that can be manipulated can be thrown—“a graspable rigid object of moderate size and weight affords throwing” (Gibson, 1979/2015, p. 125). Even among object manipulation tasks, throwing is a skilled behavior requiring particularly tuned perception-action skills. Part of this tuning is determining which objects can be thrown to which distances. To this end, objects that have a particular felt heaviness are perceived to optimally afford long-distance throwing (e.g., Bingham, Schmidt, & Rosenblum, 1989; Zhu & Bingham, 2010, 2011). The ability to strike a target with a projectile requires additional skill. To this end, targets appear larger to an archer when his or her form affords precision shooting than when it does not (Lee, Lee, Carello, & Turvey, 2012). Gibson dedicates the rest of this section to establishing the primacy of perception of affordances as opposed to physical properties: “Orthodox psychology asserts that we perceive these objects insofar as we discriminate their properties or qualities … But I now suggest that what we perceive when we look at objects are their affordances, not their qualities …” (1979/2015, p. 125) (Figure 8.8).
142 Jeffrey B. Wagman
Figure 8.8 Perceptual experience vs. artificial measurement (M) devices.
In Gibson’s ecological approach, the animal-environment relations that define affordances are perceived not as a collection of discrete (lower-order) properties but rather as an emergent, higher-order “complex particular” (Turvey, 2015). Affordances are perceived as such, without necessitating prior independent perception of properties of either animal or environment (e.g., Mark, 1987; Stoffregen, 2000a, 2003a). For example, perception of maximum jumping-reach-height is not reducible to a combination of perception of constituent lower-order affordances (i.e., maximum-jump-height plus maximum-reach-height; Thomas, Hawkins, & Nalepka, 2017; Thomas, & Riley, 2014; see Thomas, Riley, & Wagman, Chapter 14, in this volume). Moreover, there is a dissociation between improvements in abilities to perceive a given affordance and abilities to perceive physical properties constituent of that affordance (Higuchi et al., 2011; Mark, 1987; Thomas, Wagman, Hawkins, Havens, & Riley, 2017; Yasuda Wagman, & Higuchi, 2014). Relatedly, people with psychological disorders such as schizophrenia show impaired ability to perceive affordances of a given object but not the physical properties of that object, a result that is consistent with the reduced sense of agency in this population (Kim & Kim, 2017).
To Perceive an Object Is Not to Classify an Object Before moving on, Gibson provides an interesting sidebar on what has become an emerging topic of investigation in ecological psychology:
Guided Tour: Gibson’s Theory of Affordances 143 The fact that a stone is a missile does not imply that it cannot be other things as well. It can be a paperweight, a bookend, a hammer, or a pendulum bob … The theory of affordances recuses us from the philosophical muddle of assuming fixed classes of objects … (1979/2015, p. 126) Gibson is stating the plainly obvious, but theoretically challenging fact that any given object has many affordances—even for the very same animal at the very same moment. A stone can be thrown, stacked, pounded with, or stepped on by a given animal with appropriate action capabilities. Alternatively, a stone can afford none of these behaviors for a different animal with inappropriate action capabilities. This leads to the seemingly paradoxical state of affairs in which a given object simultaneously does and does not afford a particular set of behaviors. However, this is only paradoxical from the perspective of traditional physics in which a given object is a set of animal-independent properties. It is not so from the perspective of ecological physics in which a given object is a superposition of animal-dependent properties that become actualized in specific contexts (Turvey, 2015) (see Figures 8.5 and 8.6). The unstated, but equally true (and equally theoretically challenging) corollary to this fact is that a given affordance can be actualized with many different objects. Throwing can be performed with a stone, a ball, or a coffee mug (if so inclined). In any given situation, there is a multiplicity of affordances—a many-to-many relationship between action capabilities and environmental properties. At any given moment, there are multiple affordances and multiple means by which to actualize each affordance. Consequently, it is often insufficient for a perceiver to merely choose whether to actualize a given affordance. Rather, he or she must choose which affordance to actualize as well as when and how to do so. Such choices are based not on the ability to perform an isolated behavior but rather on the ability to perform that behavior in the context of overarching goals and task constraints, including metabolic costs, penalties for errors, and comfort (Comalli, Franchak, Char, & Adolph, 2013; Comalli, Persand, & Adolph, 2017; Mark et al., 1997; Wagman, Bai, & Smith, 2016). In short, people are sensitive to hierarchically nested affordances (Wagman et al., 2016b). Other Persons and Animals Gibson’s ecological approach applies not only to the understanding of perception of affordances in animal-environment systems, but also in animal- animal-environment systems (see Marsh, Richardson, Baron, & Schmidt, 2006; Richardson, Marsh, & Schmidt, 2010). A niche is a set of affordances, and the human niche includes other people:
144 Jeffrey B. Wagman The richest and most elaborate affordances of the environment are provided by other animals and for us, other people … Behavior affords behavior and the whole subject matter of psychology and of the social sciences can be thought of as an elaboration of this fact. (Gibson, 1979/2015, pp. 126–127) Accordingly, people can perceive whether another person can perform a given behavior and, if so, how and when that person ought to do so. Importantly, such sensitivity reflects the action capabilities of the other person in the context of overarching goals and task constraints. In short, people are also sensitive to hierarchically nested affordances for another person (Cordovil, Santos, & Barreiros, 2012; Passos, Cordovil, Fernandes, & Barreiros, 2012; Ramenzoni, Davis, Riley, & Shockley, 2010; Ramenzoni, Riley, Shockley, & Davis, 2008; Wagman, Stoffregen, Bai, & Schloesser, 2018). Continuous changes in an environmental property can lead to spontaneous, abrupt, and nonlinear transitions between (perception of ) affordances for another (or group) as well as for the self. For example, when people are asked to move objects from one location to another, they spontaneously transition from using a one-hand grasp to a two-hand grasp to a two-person grasp as the object size increases (Richardson et al., 2007). That is, just as in other self-organizing phenomena, individual components of a system spontaneously coordinate behavior at a critical value to satisfy a goal under constraints. Such results show that the principles that underlie perception of affordances for an individual are continuous with those that underlie perception of affordances for another person or group (see Marsh, Richardson, Baron, & Schmidt, 2006). Gibson explicitly makes this point here: The perceiving of these mutual affordances is enormously complex but it is nonetheless lawful and it is based on the pickup of the information in touch, sound, odor, taste, and ambient light. It is just as much based on stimulus information as the simpler perception of the support that is offered by the ground under one’s feet. (1979/2015, p. 127) That is, the continuity between the perception of affordances across human and non-human animals and in the natural and built environments extends to the perception of affordances for the self and other. In all such cases, higher-order, complex, and emergent patterns in structured energy arrays provide information about affordances. For example, the ability to perceive another person’s maximum reach-with-jump-height is dependent on the detection of kinematic patterns informative about that person’s ability to produce task-specific forces with the legs. Moreover, the ability to perceive this affordance for another person improves after watching that
Guided Tour: Gibson’s Theory of Affordances 145 person (or a point light representation of that person) perform task- relevant behaviors (e.g., walking or squatting) but not task-irrelevant behaviors (e.g., twisting or standing) (Ramenzoni, Riley, Davis, Shockley, & Armstrong, 2008; Ramenzoni et al., 2010). Moreover, athletes are better attuned to information about sport-specific abilities of others than are non-athletes but are no better attuned to information about non-sportspecific abilities of others (Weast, Shockley, & Riley, 2011; Weast, Walton, Chandler, Shockley, & Riley, 2014). Places and Hiding Places The habitat of a given animal contains places. A place is not an object with definite boundaries but a region. The different places have different affordances … Animals are skilled at what the psychologist calls place learning. (Gibson, 1979/2015, p. 127)
Of course, part of place learning is learning how to get to and from a particular place. Along these lines, this passage foreshadowed work showing that human odometry—nonvisual (kinesthetic) perception of places and their distances—is based on detection of variables that remain invariant over exploratory locomotion (Harrison & Turvey, 2010; Turvey et al., 2009).
The Origin of the Concept of Affordances: A Recent History For Gibson, the affordance concept differs from the Gestalt concepts of “demand character,” “invitation character,” and “valence” in that “The affordance of something does not change as the need of the observer changes. The observer may or may not perceive or attend to the affordance … but the affordance, being invariant, is always there to be perceived” (1979/2015, p. 130). However, with an argument based on industrial design, architecture, and phenomenology, Withagen and colleagues (Withagen, Araújo, & de Poel, 2017; Withagen, de Poel, Araújo, & Pepping, 2012) have argued that affordances can, in fact, invite behavior but in a way that is rooted in more permanent evolutionary, social, cultural, and personal history (e.g., social norms) than in momentary psychological states (see Heft, 2017; Rietveld & Kiverstein, 2014).
The Optical Information for Perceiving Affordances The perceiving of an affordance is not a process of perceiving a value-free object to which meaning is somehow added … it is a process of perceiving a value-rich ecological object … Physics may be value free, but ecology is not. (Gibson, 1979/2015, pp. 131–132)
146 Jeffrey B. Wagman Here, Gibson reasserts that affordances (and hence meanings and values) emerge in the relations between animal and environment—they are inherent neither in the animal nor in the environment but only in animal- environment systems (Chemero, 2009). Affordances for climbing stairs, for example, only emerge in a system that includes an animal, stairs, the mutual compatibility of the two with respect to stairclimbing, and a given occasion. Therefore, perceiving an affordance means perceiving a system of which the animal is a part (Turvey, 2013). At first pass, this seems paradoxical. But Gibson reminds the reader that: [A]n affordance, as I said, points two ways, to the environment and to the observer. So does the information to specify an affordance … to perceive the world is to co-perceive oneself … The awareness of the world and of one’s complementary relations to the world are not separable. (1979/2015, pp. 132–133) In other words, an animal perceives a system of which it is a part because what it perceives is inextricably tied to the surroundings, the animal itself, its point of observation, and its movements—which are inextricably tied to its action capabilities (Petrusz & Turvey, 2010). Relations are perceived because the information is relational—it is also determined by the animal’s surroundings, the animal itself, its point of observation, and its movements. Though it often goes unrecognized, Gibson was the original proponent of embodied, embedded, and situated cognition (Chemero, 2009; Fultot, Nie, & Carello, 2016). For Gibson, the complexity of the information specifying affordances is not a stumbling block for the perceiver, nor should it be for the scientist: [A] compound invariant is just another invariant. It is a unit and the components do not need to be combined or associated … I have described the invariants that enable … two or more children to perceive the same shape at different points of observation … to perceive the common affordance of the solid shape despite the different perspectives. (1979/2015, p. 133) That the same affordance can be perceived from multiple points of observation foreshadowed and inspired work on the perceptual constancy of affordances—that perception of affordances for a given behavior reflects a person’s action capabilities over the variety of circumstances in which that affordance is encountered (Turvey, 1992; Wagman & Day, 2014). Affordances for a given behavior can be perceived by means of different anatomical components, from different points of observation, and under different task constraints (Cole, Chan, Vereijken, & Adolph, 2013; Wagman & Hajnal, 2014a, 2014b, 2016).
Guided Tour: Gibson’s Theory of Affordances 147
Misinformation for Affordances Before concluding the chapter, Gibson comments on what he calls the “misinformation for affordances.” He writes: “According to the theory being developed, if information is picked up, perception results; if mis information is picked up, misperception results” (1979/2015, p. 133). This is a subtle, but profound statement about the ecological approach. In traditional approaches, perception is the result of a computational or interpretive process, and perception is accurate (or inaccurate) to the degree that the outcome of this process matches that of an artificial measuring device (e.g., a scale, a ruler, or a protractor). In Gibson’s ecological approach, however, perception is a lawful relationship between perceiver and environment. Consequently, the “accuracy” of perception cannot be evaluated, and misperception is not an error. Moreover, if perception is primarily (or exclusively) of affordances, then the experience of the perceiver is very much unlike the output of artificial measuring devices (see Figure 8.8). Therefore, so-called ‘illusions’ do not invalidate the ecological claim that perception is direct so much as they challenge researchers to discover lawful relationships between the information available at a point of observation and affordances. Gibson argues that a theory of perception should be developed from the countless everyday successes of perception rather than from the rare (and often artificially induced) so-called failures of perception. For an infant who refuses to crawl across a visual cliff and an adult who walks into a sliding glass door, in neither case is perception in error. Affordances of the respective surfaces were (not) perceived, but this is only because the information specifying those affordances was (not) detected (or not present). “These two cases are instructive. In the first, a surface of support was mistaken for air because the optic array specified air. In the second a barrier was mistaken for air for the same reason” (p. 134). Both perception and misperception are the detection of information. In some (rare) cases, the information specifies one state of affairs when another state of affairs is so. In other cases, perceivers are not sufficiently attuned to relevant information (see Adolph, 2008; Adolph et al., 2014; Kretch & Adolph, 2013) or are prevented from exploring the structured energy array such that this information can be detected (Mark et al., 1990; Stoffregen et al., 2009; Yu, Bardy, & Stoffregen, 2010).
Conclusion In the concluding paragraphs of the chapter, Gibson argues that affordances are among the most fundamental relationships between animal and environment and play primary roles in shaping the evolution of species and ontogenetic development of individuals:
148 Jeffrey B. Wagman The medium, substances, objects, places, and other animals have affordances for a given animal … They offer benefit or injury, life or death. This is why they must be perceived. The possibilities of the environment and the way of life of the animal go together inseparably … (1979/2015, pp. 134–135) He then returns to the central hypothesis of the theory of affordances and perhaps of the ecological approach in general: The hypothesis of information in ambient light to specify affordances is the culmination of ecological optics. The notion of invariants that are related at one extreme to the motives and needs of an observer and at the other extreme to the substances and surfaces of a world provides a new approach to psychology. (p. 135) Such a notion has indeed provided a new approach to psychology—one that has helped to reveal the lawful bases of perception and actualization of affordances. To a large extent, scientific psychology is the science of agency—the ability to select, perceive, and actualize affordances appropriately based on intention. Investigating perception of affordances over the variety of circumstances in which they are encountered, and by the variety of species that encounter them, will continue to make progress toward Gibson’s goal of bringing scientific psychology into closer alignment with the natural sciences (Turvey, 2013; Wagman, 2010).
9 Perceiving Surface Layout Ground Theory, Affordances, and the Objects of Perception William H. Warren
The Ecological Approach to Visual Perception was James Gibson’s (1979/2015) final statement of his theory of perception, which had steadily evolved over a lifetime of research and writing (Reed, 1988). In Chapter 9, Gibson critiques his earlier views and almost—but not quite—embraces a fully ecologized theory of the direct perception of surface layout. The question that emerges from this chapter is whether the objects of perceptual awareness are geometric properties, such as size, distance, and shape, or ecological properties, such as the affordances of surface layout they underwrite, or both. To put it provocatively (as Gibson was wont to do), one might ask whether affordances are perceivable but the geometric layout per se is not (cf. Gibson, 1975). Gibson’s theory of layout perception was prescient in many ways. His functional analysis of optical information became central to modern research in human and machine vision (Marr, 1982). The fundamental importance of the ground and the horizon for layout perception has been empirically vindicated (Bian, Braunstein, & Andersen, 2005; Sinai, Ooi, & He, 1998). His concept of affordances has been taken up in psychology, robotics, and design (see Hsu, 2019; Wagman, Chapter 8, in this volume). Yet a number of vexing puzzles about perceiving surface layout persist, to which I will suggest some resolutions here. Following Gibson’s Chapter 9, I will focus on static monocular perception of layout in the open field.1 Reconsidering the problem of layout perception will force us to face the question, what, exactly, is perceived?
Gibson’s Theory of Layout Perception, Updated Chapter 9 elaborates two essential aspects of Gibson’s unfolding perceptual theory. First, it offers a pithy statement of what he meant by “direct perception”: perceiving the environment by picking up optical information, unmediated by an awareness of retinal or mental images, or by a process of inference (Gibson, 1979/2015, pp. 139, 141). Second, it develops this information-based account for the perception of surface layout. These two threads are historically entangled, for it was the failure
152 William H. Warren of the traditional cue-based account of depth perception for testing pilots during World War II that drove Gibson to the view that perception is direct. Based on that experience, Gibson (1950a) rejected what he called an air theory of perception, in which the size and distance of objects in empty space are inferred from 2D retinal images with the aid of traditional depth cues. He replaced it with a ground theory, in which the observer perceives not space per se but recession along the ground and objects arrayed on the ground. Moreover, Gibson believed he had identified stimuli for geometric properties of layout such as size, distance, and slant, in higher-order patterns of optical structure. Such observations led him to formulate a new psychophysics of perception in place of classical sensory psychophysics, a “stimulus-response psychology” (Gibson, 1959), in which “a percept was an automatic response” to such higher- order stimuli (Gibson, 1979/2015, p. 141). But in Chapter 9 he dismantles the psychophysical stimulus-response (S-R) formulation of the 1950s. This shift began with his conception of ecological optics and information as specificity to environmental properties (Gibson, 1961b), and continued with a reformulation of perception as an act of exploring, attending to, and picking up such information (Gibson, 1966a). Chapter 9 presents his mature information-based theory in which an active agent seeks and detects information and perceives layout in the service of goal-directed action. Some 40 years later, I want to update this story in some, er, depth. The Ground Surface The foundation of layout perception in the wild, according to Gibson, is a homogeneously textured ground surface with an explicit or implicit horizon. This is the terrestrial context of constraint in phylogeny and ontogeny: a continuous horizontal ground surface in a gravitational field, on which the observer stands and other objects rest. In this context, visual information is available to specify an object’s location on the ground surface, distance from the observer, distance from other objects, relative size, and slant. Even if space per se is not perceivable, spatial relations in the ground plane are visually specified. Consequently, if terrestrial constraints are violated, predictable errors will ensue, as observed with floating objects, an elevated observer, and a sloping or discontinuous ground surface. An enchanted laboratory of such effects is The Mystery Spot, a roadside attraction in Santa Cruz, CA, where a tilted shed has been built on a sloping hillside. These alterations to the ground surface and the visual framework yield startling distortions of perceived size and orientation (Shimamura & Prinzmetal, 1999), including people who grow and shrink and water that flows uphill. Although such illusions are routinely invoked to demonstrate the untrustworthiness of
Perceiving Surface Layout 153 perception (Palmer, 1999), these perceptual effects are precisely what Gibson’s ground theory predicts. The role of the ground and horizon is also demonstrated in more controlled experiments. For example, Philbeck and Loomis (1997) asked standing observers to view a target in the dark at a distance of 1–5 m, then either close their eyes and walk to its remembered location (“blind walking”), or verbally estimate its distance in feet (“magnitude estimation”). With targets at eye level, distances of 1–2 m were overestimated while farther distances were indistinguishable––whether viewing was monocular or binocular, with a fixed or free head. With targets on the implicit ground plane, on the other hand, judgments were close to accurate under all viewing conditions, with a similar pattern of results for blind walking and magnitude estimation.2 Such findings indicate that even an implicit ground plane is sufficient for perception of a target distance, whereas binocular disparity and motion parallax are not, at least beyond a meter or two. Moreover, if there is a discontinuity in the ground surface due to a gap, a texture boundary, or a change in reflectance, distance judgments are systematically biased (Feria, Braunstein, & Andersen, 2003; Meng & Sedgwick, 2002; Sinai et al., 1998; Wu, He, & Ooi, 2007). If the experimenter pitches the visual framework upward, the perceived size of objects resting on the ground plane is predictably smaller, and vice versa (Matin & Fox, 1989; Stoper & Bautista, 1992)––just like the Mystery Spot. A continuous, more-or-less homogeneous, roughly horizontal ground surface thus makes successful distance and size perception possible. What visual information is provided by the ground? In Chapter 9, Gibson proposes three key invariants for the perception of exocentric (world-centered) layout: (1) optical contact; (2) the horizon ratio; and (3) the ground texture as an intrinsic spatial scale. Location: Optical Contact The exocentric location of an object on the ground surface is specified by the point at which its base occludes the ground texture, known as optical contact, which is invariant over viewing position. Gibson demonstrated the effectiveness of this variable by invisibly suspending a card above the ground surface (Gibson, 1950a, Figure 72). When viewed from the front, the card appears to rest on the ground at the point of optical occlusion (Epstein, 1966); if the observer moves, however, the shearing of optical texture at its base reveals it to be floating. As first noted by Leonardo, a cast shadow in contact with the object’s base also nails it down to a location on the ground (Ni, Braunstein, & Andersen, 2004; Yonas, Goldsmith, & Hallstrom, 1978). Moreover, objects piled on top of one another are similarly located on the ground by what Meng and Sedgwick (2001) called nested contact relations. The ground surface thus provides the fundamental reference frame for perceiving the locations of objects in the environment.
154 William H. Warren Size: The Horizon Ratio In his influential dissertation, Sedgwick (1973, 1986) showed that the relative size (height) of terrestrial objects is specified by the horizon ratio. As Gibson (1979/2015) put it, the horizon intersects all objects of the same height at the same ratio, which thus constitutes an invariant for size constancy. For example, if the horizon intersects a set of telephone poles and a tree at one-third of their height, they are all the same height (see Figure 9.1). Gibson (1950a) first showed that size judgments of a stake in an open field, compared to a standard set before the observer (15–99 in. tall), were accurate out to half a mile; only the variable error increased with distance. Although this experiment did not isolate the horizon ratio, it is the only plausible explanation of the data. Using pictorial stimuli, Rogers (1996) subsequently demonstrated highly accurate relative size judgments based on the horizon ratio, within certain pictorial constraints (the horizon is not too far from the middle of the picture, the adjusted line was not more than three times the length of the standard line). The horizon ratio also specifies egocentric (viewer-centered) size, relative to the observer (Figure 9.2a). The height at which the horizon intersects an object corresponds to the observer’s eye height (E) on that
Figure 9.1 The horizon ratio. The horizon intersects all objects of the same height at the same ratio, providing an invariant for size constancy. For example, the telephone poles and the tree all have the same horizon ratio of 0.36, and are thus the same size. The horizon line also corresponds to the observer’s eye height on an object: each telephone pole is thus about three eye heights tall. Source: From Gibson (1979/2015), Figure 9.6. Copyright 2015. Reproduced by permission of Taylor & Francis Group, LLC, a division of Informa plc.
Perceiving Surface Layout 155 ;ĂͿ �
, , с ƚĂŶ� н ƚĂŶ� ƚĂŶ�
с ϭ ƚĂŶ� Ζ с ϭ ƚĂŶ;ϭ͘ϱ�Ϳ
Ζ с ϭ ƚĂŶ;�н�Ϳ
Figure 9.2 Geometry of the ground theory. (a) Horizon ratio specifies frontal extent (H) in eye height units (E). (b) Declination angle (a) specifies distance (Z) in eye height units; overestimated declination angle (1.5a) yields perceived linear distance compression. (c) Declination angle from raised horizon (a + e) yields perceived nonlinear distance compression
object, and thus specifies object height (H) in units of eye height. This can be formalized as H tan a + tan g __ = ___________ E
where α is the visual angle between the horizon and the base of the object and γ is the visual angle between the horizon and the top of the object. Note that the horizon ratio specifies not only the height but any frontal dimension of an object, such as its width (W): W __
2 tan b ______ = tan a E
156 William H. Warren where b is half the horizontal visual angle of the object. Eye height thus provides a body-scaled measure of an object’s frontal extent. Contrary to the longstanding belief that perceived size depends on perceived distance (size-distance invariance), the horizon ratio specifies size independent of distance. This claim was confirmed by Haber and Levin (2001) in an open-field experiment, which found that verbal estimates of the size and distance of the same objects by the same observers were completely uncorrelated. Specifically, over distances of 3–100 m and vertical sizes of 0.2–2.0 m, estimates of the distance of unfamiliar geometric shapes accounted for zero variance in the estimates of their size, and vice versa. The perception of size from the horizon ratio does not depend on the perception of distance. This is an exemplary case of how to test whether perception of one property depends on the explicit perception of another property. The horizon ratio holds whether the horizon is explicit (visible) or implicit, specified by the limit of optical compression and the convergence of ground texture and wall texture. People are highly accurate and precise when estimating their own eye level (indicative of the perceived horizontal or horizon) in the light, and only slightly less so in the dark (Stoper & Cohen, 1986). When a minimal visual framework consisting of two parallel lines or a rectangular box is pitched upward or downward, it biases perceived eye level in the same direction by around 50%, demonstrating both visual and gravitational influences (Matin & Fox, 1989; Stoper & Cohen, 1989). In outdoor scenes, O’Shea and Ross (2007) observed that sloping ground elicits a similar bias of 40% in perceived eye level, although it saturates at +3 º to +4 º when looking at uphill slopes. As the horizon ratio predicts, these manipulations of perceived eye level produce corresponding biases in perceived object size (Matin & Fox, 1989; Stoper & Bautista, 1992). Specifically, raising the perceived eye level (hence, the implied horizon) by pitching the visual framework upward reduces the judged vertical size of an object resting on the floor, and vice versa. Manipulating the observer’s effective eye height has similar effects on size-related affordance judgments. Warren and Whang (1987) first showed that reducing the visually specified eye height by raising a false floor increased the perceived width of a doorway, so that a narrow aperture was now judged to be passable. Mark (1987) likewise found an eye height effect on the perceived vertical height that afforded sitting. Wraga (1999) reported a similar influence of effective eye height on perceived vertical size by using a false floor and by varying the eye height in a virtual reality (VR) head-mounted display (Dixon, Wraga, Proffitt, & Williams, 2000). Note, however, that the eye height manipulation only affected the perceived size of objects between 0.2–2.5 eye heights tall (Wraga & Proffitt, 2000).
Perceiving Surface Layout 157 Relative Size and Distance: Ground Texture as an Intrinsic Scale Gibson emphasized that the ground surface texture provides an intrinsic scale for exocentric size and distance. If the surface texture is “stochastically regular,” then “equal amounts of texture” correspond to “equal stretches of distance along the ground” everywhere in the scene. Thus, the amount of texture covered by the base of an object provides an intrinsic scale for size, and the amount of texture between two objects provides an intrinsic scale for the distance interval between them. This is a potentially powerful variable, for it provides a basis for both size and distance constancy. For example, any objects that cover T texture are the same width, while an object that covers 2T is twice as wide. This is nicely illustrated by Figure 9.3 (from Gibson 1979/2015), in which two cylinders resting on a tiled floor look to be the same size, for the base of each covers the width of one floor tile. For such a scale to be invariant over translation and rotation in the ground plane, satisfying a Euclidean metric, the texture elements must be symmetric (isotropy) and have a constant size over the whole surface (homogeneity). Ordinary textures, however, are often anisotropic (e.g., bricks, paving stones, wood grain), undermining comparisons in different directions. Indeed, the floor tiles in Gibson’s own figure are anisotropic,
Figure 9.3 Ground texture as an intrinsic scale for relative exocentric size and distance. Objects that cover the same amount of ground texture are the same width, assuming the texture is homogeneous across the surface. Thus the two cylinders appear to be the same size. Equal amounts of texture correspond to equal stretches of distance along the ground, assuming the texture is isotropic. That is not the case in this figure, so the depth interval of four texture units between the cylinders is not equal to a frontal interval of four texture units. Source: From Gibson (1979/2015), Figure 9.5. Copyright 2015. Reproduced by permission of Taylor & Francis Group, LLC, a division of Informa plc.
158 William H. Warren such that an equal amount of texture corresponds to a larger stretch of distance in depth than in the frontal dimension (see Figure 9.3). The texture scale hypothesis predicts that manipulating this anisotropy should affect perceived exocentric distance. In addition, Gibson’s student W. P. Purdy (1958) proved that the egocentric distance of an object from the observer is specified by the optical size gradient of ground texture at the object’s base. This hypothesis predicts that manipulating the size gradient should affect perceived egocentric distance. To my knowledge, neither of these experiments has been done. If ground texture is an effective scale, perceived stretches of distance in the open field should be quite accurate, precise, and invariant over changes in viewing distance and direction. But as we will see, there are systematic biases in distance perception. Note that ground texture can only provide a scale when it is visually resolvable, and common textures (e.g., grass, sand, gravel, asphalt) become indistinct at farther distances, where the optical density surpasses the spatial frequency threshold (around 25 cycles/degree in daylight). This implies that perceived size over large distances depends on the horizon ratio, and perceived distance likely depends on other information. Egocentric Distance: Declination Angle Sedgwick (1973, 1986) showed that the egocentric distance (Z) of an object resting on the ground is specified by the declination angle (a) of its base from the horizon, in units of eyeheight (E): Z __
1 _____ = E tan a
The depth cue formerly known as “height in the visual field” actually depends on optical contact with the ground plane (Epstein, 1966; Gibson, 1950a, p. 180) and the declination from the horizon. To experimentally test this distance information, Wallach and O’Leary (1982) used a minifying lens to decrease the declination angle, yielding the predicted increase in perceived distance, as indicated by judgments of the size of a square target. Conversely, Ooi, Wu, and He (2001) used a base-up wedge prism to increase the declination angle, yielding the expected decrease in perceived distance, as measured by blind walking. There is thus converging evidence that the effective information for egocentric distance over the ground is the declination angle. Slant: Texture Gradients One of Gibson’s (1950a) early discoveries was that optical texture gradients specify the slant of a surface to the line of sight, known as optical slant. It is not the density gradient that carries the effective information, as
Perceiving Surface Layout 159 Gibson thought, but the size gradient. W. P. Purdy (1958) showed mathematically that, for homogeneous textures, local slant is indeed specified by the relative optical size of adjacent elements. However, beginning with Gibson (1950b), there is a long literature demonstrating that perceived slant is significantly underestimated, biased into the frontal plane (Todd, Thaler, & Dijkstra, 2005). Todd, Christensen, and Guckes (2010) showed that the effective information for perceiving slant from texture is the scaling contrast (normalized difference in optical size for the range of elements on the surface), which covaries with slant but does not uniquely correspond to it. Thus, even though available information specifies optical slant, in this case, the visual system appears not to take advantage of it. To summarize, updated evidence indicates that (1) optical contact and cast shadows are effective information for exocentric location on the ground, which is invariant under changes in viewing position; (2) the horizon ratio is effective information for the frontal extent of objects over a large range of distances, at least for objects ranging from 0.2–2.5 E; (3) ground texture appears not to serve as an intrinsic scale for relative distance or relative size over a large range of distances; and (4) texture scaling contrast determines perceived slant to the line of sight, which tends to be flattened into the frontal plane.
Paradoxes of Distance Perception We now confront the inconvenient fact that, despite what Gibson and his followers (including me) expected, perception of distance in the open field is distorted and inconsistent.3 These visual distortions are not dispelled by motion parallax or binocular disparity (Foley, Ribeiro-Filho, & DaSilva, 2004; Philbeck & Loomis, 1997). Indeed, related distortions are observed in “binocular space” with disparity alone (Hardy, Rand, & Rittler, 1951; Luneburg, 1947), although they are presumably driven by different variables. Consider the following paradoxes (refer to Figure 9.4, Task and Data). 1. Egocentric distance (Figure 9.4a). If you view a target on the ground 4–20 m away, then blind walk to it, you are highly accurate (Loomis, DaSilva, Fujita, & Fukusima, 1992; Rieser, Ashmead, Talor, & Youngquist, 1990). But if you verbally estimate the egocentric distance of the same target, you underestimate it by 20–30% (Foley et al., 2004; Knapp & Loomis, 2004; Loomis & Philbeck, 2008). 2. Egocentric aspect ratio (Figure 9.4b). Perhaps, you say, action tasks are more natural and less cognitive than magnitude estimation. Yet if you match a frontal interval between two targets by walking toward one of them and stopping at a point that appears to form an equilateral ‘L,’ egocentric distance is also underestimated by about 30% (Li,
ϭϱ ϭϬ dĂƌŐĞƚĚŝƐƚĂŶĐĞ;ŵͿ
ϭϬ ϭϱ ϮϬ Ϯϱ ϯϬ ϯϱ ĐƚƵĂůĞŐŽĐĞŶƚƌŝĐĚŝƐƚĂŶĐĞ;ŵͿ
&ƌŽŶƚĂůŵĂƚĐŚĞƐƚŽĞŐŽĐĞŶƚƌŝĐĚŝƐƚĂŶĐĞƐ dǇƉŝĐĂůǀĞƌďĂůĞƐƟŵĂƚĞƐ;>ŽŽŵŝƐΘWŚŝůďĞĐŬ͕ϮϬϬϴͿ 'ŝůŝŶƐŬǇ;ϭϵϱϭͿŵŽĚĞů
ϭϬ ϭϱ ϮϬ &ƌŽŶƚĂůŝŶƚĞƌǀĂů;ŵͿ
ϭϬ ϭϱ ϮϬ dĂƌŐĞƚĚŝƐƚĂŶĐĞ;ŵͿ
WƌĞĚ͘ǀĞƌďĂů WƌĞĚ͘ǁĂůŬ ĐĐƵƌĂƚĞ
ϰ ϲ ϴ ϭϬ WŚǇƐŝĐĂůĚŝƐƚĂŶĐĞŝŶŵĞƚĞƌƐ
ϭϬ ϭϭ ϭϮ ϭϯ DŝĚƉŽŝŶƚ;DͿ
KƉĞŶĮĞůĚ ,ĂůůǁĂǇ ĐĐƵƌĂƚĞ
ϴ ϭϬ ϭϮ DŝĚƉŽŝŶƚ;ŵͿ
Ƌ͘/͗ Ě с Ϯϴ͘ϱ Ϯϴ͘ϱн
ϭϬ ϮϬ ϯϬ WŚǇƐŝĐĂůĚŝƐƚĂŶĐĞ;ͿŝŶŵĞƚĞƌƐ
ϮϬ ϰϬ WŚǇƐŝĐĂůĚŝƐƚĂŶĐĞ;ŵͿ
'ŝůŝŶƐŬǇ;сϮϴͿ ĐĐƵƌĂƚĞ ,ŽƌŝǌнϯΣ
Source: (a) Data from Knapp and Loomis (2004), Figure 3; (b) from Li et al. (2011), Figure 4. Adapted with permission from Springer Nature; (c) from Loomis et al. (1992), Figure 5a. Adapted with permission from American Psychological Association; (d) from Bodenheimer et al. (2007), Figure 2; (e) from Gilinsky (1951), Figure 5.
Figure 9.4 Distance perception: experimental tasks, representative data, and the results of numerical simulations based on Equations 9.1 and 9.3. (a) Egocentric distance: blind walking task and verbal estimates. (b) Egocentric aspect ratio: equilateral ‘L’ task yields linear depth compression. (c) Exocentric aspect ratio: equilateral ‘+’ task, ditto. (d) Egocentric bisection: accurate judgments. (e) Exocentric depth increments: marking off equal intervals in depth yields nonlinear compression, simulated with Equation 9.4. Note: ‘o’ indicates targets, ‘x’ indicates participant’s final position, arrows indicate adjustments.
Perceiving Surface Layout 163 Phillips, & Durgin, 2011). In other words, perceived egocentric distance is a linear function of physical distance, but compressed in depth, with a slope of about 0.7. 3. Exocentric aspect ratio (Figure 9.4c). Well, perhaps exocentric distances between objects behave differently, thanks to the ground texture scale. But if you attempt to adjust a depth interval between two targets to match a frontal interval between two targets, forming an equilateral ‘+’ or ‘L,’ the depth interval is underestimated relative to the frontal interval, increasingly so with distance, until it levels off around 30–40% at 6–10 m (Loomis et al., 1992; Loomis & Philbeck, 1999). This presents the deepest paradox: one would think that if you can blind walk accurately to each of these targets, you must be able to perceive their spatial locations and the distances between them. Yet physically-equal intervals in the open field are perceived as unequal when viewed from different directions, thereby violating a Euclidean metric for perceptual constancy (Foley et al., 2004; Toye, 1986; Wagner, 1985). 4. Egocentric bisection (Figure 9.4d). Perhaps this is only so when comparing distance intervals in different directions (e.g., frontal vs depth intervals). And happily, if you adjust a marker (Z1) to bisect the egocentric distance between you and a target (Z2) in the open field (setting tana1 = 2tana2), you are highly accurate and precise—up to 300 m! (Bodenheimer et al. (2007); Lappin, Shelton, & Rieser (2006); J. Purdy & Gibson (1955); Rieser et al. (1990); but see Gilinsky’s, 1951, two observers).4 5. Exocentric depth increments (Figure 9.4e). On the other hand, if you try to mark off a series of equal increments in depth, starting near your feet, equal intervals look progressively smaller with distance, that is, they appear nonlinearly compressed (Gilinsky, 1951; Ooi & He, 2007). This is apparent when viewing the dashed lines running down the middle of a highway: equal-length dashes look increasingly compressed with distance. Another way of saying this is that perceived incremental egocentric distance is a negatively accelerating function of physical distance, such as a hyperbolic curve or a power law with an exponent