Cognitive Penetrability and the Epistemic Role of Perception 978-3-030-10445-0

This book is about the interweaving between cognitive penetrability and the epistemic role of the two stages of percepti

686 39 3MB

English Pages 384 Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

The Cognitive Use of Prior Knowledge in Design Cognition: The Role of Types and Precedents in Architectural Design
The Cognitive Use of Prior Knowledge in Design Cognition: The Role of Types and Precedents in Architectural Design

This paper examines the cognitive use of prior knowledge in design and evaluates the role of types and precedents in architectural design and education from a cognitive perspective. Previous research on design cognition shows that the amount of prior knowledge possessed by the designer plays a fundamental role in the production and quality of the creative outcome. Prior knowledge is thought to be held by way of specific cognitive structures that are called cognitive schemas and, the role of our cognitive schemas (be it personal or cultural schemas) is portrayed as indispensable for the formation of our creative productions. Although significant efforts were made in the way of studying the use of prior knowledge in design, the correlation of types and cultural schemas has yet to be explored. This paper examines this correlation between cultural schemas, a markedly cognitive concept, and types, an architectural one, culminating in an investigation of the cognitive role of types and precedents within architectural design and education in the light of the cognitive literature. Building on that attempt, the study endeavors to conduct an interdisciplinary theoretical inquiry that respectively studies the role of prior knowledge in design cognition, the concept of cognitive-cultural schemas, the concept of type and its relationship with cultural schemas, and finally, the cognitive role of types and precedents in architectural design and education. In conclusion, this study proposes that, in terms of function, types are virtually identical to cultural schemas at the cognitive level, and types and precedents have a generative value for architectural design, by virtue of the fact that they exist as the initial cognitive schemas that are employed at the beginning of the design process. JOURNAL OF CONTEMPORARY URBAN AFFAIRS (2019), 3(3), 39-50. https://doi.org/10.25034/ijcua.2019.v3n3-4

0 0 1MB Read more

Cognitive Penetrability and the Epistemic Role of Perception
 978-3-030-10445-0

Table of contents :
Front Matter ....Pages i-xix
Cognitive Penetrability and the Epistemic Role of Perception (Athanassios Raftopoulos)....Pages 1-83
Cognitive Penetrability (Athanassios Raftopoulos)....Pages 85-158
Early Vision and Cognitive Penetrability (Athanassios Raftopoulos)....Pages 159-221
The Cognitive Effects on Early and Late Vision and Their Epistemological Impact (Athanassios Raftopoulos)....Pages 223-250
Early and Late Vision: Their Processes and Epistemic Status (Athanassios Raftopoulos)....Pages 251-338
Back Matter ....Pages 339-368

Citation preview

Cognitive Penetrability and the Epistemic Role of Perception Athanassios Raftopoulos

Palgrave Innovations in Philosophy

Series Editors Vincent Hendricks University of Copenhagen Copenhagen, Denmark Duncan Pritchard University of Edinburgh Edinburgh, UK

Palgrave Innovations in Philosophy is a new series of monographs. Each book in the series will constitute the ‘new wave’ of philosophy, both in terms of its topic and the research profile of the author. The books will be concerned with exciting new research topics of particular contemporary interest, and will include topics at the intersection of Philosophy and other research areas. They will be written by up-and-coming young philosophers who have already established a strong research profile and who are clearly going to be leading researchers of the future. Each monograph in this series will provide an overview of the research area in question while at the same time significantly advancing the debate on this topic and giving the reader a sense of where this debate might be heading next. The books in the series would be of interest to researchers and advanced students within philosophy and its neighboring scientific environments. More information about this series at http://www.palgrave.com/gp/series/14689

Athanassios Raftopoulos

Cognitive Penetrability and the Epistemic Role of Perception

Athanassios Raftopoulos Department of Psychology University of Cyprus Nicosia, Cyprus

Palgrave Innovations in Philosophy ISBN 978-3-030-10444-3 ISBN 978-3-030-10445-0  (eBook) https://doi.org/10.1007/978-3-030-10445-0 Library of Congress Control Number: 2018965225 © The Editor(s) (if applicable) and The Author(s) 2019 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Cover illustration: Bashutskyy shutterstock.com This Palgrave Macmillan imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

For Ali

Preface

In this book, I aim, first, to analyze the concept of cognitive penetrability (CP). To do so, I discuss the epistemic role of perception in grounding perceptual beliefs and the importance of this role for the discussions about CP. I reframe the problem of CP so as to take into account the repercussions of the cognitive effects on perception for the epistemic role of perception. I contend that cognitive effects on perception constitute cases of CP only if (a) they affect perceptual processing in a direct manner, and (b) they affect the epistemic role of perception. I argue that early vision is not CP because it is neither affected directly by cognition, nor do any cognitive effects affect its epistemic role. Second, I examine in detail the processes of late vision and their role in visual perception. This is important because although it is widely argued that early vision is CI, there is, in addition to early vision, a later perceptual stage of visual processing, namely, late vision. Unlike early vision, late vision is not much discussed in the literature. Questions that arise in that respect concern the processes and role of late vision processes, and the ways these processes interrelate (if at all) with cognition. I argue that late vision is CP because it is affected directly by cognitive states and its epistemic role is affected by cognition as well. If cognition vii

viii     Preface

affects late vision, an explanation must be given as to how cognitive states could affect perceptual processes in view of the fact that perceptual states have iconic content, whereas cognitive states have symbolic/ conceptual content; given this significant difference in representational format, how could cognition interact with perception? There are several differences that conjunctively differentiate this book from other accounts about the CP of perception and its epistemic impact. First, in the literature, visual perception is usually treated invariably as a unified process and the discussions pertain to whether visual perception is CP or not, despite Pylyshyn’s (1999) early warnings that visual perception as a whole should be distinguished from early vision and that the claim about CI concerns early vision and not visual perception as a whole. In this book, I divide visual perception into two parts, namely early vision and late vision. The effects of cognition on perception are analyzed and discussed, thereby, in terms of the ways cognition affects early vision and late vision separately. It turns out that cognition not only affects early vision differently from late vision, but also affects the respective epistemic roles differently. This has important repercussions for the discussions concerning CP that the other books miss because they do not differentiate between early and late vision. Second, there are various definitions of CP in the literature. Usually, these definition state that a perceptual process is CP if it is directly affected by cognition, where this ‘directly effects’ is explicated in various ways. Some more recent accounts of CP, however, introduce a new condition with a pragmatic, consequentialist flair, namely that a perceptual process is CP if the cognitive effects have some important or interesting philosophically speaking consequences concerning the epistemic role of perception, namely its role as a justifier of perceptual beliefs. The relations between the two conditions for CP are not discussed and they are usually left to run in parallel. In this book, I examine closely their interrelations and argue that they should be treated as the two faces of the same coin with the one face bootstrapping the other. Although, third, early vision is discussed sometimes, late vision is not discussed and analyzed at all. There are some researchers, both philosophers and cognitive scientists, who think that late vision, owing to cognitive influences, is not a perceptual stage properly speaking. I argue that despite the cognitive influences, late vision is a perceptual stage.

Preface     ix

Finally, among those who espouse the view that cognition affects perception, there are those who think that cognition can affect perception because both have conceptual/symbolic contents. There is a significant part of researchers, however, who think that cognition affects perception even though perceptual states have iconic/analog contents, whereas cognitive states have conceptual/symbolic contents. No explanation has been given as to how analog contents could affect symbolic contents. In fact, some researchers argue that exactly because of the difference in representational format, cognition and perception could not interact in a way that entails the CP of perception. Here, I provide an account of how cognitive states could affect perceptual states in a way that entails that late vision is CP. This book is structured as follows. In Chapter 1, I examine the problems that CP raises for the epistemic role of perception in justifying empirical beliefs. I assess both internalistic and externalistic accounts of perceptual justification and argue that only the latter, especially when they involve a reference to the sensitivity of perception to the environmental input and the way this sensitivity is influenced by CP, offer a promising start to understanding the effects of CP on the epistemic role of perception. This discussion will serve as the basis on which I will base my revisiting the problem of CP. In Chapter 2, I assess the definitions of CI in the literature and synthesize them to propose a new definition of CP that incorporates the much-heated discussion about the effects of CP on the epistemic role of perception. I distinguish this definition from the other definitions that I had examined underlying at the same time the commonalities with them. Then, I propose to approach CP by factoring in the epistemic role of perception in justifying perceptual beliefs. This means that one should determine and assess the epistemic role of each stage of visual processing separately, in view of the different roles that the two stages play in perception and in view of the fact that cognition affects early and late vision differently. In view of the two threads that exist in definition of CP, one imposing the demand that for a perceptual process to be CP it must be directly affected by cognition, and the other imposing the demand that for a perceptual process to be CP cognition should affect in an interesting way its epistemic role, I discuss the relation between these two conditions. Finally, in view of my thesis that late vision is CP

x     Preface

because cognitive states affect hitherto purely perceptual processes, I propose a way in which states with cognitive contents that are symbolically structured could affect states with purely iconic or analog contents. In Chapter 3, I defend the thesis that early vision is CI against very recent criticism, aimed specifically at my arguments, which states that neurophysiological evidence shows that early vision is affected in a topdown manner by cognitive states. This criticism comes from (a) studies on fast object recognition; (b) pre-cueing studies; (c) imaging studies that examine the recurrent processes in the brain during visual perception. I argue that upon closer examination, all this evidence supports rather than defeats the thesis that early vision is CI, because it shows that (a) the information used in early vision to recognize objects very fast is not cognitive information; (b) the processes of early vision do not use the cognitive information that issues cognitive demands guiding attention or expectation in pre-cueing studies; and (c) the recurrent processes in early vision are purely stimulus-driven and do not involve any cognitive signals. In Chapter 4, I examine the repercussions of the cognitive impenetrability of early vision and cognitive penetrability of late vision for the epistemic role of visual perception and for the constructivist claim that our access to the world is mediated through our concepts. In Chapter 5, finally, I elaborate my thesis that a stage of visual processing, namely, late vision, is CP. I explain why, notwithstanding the conceptual modulation of the processes of late vision, late vision should be considered a perceptual stage rather than a stage of pure thought that is fed by perceptual information. Nicosia, Cyprus

Athanassios Raftopoulos

Series Editors’ Preface

Palgrave Innovations in Philosophy is a new series of monographs. Each book in the series will constitute the ‘new wave’ of philosophy, both in terms of its topic and the research profile of the author. The books will be concerned with exciting new research topics of particular contemporary interest, and will include topics at the intersection of Philosophy and other research areas. They will be written by up-and-coming young philosophers who have already established a strong research profile and who are clearly going to be leading researchers of the future. Each monograph in this series will provide an overview of the research area in question while at the same time significantly advancing the debate on this topic and giving the reader a sense of where this debate might be heading next. The books in the series would be of interest to researchers and advanced students within philosophy and its neighboring scientific environments. Copenhagen, Denmark Edinburgh, UK

Vincent Hendricks Duncan Pritchard

xi

Acknowledgements

This book continues my work on visual perception and its relations to cognition that started 19 years ago when, in 2000, I presented in the annual conference of the Cognitive Science Society in Philadelphia a paper about the cognitive penetrability of early vision. Since then, in a book and many papers, I argue that early vision is not cognitively penetrated based mainly on empirical evidence concerning the details of visual processing, its processes and the brain areas it involves. Since the empirical evidence is a fast growing area is always variegated and highly dynamic, challenges constantly emerge for my theses. This book is, partly, an account of these new challenges against my previous work and an attempt to defend my view that early vision is cognitively impenetrable in the light of these challenges. It is also an attempt to elaborate on themes that in my previous work were treated in a rather superficial way, themes such as the relation between cognitive impenetrability and nonconceptual content, the notion of cognitive penetrability itself, and late vision. It also addresses some philosophical issues that laid latent in my previous work and explicitly arose relatively recently in the literature, such as the way in which cognition could interact with perception, and the suggestion that the problem of the cognitive penetrability of xiii

xiv     Acknowledgements

perception should be addressed in terms of the epistemological consequences of the cognitive penetration. In view of these, I would like to thank all those philosophers and cognitive scientists (they are too many to mention names) who read my work and criticized it both on philosophical and empirical grounds either in papers, talks, or conversations. Their criticism made me rethink and sometimes reconsider my previous views and arguments and allowed me, on the one hand, to elaborate on, and deepen, I hope, them, and, on the other hand, to consider factors and issues that had gone unnoticed in my previous work. I would like also to thank all those who invited me to present papers either in talks or in workshops. So, special thanks to Andreas Demetriou, Athanasios Gagatsis, Philippos Kargopoulos, Dimitris Koliopoulos, Cristoph Limbeck, Lorenzo Magnani, Peter Machamer, Bence Nanay, John Norton, Konstantinos Pagondiotis, Woosuk Park, Li Ping, Dimitis Portides, Stathis Psillos, Konstantinos Ravanis, Susanna Siegel, Maria Venieri, Alberto Voltonini, Stella Vosniadou, John Votsis, Liu Xiao Tao, John Zeimbekis. It goes without saying that I am indebted to all the participants in these meetings whose stimulating questions and conversations helped me to develop further my arguments and views. I am grateful to Brendan George, publisher and subject head in Philosophy in Palgrave Macmillan. Brendan embraced the project from the start and helped me go through its preparation stages. Finally, I am grateful, to say the least, to my wife Ali and our cats, Ettore, Natasha, and Zogia who tolerated my all too frequent absences abroad when I was traveling to present my work in conferences, workshops, or talks, and who also supported me in invaluable ways through the whole process of thinking about, and writing, the book.

Contents

1 Cognitive Penetrability and the Epistemic Role of Perception 1 1 Introduction 1 2 Cognitive Effects on Perception and the Epistemic Problems They Pose for Perception 14 3 Siegel’s Inferentialism 26 3.1 Ways in Which Cognition Affects Perception 37 3.2 Illicit Perception and Illicit Inferences 46 3.3 Inferences in Perception 53 4 Externalism: Perceptual Justification vs. Perceptual Grounding 56 References 78 2 Cognitive Penetrability 85 1 Introduction 85 2 Assessing the Definitions of Cognitive Penetrability 93 2.1 Pylyshyn 93 2.2 Macpherson 99 xv

xvi     Contents

2.3 Stokes 102 2.4 Siegel 107 2.5 Wu 115 3 A New Definition of CP 117 4 The Epistemic Role of Early and Late Vision 124 5 How Do Cognition and Perception Interact? 126 5.1 The Argument 129 5.2 How Do Cognitive States Modulate Perceptual Processing in Late Vision? 132 5.3 IEV and EEV: Direct and Indirect Cognitive Effects on Perception 152 References 154 3 Early Vision and Cognitive Penetrability 159 1 Introduction 159 2 The Operational Constraints in Perception and Fast Categorization Owing to Perceptual Learning Do Not Entail That Perception Is CP: Why Early Vision Is Not Affected Directly by Cognition Part 1 162 2.1 Operational Constraints 162 2.2 Perceptual Learning 169 3 Early Vision: Why Early Vision Is Not Affected Directly by Cognition Part 2 173 3.1 The MT/V5 to V1, V2 Interaction in Early Vision (First Pass) 177 3.2 Assessing the Evidence Thus Far 181 3.3 What About Modules? 182 3.4 Is the Content of Early Vision Philosophically Speaking Significant? 184 4 Recurrent Processes of Early Vision Do Not Involve Cognitive Information: Why Early Vision Is Not Affected Directly by Cognition Part 3 185 4.1 Recurrent Processing Between MT/V5 and V1, V2 (Second Pass) 186

Contents     xvii

4.2 Other Types of Early Recurrent Interactions in Early Vision and the Role of FEF 191 4.3 Does Early Object Recognition Entail the CP of Early Vision (Again)? 198 5 Pre-cueing Effects in Perception: Why Early Vision Is Not Affected Directly by Cognition Part 4 199 References 212 4 The Cognitive Effects on Early and Late Vision and Their Epistemological Impact 223 1 Introduction 223 2 Indirect Cognitive Effects on Early Vision and Their Epistemic Impact 225 3 The CP of Late Vision Does Not Justify Constructivism 231 4 Concluding Discussion 244 References 249 5 Early and Late Vision: Their Processes and Epistemic Status 251 1 Introduction 251 2 Early Vision 256 3 Late Vision 261 4 Is Late Vision a Visual Stage or a Discursive Thought-Like Stage? 274 4.1 The Problem 274 4.2 Beliefs 276 4.3 Inference 278 4.4 Late Vision, Hypothesis Testing, and Inference 280 5 Late Vision and Discursive Understanding 308 5.1 Late Vision Is More Than Object Recognition 308 5.2 Late Vision as a Synergy of Bottom-Up and Top-Down Information Processing 309 6 Beliefs: Take Two 319

xviii     Contents

7 Late Vision, Amodal Completion, and Inference 323 8 Concluding Discussion 328 References 332 References 339 Index 365

Abbreviations

CAA Cognitive Access Awareness CDAP Cognition Directly Affects Perception CI Cognitively Impenetrable CP Cognitively Penetrated EEV External Effect View ERP Event Related Potentials FEF Front Eye Fields FFS Feed Forward Sweep GRP Global Recurrent Processing HSF High Spatial Frequency IEV Internal Effect View LGN Lateral Geniculate Nucleus LRP Local Recurrent Processing LSF Low Spatial Frequency LTM Long Term Memory NCC Non-Conceptual Concept RF Receptive Fields TMS Transcranial Magnetic Simulation VSTM Visual Short Term Memory WM Working Memory xix

1 Cognitive Penetrability and the Epistemic Role of Perception

1 Introduction Feyerabend (1981), Hanson (1958), and Kuhn (1962) argued that what someone thinks, hopes, or desires determines what they perceive. Perception became theory-laden, conceptually modulated, and cognitively penetrated (CP). (The relations among theory-ladenness, conceptual modulation, and cognitive penetrability are not as clear as they seem, but as I have argued [Raftopoulos 2009], under some independent assumptions, the equivalence holds true.) Furthermore, Sellars (1956) attacked one of the main ‘dogmas’ of classical empiricism, to wit, the view that perception functions independently of concepts and delivers to us the world in its own guise without any conceptual influences. This ‘given’, empiricists thought, could be used as a neutral basis to provide justification and determine the truth of both perceptual beliefs and scientific theories, exactly because by being free from any conceptual influences it reflects only the environment. The rejection of this assumption undermined the epistemic role of perception in justifying perceptual beliefs since the fact that prior beliefs affect perception makes it possible that prior beliefs, by shaping the percept © The Author(s) 2019 A. Raftopoulos, Cognitive Penetrability and the Epistemic Role of Perception, Palgrave Innovations in Philosophy, https://doi.org/10.1007/978-3-030-10445-0_1

1

2     A. Raftopoulos

(i.e., the content or character of perceptual experience)1 that is subsequently used to support rationally a belief, provide indirectly (through the intermediate perception) support either for themselves or for beliefs that are congruent with them. This is a form of confirmation bias that is, epistemically highly problematic, or, at worst, a simple vicious justificatory circle whereby a belief justifies itself through the intermediary of perception; this clearly undermines the epistemic role of perception. Siegel (2016, 2) explains the threat that the cognitive effects on perception pose for the traditional view concerning the epistemic role of perception, according to which perception yields receptively and passively a given that is neutral with respect to one’s conceptual frameworks and can be used as a neutral ground to justify beliefs and relate one with the world. Experiences are not uniformly receptive. They are not always a landing pad for information (or misinformation) that tumbles in along arational channels, naively open to the objects, properties, and events that are there for us to perceive. Experiences can fail to be receptive in these ways, not because they are off the grid of rational assessment, as an undirected ‘raw feel’ would be, but because the grid has a place for experiences that arise in some of the same ways as irrational beliefs do. And this opens the possibility that the notions of ill-foundedness that we use to convict some beliefs might convict some experiences as well.

Indeed, if prior beliefs affect perceptual processing, one might worry about this affects the justificatory role of perception. It is intuitive to argue that if the belief that X is F causally affects the perceptual processing of a visual scene in which an X is present and as a result of this process, viewers have an experience with content ‘X is F’ on which they subsequently base their belief that X is F, one should suspect that the role of the prior belief in affecting the content of perception diminishes the rational support for the perceptually based belief undermining the rational standing of the belief; the belief is epistemically compromised.

1I write content or character because I wish to remain neutral with respect to the (representationalist) thesis that the phenomenal character of a perceptual experience is identical to, or supervenes on, the representational content of the experience.

1  Cognitive Penetrability and the Epistemic Role of Perception     3

Siegel (2011, 2013a, 702–703) calls the phenomenon in which CP leads to epistemically compromised beliefs, the downgrade principle. The experience E that results from, among other factors, the causal influence of a prior belief and, owing to this casual influence, has its justificatory role diminished, is epistemically downgraded. The underdetermination of the justificatory role of perception paved the way to constructivism. Constructivists claim that mind-­independent objects are epistemically inaccessible. Epistemological Constructivism undermines realism by arguing that our experience of the world is mediated by our concepts. One cannot examine directly which aspects of objects belong to them independently of our conceptualizations because perception is cognitively penetrable and, thus, they cannot compare their representations of objects and the mind-independent objects these representations represent. This view clashes with epistemological realism’s thesis that perception relates mind-independent objects and us (Kitcher 2001). I draw a distinction between cognitive effects on perception and CP. The reason is that it is widely accepted (see the next chapter for a discussion) that not all cognitive effects on perception constitute cases of CP. It is also true that the cognitive effects on perception that are not cases of CP may threaten the justificatory role of perception. CP is not the only sort of cognitive influences on perception that could affect the epistemic role of perception; other cognitive effects that are not usually taken to constitute cases of CP could downgrade perception. When, for example, cognitively driven attention causes the eyes to focus on a certain part of the environment selecting the visual scene to be perceived, or when attention selects in the same way a particular object among the objects in a visual scene boosting its perceptual processing this may affect the epistemic role of perception. It could make one ignore relevant evidence, or give credence to irrelevant evidence, so that an initial hypothesis concerning the identity of the object (that is constructed in late vision as we shall see in Chapter 5) be supported against the testimony of the environment. At the same time, in view of the fact that this effect of attention introduces an external link in the causal chain from the penetrating cognitive state to perception is not deemed to be a case of CP, a claim endorsed by most philosophers with few exceptions.

4     A. Raftopoulos

I will contain my discussion on the epistemic effects of CP. I will return in the next two chapters to discuss the indirect cognitive effects on perception in detail, and I will argue that the epistemic repercussions of indirect cognitive effects on perception differ from the epistemic effects of CP since they affect a different perceptual stage. Cognition directly affects late vision, rendering it CP, and may downgrade the percept, but it does not affect directly early vision. Indirect cognitive effects do impact early vision but do not affect its epistemic role. The indirect cognitive effects, by selecting the environmental scene to be perceptually processed, impact on the epistemic role of perception and may downgrade this role. Siegel (2016, 5) calls this sort of effects ‘the selective mode’ in which perception can depend on cognition (Siegel talks about desires but her discussion can be extended to all sorts of cognitive effects on perception), and in other works she calls them the effects of global selection (Siegel 2013a, 2016). The downgrade, however, is different in the direct and indirect cases, since the indirect cognitive effects do not affect perceptual processing itself but only the selection of perceptual input, while the direct cognitive effects affect perceptual processing itself. As we shall see, this is a significant difference because the epistemic role of early vision, which is influenced by indirect cognitive effects only, is not affected by these indirect cognitive effects. In contrast, late vision, owing to the fact that is directly affected by cognition, has its epistemic role influenced by cognition. This chapter concerns the repercussions of the CP of perception for the epistemic role of perception in founding empirical beliefs, which calls for a definition of CP first. CP is thought to encompass cognitive influences on perception, where cognition is widely understood so as to include emotive states, such as desires, hopes, etc., but since not all cognitive effects on perception are considered to be cases of CP, a principled way to distinguish those cognitive effects that do signify the CP of perception from those that do not is required. I will discuss ways to make this distinction in the next chapter, where I attempt to define CP. In this chapter, I use a very generic notion of CP that is also used by Siegel (2013a, 2013b, 2016) and Pylyshyn (1999), according to which CP occurs when some cognitive states affect perceptual processing itself and not some preperceptual or post-perceptual stage. In the former case, cognition just

1  Cognitive Penetrability and the Epistemic Role of Perception     5

selects the where or what one focuses and takes in perceptually, while in the latter case cognition affects the interpretation of the perceptual output. Should this be the case, the contents of the affecting cognitive states influence the contents of the affected perceptual states and some sort of semantic relation between the two is established. Macpherson (2012) and my self (Raftopoulos 2009) have argued that when this happens, concepts enter the perceptual contents. Discussions concerning the effects of the CP of perception for the epistemic role of perception in grounding perceptual beliefs center on whether the cognitive effects affect the justificatory role of perception and, especially, on whether the cognitive effects hinder the epistemic role of perception either by rendering perception less sensitive to the data and, thus, less reliable, for those with externalistic epistemological inclinations, or, for those with epistemological internalistic tendencies, by introducing an illicit etiology in perception whereby the percept is the result of an illicit, ill-founded perceptual inference, which makes the percept to which the inference issues the result of an irrational process. There are many views concerning the way perception justifies perceptual beliefs that are roughly divided into two main categories; those that fall within internalism and those that fall within externalism. According to internalism, the justification of perceptual beliefs by perception is independent of truth-related factors. Externalists reject this thesis, tying perceptual justification to externalist, relational factors that are truthrelated. The two camps differ on the way they interpret, and account for, the problems that CP engenders for the epistemic role of perception. The disagreement follows mainly from a difference about the content of mental, in general, and perceptual, in particular, states. For the internalist, perceptual content is inherently intrinsic to the viewer and does not constitutively depend on the viewer’s relation to the environment; the environment is causally implicated in the formation of this content but it is not constitutively involved in this content. For the externalist, the perceptual content is inherently extrinsic, that is, it constitutively depends on the viewer’s relation to the environment at the time of the viewing act. For some externalists, the representational content of perception (throughout this book I assume that perception has representational content) includes both phenomenal content, which

6     A. Raftopoulos

is the phenomenal character (or part of it) of the relevant perceptual experience, and also a different kind of representational content, let us call it externalist content, which depends constitutively on the perceptual relation of the viewer to the external world. This discussion is underlaid by the distinction between narrow and wide content. The term ‘narrow content’ denotes the purely subjective or psychological aspect of the thought or perceptual state, the sense of the thought or perceptual state or the mode of presentation of the object the thought or the perceptual state is about, the part that “is in the head.” The term ‘wide content’ denotes the object or the worldly state of affairs that the thought or perceptual state is about. Internalists hold that the semantic content of a mental state is narrow content that is independent of its truth-conditions. Some externalists hold that the semantic content of a mental state is externalist or wide content that is an objective truth-conditional content (externalism1). Other externalists think that a mental content should be analyzed along two dimensions, an objective truth-conditional, the wide content, and a subjective psychological dimension, the narrow content (externalism2). The difference between externalism2 and externalism1 roughly corresponds to Smithies’s (2016, 1133) distinction between moderate and radical externalism. According to moderate externalism, the externalist contents that affect the justificatory role of perception do so by virtue of their impact on the representational content of perception. “Perceptual experience represents externalistic contents in virtue of representing phenomenal contents in a specific environment” (Smithies 2016, 1135) and, thus, the former are partially grounded in the latter. For radical externalism, on the other hand, the justificatory externalist facts do not depend on the phenomenal representational content of perception. For an externalist1 who denies that mental states have narrow semantic content, the justificatory impact of the externalist (wide) content of the mental state cannot depend on the phenomenal representational content, since the latter is the narrow content of the mental state that is not part of the semantic content of the state and, therefore, plays no role in the epistemological role of the state. An externalist1 is a radical externalist. For a moderate externalist, it is conceivable that the

1  Cognitive Penetrability and the Epistemic Role of Perception     7

justificatory role of the wide content is grounded on the representational (narrow) content, which presupposes that the mental state has such a narrow content in addition to its externalist content. A moderate externalist is, thus, an externalist2. In fact, a moderate externalist adds the requirement that the wide content be implicated in the epistemic role of a mental state only in so far as it has an impact on the phenomenal content of the mental state. I do not think that the epistemic role of the wide content depends on the phenomenal content of a mental state because in the case of perception, at least, the representational content of a perceptual state may play an epistemic role even if it is outside the scope of awareness, i.e., at a subpersonal level (see also Lyons 2005, 2009, 2016; Siegel 2013a) for a related view.2 Subpersonal perceptual contents may justify perceptual beliefs provided that perception functions reliably, or performs adequately its proper function, which requires that it be sensitive to the environmental input. I agree, therefore, with Lyons (2009) that someone need not have experiences in order to have perceptual beliefs, but they must have a perceptual system that reliably produces beliefs. Note that this view does not necessarily commit one to the view that blindsighters, for example, who systematically for some reason that is unrelated to the function of their perceptual system form correct beliefs about what happens in their blind side, are justified in holding these beliefs, the reason being that these beliefs are not issued by their perceptual system since no signals from their blindsight reach V1 in order for the perceptual processes that eventually lead to perceptual beliefs to get off the ground. In Raftopoulos (2009, ch. 6; 2011), I argued that both early vision and late vision are constitutively related to the environment, in that the

2Things

are more complicated. Naïve realists are externalists1 since for them the contents of perceptual states are worldly states of affairs and, thus, the perceptual states do not represent phenomenal contents but relate us directly with the world; in fact they represent nothing, they just relate viewers to the world. As such, only the (external) relations to the world matter for the justificatory role of perception. Naïve realists are clearly among the radical externalists. There are, however, philosophers (Tye 2000, 2002, 2006) who hold that perception has representational content that is a form of Russellian content and who try to analyze perception in terms of perceptual modes of presentation that concern only properties but not objects (but see Tye [2009] for a different object-involving approach).

8     A. Raftopoulos

viewer is in a de re relation with it. Being related in perception with an object means that one is in direct (i.e., without any conceptual intermediaries) contact with the object itself and retrieves information regarding that very object from the object itself and not through a description that would individuate or identify the object, and thus, by depicting it would secure reference to it. Perception puts us in a de re relationship with the object (as opposed to a descriptivist relationship), and the ensuing perceptual judgments are de re judgments; when one forms a de re belief, one stands in “appropriate nonconceptual, contextual relations to objects the belief is about” (Burge 1977, 346). Recanati (1997) calls the de re relation or relation of acquaintance between viewers and the objects in the visual scene, they perceive “a fundamental epistemic relation” a term also used by Perry (2001). This constitutive relation of a perceptual content on the visual scene is reflected on the fact that a perceptual mode of presentation is object involving and inherently contextual (Raftopoulos 2009, ch. 6). For the externalist, it is intuitive to assume that CP downgrades perception because it affects perceptual processing in a way that renders the percept, the output of perception on which perceptual beliefs are grounded, epistemically suspect, by raising concerns about whether this percept reflects, or more or less accurately represents the environmental evidence, or whether it reflects more the contents of the cognitive states that penetrate perception (Lyons 2005, 2011; Siegel 2016). Since for the externalist, the epistemic impact of perception hinges also on the relation of the content of the perceptual state with the environment (its wide content), it follows that two viewers who, facing the same visual scene, share the same phenomenal content (they have the same narrow content) may differ in the degree of justification their respective states confer on them on account of the fact that they may be related differently to the visual scene. If one endorses the view that a perceptual state justifies a belief because it is evidence for it, in the preceding situation the two viewers, their sharing the same narrow content notwithstanding, differ with respect to what evidence they possess; this is Silins’ (2005) ‘evidential externalism’. One of them, for example, may have a veridical perception of O being F, while the other one hallucinates that O is F. Their narrow contents are, allegedly, indistinguishable but since only the

1  Cognitive Penetrability and the Epistemic Role of Perception     9

former is related to O, while the latter is not, only the former is justified in believing that O is F. As we shall see, some externalists are sympathetic to the internalist’s intuition that since both viewers share the same phenomenal content, there is a sense in which both are justified in believing that O is F. To accommodate their externalist allegiance, they introduce another dimension of justification that makes it possible to say that the two viewers are differently justified in believing that O is F. For the internalist (McGrath 2013a, b; Siegel 2011, 2013a), cognitive effects on perception may have bad epistemic effects that lead to ill-founded perceptual beliefs because they may introduce an irrational etiology of perception, that is, an etiology that introduces a suspicious epistemologically speaking inference in perception, rendering the perceptual process irrational. Therefore, CP undermines the justificatory of perception by vitiating the epistemic credentials of perceptual inferences, which justifies coining this class of views of ‘inferentialism’. According to Siegel (2013a, b), this irrational etiology diminishes the sensitivity of perception to distal data, which distorts perception’s role in forming a percept that reflects the environment. This is the point at which externalistic considerations enter, the otherwise internalistic, Siegel’s discussion on the epistemic effects of CP. Recently, Siegel (2016) seems to combine an even more externalistic inclined approach with her inferentialism, arguing that the epistemic problems that cognitive effects in general and CP in particular pose for the justificatory role of perception are due to the fact that these effects undermine the, epistemically speaking, proper relation of perception to the evidence, where this evidence constitutively depends on the environment (Siegel 2016, 3, 5, 19). When this happens, the rationality of perception is undermined. I have tied the epistemic downgrade of perception owing to cognitive effects on perception to the externalist view that these effects tend to reduce the reliability of perception by rendering it less sensitive to the available evidence. Intuitively speaking, one expects that the evidence should control the percept in the sense that a perceiver should form a percept that best reflects the environmental input. Should the evidence indicate that F-ness is present in a visual scene, the perceiver should form a percept including F-ness and this should happen independent of whether the perceiver wishes, or has a former belief, that non-F-ness be

10     A. Raftopoulos

present. CP may affect the sensitivity of perception to the environmental data in different ways. Cognition may affect it by causing perception to select as evidence those data from the environment that confirm, or are congruent with, the contents of the penetrating cognitive states and ignore other data that do not. Let me stress that the demand that perception’s ‘sensitivity to the environmental data’ be threatened by CP purports, first, to clarify to which sort of evidence perception must be sensitive in order for cognitive effects to not affect its epistemic role, and, second, to distinguish, at a first run, discussions about the epistemic side effects of CP from discussions concerning the role of background information in allowing seemings to act as justifiers of beliefs. We know, for example, that the lines in the Muller-Lyer illusion have the same length, and, yet, our perceptual systems ignore this piece of counterevidence and we keep experiencing two unequal lengths; in this case perception is evidenceinsensitive, but the evidence at issue in this case is not about some environmental data and our perceptual make-up but concerns some beliefs. The point is that a seeming may be evidence-sensitive to the environmental data and our perceptual make-up, and at the same time, it may be evidence-insensitive to a set of beliefs held by the viewer. Both are conditions for the cognitive impenetrability (CI) of perception but the coexistence of sensitivity and insensitivity in the same condition does not entail an inconsistency because they range over different domains; sensitivity to the environmental data and insensitivity to beliefs. Alternatively, cognition may affect perception when during late vision (in Chapter 5 we shall see that in late vision hypotheses concerning the identity of objects in a visual scene3 are tested against the

3Sometimes I write that perception selects information from the environment and other times I write that perception selects information from the visual scene. The distinction I have in mind is the following. By environment I mean the whole space in front of viewers from which they choose a part to focus attention. That part of the environment on which they focus and which will be perceptually taken in is the role of peripheral attention. It follows that the viewer’s perceptual processes prioritize for processing information from the visual scene, although peripheral vision is also sensitive to information outside the visual scene.

1  Cognitive Penetrability and the Epistemic Role of Perception     11

iconic information contained in the ‘iconic image’ (an iconic image contains information retrieved from the visual scene by early vision and stored for a limited amount of time in early visual circuits) cognition guides the search for the information contained in the iconic image so that only confirming evidence for the hypothesis about the identity of an object that is congruent with the content of the cognitive penetrating states be selected and recalcitrant information be ignored. Cognitive effects bias perceptual processes to favor the viewer’s expectations vitiating the role of perception to depict as faithfully as possible the environment. These two ways roughly correspond to Siegel’s (2016, 5–6) distinction between the selective and the responsive mode in which cognitive states may affect perception. In the selective mode, the cognitive states select the distal stimuli that will be perceptually processed and, hence, which evidence perception will use to form a perceptual belief, a selection that takes place through the effects of cognitively driven spatial or object/feature-centered attention. It is widely acknowledged that this sort of effects is not a case of CP; CP purports to cover cases in which cognition affects the formation of the percept given the same input. In the responsive mode, the cognitive states control which beliefs a perceiver forms in response to a body of evidence. In perception, this means that the cognitive states controlling the formation of the percept do so by controlling the way the evidence, in the form of low-level perceptual input, is handled; this is a typical case of CP. This low-level perceptual input is the content of pre-experiential perceptual states (Siegel 2016, 19), which constitutively depend on the distal scene. Not all cognitive effects on perception, including cases of CP, downgrade perception. Some cognitive effects, such as perceptual expertise and familiarity not only do they not downgrade perception but, instead, they enhance its epistemic role. Other cognitive effects do not downgrade perception even though they are ubiquitous in everyday visual perception. Late vision, for example, is CP through and through and it is the locus at which cognitive effects affect directly perception. Most of the times, however, late vision despite the cognitive effects on its processes is reliable and allows the viewer to form the right percept. Thus, the CP of late

12     A. Raftopoulos

vision notwithstanding perception in most cases of everyday perception is not downgraded. Any account of CP and of its epistemic repercussions should be able to explain why sometimes CP does not downgrade perception and other times does. Siegel (2011, 2013a, b) proposes that the cognitive effects that downgrade perception do so because they introduce an irrational etiology. Lyons (2011) argues that it is the nature of the penetration that determines whether CP reduces the reliability of perception by increasing the probability that the percept be false rendering perception less reliable. The challenge is to provide a principled account of why some cognitive effects downgrade perception and others either do not affect, or even enhance the epistemic role of perception. The discussion thus far suggests a two-thorned problem concerning the impact of the cognitive effects on perception for its epistemic role in justifying perceptual beliefs. One has to provide an account of CP that offers a principled way to distinguish between cognitive effects that constitute cases of CP and cognitive effects that do not. One should also be able to explain why some cases of CP downgrade perception, while others do not. In the next chapter, I will address the first concern and offer a new definition of CP that reframes the problem of CP. In this chapter, I address the second concern, i.e., the problems that CP poses for the epistemic role of perception and the various attempts to answer to the questions raised above. I will not discuss here the epistemic effects of the indirect cognitive influences on perception. I start with the core thesis of phenomenal conservatism or seemings internalism, namely that if O seems F to S, then S is prima facie justified to believing that O is F, because how things look to perceivers certainly plays a central role in the justification of perceptual beliefs they may form. I present and assess a recent internalistic attempt to explain the effects of CP on the epistemic role of perception offered by Siegel (2011, 2013a, b, 2016). I argue that seemings internalism cannot by itself account for the various problems that CP creates for perception, and that externalist considerations should be brought to bear. Such externalistic factors are the reliability of perception, that

1  Cognitive Penetrability and the Epistemic Role of Perception     13

is, whether perception can be deemed reliable in view of the cognitive effects on perception, and the sensitivity to the visual facts. These factors are also tightly related to externalist considerations, such as the proper function of perception, the fact that perception is a competence or capacity that fulfills this proper function, and others. It is beyond the scope of this chapter to examine the intricate relations among these externalist factors, although I will have a few things to say when I discuss Schellenberg’s (2013, 2014, 2016a, b) views. Externalistic factors make it possible to discuss the extent to which perception grounds beliefs as opposed to merely justifying them, and allow one to argue that although CP perception may perhaps justify beliefs, only non-CP perception can ground them by connecting beliefs with the environment. My aim is to discuss the conditions under which CP downgrades perception, and whether this downgrade can be systematic and irrevocable, although this latter problem will be covered in Chapter 4. This chapter is structured as follows. In the first section, I discuss briefly the problem that the view that perception is CP has created for the epistemic role of perception in justifying perceptual beliefs. In the second section, I discuss and assess a specific attempt, namely inferentialism, to provide an answer to the problem concerning the epistemic impact of CP. I concentrate mainly on Siegel’s account and claim that the discussion of Siegel’s inferentialism indicates three main theses underlying her assessment of the repercussions of the existence of cognitive effects on perception for the epistemic role of perception. I examine each of these theses starting from the last one since it is the starting point of Siegel’s analysis of the cognitive effects on perception. In the third section, I argue that in view of the problems on the part of inferentialism to provide an adequate account of the epistemic repercussions of CP, externalistic considerations such as the sensitivity of perceptual processes to environmental data and the reliability of perception should be brought to bear. I elaborate on externalistic accounts that purport to explain why, and which cases of, CP undermine the justificatory role of perception.

14     A. Raftopoulos

2 Cognitive Effects on Perception and the Epistemic Problems They Pose for Perception It is intuitive to think that perceptual experience provides defeasible evidence for beliefs and that it does so directly, without any intermediate mental states, just because it is perceptual experience. Perception is the faculty par excellence that brings perceivers into contact with the world and, thus, what is a better way to get to know the world other than perception itself? Thus, perceiving p provides prima facie justification, i.e., rational support, for the proposition p. This thesis constitutes the core of the experientialist theories of perceptual justification (Ghijsen 2016, 2). The term ‘experientialism’ purports to convey the sense that states of perceptual experience that are not doxastic, that is, are not held as beliefs by S, can nevertheless justify some of S’s beliefs; non-doxastic experience can serve as evidence for some beliefs. In the most classical form of this view, which is called perceptual or phenomenal dogmatism4 or conservatism, or seemings internalism (Audi 2003; Lyons 2005, 2011, 2016; Huemer 2007; Markie 2005, 2006, 2013; McGrath 2013a, b; Pryor 2005; Siegel 2011, 2013a; Tucker 2010, 2014), it is held that if it perceptually seems to S that p, then, thereby, S has prima facie perceptual justification for the proposition p. Differently put, having an experience with content p suffices to give S immediate (meaning that S does not have to believe anything else) prima facie justification for proposition p. Pryor (2000, 538) calls these propositions ‘perceptually basic propositions’. There are

4There is some confusion with respect to the term ‘conservatism’ since it can also be used to describe the view that if the CP of perception engenders defeaters that prevent a perceptual belief from being justified even when it is based on perception, it undermines the reliability of perception and, hence, it blocks perception from providing propositional justification as well (Steup 2018, 2912–2913). It follows that perceptual experience is a source of justification only if evidence of unreliability is absent (Steup 2018, 2913). I will keep using dogmatism and conservatism to describe what for Steup should be properly called ‘dogmatism’. Notice that for Steup (2018, 2911), conservatism is an internalist theory of perceptual justification since it does not “take de facto reliability to be necessary or sufficient for a sense experience to have justificational force.”

1  Cognitive Penetrability and the Epistemic Role of Perception     15

several motives underlying this view and one of them, closely related to the function of perception, is the so-called transparency of perceptual experience; perceptual experience is transparent in the sense that when someone attends to their perceptual experience, they attend to the objects and properties the experience presents to them as in their environment; the phenomenology of the experience, its phenomenal content, or character of the experience, presents the world as being a certain way. Since perceptual experience presents to viewers worldly states of affairs as in their environment, it is rational that these perceivers take what their perceptual experience offers them at face value and form, prima facie, the belief whose content corresponds (more about this correspondence in a while) to the phenomenal content or character of their experience. It is because perceptual experience has the phenomenal character of confronting one with objects and properties in the world around me that it justifies forming beliefs about those objects and properties. (Smithies 2014, 103)

One could add the fact that since on the basis of their experience viewers are justified in believing a proposition whose content matches that of the experience, it is rational for them to heed the testimony of their perceptual state; the perceptual state provides evidence for the proposition expressed by the corresponding belief (Schellenberg 2014, 96). Thus, when it perceptually seems to S that p, this provides S with reason to believe that p. (As a matter of course, S’s having some reason to believe that p does not entail that S believes that p. If, however, the fact that it seems to S that p is a reason for which S believes that p, then S believes that p for that reason, that is, because it seems to S that p.) So, if S is in a perceptual state with content P, or a content that could be conceptualized as P, S has prima facie evidence for P because S is in that particular state with that particular content; the state and its content have an autonomous epistemic force such that they justify a belief with content P. This makes perceptual experience a holder of an epistemic property, which Siegel (2015) calls epistemic charge.

16     A. Raftopoulos

Phenomenal dogmatism is about the justificatory role of the phenomenal content of a perceptual state, that is, the way the visual scene is presented to the viewer or how the visual scene seems to the viewer. If one bears in mind (i) the role of attention in affecting perceptual phenomenal content, (ii) that cognitive states drive the so-called cognitively or endogenous-driven attention, (iii) that cognitive-driven attentional effects are delayed in time and affect perceptual processing about 170 ms after stimulus onset (see Chapter 3, and Raftopoulos 2009), and (iv) that there is a distinction between phenomenal awareness and phenomenal nonconceptual perceptual content, on the one hand, and cognitive access awareness and conceptual perceptual content, on the other hand, one might wonder what is the range of ‘phenomenal’ in phenomenal dogmatism. Is it the phenomenal content before cognitively driven attention intervenes and modulates perceptual processing affecting the phenomenology of the perceptual state, or is it the phenomenal content associated with the cognitively modulated perceptual states? I have addressed this problem elsewhere (Raftopoulos 2015b). This problem is inextricably related to the problem of whether attention is required for perceptual justification. Siegel and Silins (2014) distinguish between ‘attention needed’ and ‘attention optional’ views concerning the role of attention in perceptual justification, within the context of phenomenal dogmatism. According to the former, only perceptual experiences that are attended could justify beliefs. According to the latter, a perceptual state that is not attended can still have a phenomenal content, provided that there is phenomenal consciousness without attention, and, therefore, it could still play a justificatory role. The two problems are interrelated because if attention is required for perceptual justification, the phenomenal content that justifies a belief is the attentionally modulated content of perception. If, however, attention is not needed for justification, the phenomenal content that justifies is the pre-attentional content of perception. The problem, in this case, is that this content is not cognitively accessible by the viewer and, thus, one wonders whether one could properly assign to it the capacity of being able to justify. In what follows, I proceed under the assumption that for internalists, including inferentialists, what justifies is a seeming, which for most

1  Cognitive Penetrability and the Epistemic Role of Perception     17

philosophers is the phenomenal content of a propositionally/conceptually structured perceptual state. Tucker (2010) and McGrath (2013a, b) specifically endorse this view and argue that any perceptual downgrade is located in the quasi-inference from a receptive seeming to the propositionally structured states that contain the nonreceptive seeming. Although Siegel acknowledges the role of subpersonal processes in perceptual justification (Siegel 2013a), her recent views (Siegel 2016) show that she thinks that the downgrade due to CP is located at the level at which cognition affects the handling in perception of the evidence provided by some pre-experiential, subdoxastic states. Since all these issues will be addressed in what follows, I do not elaborate here. By rejecting internalism and adopting a form of externalism according to which perception can justify beliefs if it is sensitive to the environmental data, that is, if it (re)presents them as accurately as possible given the limitations of our perceptual systems, I will recast the problem in terms of what is the impact of cognition on the epistemic role of the two perceptual stages, namely early and late vision and I will argue that cognition does not impact on the epistemic role of early vision but it does so for late vision. It follows that any perceptual downgrade is due to the cognitive influences on late vision. This entails that if there is an epistemic blame this falls on the shoulders of the cognitively modulated phenomenological content of a state of late vision (the nonreceptive seemings of McGrath’s), and not on the nonconceptual contents (NCCs) of early vision. Let me also pause for a while to discuss the matter of whether it is appropriate to use the term ‘evidence’ to describe the supporting relation between perceptual experience and beliefs through which perceptual experience justifies basic (perceptual) beliefs.5 Some philosophers (Austin 1962, 115; Byrne 2014; McDowell 2011) explicitly or implicitly object to the usage of the term ‘evidence’ to express the rational support perception can provide to beliefs. One of the reasons for denying that perception provides evidence for beliefs (although not for Byrne or

5The

relationship between evidence and justification is discussed in Byrne (2014), Fantl and McGrath (2009), Lyons (2015, 2016), and McGrath and Fantl (2002).

18     A. Raftopoulos

McDowell) is that while evidence consists of facts that are expressed by propositions, as many philosophers have argued perception has NonConceptual Concept (NCC) or, at least, that its contents are not the contents of propositional attitudes. For this reason, propositional contents cannot stand in logical relations with non-propositional contents (Williamson 2000, 194–195).6 In Chapter 5 (see also Raftopoulos 2011), I argue that late vision in which the seeming is formed (and whose role is pivotal in justifying beliefs) has states with hybrid contents, including propositional contents. Therefore, this objection does not pose an insurmountable obstacle to the view that perception provides evidence for beliefs. Things are different with early vision however, because early visual contents are purely iconic and have purely NCC (Burge 2010; Burnston 2017; Heck 2007; Raftopoulos 2009, 2014). Therefore, should evidential relations be restricted to propositional contents, early vision cannot be said to provide any sort of evidence. Recall that the core of experientialism is that non-doxastic experiential states can justify doxastic states such as beliefs. This entails nothing concerning the nature of the content that does the justifying. It does not entail, for example, that some NCC justifies or evidences a belief. One could argue (McAllister 2018; Tucker 2010) that the justifiers are the seemings, which are perceptual states that have propositional/conceptual contents that are not held as beliefs by the viewer and, thus, are non-doxastic. In Chapter 5, I argue that such perceptual states with hybrid iconic/conceptual akin to seemings do indeed exist in late vision and may well fill the justificatory role of perception. In this framework, the epistemic role of early vision and how it contributes to the formation of the seeming needs to be determined, least one face the problem that the epistemic role of perceptual experience per se, that is, of pure perception without any conceptual involvement, becomes redundant and should be deleted.7 I will take up this problem in Chapters 3 and 4 and I will argue for the indispensable and crucial role of early vision

6As Davidson (1986, 3111) expresses it “The relation between a sensation and a belief cannot be logical. Since sensations are not beliefs or other propositional attitudes.” 7See Lyons (2016, 1075) for a discussion of this problem.

1  Cognitive Penetrability and the Epistemic Role of Perception     19

in the justificatory process whereby perception in general and perceptual experience in particular can justify or support or ground the corresponding beliefs. Another reason to doubt about the appropriate use of ‘evidence’ to explicate the perceptual justification of beliefs is that evidential relations between sets of propositions or contents in general function either in a deductive way, whence from a set of premises (that is, the [propositional] content of perceptual states) one deduces a conclusion (the perceptual belief rationally supported by the relevant perceptual experience), or as inferences to the best explanation, whence one considers which proposition if true would better explain the evidence and this proposition is the proposition that the evidence supports; to paraphrase Byrne (2014, 103), evidence is something one is in positions to reason from. In Chapter 5, I argue that neither early nor late vision support inferences in this sense, which I call ‘discursive inferences’, and even though the non-inferential process that results in the formation of a recognitional thought/belief could be recast in the form of an argument from some premise to a conclusion this does not entail that the formation of the perceptual belief is a case of reasoning, that is, a transition from a set of premises that act as a reason or as evidence for holding the thought (the content of the belief ) to the thought itself. Perceivers may be asked on what grounds they hold the thought that O is F, in which case they may reply “because I saw it”. However, the reason they cite as a justification of their thought is not a premise from which they inferred the thought. They do not argue from the thought “I saw it to be thus and so” to the thought “It is thus and so”. They just form the thought on the basis of the information included in their perceptual state in a non-inferential way. What warrants the recognitional thought “O is F” is not the thought held by a perceiver that they see O to be F but the perceptual state that presents to them the world as being such and such. “When one knows something to be so by virtue of seeing to be so, one’s warrant for believing it to be so is that one sees it to be so, not one’s believing that one sees it to be so” (McDowell 2011, 33). Externalists, such as Ghijsen (2016) and Lyons (2009) to name some recent accounts, also argue that perception is not evidence for beliefs; perceptual justification is not an evidential relation. The reason is that

20     A. Raftopoulos

the evidence itself must be justified in order to be able to confer any sort of justification to some propositions. A set of beliefs constitutes evidence for some other beliefs if, among other things, the beliefs in the set are justified and it is this justification that bestows rationality to the justification relation in the sense that the ensuing beliefs have been arrived at rationally. This, however, paves the way for the known infinite regress that besets internalist accounts of justification. If perception were to provide evidence for beliefs it should, too, be justified and this would require the existence of some prior mental states that justify perception, or an account of why perception could be taken as evidence without the need for any further justification of its contents. This has led some externalists to reject the view that perception yields evidence for beliefs; instead, perception justifies beliefs being part of a process that is in its entirety reliable (Lyons 2009).8 Be that as it may, since nothing important concerning the discussion about CP hinges on whether one calls the relation between perceptual states and beliefs an evidential relation, I will keep using both ‘justification’ and ‘evidence’ to express the rational support perception can provide to propositions or beliefs. Some among the philosophers who propose phenomenal dogmatism (e.g., Huemer 2007) hold that perception can justify beliefs and, thus, provide reasons for holding them, only if it has propositional and, thus, conceptual content. Similarly, Tucker (2010) argues that perceptual justification is carried over by the seemings, which are propositional, non-doxastic states. In this he joins a series of philosophers who, like McDowell (1994, 10), think that since perception can justify beliefs, admitting that perception has NCC would mean that the ‘space of

8The transition from the thesis about the reliability of perception to the thesis that perception can justify beliefs may take many forms. According to standard reliabilism, the epistemic force of perception relies on the fact that perception is reliable; if perception relates us reliably with the environment, the contents of our perceptual states justify beliefs with matching contents. According to Burge’s (2003) externalism, further conditions should be added to standard reliabilism, conditions that refer to the ontogenetic and phylogenetic history of perceivers in their environment and explain why perception functions the way it does and why it entitles perceivers to hold certain beliefs. This would allow one to define the proper function of perception, a term that can be found in externalist discussions of perceptual justification (Plantinga 1993).

1  Cognitive Penetrability and the Epistemic Role of Perception     21

reasons’ should be extended beyond the ‘space of concepts’, which they find problematic and reject. One might think that this view clashes with one of my main claims throughout this book, namely, that early vision has NCC; it does not, however, because if the onus of perceptual justification lies in the percept, this is formed in late vision the states of which have hybrid contents that include conceptual contents that, however, are not represented in a propositional format. In Chapter 5, I argue that the nonconceptual output of early vision is conceptualized in late vision, which allows for conceptual contents eventually to be formed although the format in which the contents of late vision are represented is not propositional, with, perhaps, the exception of the recognitional belief that “that O is F” that arises when the percept is formed. My views, however, are compatible with the view that this belief emerges spontaneously in the space of reasons. Heck (2000, 511) and Millar (2011, 338) also offer the view that in perception at some point concepts are immediately, that is, noninferentially, applied in response to seeing something. Of course, the conceptualization does not create a nouveau the content and the structure of the states of late vision. This is crucially based on the content that early vision outputs. The conceptualization, however, has two effects on this content. First, it enriches it with semantic information and, second, since the conceptualization is accompanied by attention, it changes some of the qualitative aspects of the content, as well as the structure of the experience, that is, the way the world looks to the perceiver (see Chapter 5 and Raftopoulos [2015] for the ways attention affects phenomenology). It follows that for perception to be able to support beliefs concerning objects and their properties, early vision should output contents with the requisite iconic or analogue structure. These contents, for perception to be reliable, should somewhat reflect the content of the distal visual scene. As I have argued (Raftopoulos 2009, 2015a), early vision does have the required content to support the justification of beliefs despite the fact that it has nonconceptual and, thus, non-propositional contents. Two caveats are needed at this juncture. First, I assume, and in this, I join a very big line of philosophers and cognitive scientists, that perception (or a part of it) has NCC. This content, whatever it may turn out

22     A. Raftopoulos

to be, is certainly not propositional content and, thus, differs radically from the content of beliefs, which is propositional. Some philosophers think that there is another part of perception that has a hybrid content, partly conceptual content and partly NCC. In both cases, either the non-propositional, NCC of perception or its hybrid content cannot be the same as the purely propositional content of beliefs. Therefore, it is not correct to say that a perceptual state with content P prima facie justifies a belief with content P, because the two states have different sorts of contents. However, it is intuitively correct to affirm that an experience as of a red21 cat justifies the belief that there is a red cat, whereas it does not justify the belief that there is a black dog. One could say that the content P of a perceptual state justifies a belief that has a correspondent or matching content P*; P* may be construed as the conceptualization of P (Burge 2010; Raftopoulos 2009), or as the content that is in a canonical correspondence with P (Peacock 2004). As nothing significant hinges on the relationship between the two contents that affects the role of CP in perceptual justification, I will say that a perceptual experience with content P provides prima facie justification for the belief that P*, where P* is the propositional content that matches P in some appropriate sense of matching. Second, I have said that perceptual content P provides prima facie justification for proposition P or that it provides justification for believing that P is the case. These two, however, are not equivalent. A perceptual experience of subject S with content P can prima facie justify a proposition P*, but S may not form the belief that P* is the case. The former is called propositional justification and the latter doxastic justification (Lyons 2016; Siegel 2011, 2013a, 704; Steup 2018; Tucker 2014, 35). Someone may have propositional justification for P independent of whether they believe that P*; this is so because experience as of P prima facie justifies P* even when the viewer has defeaters for P* and, therefore, abstains from forming the belief that P*. Steup (2018, 2912) argues that propositional justification is involved when one asks the question “When is a perceptual experience a source of justification?” If there are no defeaters (I will not discuss the issue of whether the viewer must be aware or not of the presence of the defeaters to avoid entering into the discussion between internalism and externalism

1  Cognitive Penetrability and the Epistemic Role of Perception     23

in epistemology that will only complicate things), the viewer may pass from propositional justification to doxastic justification for P* and, thus, form the belief that P*; that is, doxastic justification results when S believes something that is propositionally justified and S bases the belief on that which propositionally justifies it (Kvanvig 2003). According to Steup, doxastic justification is involved in questions of the type “when does a perceptual experience succeed in justifying a perceptual belief?” This last remark is important because it shows that there is a link between propositional and doxastic justification. If an experience provides perceptual justification for a proposition P, then if a belief in P based on E isn’t doxastically justified, that status (as not doxastically justified) will be due to some factor other than E. (Siegel 2013a, 706)

In other words, if experience E propositionally justifies P* then it also doxastically justifies P* unless some defeaters make the viewer to abstain from forming the belief that P*. Lyons (2011, 292) similarly claims “one can draw conclusions about prima facie justification, starting with claims about ultima facie justification, having ruled out the presence of defeaters.” Let me also note that perception is intended here to include both unconscious subpersonal processes and personal level perceptual experience that presents to the viewer the world as being some way or other. The latter is the phenomenology associated with the experience and is called the phenomenal content/character of perception. The reason I include subpersonal processes in the factors that may affect the epistemic impact of perception is that I agree with Lyons (2005, 2009, 2015, 2016) and Siegel (2013a) that subpersonal perceptual processes do play such a role. There is a marked difference between Lyons and Siegel, of course, in that for Lyons subpersonal processes could by themselves justify a belief, whereas as for Siegel they do so in as much as they contribute to the formation of states with phenomenal content. In my discussion, however, I concentrate on the epistemic impact of the content of perceptual experience that presents the world to the viewer as being in some way. Moreover, when I talk about the phenomenology

24     A. Raftopoulos

of an experience, I include both what is called the phenomenal content of perception (Block 2007; Burge 2010: Lamme 2005; Raftopoulos 2009), which is associated with the NCC of perception, and the phenomenology associated with the cognitive access content of perception, which is the content that is available to the cognitive, consumers parts of the brain and which is the conceptualized content of perception. In Raftopoulos (2015b), I called the phenomenology of perception that has been affected by cognitively driven attention and, hence, by the contents of the cognitive states that drive attention, ‘conceptually modulated visual awareness’. Hanson (1958), Kuhn (1962), Churchland (1988), and other philosophers of science interpreted findings in psychology and neuropsychology as showing that cognitive states involving propositional (conceptual) contents affect perception. This was used as a springboard to mount an attack on the received view in the philosophy of science that there is a theory neutral observational basis on which a rational choice for empirical adequacy between competing theories could be made. Sellars (1956), on his part, sought to undermine one of the tenets of classical empiricism, to wit, the view that we can have access through introspection to the contents of perception independently of concepts, which, this way, deliver to us the world in its own guise without any conceptual influences and, thus, allow us a direct access to the world. This ‘given’, empiricists thought, can be used as a neutral basis on which to determine the adequacy of both perceptual beliefs and scientific theories. Since the CP of perception undercuts the possibility of such a given, the justificatory role of perception is undermined. The problem that the CP of perception poses for perception, therefore, is that it endangers the justificatory role of perception, that is, the role of perception in justifying perceptual beliefs; it may downgrade perception. If prior beliefs affect perceptual processing, one might wonder how this affects the justificatory role of perception. It is intuitive to argue that if the belief that X is F causally affects the perceptual processing of a visual scene in which an X is present and as a result of this process a viewer has an experience with content “X is F” on which she subsequently bases her belief that X is F, one has a right to suspect that the role of the prior belief in affecting the content of perception

1  Cognitive Penetrability and the Epistemic Role of Perception     25

undermines or diminishes the rational support for the perceptually based belief; the belief is epistemically compromised. Phenomenal dogmatism or conservatism does not perceive CP as a threat to its main tenet, namely that if it perceptually seems to S that P, then, thereby, S has prima facie perceptual justification for the proposition P, because even if CP undermines the epistemic standing of the perceptual output, i.e., the percept, by decreasing its probability of being true, S still has prima facie perceptual justification for the belief based on the percept. Thus, CP may block doxastic justification afforded by perception, but it does not affect the propositional justification afforded by perception. In view of these, dogmatists may bite the bullet of CP because CP is not perceived as threatening their internalist epistemological views concerning prima facie justification. Perceptual experiences are given and, thus, perceivers are not epistemologically speaking responsible for them. As such, experiences are the starting point of the epistemic evaluation of perceptual beliefs but they are not themselves epistemically evaluable and thus, they cannot be justified or unjustified. It follows that a perceiver that has a penetrated experience with some content and a perceiver that has a non-penetrated experience with the same content are in the same epistemic position, in that they are equally prima facie justified for having a belief with the content that matches the content of the perceptual experience. Other philosophers, while anchored within the internalist camp, think that the CP of perception poses a real threat to the epistemic role of perception in grounding perceptual beliefs. To counteract the threat, an internalist way must be found to account for the downgrading role of cognitive effects on perception in general and CP in particular. Markie (2005, 2006, 2013), McGrath (2013a, b) and Siegel (2011, 2013a, b) propose a version of an internalistic account of perceptual justification, which could be called ‘inferentialism’, according to which the perceptual process of the production of the percept is a sort of inference, which can be undermined at various stages and for various reasons. As such, perception becomes epistemically evaluable. CP is just such one of such reasons, one that introduces an irrational ingredient in the justificatory process and, hence, downgrades the epistemic role of perception.

26     A. Raftopoulos

3 Siegel’s Inferentialism In discussing inferentialism, I will concentrate on Siegel’s work and will not discuss the very interesting variations of Markie’s and McGrath’s. According to Siegel (2013a, 707), when the CP of an experience epistemically downgrades the experience by diminishing its justificatory role, this happens because the experience is formed through an irrational process, or equivalently, it is the irrational etiology of the experience that epistemically downgrades it (Siegel 2013a, 699–700). The irrational etiology of experience makes it serve as a carrier for forms of influences on beliefs that are epistemically bad. The experiences that are generated through an irrational process, i.e., those that are causally affected by prior mental states in a way that diminishes their justificatory role, generate ill-formed beliefs on account of their etiology. These are the checkered experiences. Up to this point, Siegel uses the irrational etiology of experience to explain why the epistemic power of the experience is downgraded. One could argue, however, that the irrational etiology of the experience has repercussions for the epistemic character of the experience itself. An experience formed through rational processes is of different epistemic standing from an experience formed through irrational processes and, therefore, exactly as with beliefs, whether a subject has rationally or irrationally formed experiences relates to the subject’s rational standing. This makes perceptual experience a holder of an epistemic property, which Siegel (2015) calls epistemic charge. It is this property that is transmitted to the beliefs grounded in the experience and renders them less or better well justified by the experience; the lesser the epistemic charge of an experience, the lesser its epistemic power to justify beliefs. Before I proceed, let me point out that some of my arguments against Siegel’s inferentialism, specifically those concerning the failure of inferentialism as a branch of internalism to account for the epistemic downgrade of perception by CP, parallel those of Ghijsen (2016) and Lyons (2009, 2015). This is why I do not expand much on these. The arguments, however, put forth here and in the next chapters against Siegel’s distinction between CP and attention and the way this is reflected in

1  Cognitive Penetrability and the Epistemic Role of Perception     27

the discussion of the selective and responsive modes in which cognition may affect perception, as well as the criticism of Siegel’s more general claim that perception involves discursive inferences are relatively novel, although I am certainly not the first one to argue that perception does not involve discursive inferences Hatfield’s (2002). To discuss the problems associated with the CP of perception, one should first define CP. Since Siegel has extensively addressed this problem, I will use Siegel’s definition of CP. (I will return to discuss various problems of Siegel’s view and other definitions of CP in the next chapter.) For Siegel, CP covers all cases of influences on the contents of experience by prior mental states, including cognitive, and emotive states. CP signifies, thus, the causal influences on the contents of perception by prior mental states such that these contents are altered as a result of the cognitive influences. There are also cognitive influences that are not cases of CP, such as those associated with the selective mode in which cognition may affect perception (Siegel 2016). Not all forms of CP lead to epistemic downgrade. Some forms of CP are beneficial for the viewer in that they increase the viewer’s sensitivity to the visual information in the environment (for the externalist), or in that they render perceptual inferences more rational or better grounded since the evidential basis is strengthened (for the inferentialist). CP that results from expertise and familiarity owing to perceptual learning, for example, does not undermine the justificatory role of experience but, rather, increases it (Siegel 2011, 2013a, b, 702; see also Lyons 2011). If CP downgrades experience because of the irrational etiology it introduces, then some forms of CP do not introduce an irrational etiology. If some cases of CP lead to epistemic downgrade owing to the fact that they introduce an irrational etiology but others do not, it is crucial to define what constitutes the irrational etiology of experience in such a way that the appeal to this etiology could explain why some forms of CP introduce irrationality in belief formation but others do not. It is obvious that the etiologies of experiences are processes that lead to the formation of the experiences. The question is how to distinguish between an epistemically bad and an epistemically innocuous or even beneficial etiology of experience and, thus, between an epistemically bad

28     A. Raftopoulos

kind of CP and an epistemically neutral or beneficial kind of CP. To answer this, Siegel (2013a, 714) proposes a definition of rational assessability whereby An etiology X of experience E with content C is rationally assessable iff etiology X* of belief B with content C is rationally assessable, where X* has similar psychological elements as X, except that it leads to a belief without intervening experience.

Having defined rational assessability, Siegel (2013a, 716) proceeds to define checkered experiences as follows, An experience E with content C and etiology X is checkered with respect to C iff • X is rationally assessable, and • A belief with content C and etiology X* would be doxastically unjustified, where the output of X* is a belief with no intervening experience, and X* has psychological elements sufficiently similar to X’s.

Underlying these views is the Analogy thesis (Siegel 2016, 4), according to which It is possible in principle for an experience to depend on a desire, in ways that are structurally analogous to modes in which a belief that P depends on a desire, where the mode of dependence makes the belief ill-founded.

Although Siegel talks about the effects of desires on perception, her views can naturally be extended to all sorts of propositional states on perception. Let us suppose that we have solved all sorts of problems related to the psychological similarities between etiologies of beliefs, that is, formation processes of beliefs, and etiologies of experiences, that is, formation processes of experiences, and let us also suppose that we have dealt with the problem of assigning the same content to an experience and to a belief in a way similar to one of those suggested above when

1  Cognitive Penetrability and the Epistemic Role of Perception     29

I discussed the problem of the matching contents between perception and thought. Siegel proposes a way to determine whether the influence of a prior mental state on an experience, on which another (token) mental state is based, epistemically downgrades the experience. One should find, first, a belief with the same content as that of the experience. Then, one should find an etiology for this belief that is psychologically similar to the etiology of the experience. If this belief with this specific etiology is doxastically unjustified, then the experience has an irrational etiology and has its justificatory role diminished. In other words, one should ask whether the processes leading to the experience, and from there to the belief that is based on this experience, are of the kind that, when applied (or rather, when their corresponding psychological processes that pertain to beliefs are applied) to beliefs and not to perceptual contents lead to wellfounded beliefs or to ill-founded ones. As Siegel writes (2013a, 717), in view of the fact that it is difficult to define a checkered experience in terms of a sort of CP that is bad, one should rely on one’s sense “of which processes [of direct belief formation, parenthesis added] lead to ill-founded beliefs, and of which etiologies of experience are structurally similar to those.” To give an example, consider Siegel’s Jill and Jack case where Jill’s prior belief that Jack is angry makes her have a perceptual experience of an angry Jack. The corresponding belief, that is, the belief whose content matches the experience of an angry Jack, is the belief that Jack is angry. Since this belief is formed through a process that includes the prior belief that Jack is angry, one ends up in a situation in which the belief that Jack is angry is based on the prior belief that Jack is angry. The newly formed belief that Jack is angry is clearly an ill-founded belief based on an irrational etiology, since it is based on itself; it is a paradigmatic case of belief perseverance or circular reasoning, where one’s belief that p at time t causes them to believe that p at time t + 1. It follows that the experience of an angry Jack that is CP by the prior belief that Jack is angry is a case of CP that downgrades the experience. In a nutshell, what Siegel advises us to do to determine whether a case of CP leads to epistemic downgrade is to start from the perceptual

30     A. Raftopoulos

case in which a prior belief cognitively penetrates an experience, find a belief whose content matches that of the experience, subtract the intervening experience, and decide whether the psychological process that starts from the prior belief and gives rise to the belief whose content matches that of the experience is such that the resulting belief is ill-founded and the process of perceptual belief formation is irrational. Returning to the example of Jill and Jack, Siegel starts from an indirect case of wishful thinking in which the prior belief causes an experience or seeming, and on the basis of this experience, another token belief of the same type belief as the prior belief is formed; it is indirect because the belief based on the experience receives its justification from the prior belief through the intermediate of experience. Then, Siegel proposes that we subtract the intervening experience and examine whether the resulting direct process of belief formation is rational or irrational, that is, whether it leads to well-founded or ill-founded belief. Since, clearly, a belief whose justification is based on itself is ill-founded, CP downgrades the experience. (For a discussion on direct and indirect basing relation, albeit in another argumentative context, see Tucker 2014, 45.) Siegel is clear that not all cases of CP downgrade experience. Familiarity, expertise, and perceptual learning in general facilitate rather than hinder the justificatory role of perception. These are cases in which prior perceptual knowledge changes the way a scene looks, which is a case of CP, by affecting the features in a visual scene that become salient and, thus, are selected for further processing. This is the second of the four modes of influence through which, according to Lyons (2011, 302), cognition may affect perception since expertise and familiarity facilitate pop-out of certain patterns that allow or speed up object recognition. This selection is effectuated by attention that is cognitively driven by the prior knowledge acquired through perceptual learning. (As I argue in Chapter 3, this is one of the two ways that perceptual expertise may operate and it is made possible because late vision is CP by prior knowledge; the other way occurs much earlier in early vision but does not entail the CP of early vision.) Since these cases of CP are not epistemically harmful, the CP at issue does not downgrade experience because it does not introduce an irrational etiology.

1  Cognitive Penetrability and the Epistemic Role of Perception     31

We saw that CP denotes the cognitive effects on perception. Not all cognitive effects on perception, however, are thought to be instances of CP. Siegel (2011, 2013a, b, 2016, 4) thinks that real CP occurs when cognitive effects influence how things look, or, more generally, when the cognitive effects affect not the selection of the input but the perceptual processing itself. If visual experience is cognitively penetrable, then it is nomologically possible for two subjects (or for the same subject in different counterfactual circumstances, or at different times) to have visual experiences with different contents while seeing and attending to the same distal stimuli under the same external conditions, as a result of differences in other cognitive (including affective) states. (Siegel 2011, 5–6)

Sometimes, when attention guides the selection of some features in the visual scene, which eventually causes the percept, this is not a case of CP but a mere selection effect, because [W]hen prior mental states influence what you look at or attend to, without influencing how things look to you when you see them, the result might seem to be a mere selection effect. If you want the Necker cube or the duck-rabbit to shift, you can make it shift by adjusting your focus to the relevant part of the figure, thereby affecting the contents of your experience. (Siegel 2013a, 717)

Siegel (2016, 4–5) relates the distinction between CP and mere selection effects to two modes in which cognition can affect perception, the responsive and the selective mode. The responsive mode, which is the way perception responds to the evidence contained in some preexperiential states, is a form of CP of experience. The selective mode is a selection effect on experience in which prior-held beliefs or desires, by guiding attention, select objects and properties for perception to process, without necessarily affecting the content of the ensuing perceptual states. Siegel thinks that this sort of cognitive influence on perception does not amount to CP of perception. In the selective mode, the attentional selection concerns items in the environment, which are external

32     A. Raftopoulos

to perception and may serve as its inputs. In the response mode, in contrast, what perception responds to is something that is contained in the pre-experiential perceptual states that are mental, internal items. The distinction between the selective and response modes correspond to Siegel’s (2013b, 240–241) distinction between CP and attentional effects that select some features as opposed to some others selecting, thus, the input and co-determining this way the percept. When CP occurs subjects see, say, some pliers (the distal object) but they look to them (the percept) like a gun when they are given an appropriate prime. This is a genuine case of CP because cognition affects the way things look by influencing perceptual processing and not just by selecting the input. Selection effects that merely select the input to be perceptually processed, on the other hand, should be excluded from being instances of CP and since selection effects are the hallmark of the attentional effects on perception, attentional effects should not be considered cases of CP. In general, throughout her work, Siegel maintains that attention in any of its forms affects perception only indirectly, which means that attention affects pre-perceptual or post-perceptual stages but not perceptual processing itself and, thus, it is not a case of CP. Pylyshyn (1999) and Firestone and Scholl (2016) share this view. In other words, perceptual processing is independent of attention, which acts externally to perception. In the next chapter, I will argue that cognitively driven attention does influence a stage of perceptual processing directly and is inherently involved in it and, thus, that some attentional effects are genuine cases of CP. Despite the fact that selection effects in general are not considered by Siegel to be cases of CP, Siegel hastens to note that some selection effects are epistemically harmful. Indeed, Siegel (2016) argues that the selective mode may downgrade perception. There are several problems with Siegel’s conception of CP and of its relation to attention to which I return in the next chapter. For now, let us apply Siegel’s distinction to concrete perceptual cases. A way to explain the perception of a set of pliers as a gun as the result of selection effects, which as such does not involve CP, is that the prime generates a selection effect whereby some features of the distal object (the set of pliers) that the object shares with guns are selected

1  Cognitive Penetrability and the Epistemic Role of Perception     33

and perceptually processed and, as a result, the distant object looks like a gun. In Jill’s and Jack’s story, similarly, Jill selects some of the characteristics of Jack’s expression (those that corroborate the hypothesis that Jack is angry) and ignores some others (those that contradict this hypothesis) and on the basis of this selection, she forms the experience of an angry Jack. It should be pointed out that in these cases the selection concerns not the distal visual scene that will be perceptually taken in, but which features of the selected scene will be given processing priority. Herein lies a problem afflicting Siegel’s account because this selection may very well take place in late vision, where attention guides perceptual processes in order to test hypotheses concerning the identity of the distal object(s) by revisiting information contained in the iconic image. In this case, the information selected is not in the environment but in a set of perceptual mental states and the selection is effectuated by cognitively driven attention, which means that the link from cognition to perception is internal, causal, and purely mental. In addition, the cognitive states affect perceptual processes and the contents of the affected perceptual states; they do not merely select the input before perceptual processing begins, as Siegel seems to suppose. It follows that these attentional effects meet Siegel’s own criteria for CP and, thus, should be seemed cases of genuine CP. This objection notwithstanding, Siegel is right that even when selective effects do not constitute cases of CP, they may downgrade perception because they intervene in the way perception handles the available evidence and affect perception in ways that decrease its sensitivity to the data (for the various ways cognition through selection may downgrade perception, see Siegel 2016, 23). A way to understand the difference between the selective mode of evidence and the response mode of evidence would be to build on Siegel’s view that only in the responsive mode, which is a case of CP, does cognition affect the contents of some perceptual states; in the selective mode it does not affect perceptual contents. In the selective mode, therefore, attention selects from the environment which evidence the perceptual system will use (it handpicks the evidence) but does not determine the content of the evidence thus selected. It selects, for example, some features of pairs of scissors that mimic features of guns without changing these features. In the

34     A. Raftopoulos

responsive mode, in contrast, cognition affects the content of the evidence; it may change the way some of the features of a pair of scissors look so that they resemble features of guns. But why does the selection of visual information in Jill’s wishful thinking case downgrade experience, while the selection that occurs as a result of perceptual learning is epistemically innocuous or even beneficial? The situation is more complicated because Jill may be quick to pick up angry cues as a result of her perceptual learning with angry faces (suppose she was raised in such an environment). Why are then some cases of perceptual learning epistemologically innocuous and other cases of perceptual learning harmful? This is where Siegel asks us to convert the indirect case of formation of a belief through the CP of perception to a direct case of belief formation in which a belief is based on some other beliefs, and decide whether this latter process leads to illfounded or well-founded beliefs; as long as there is an irrational doxastic etiology structurally similar enough to an experience etiology, the latter is epistemically downgraded. In the Jack and Jill case, the corresponding doxastic etiology involves a case of belief perseverance, which, being epistemically bad, entails that the corresponding perceptual etiology is equally epistemically bad, and this renders Jill’s experience downgraded. A way to stick with etiologies to explain the differences between bad and good or innocuous CP is to unravel the perceptual processes underlying the etiology of the percept and attempt to pinpoint at that level the cause of bad CP and good or innocuous CP. As we shall see in Chapter 5, object recognition relies on matching the predictions concerning the identity of the distal object that the perceptual system makes during late vision with information retrieved from the current visual scene and stored in the visual circuits (the iconic image). In one scenario, favored by Kosslyn (1994), which we will discuss in detail in Chapter 5, the perceptual system builds representations of the distal objects in the visual scene and compares them with templates of objects that have been stored in visual long-term memory. These representations, which are hypotheses regarding the identity of objects, provide top-down feedback to the visual buffer where it is matched against the input image to test the hypothesis against the fine pictorial details registered in the retinotopical areas of the visual buffer. If the match is

1  Cognitive Penetrability and the Epistemic Role of Perception     35

satisfactory, the category pattern activation subsystem sends the relevant pattern code to Working Memory (WM), where the object is tentatively identified. Occasionally the match in the pattern activation subsystems is enough to select the appropriate representation in WM. On other occasions, the input to the ventral system does not match well with a visual memory in the pattern activation subsystems. Then, another hypothesis is formed in WM. This hypothesis is tested with the help of other subsystems (including cognitive ones) that access representations of such objects and highlight their more distinctive feature. The information gathered by shifts attention to a location in the iconic image where an informative characteristic is most likely to be found. If it is, its pattern code is sent to the pattern activation subsystem and the buffer where a second cycle of matching commences. Suppose that a process like this occurs in the visual system. CP may act in such a way as to lower the threshold of the matching criteria for the detector of a certain object O. This means that a wider class of stimuli could trigger the detector, which results in many objects be perceived as O even though they are not; these are the false positives. This is clearly a case in which CP epistemically downgrades perception. Or, CP may act in such a way that information relevant to the identification of O is quickly distinguished from other information in the visual scene and is processed more quickly and efficiently, which, as I said (Raftopoulos 2015b; Chapter 3) is one of the ways in which familiarity affects perceptual processing. In this case, CP epistemically upgrades rather than downgrades perception. The account, which involves perceptual mechanisms, that explains why some cases of CP downgrade perception while others do not, however, cannot be used by inferentialists. The reason is that it is hard to imagine what will be the doxastic analogue of these perceptual processes. What could be the doxastic inferential mechanisms that support discursive inferences that correspond to the perceptual processes described above in such a way as to be capable of being deemed as having a sufficiently similar structure that would allow one to conclude, on the basis of the irrational character of the discursive inference supposing that there is one, that the corresponding perceptual processes introduce an irrational etiology that downgrades perception?

36     A. Raftopoulos

Another way to stick with etiologies is to create around the prior belief a context of other cognitive states and, perhaps, emotive states that, in conjunction with the belief, creates the irrational etiology. This context could function, for example, so that the agent ignores disconfirming evidence and selects only confirmatory evidence in the Jill’s and Jack’s case, while the snake expert case does not lead to ill-epistemic effects because the context makes the agent to be objective, that is, sensitive to the data, by not ignoring disconfirming evidence, even though the snake’s expert sensitivity to snake cues is increased relative to an amateur. Bringing in the notion of sensitivity to the facts, however, is an externalist move, which Siegel (2013a, 716) herself seems to allow when talking about processes that make Jill better at detecting facial cues or making her good at figuring out what will make Jack angry and which Siegel (2016) clearly adopts, although she ties the sensitivity to the evidence to the notion of resistance to evidence, and these with the rationality of perceptual processes, since she argues that a percept that is formed through processes that resist evidence is the result of irrational perceptual processes. In this vein, one could attempt to defend the claim that the etiology of the percept determines whether CP leads to ill-founded beliefs or not in the following way. Emotional effects act separately from attentional effects and provide an additional bias to the processes of sensory representations that lead to the selection of some among the items in the input, either adding to or competing with attention. Thus, in cases of wishful thinking where the percept is biased in favor of the viewer’s expectations, these expectations act as strong desires or hopes that either compete with the regular business of attention preventing it from searching effectively the iconic image for all the relevant clues independent of their confirmatory or dis-confirmatory role. Alternatively, they may add to the attentional effects (driven by the beliefs that share the same or similar contents as the emotive states) with the same biasing result, say, by increasing the activation of the neurons that encode confirming evidence for the biasing belief or desire making the perceptual system to reach the confidence level required to confirm the corresponding percept. That is, the harmful cognitive penetration comes from emotive states that act in accordance with some beliefs. On the

1  Cognitive Penetrability and the Epistemic Role of Perception     37

other hand, where no such emotive bias exists, the CP of perception in the form of expertise does not have harmful epistemic effects. One could, thus, claim that the synergy of emotive and cognitive, properly speaking effects, creates the irrational etiology that gives rise to the epistemic downgrade of perception. Of course, one should work out the details in full, but I think that in principle this may be a promising start. Siegel’s defense of inferentialism unearths three main theses underlying her assessment of the repercussions of the existence of cognitive effects on perception for the epistemic role of perception. The first and most general is her view that perception is a kind of inference structurally similar to the inferences characterizing discursive reasoning. The second thesis is that because perception is a kind of inference, it can be rendered irrational failing thus to support perceptual beliefs by outputting an ill-founded percept. The third thesis is that cognition can affect perception either through CP or through selection effects. In the rest of this section, I examine these theses starting from the last one since it is the starting point of Siegel’s analysis of CP.

3.1 Ways in Which Cognition Affects Perception CP may affect the sensitivity of perception to the environmental data or may render perceptual inferences illicit in different ways. It could cause perception to select as evidence those data from the environment that are congruent with the contents of the penetrating cognitive states and ignore other data that do not. In this case, cognition selects the input to perception. Cognition could also affect perception in a more direct way during late vision. In late vision, hypotheses concerning the identity of objects in a visual scene are tested against the rich iconic information contained in the iconic image that contains all iconic information retrieved from the visual scene by early vision and is stored for a limited amount of time in early visual circuits. Recent work on perceptual processing emphasizes the role of brain as a predictive tool. To perceive is to use what you know to explain away the sensory signal across multiple

38     A. Raftopoulos

spatial and temporal scales. Perception aims to enable perceivers to interact with their environment successfully. Success relies on inferring or predicting correctly (or nearly so) the nature of the source of the incoming signal, that is, the identity of the distal objects and their properties from the signal itself, an inference that may well be Bayesian. Attention within late vision contributes to testing hypotheses concerning the putative distal causes of the sensory data encoded in the lower neuronal assemblies in the visual processing hierarchy. This testing assumes the form of matching predictions, made on the basis of a hypothesis, about the sensory information that the lower levels should encode assuming that the hypothesis is correct, with the current, actual sensory information encoded at the lower levels. To this aim, attention selects from the iconic image the information that is relevant to the testing of these hypotheses. Attention does this by enhancing the activity of neurons in the cortical regions that encode the stimuli that most likely contain information relevant to the testing of the hypothesis. Therefore, cognition, through attention, may guide the search of the iconic image so that only confirming evidence for a hypothesis concerning the identity of an object in the visual scene that is congruent with the content of the cognitive penetrating states be selected and recalcitrant information be ignored. In this case, the perceptual system ‘handles’ the evidence contained in the iconic image that is formed through the retrieval of information from the environment during early vision. These two ways, namely, searching for relevant information in the distal scene, or in the iconic image, respectively correspond to a certain extent to Siegel’s (2016, 4–6) distinction between the selective mode and the responsive mode in which cognitive states may affect perception. Let me point out that the main difference between the way cognition affects perception by selecting the input before the perceptual processing starts, and the way cognition affects perception during late vision is not that the one but not the other involves attention because, as we saw, cognitively driven attention operates both when the environmental input is selected and during late vision when it guides the hypothesis testing. The main difference consists in the nature of the selected evidence. In the former case the evidence is in the environment,

1  Cognitive Penetrability and the Epistemic Role of Perception     39

whereas in the latter it is stored in the perceptual circuits of the viewer and is, thus, the content of some mental perceptual states. As I noted above, not all cognitive effects are cases of CP. Siegel (2016, 4–6) ties the distinction between CP and mere selection effects to two modes in which cognition can affect perception, to wit the responsive mode and the selective mode. The responsive mode, that is, the way perception responds to the evidence contained in the preexperiential states, is a form of CP of experience. The selective mode, on the other hand, is a selection effect on experience in which prior-held beliefs or desires, by guiding attention, select objects and properties in the environment for perception to process. We also have seen that Siegel thinks that attention is at work only in the selective mode, which prompts her to claim that attentional influences, even when they are cognitively driven, do not entail the CP of perception. In the selection mode, attention, indeed, does not affect the content of the ensuing perceptual states (these are the cognitive effects that I have described as indirect cognitive effects on perception [Raftopoulos 2009]) in the sense that the cognitive states that guide attention either determine what or where one looks at, but once they have done so they do not affect the perceptual processes that lead to the formation of the percept otherwise. In the selective mode, cognition selects which evidence perception will use in the form of selection of the distal stimuli that will be perceptually processed, a selection that takes place through the effects of cognitively driven spatial or object/feature-centered attention that acts externally or indirectly on perception in the sense explained above. It is widely acknowledged, and Siegel (2011, 2013a, b, 2015, 2016) concurs that this sort of effects is not a case of CP. In the responsive mode, the cognitive states control which beliefs a viewer forms in response to a body of evidence. In the case of perception, this control assumes the form of the cognitive states controlling the percept in response to low-level perceptual input; this is a typical case of CP. The low-level perceptual input is the content of pre-experiential perceptual states (Siegel 2016, 19), and constitutively depends on the distal visual scene. Siegel (2013a) claims that unconscious processes at the basement of the mind give rise to perceptual experiences and this is why they affect the rational role of an experience in supporting

40     A. Raftopoulos

perceptual beliefs, which indicates that for Siegel the pre-experiential states are the result of unconscious processes. It is not clear to me whether Siegel thinks that the processes in the basement of the mind could be blamed for any irrationality that might infringe in the process of belief justification by the percept, or whether such irrationality can come only on account of the way cognition affects the handling by perception of the evidence provided by the basement. In the latter case, the processes in the basement affect the epistemic role of perception only in the sense that they provide the information on the basis of which the percept will be built and, thus constrain the processes that issue the percept. Be that as it may, in Chapter 5, we shall see that the percept is formed in late vision that receives as input the low-level, iconic, perceptual output of early vision. The responsive mode of Siegel’s, therefore, amounts to the way the visual information that early vision outputs is used by late vision in the construction of the percept, which, as I claimed above, is the second way in which cognition may affect the sensitivity of perception to the evidence. Note that the evidence, Siegel’s low-level perceptual input, consists in the information contained in the iconic image that is formed as early vision retrieves information from the visual scene. I have argued (Raftopoulos 2001a, b, 2009, 2014; and also Chapters 2 and 3), that early vision retrieves information from the environment in a direct, conceptually encapsulated manner. The iconic image thereby formed not only constitutively depends on the distal visual scene (its mode of [re]presentation is object involving) but also reflects in an accurate way the distal scene in the sense that all information in it is retrieved in parallel from the environment. Thus, Siegel (2016, 19) is right to argue that the content of pre-experiential perceptual states constitutively involves the distal scene. Notice also that Siegel talks of the cognitive control over the way perceivers respond to the low-level perceptual input, which leaves open the problem of whether cognition could also control the low-level input in a way other than the selection of which environmental information will perceptually be taken in. As I will address this problem in the next two chapters, let me just mention here that what Siegel seems to imply, and I fully agree with her assessment, is that in the responsive mode the perceptual processes that

1  Cognitive Penetrability and the Epistemic Role of Perception     41

receive and process low-level input are affected by cognition, but the processes of the perceptual stage that provides the low-level perceptual information are not affected by cognition at least in a way that could undermine the justificatory role of perception. Siegel’s discussion parallels that of McGrath’s (2013b, 237–238), who distinguishes between receptive and nonreceptive seemings and argues that it is only the receptive seemings that have the property of justifying their contents and can transmit this property to the nonreceptive seemings. Receptive seemings are the personal level contents of perceptual states that are not the result of quasi-inferential transitions in perception and provide foundational justification for their contents. Nonreceptive seemings are the personal level contents that are the results of such transitions that, as such, have clearly propositional contents. The quasi-inferred nonreceptive seemings have ‘inference-like’ dependence on either other seemings or beliefs and, thus, can derivatively but not foundationally justify beliefs. A quasi-inferred seeming that P derivatively justifies a belief with content P if and only if its ‘basis’, that is, the receptive seeming from which the nonreceptive seeming is inferred, also justifies P. An experience is downgraded if the nonreceptive seeming is the result of an epistemologically speaking bad quasi-inference from a receptive seeming, the same way that the conclusion from a bad inference from a set of beliefs is ill-founded. Given that the inferred nonreceptive seeming is ill-founded, it cannot adequately support a belief with the same content. Two remarks are in order to understand better McGrath’s views. First, McGrath models quasi-inferences from receptive seemings to nonreceptive seemings on criteria attributed to inferences among beliefs. Hence, both the receptive and the nonreceptive seemings are mental states of the viewer and not states of a subpersonal system, and in order for the transition to take place in a rational way, the content of the receptive seeming should support or evidence the content of the nonreceptive state. It follows from this that both the receptive and the nonreceptive seemings have conceptual contents, a thesis also defended by Huemer (2013) and Tucker (2010). This results directly from the demand that there be a quasi-inferential relation between the two seemings. As McGrath (2013b, 242) states, we do not have a theory of the transitions

42     A. Raftopoulos

from states with NCC to states with conceptual contents, and a theory of what counts as epistemic norms governing them. Therefore, both seemings must have conceptual contents to allow for inferential relations between them. When the receptive state does not adequately support the nonreceptive state, the formation of the nonreceptive state on the basis of the evidence provided by the receptive state is unjustified and is a kind of jumping to conclusions, that is, a case of drawing a conclusion based on weak evidence. Second, McGrath relates the receptive seemings with the experience of low-level perceptual contents (colors, shapes, size, texture, etc.), and the nonreceptive seemings with the experience of high-level contents (being gold or mustard). Although McGrath talks about low-level and high-level experiences, it is not clear whether by this he is committed to the view that high-level contents are phenomenally present (in which case S does not see that X is a piece of gold but S sees a piece of gold, which is Siegel’s view), or to the view that they are only represented in the perceptual experience. McGrath’s account commits him only to the view that nonreceptive seemings can have high-level contents. It is not clear therefore that he is subject to the criticism that it is debatable whether high-level contents are phenomenally present in perceptual experience, as Long (2017, 8–9) claims. His account only requires that such contents be represented in the experience, and representational matters should not be confused with phenomenal matters (Speaks 2005). This does not mean to say that there is no controversy about whether high-level properties are represented in perceptual experience. High-level properties are conceptual in nature, in the sense that they are formed only when some concepts are activated. Thus, if they are perceptual properties, they are properties of a perceptual stage that is not conceptually encapsulated, such as late vision. Now, Siegel (2012) leaves open this possibility (except that she adds that they have a phenomenal presence), and Pylyshyn (2003, 2007) and Raftopoulos (2009) think that high-level properties may be represented in late vision, given that late vision has hybrid, nonconceptual, and conceptual contents. As we shall see in Chapter 5, other philosophers (Jackendoff 1989; Tye 1995) hold that such hybrid states are not perceptual states at all. For these philosophers, conceptual contents

1  Cognitive Penetrability and the Epistemic Role of Perception     43

cannot be part of the representational content of perception properly speaking. Suppose S believes on the basis of perception that O is gold. S’s seeming that O is gold (P) is quasi-inferred from S’ seeming that O is yellowish (Q). Thus, the former is a nonreceptive seeming, while the latter is a receptive seeming and the nonreceptive seeming is quasi-inferred from the receptive seeming. Since Q does not evidentially support P, in the sense that the quasi-inference from ‘yellowish’ to ‘gold’ is illicit, S’s seeming that P is unjustified and fails to justify, even derivatively, S’s belief that P. McGrath, therefore, distinguishes between a propositionally structured and non-doxastic seeming (seeming that P) (which is a conceptual content because “X is Gold” is conceptual content) that is quasi-inferred from a receptive seeming and a belief with the same content that is based on the corresponding seeming (believing that P). If the nonreceptive seeming is not justified, so is the corresponding belief. The picture that McGrath seems to have in mind is that perceptual beliefs are formed as follows. First, some receptive, personal level, perceptual states with contents (seemings) are formed. Second, nonreceptive perceptual states (seemings) with conceptual contents are formed in a quasi-inferential manner on the basis of some receptive seemings but they are not yet perceptual beliefs because they have not been yet endorsed by the viewer; they are subdoxastic occurrent thoughts if you will. Third, apparently in the absence of defeaters, doxastic states (beliefs) are formed when the content of the nonreceptive seeming is endorsed; recall that the quasi-inferred seemings have ‘inference-like’ dependence on either other seemings or beliefs and this allows for consideration of possible defeaters. McGrath’s receptive seemings seem to correspond to Siegel’s contents of pre-experiential perceptual states that are somewhat directly related to the distal scene, as they both precede the perceptual states that are ‘inferred’ from them, and form the basis on which the quasi-inferences take place. The receptive seemings have intrinsically the property of justifying their contents. More importantly, any irrationality in the perceptual etiology that downgrades perception is traceable back to the quasi-inferential process that outputs McGrath’s nonreceptive seemings or Siegel’s percept on which perceptual beliefs are based. Siegel and McGrath differ, in that Siegel’s pre-experiential

44     A. Raftopoulos

perceptual states are at a subpersonal level, while McGrath’s receptive seemings are at the personal level. This, however, does not preclude the possibility that McGrath’s receptive seemings may themselves be the result of a non-inferential processing on some pre-experiential perceptual states whereby they receive the property of justifying their contents. In addition, Siegel is noncommittal with respect to whether the pre-experiential states have NCC or conceptual contents. The reader should note that McGrath’s views are a significant elaboration on Siegel’s earlier views concerning the sources of the potential irrational etiology that CP engenders for perception. Initially, the source of the irrationality was an illicit perceptual inference that structurally al least parallels illicit discursive inferences in thought; the ‘Analogy thesis’: “it is possible in principle for an experience to depend on a desire, in ways that are structurally analogous to modes in which a belief that P depends on a desire, where the mode of dependence makes the belief illfounded” (Siegel 2016, 4). For McGrath, the inference to blame for the irrationality of perception is restricted to the way cognition intervenes and controls the contents of the receptive seemings in order to form the percept on which a perceptual belief is based; differently put, the illicit step is at the level of the quasi-inferential transition from the receptive seeming to the percept (McGrath’s nonreceptive seeming). The receptive seemings, by not being the product of a quasi-inference from the input, are epistemically blameless since the viewer plays no active, agentive role in their formation insofar as they are receptive. (The reader should draw the parallel with McDoweel’s [1994, 2011] discussion concerning the receptivity or distinctive passivity of perception, bearing in mind that for McDowell the distinctive passivity of perceptual experience must be understood as a conceptual shaping of sensory consciousness, whereas it is not clear whether McGrath would ascribe to the view that the pre-experiential contents are intrinsically conceptual.) This elaboration could alleviate some of the worries concerning whether the Analogy thesis holds throughout perception, by making it easier to argue for a parallel between perceptual quasi-inferences and discursive inferences. The task is easier because one need not search for a parallel between the perceptual processes that produce the receptive seeming and inferences in thought, since these processes are not responsible for the downgrade

1  Cognitive Penetrability and the Epistemic Role of Perception     45

of experience. This option, however, is not available to Siegel because she thinks that the processes that lead to the pre-experiential states do affect the epistemic status of the experience. I have said that I agree with Siegel that CP is the result of the responsive mode in which cognition affects perception. There are, however, differences between my views and that of Siegel’s. First, Siegel refers to the low-level perceptual states as ‘pre-experiential’ perceptual states, which entails that the contents of these states are at a subpersonal level, that is, the perceiver has no awareness of them. In my view, some of the contents of the states of early vision, Siegel’s low-level perceptual contents, are within the perceivers’ phenomenal awareness but outside the purview of their cognitive access consciousness. Second, the description in the fifth chapter of the way late vision tests hypotheses concerning the identity of a visual object in order to construct the percept suggests that even in the responsive mode there is selection of evidence, except that this time the evidence at issue is contained in the iconic image and not in the distal visual scene. I think, thus, that the difference between the responsive and the selective mode in which cognition may affect perception by controlling the evidence is better captured by pointing out that in the response mode there is selection of evidence that concerns information contained in the iconic image, whereas in the selection mode the selection concerns evidence contained in the environment. A further difference is that in the response mode it is attention that controls the way the evidence is handled or responded to in late vision. Since for Siegel, the responsive mode accounts for CP, I disagree with Siegel’s claim that CP is not the result of attention and that attention influences on perception are not cases of CP. Recall that Siegel (2013b, 240–241) distinguishes between CP and attention effects. The selection effects cause the selection of some features as opposed to some others selecting, thus, the input and co-determining this way the percept. This selection, however, does not affect the perceptual processing itself and this is why the selection effects that merely select the input to be perceptually processed should be excluded from being instances of CP. Since selection effects are the hallmark of the attentional effects on perceptual processing, attentional effects should not be considered cases of CP.

46     A. Raftopoulos

When CP occurs, on the other hand, subjects see, say, some pliers (the distal object) but they look to them (the percept) like a gun when they are given an appropriate prime. This is a genuine case of CP because cognition affects the way things look by influencing perceptual processing and not just by selecting the input to perception. In view of the way late vision functions to construct the percept, however, it is plausible that cognitive-driven attention is a decisive factor in the way the evidence provided by early vision is used in late vision, since attention controls the way hypotheses concerning the identities of the distal objects are tested. To put it differently, selection may well take place in late vision, where attention guides perceptual processes to test hypotheses concerning the identity of the distal object(s) by revisiting information contained in the iconic image; this, however, is a clear case of CP as the cognitive states affect the perceptual processes during the perceptual act and do not merely select the input. This is why I claimed in the introduction that the distinction between the responsive mode and the selective mode in which cognition can affect perception is not as clear-cut as Siegel suggests. This objection notwithstanding, Siegel is certainly right that even when selective effects do not constitute cases of CP, they may downgrade perception because they intervene in the way perception handles the available evidence and affect perception in ways that decrease its sensitivity to the data—for the different ways cognition through selection may downgrade perception, see Siegel (2016, 23).

3.2 Illicit Perception and Illicit Inferences Siegel tries to find an internalistic midway between phenomenal dogmatism’s view that the CP of perception does not affect perception’s justificatory role and the view that CP states do not provide any evidence at all, and suggests that CP downgrades the justificatory force of perception by diminishing it, which means that this force is not vanished altogether. In the same vein, Schellenberg, with some more externalist overtones, attempts to find a midway between the same opposite views

1  Cognitive Penetrability and the Epistemic Role of Perception     47

by arguing that the CP of perception undermines and, thereby, reduces the evidential force of perception, but does not abolish it altogether. Let us return to the reasons that render the percept ill-founded and downgrade perception and ask why does the selection in Jill’s wishful thinking case downgrade experience, while the selection that occurs as a result of perceptual learning is epistemically innocuous or even beneficial? As I said above, things are more complicated since Jill may be quick to pick up angry cues as a result of her perceptual learning with angry faces (suppose she was raised in such an environment). What does make some cases of perceptual learning innocuous and other cases of perceptual learning harmful? This is where Siegel asks us to proceed through the procedure introduced before, convert the indirect case of CP of perception to a direct case of belief formation in which a belief is based on some other beliefs, and decide whether this latter process leads to ill-founded or well-founded beliefs; as long as there is an irrational, epistemically bad, doxastic etiology structurally similar enough to an experience etiology, the latter is epistemically downgraded. In the Jack and Jill case, the corresponding doxastic etiology involves a case of belief perseverance, which, being epistemically bad, entails that the corresponding perceptual etiology is equally epistemically bad, and this renders Jill’s experience downgraded. This strategy seems, at a first glance, not to work here. As Jill’s prior belief that Jack is angry makes here search for perceptual clues confirming this belief while ignoring others disconfirming it, so an expert in snakes, to borrow Lyon’s (2011) example, when walking in the forest and believing that there are snakes around, looks for clues signaling the presence of a snake and ignores other, irrelevant to snake-ness clues in the environment. If one subtracts the intervening experience and forms the direct, discursive argument with an etiology structurally similar to the experiential case, one gets a structurally similar belief formation process in both cases. Similarly, what is the difference between Jill’s bad case and the case in which Jill’s unjustified prior belief makes her scrutinize Jack’s expression more carefully and discover that he is indeed angry (Ghijsen 2016, 1463; Lyons 2011, 295; Vahid 2014). In these cases, the corresponding doxastic inferences, which Siegel thinks is a case of ‘belief preservation’, are the same. Why is, then, Jill’s belief ill-founded in the

48     A. Raftopoulos

bad case but Jill’s belief in the second case or the snake expert’s belief are not? Lyons and I would claim that in the first case CP biases perceptual processing in favor of the viewer’s expectations, whereas in the second case CP facilitates pop-up of certain relevant patterns, facilitating, thus, the confirmation of the tested hypothesis about the identities of the distal objects, and speeding up object recognition. The difference between the two cases consists in that in the latter but not in the former case, prior knowledge increases the sensitivity to the environmental input rendering perception more reliable. This option, however, is not in general available to the internalist, although, as we saw, Siegel on several occasions seems to allude to this factor to explain why CP downgrades perception. Siegel (2013a, 716) acknowledges that the fact that in these cases the doxastic, discursive inferences and the corresponding perceptual inference share the structure of belief preservation does not suffice to render the two etiologies similar, and that belief perseverance by itself does not suffice for epistemic downgrade, since there is an irrational etiology in Jill’s bad case that downgrades experience, but not in the good case of Jill’s or in the case of the snake expert case and all three cases involve belief preservation. The question remains, then, where does the difference in structure come from and how is this related to CP that causes either the harmful bias or the facilitating bias? Siegel (2013a) attempts to explain the differences by claiming that the etiologies in the two cases concerning Jill are different because in the good case Jill’s theory of mind is improved owing initially to the unjustified belief and this leads her to understand Jack better and come to realize that Jack is indeed angry for some reason or other. In this case, the etiology of Jill’s experience in the good case but not in the bad case involves the intermediate step consisting in Jill’s being more sensitive to Jack’s posture. Furthermore, this additional step would also figure in the corresponding belief-etiology and this makes the belief-etiologies in the two cases different. One could point out, however, that in these cases both the unjustified prior belief or the improved theory of mind act on Jill’s perception by making her concentrate on the anger-correlated features of Jack’s facial features, which means that the perceptual etiologies are still similar.

1  Cognitive Penetrability and the Epistemic Role of Perception     49

The point is not only that taking recourse to the psychological correlates of the perceptual process and of the doxastic belief formation process in the aforementioned cases does not reveal any structural differences between the corresponding etiologies and, thus, cannot explain the fact that the first case is epistemically problematic, while the second case is not; a prior belief guides perceptual processing in all cases but it downgrades perception only in the first case. There is also the problem that it is not quite clear what the doxastic analogues of many perceptual processes could be. What is the doxastic analogue of pattern matching mechanisms that are abundant in perception? Or what is the doxastic analogue of processes whereby the pop-up of certain relevant patterns is facilitated? Our epistemic intuitions say that there is a difference but Siegel fails to reveal it since the same etiology underlies both cases. The problem is accentuated if one considers what transpires in late vision. The role of attention in affecting internally perceptual processing in late vision is an ubiquitous characteristic of everyday perception and, for the non-constructivists, these are not cases in which CP leads to an epistemic downgrade of the experience. However, the same perceptual processing occurs in all these cases as in cases of wishful thinking in which the percept is biased in favor of the viewer’s expectations; a hypothesis is tested and confirmed and, moreover, the corresponding direct doxastic argument is structurally similar in ordinary perception and wishful thinking. The difference lies in the fact that in the biasing scenario perception ignores recalcitrant information, while in everyday perception the irrelevant data are simply ignored. Information disconfirming a tested hypothesis is not an irrelevant datum and, therefore, perception is sensitive to it, which means that should such information be detected the perceptual system would reject the tested hypothesis; perceptual learning does not make one see things that are not there. A snake expert is more sensitive to snake clues and quicker to see snakes, but she does not see snakes where there are none. How does this difference translate to visual processing? How could perceptual processes subserve this analysis in a way that corresponds, at least structurally, to discursive inferences? Siegel (2016) attempts to alleviate some of these worries proposing certain similarities between the ways evidence is handled in discursive

50     A. Raftopoulos

reasoning and ways perception responds to evidence in relation to the influences that prior cognitive states and desires, may have on the handling of the evidence, thereby affecting the sensitivity of perception and of held beliefs to the evidence. Siegel (2016, 4) offers some observations that aim at bolstering the Analogy thesis: “It is possible in principle for an experience to depend on a desire, in ways that are structurally analogous to modes in which a belief that P depends on a desire, where that mode of dependence makes the belief ill-founded.” Siegel (2016, 20–21) proposes three ways in which CP may reduce the sensitivity of perception to the evidence by increasing the resistance to the pre-experiential perceptual states that carry this evidence. Notice, first, that thus far I have talked about the sensitivity of perception to the evidence as an index by which the reliability of perception could be measured since this reliability depends on how much perception reflects accurately the environment; the sensitivity of perception to the evidence ensures that perception reflects the environment (Siegel, as we have seen, also talks about sensitivity to the evidence). The discussion now is cast in terms of resistance to the pre-experiential states but this is a different way to make the same point, since the most common way in which this insensitivity to evidence manifests itself is by resisting incongruent evidence, incongruent either with a held belief or a perceptual hypothesis that has been formed, so that the subject retains her belief or forms the percept that conforms to that belief despite recalcitrant evidence that the belief or the percept should not be held or formed. Indeed, as Siegel (2016, 15) points out, resistance to evidence that is incongruent with a perceptual hypothesis or a belief means that the evidence does not control the percept or the belief since the subject (whether she perceives a visual scene and forms a percept, or argues from a set of premises and reaches a conclusion, the belief ) is not disposed in varying extents (which indicate the degree to which a mental state illfounds a percept or a belief ) to adjust the percept or the belief properly in response to the incongruent evidence. Since, intuitively speaking, the rational thing to do is to revise the perceptual hypothesis or the belief in view of recalcitrant evidence, when a subject does not do so, this introduces an element of irrationality.

1  Cognitive Penetrability and the Epistemic Role of Perception     51

Let us suppose that perceivers have an F-experience owing to their desire to see F-ness. The first way in, or dimension along, which resistance to pre-experiential states varies is the quantity of different incongruent stimuli to which a perceiver responds by retaining the F-experience. For example, the more resistant a perceiver is to the pre-experiential states bearing information concerning happiness or sadness expressed in the face of a person owing to the perceiver’s desire to see happiness expressed in the face of this person, the more expressions in the sad–happy facial expression range will be deemed to signify happiness. The second way in which a percept may resist information contained in the pre-experiential states concerns the extent to which the incongruent information that supports non-F-ness may be responded to in epistemically speaking inappropriate ways. Put differently, the more the evidence in the pre-experiential states supports non-F-ness, the more the perceivers’ resistance to it is, should they retain the percept. When the evidence concerning a facial expression clearly suggests that the person is not happy, if a perceiver retains a perception of happiness, the percept severely distorts the available visual evidence. If the incongruent information is not as strong as before, then if the perceiver retains the perception of F-ness, the resistance to the evidence is less. The third way in which the effects of a desire influence the percept concerns what happens when the perceiver for some reason scrutinizes the scene, and, yet, despite the incongruent evidence, she still perceives F-ness. The fact that a closer examination of the scene failed to make the perceiver consider the recalcitrant information and accordingly revise the percept shows the extent to which the desire ill-founds the percept. Siegel (2016, 22) argues that these three ways have natural analogues in discursive reasoning, where a belief is formed on the basis of the available evidence that provides the premises in the argument that issues the belief, and this supports the Analogy thesis. This, in turn, reinforces Siegel’s strategy for determining when a mental state ill-founds the percept rendering it irrational. For example, just as resistance to preexperiential states formed from information engendered by non-F distal objects is more extreme when perceivers are disposed to retain their perception of F-ness in response to non-F stimuli even when these stimuli

52     A. Raftopoulos

are far from being F, or rather when the information contained in the pre-experiential states that store information incoming from these stimuli is far from indicating F-ness, so too the resistance to evidence in discursive reasoning is more extreme when subjects are disposed to keep believing P even when the evidence is overwhelming against P. Let us assume that Siegel is right in her assessment of the ways a desire may undermine a discursive inference to some belief by controlling the evidence provided by the premises of the inference, although there is some consternation as to whether control by desire is actually ill-founding in wishful thinking (Long 2017, 12–18). Also let us put aside worries about whether even if successful Siegel’s strategy could be adopted by hardcore internalists; after all Siegel has made several concessions to externalism and, therefore, seems to have abandoned hardcore internalism even though she strives to maintain a basic tenet of it, namely that an account of the downgrade of perception owing to cognitive influences should be based on considerations concerning the rationality of the relevant perceptual inferences. A first problem in Siegel’s account is that it purports to strengthen the thesis that thought-like inferences are involved in perception by offering analogies between the way the evidence is handled in wishful thinking and the way the evidence (whether it be in the form of preexperiential states or the visual scene itself ) is handled in perception. However, as I succinctly argue in the next subsection and develop in detail in Chapter 5, there are not discursive-like inferences in perception of the sort that Siegel’s (and McGrath’s) account requires. Thus the analogy cannot get off the ground to begin with. Thus, even if Siegel is right (and for what it is worth, I think she is) that the downgrade of perception is due to the way the information in a visual scene (whether it be considered as evidence or as a justifier is not important because what matters is that perceptual states in general and experiences in particular are epistemically relevant to perceptual beliefs) is handled by the perceptual system, an internalist account, even an hybrid one like Siegel’s, that emphasizes illicit inferences cannot adequately explain this downgrade. Another problem in Siegel’s argument, related to the previous one, is that even if the analogy between resistance to the evidence contained in pre-experiential states and resistance to the evidence contained in the

1  Cognitive Penetrability and the Epistemic Role of Perception     53

premises of discursive reasoning is correct, this does not lend as much support to the Analogy thesis as Siegel thinks. The reason is that for the Analogy thesis to hold, the analogy must be at the level of mechanisms that subserve the transitions between the relevant states. Recall that the Analogy thesis posits that it is possible in principle for an experience to depend on a desire in ways that are structurally analogous to modes in which a belief that P depends on a desire. It follows that the analogy would hold only if the mechanisms that subserve the causal influences of desires or other cognitive states on perceptual processes and states were structurally analogous to the mechanisms that subserve the logical or epistemic relations between the premises/evidence and conclusion of a discursive argument that issues a belief. These mechanisms, however, are situated at the algorithmic or representational level of analysis and what Siegel provides us with is an analogy at the computational or cognitive level of analysis. The fact that one could describe the results of the function of the causal mechanisms through which desires affect perceptual states and their contents in a way that is analogous to the description of the results of the function of the mechanisms subserving transitions in discursive reasoning does not entail that the mechanisms themselves are structurally similar, just as the fact that a connectionist network could be described as performing algorithmic manipulations of data does not entail that the network functions by performing algorithmic transformations: the network performs algebraic and not algorithmic transformations. Similarly, while the state transformations occurring in the perceptual system could be to a certain extent recast as discursive, abductive inferences, this does not entail that perception works through such inferences. In the next section, I will argue that, indeed it does not.

3.3 Inferences in Perception Siegel (2015) addresses directly the problem of the sort of inference involved in perception. She draws from the work of psychologists like Helmholtz (1878/1925) and Rock (1983) to argue that perception involves inferences similar to those found in discursive reasoning

54     A. Raftopoulos

and, thus, that the percept is the result of an inference. The inferences ­characterizing discursive reasoning are also called belief-like inferences, or cognitive inferences, in that they require the presence of conceptual representations that encode the processing rules and the premises used in the inference (Hatfield 2002). These inferences may be inferences from low-level perceptual input (of the sort, for example, delivered by early vision) and stored assumptions (whether it be conscious or unconscious) to the perceptual experience (Clark 2013). Helmholtz (1878 [1925]) maintained that perception is a form of inference; the brain uses probabilistic knowledge-driven inferences to induce the causes of the sensory input from this input, that is, to extract from the bodily effects of the light emanating from the objects in a visual scene as it impinges on transducers the various aspects of the world that cause the input; the brain integrates computationally the retinal properties of the image of an object with other relevant sources of information to determine the object’s intrinsic properties. Rock (1983) claimed that the perceptual system combines inferentially information to form the percept. From visual angle and distance information, for example, the perceptual system infers and perceives size. This inference may be automatic and outside the authority of the viewer who does not have control over it, but is nonetheless an inference. Similarly, Spelke (1988) suggests “perceiving objects may be more akin to thinking about the physical world than to sensing the immediate environment.” The reason is that the perceptual system, to solve the underdetermination problem of both the distal object from the retinal image and of the percept from the retinal image, employs a set of object principles (the Spelke principles) that reflect the geometry and the physics of our environment. Since the principles can be thought of as some form of knowledge about the world, perception engages in inferential processes from some pieces of worldly knowledge and visual information to the percept, that is, the object of our ordinary visual encounters with the world. In all these views, the visual system constructs the percept in the way thinking constructs new thoughts on the basis of thoughts that are already entertained. Thus, vision is a cognitive, a thought-involving, process, cast perhaps in some form of a language of thought.

1  Cognitive Penetrability and the Epistemic Role of Perception     55

If perception is to be thought of as some sort of thinking, its processes must necessarily, first, include transformations of states that are expressed in symbolic or propositional form, and, second, these transformations must be inferences from some states that function as premises to a state that is the conclusion of the inference. That is to say, visual processes must be inferences or arguments, exactly like the processes of rational belief formation. These two conditions follow directly from the claim that perception is some sort of thinking, since the characteristic trait of thinking is drawing inferences (whether it be deductive, abductive, or inductive) operating on symbolic forms by means of inference rules that are represented in the system, although thinking is not reduced to drawing inferences this way. In view of these considerations, the principles guiding the transformations of perceptual states, that is, the principles (such as Spelke’s objects principles) acting as the inference rules in perceptual inferences, must be expressed in the system and, specifically, must be represented in a symbolic form. Whenever the system needs some of the principles to draw an inference, it simply activates and uses them. In addition, the premises and the conclusion of a visual argument may be represented in the viewer in a propositional-like, symbolic form. If these conditions are met, perception involves discursive inferences, that is, drawing propositions/conclusions from other propositions acting as premises by applying (explicitly or implicitly) inferential rules that are also represented in the system. Given the propositional/symbolic form of the format in which the states of the visual system must be represented if vision is akin to thinking, the contents of these states, that is the information carried by the states, consists of concepts that roughly correspond to the symbols implicated; it is conceptual content. If vision is some sort of thinking, therefore, its contents must be conceptual contents. This means two things. Either the visual circuits store conceptual information that they use to process the incoming information, or they receive from the inception of their function such information from the cognitive areas of the brain while they are processing the information impinging on the retina. I have argued that early vision has NCC (Raftopoulos 2009, 2014) that is, thus, nonsymbolic in nature. This undercuts the road

56     A. Raftopoulos

for Siegel’s pre-experiential information, or McGrath’s receptive seemings to act as premises or evidence to a propositionally structured perceptual belief, which belief is being derived from this evidence through some inference that is structurally analogous to a discursive inference, if Siegel and McGrath intended the pre-experiential states or the receptive seemings, respectively to have NCC. Second, in Chapter 5, I argue in detail that no perceptual stage involves discursive inferences that Siegel’s account of the epistemic role of perception requires. Specifically, the processes in late vision are not discursive inferential processes, where discursive inferences should be distinguished from “inferences” as understood by vision scientists according to whom any transformation of signals carrying information according to some rule is a form of inference. I also argue that another notion of inference, advocated by Cavanagh (2011), which makes use of the various assumptions employed by the perceptual system, falls short of being a discursive inference. It follows that the main edifice on which Siegel bases the entire enterprise of inferentialism rests on a very shaky ground to begin with and Siegel has failed to provide us with a persuasive argument for her Analogy thesis.

4 Externalism: Perceptual Justification vs. Perceptual Grounding One of the theses underlying phenomenal conservatism is the view that perceptual states are given and perceivers are not actively or agentively implicated in the perceptual processes and, thus, are not responsible for their perceptual contents. In discursive inferences, explicit or implicit, agents can go beyond, or ignore (part of ) the evidence expressed by the premises of the inference and reach the wrong conclusion, for which the agents are to blame since they bear the responsibility for going beyond or ignoring the evidence available to them. Since perceptual states and their contents are given, however, none of the above applies. In perception, perceivers cannot go beyond or ignore the evidence. Even if one

1  Cognitive Penetrability and the Epistemic Role of Perception     57

thinks that the perceptual system operates by filling up gaps in the available evidence, owing to the many indeterminacies that arise during its processing at all levels, and, thus, performs some sort of abductive reasoning, this reasoning is automatic and outside the epistemic purview of the perceiver. As such, it has no bearing on the perceiver’s rationality as far as the handling of the evidence of the senses is concerned. If one adds to this consideration the intuitive, at least on the part of internalists, view that the evidence on which a perceiver bases prima facie a perceptual belief, or the reasons perceptual experience provides a perceiver for the perceptual belief caused by the experience, consists in the way the experience presents to the perceiver the world as being, that is, in its phenomenal content/character, it follows that the justificatory potential and force of a perceptual state (that is, the range of perceptual beliefs this state can justify, and the degree to which it does so) depends solely on the phenomenal content/character of the experience; this sort of evidence is called phenomenal evidence. The phenomenal character of the experience seems to be an intrinsic property of it, which means that the phenomenal character of an experience is constitutively independent of what the experience is about and, hence, of the intentional object of perception, although, of course, there is a causal relation between the object of the experience and the perceptual state and its phenomenal character. This is why, on the internalistic view, one can have different experiences (experiential states with different phenomenal character) of the same object, or one can have experience with the same phenomenal character of different objects. A consequence of this notion of phenomenal evidence is that perceptual states with the same phenomenal character have the same justificatory potential and force. In view of the characteristics of the phenomenal character of experience, the justificatory force and justificatory potential of an experience is independent of the intentional content of the experience; the beliefs an experience can justify do not depend on what the experience is about but only on how the experience is phenomenally taken in. A veridical, an illusory, and a hallucinatory state that present to the subjects in these states that “a is F” justify equally well the belief that “a is F” formed on these different occasions

58     A. Raftopoulos

(always taking into consideration that we are talking about matching contents and not the same contents, since perceptual contents and belief contents are most likely of different sorts). Suppose that in reality “a is F” but for some reason, another object b that is also F seems to perceiver S to be the object a, which means that S is in a perceptual state whose phenomenal character is “a is F” (let us call it the illusory experience). Note that the intentional content of the illusory experience is “b is F” since the experience is about b independent of the fact that S mistakes (perceptually) b for a. Let us suppose that the illusory experience causes S to form the belief that “a is F”. On another occasion S has a veridical experience with content “a is F” and forms the belief that “a is F”. According to phenomenal conservatism and for the reasons explained above, both the belief that is formed on the basis of the illusory experience and the belief that is formed on the basis of the veridical experience are true and equally well-justified, because on both occasions S had two perceptual states with the exact same phenomenal character (two different tokens of the same type of state) and, thus, with the same justificatory force and potential.9 This view of phenomenal evidence presupposes a conception of perception according to which the fact that S sees a as F does not guarantee that a is indeed F or even that an a is involved, which means that seeing a as F is noncommittal as to how things really are. S could have an experience with the same phenomenal character even if a is G or even if b but not a is F, or even if there is no a. Phenomenal conservatism simply adds to this the view that despite being noncommittal, perceptual states can justify perceptual beliefs by providing evidence for them; “experiences conceived in the non-committal sense, can be justifiers” as Millar (2011, 340) remarks while proceeding to explicate in which sense this could be true. In addition to the view that the evidence a perceptual state provides is phenomenal evidence, there is also the view that perceptual

9Not all philosophers agree with this assessment. Byrne (2014, 104, 109), for instance, argues that a hallucinatory state with content p does not rationally support the belief that p.

1  Cognitive Penetrability and the Epistemic Role of Perception     59

experience yields evidence of another sort, which relies on the representational content10 of the experience, that is, the way the perceptual state represents the world as being. Recall from the introduction that the debate concerning the sort of evidence a perceptual state yields is interlinked with the views about the sort of content that a perceptual state has. For the internalist, perceptual content is intrinsic to the viewer and does not constitutively depend on the viewer’s relation to the environment; the environment is causally implicated in the formation of this content but plays no constitutive role in it. For the externalist, the perceptual content is extrinsic, that is, it constitutively depends on the viewer’s relation to the environment. For some externalists, the representational content of perception includes both phenomenal content, which is the phenomenal character (or part of it) of the relevant perceptual experience and also a different kind of representational content, the externalist content, which depends constitutively on the perceptual relation of the viewer to the external world. As we had the opportunity to discuss above, this is the traditional distinction in the Philosophy of Mind between narrow and wide contents. The sort of evidence that wide content provides is called ‘factive evidence’ (Schellenberg 2013). Accordingly, a perceptual state causes a perceiver who is in this state to believe the representational content of the perceptual state. The difference between the phenomenal evidence and the factive evidence provided by the same perceptual state reveals itself in the case of veridical hallucination when compared with a veridical perceptual state from which it is subjectively indistinguishable. The two states provide the same phenomenal evidence because they have the same phenomenal content. The two states, however, do not have the same representational content; the veridical perceptual state represents some worldly state of affairs, while the hallucinatory state does not represent a worldly state of affairs; this entails that the veridical perceptual state provides factive evidence, while the hallucinatory state does not.

10Burge

(2010), Raftopoulos (2009), Schellenberg (2011), Siegel (2010), Smith (2002), Tye (2000) among many others, argue that perceptual states have representational contents.

60     A. Raftopoulos

If one moves from illusory states to perceptual states that are CP, the epistemological analysis of the justificatory role of perception under the light of phenomenal conservatism is the same. Both a CP experience and a non-CP experience with the same phenomenal character have the same justificatory force and potential and justify equally well the same beliefs. If Jack is really angry and Jill perceives him as such and forms the belief that Jack is angry, this belief is equally well-justified as when the same belief is caused by a CP perceptual state of Jill’s that presents Jack as angry, where Jill’s belief that Jack is angry has penetrated her experience and made Jack appear angry to Jill even though in reality Jack is not angry. We saw that many philosophers feel uncomfortable with this analysis because they think that the non-CP perceptual state and the CP perceptual state cannot have the same justificatory force; CP surely undermines this role and downgrades perception. Siegel, Tucker, Markie, McGrath, and others attempted to address these worries from within the internalist camp, that is, while retaining the thesis that the justificatory force of an experience depends crucially on the way the experience presents the world to the perceiver, and on the inferences from this presentation or seeming to perceptual beliefs, to which they added additional requirements in order to explain why in cases of CP these requirements may not be met allowing, thus, the claim that CP does, on certain occasions, downgrade perceptual experience. Our assessment of Siegel’s work, however, suggests that unless one moves to incorporate into the discussion externalistic elements, such as the veridicality of perception, the reliability of perception, and the sensitivity of perception to the data, it is difficult to be found on a solid epistemological basis the intuition that a CP experience and a non-CP experience have different justificatory forces because CP downgrades perception. Let me note that henceforth I discuss cases where CP downgrades perception, but CP may also enhance the epistemic role of perception (as the cases of perceptual learning characterizing experts make clear). Once externalistic notions such as the sensitivity to the data and the veridicality of perception are allowed, one can revisit the analysis of the justificatory role of hallucinatory, illusory, and CP experiences provided

1  Cognitive Penetrability and the Epistemic Role of Perception     61

by the phenomenal conservatist and show its shortcomings. Even if one accepts that the belief that “a is F” that is based on the illusory perceptual state that “a is F”, where S mistakes b for a, is prima facie justified by the illusory state and, in fact, it is equally justified as the same belief that is based on a veridical perceptual state that “a is F”, one could still insist that there is an important difference between the two cases because only the veridical perceptual state grounds the belief. A similar claim can be made with regard to CP perceptual experiences. In fact, Siegel’s analysis aimed to establish a sense in which the beliefs that CP perception justifies are not well-founded. Sosa (2011) argues that a veridical and an illusory perceptual state with the same phenomenal character may in some sense justify the same belief but only the former grounds the belief because only a veridical perceptual state could ground a belief. Sosa (2011, 479) starts his analysis by claiming that there are two logically independent conceptions of the veridicality of perception: (a) an experience is veridical when it does not mislead a perceiver, and (b) an experience is veridical if the phenomenal character of the experience resembles the reality experienced, in which case there is a match between the phenomenal character of the experience and its intentional content. To explicate Sosa’s views, let us return to the illusory perceptual state that presents a as being F to a viewer, where a is indeed F but the viewer perceives in reality b (which is F) and perceptually mistakes it for a. The belief that “a is F” based on the illusory state is not well-grounded because it is not “in virtue of being based on that experience that the acquisition of that belief was the acquiring of a true belief ” (Sosa, ibid.). The illusory experience does not ground, although it justifies, the true belief because the intentional content of the illusory experience (which is that b is F since the object in the visual scene is b and not a) differs from the content of the belief (which is that a is F), which means that the viewer forms the belief for the wrong reasons since the content of the acquired belief is not based on the intentional content of the experience. In this sense, the illusory state misleads the viewer into believing something that she should not believe. This, according to the first conception of veridicality espoused by Sosa, entails that the illusory perceptual state is not veridical and cannot ground a belief.

62     A. Raftopoulos

In addition, the intentional content of the illusory experience (“b is F”) also differs from the phenomenal character of the experience (which presents a to be F), which means that the phenomenal character of the experience does not resemble the experienced reality, and, in turn, that the illusory experience is non-veridical according to the second conception of veridicality. For this reason, the illusory perceptual state does not ground the belief whose content matches the phenomenal character of the perceptual state, because the belief is not grounded in reality but through an illusion. None of these applies to the case when a viewer perceives a to be F, where a is the intentional object of the perceptual state. In this case, the experience is veridical in both senses of the term and the experience grounds the belief, in addition to justifying it. Moving to CP states, the analysis of the difference between the epistemological roles of a CP experience and of a non-CP experience is the same. The CP experience may justify a belief (if one wishes to back the intuition according to which the justificatory role of a perceptual state is exhausted by its phenomenal character) but it does not ground that belief for the same reasons that an illusory state does not ground a belief. Jill’s belief that Jack is angry due to her CP experience is not grounded in her experience because, first, the content of her belief does not match the intentional content of her experience (which is that Jack is not angry) and, therefore, it is not on account of her experience that she forms the belief; she forms the belief for the wrong reasons since the content of the acquired belief is not based on the content of the experience. The experience misleads Jill to form the belief that Jack is angry, which entails that the experience cannot ground the belief. Second, there is not a match between the phenomenal character of Jill’s experience and its intentional content, and the phenomenal character of the experience does not resemble the reality experienced. These problems do not afflict the case of the non-CP perceptual state that justifies the same belief as the CP perceptual state and, thus, the non-CP perceptual state grounds, in addition to justifying, the corresponding belief. The difference between the epistemological role of a CP experience and a non-CP experience is not in their justificatory role (this is the same) but in their grounding role; only the latter can ground beliefs.

1  Cognitive Penetrability and the Epistemic Role of Perception     63

An appeal to a notion of grounding to avoid some of the problems besetting phenomenal conservatism’s thesis that a CP perceptual state and a non-CP perceptual state confer the same degree of justification to the same belief, is also adopted by Brogaard (2013). Brogaard (2013, 279) defines ‘content grounding’ as follows: A seeming of the form [It seems to A as if q] is grounded in a content p of a particular perceptual, introspective, or memory-related experience e had by A iff [reliably (if p is a content of e, then it seems to A as if q) and reliably (it seems to A as if q, then q)]

Having defined content grounding, Brogaard (ibid.), proceeds to define what she calls Sensible Dogmatism, according to which: If it seems to S as if p and the seeming is grounded in the content of S’s perceptual, introspective, or memory-related experience, then, in the absence of defeaters, S thereby has at least some degree of justification for believing that p.

and A seeming to S that p provides prima facie justification only if it is grounded in the content of S’s perceptual, introspective or memory-related experience.

Let us restrict our discussion to perceptual experience. Note first that for Brogaard, a seeming provides prima facie justification immediately, i.e., it serves as an immediate justifier of beliefs and, as such, does not need supporting background beliefs (Brogaard and Gatzia 2017, 1). The definition of content grounding suggests that Brogaard attempts to establish a connection between the representational content of perceptual experience, what the experience is about, and the appearance related to this experience, which reminds us of Sosa’s (2011) demand that a veridical experience that justifies to a belief is also one that grounds the belief when the phenomenal character of the experience resembles the intentional content of the perceptual experience and,

64     A. Raftopoulos

thus, the experienced reality. Brogaard reserves the capacity to justify beliefs only to those experiences in which the phenomenal character of the experience is grounded in the content of the experience. Brogaard goes beyond Sosa in that while Sosa invokes the notion of ­‘resemblance’ between the phenomenal character and the experienced reality to explain ‘grounding’, Brogaard defines the grounding of the seeming in the content of the experience by using the modifier ‘reliably’ to connect the seeming related to an experience with the content of that experience. It is easy to see that Sensible Dogmatism escapes the problems concerning cases of beliefs justified by CP experiences that beset phenomenal conservatism. Consider a snake expert and a novice. The snake expert sees a certain shape in a distance, which is in fact a snake, and because of her experience with snakes sees a snake and forms the belief that a snake is present. Suppose that the novice is afraid of snakes and by seeing the same shape at a distance her fear penetrates her perceptual processes and, as a result, she sees a snake and forms the belief that a snake is present. According to phenomenal conservatism, both beliefs are equally well-justified. If one applies the grounding criterion, however, one gets different results. In the case of the expert, the content of the expert’s experience grounds the seeming because there is a reliable connection between the contents of the experience of snakes each time the expert sees a snake and the relevant seemings and, furthermore, there is a reliable connection between the seeming and the experienced reality. Things differ in the case of the novice. There is not a reliable connection both between the seeming and the experienced reality and between the content of the experience and the seeming, because on another occasion in which the novice has no reason to expect snakes and, as a result, her fear does not penetrate her perceptual processes, she would not see a snake. The seeming will be different in this case because the lack of expertise prevents the novice from detecting the snake. It seems, thus, that the justificatory role of perceptual experience cannot be exhausted by its phenomenal character and more is needed to account for the support that perception provides to beliefs that are formed on its basis. This additional element is the extent to which the

1  Cognitive Penetrability and the Epistemic Role of Perception     65

CP experience reflects the world; whether its phenomenal character resembles, to use Sosa’s (2011) term, the experienced reality because only then does it make sense to talk about perception’s sensitivity to the facts and perception’s reliability. This demand is similar to the demand that perception can justify or ground beliefs only when it is sensitive to the data that are used as evidence in perceptual processes on the assumption that the perceptual processes transform states to other states where the transformed states constitute the evidence on which the transformation operates. Perception should justify or ground beliefs only when it is sensitive to the environmental data that serve as input to perception because this is the only way perception could be said to reflect more the world and less the cognitive and emotive states of a perceiver. Seen in this light, the problem with CP experiences is that they may limit the sensitivity to the data making perception to conform more to the penetrating states rather than to the world. In view of the way perception functions, the sensitivity to the data is determined by two factors: (a) what information gets into the transducers and is processed by the early visual system (I have called the information that early vision delivers the iconic image/stimulus), and (b) the way this iconic image is explored by late vision to which it serves as an evidential or support or grounding basis in order for perception to construct the percept, that is, the way the visual scene is presented to the perceiver. These two factors entail that the sensitivity of perception to the data depends on how well both the environmental input and the iconic image are handled by the perceptual processes of early vision and late vision, respectively. Thus, unlike the main underlying assumption of phenomenal dogmatism, perceptual experience is not passive and the percept is not simply given; it is, instead, constructed and during the construction process the visual system uses various pieces of evidence and in various ways. When CP occurs, the visual system may mishandle the data in the incoming environmental information, should CP affect early vision whose role is to retrieve information from the visual scene. (The viewer may also ignore relevant information by selecting the input focusing attention accordingly, but, as we have seen, this is not a case of CP and I will not pursue this possibility any further.) It could also mishandle

66     A. Raftopoulos

the information contained in the iconic image should late vision be CP, since it is in late vision that hypotheses concerning the identities of the distal objects are tested against the information contained in the iconic image. It follows that viewers may either ignore relevant evidence in the environment if it does not conform to what they expect or believe, or ignore and perhaps go beyond the available evidence in the iconic image to supplement it in a top-down way either with visual information that does not exist in the iconic image but is activated top-down, or with nonvisual information that does not conform to the information contained in the iconic image. It remains true, however, that the perceiver has no control over these processes and is not responsible for any mishandling of the evidence; hence, it cannot be faulted when CP downgrade perceptions. Still, CP does downgrade perception. The externalistic factor of the sensitivity to the data and of the reliability of perception is at odds with another assumption underlying phenomenal dogmatism, namely the thesis that the “normative status of a belief formed on the basis of an experience is logically independent of what the experience is about” (Sosa 2011, 476). Indeed, externalism requires, and internalism denies, the thesis that the justification of a belief by a perceptual state depends constitutively on truth-related factors. This is so because the sensitivity to the data that is required in order for a perceptual state to be able to justify or ground a belief depends on the extent to which the perceptual state that purports to ground the belief is a state whose content resembles the experienced worldly state of affairs. A phenomenal dogmatism could acknowledge the role that the sensitivity of perception to the data plays for the reliability of perception, but insists that considerations pertaining to the normative status of the justification experience can provide to beliefs should be distinguished from considerations concerning claims to knowledge that experience may sanction, because the former is exhausted by the phenomenal character of the experience, which is intrinsic to it and logically independent of the distal cause of the experience. It seems that everything hinges on whether one is willing to bite, as conservative phenomenalism is, the bullet that a CP experience justifies prima facie a belief equally well as a non-CP experience with the same phenomenal character, and that

1  Cognitive Penetrability and the Epistemic Role of Perception     67

independent of whether the perceiver is aware or not of the CP. This, in turn, depends on one’s construal of ‘justification’. We have seen that a notion of perception grounding beliefs is introduced that is inextricably linked with the reliability of perception, and which is more potent than the notion of simple justification in that it involves the reliability of perception in representing accurately the environment. In this view, a CP experience may justify beliefs to the same extent as a non-CP experience with the same phenomenal character, but only the non-CP experience grounds the belief. Schellenberg (2013, 2014, 2016a, b) proposes an account of perceptual evidence that allows internalists to retain the thesis that the phenomenal character of subjectively indistinguishable perceptual states justifies, or provides (phenomenal) evidence for, those beliefs whose contents match the phenomenal character (Schellenberg 2014, 97), but at the same time does not face the problem inflicting phenomenal conservatism that is at issue in our discussion. This is so because Schellenberg’s account can accommodate, if appropriately extended, the view that a CP experience does not justify equally well, or does not provide the same amount of evidence for, a belief as a non-CP experience does to the same belief, and it can also explain why a CP experience can mislead one into espousing a belief that is wrong despite the fact that the phenomenal character of the experience justifies this belief. What allows Schellenberg (2014, 97–98) to accomplish this is a distinction between the phenomenal evidence provided by the phenomenal character of a perceptual state in which the perceptual capacities are put in good use, and the phenomenal evidence provided by the phenomenal character of an experience in which the perceptual capacities fail to accomplish their function, as in hallucinatory or in illusory states. In the former case, the phenomenal evidence puts someone in an epistemic position in which they may know the truth, while in the latter case the evidence, although tangible, is misleading. This distinction is made possible by the way Schellenberg (2014, 90) views the sensory character of a perceptual state, which she contrasts with the standard view. According to the standard view, the sensory character is constituted by awareness relations to worldly (external, mind-independent) particulars (whether it be entities, instances of properties or relations, or

68     A. Raftopoulos

events). Accordingly, in hallucinations where no such entities exist the awareness relation that constitutes the sensory character of a hallucinatory state is a relation to entities such as intentional objects, clusters of properties, or qualia. Schellenberg proposes to understand the sensory character of a perceptual state in terms of employing certain perceptual capacities that function to differentiate and single out particulars in the environment. According to Schellenberg (2014, 91–92), these capacities are lowlevel mental capacities realized in early vision that allows a proto-conceptual analogue of referring to worldly particulars, in the sense of perceptually singling out particulars that do not require any conceptual capacities. (Raftopoulos and Muller [2006] have argued that early vision parses and individuates objects in a visual scene in a nonconceptual way, and they have called this function of perception through which reference to objects in the world is secured, nonconceptual perceptual demonstrative reference.) Schellenberg (2014, 90) proposes that in mental states in which perceptual capacities are employed but fail to individuate particulars because these particulars are absent (as when one hallucinates or has illusions), these capacities are baseless. Schellenberg (2014, 97) argues that a perceiver could have phenomenal perceptual evidence for some belief even when the perceptual capacities that individuate the sensory character of the perceptual state are being employed baselessly. Thus, if someone hallucinates, one has phenomenal evidence for a belief that matches the phenomenal character of their hallucinatory state, evidence that is tangent but misleading. Although Schellenberg (2014) does not discuss in detail illusions except to include illusions among the bad cases one can easily extrapolate; in an illusion it seems to perceivers that there is a property instantiated in their environment, which is not in fact instantiated where it seems to them to be. In this case, the phenomenal character of the perceivers’ experience may mislead them into forming the belief that this property is instantiated where it seems to them to be, while in fact it is not. One can assume that in cases of illusions, the perceptual evidence the illusory states provide on account of their phenomenal character is misleading because the perceptual capacities are baseless not in the sense that the objects, as they figure in the

1  Cognitive Penetrability and the Epistemic Role of Perception     69

phenomenal character of the perceptual state, these perceptual capacities individuate, are absent, as in hallucinations, but in the sense that the instances of the properties, as they figure in the phenomenal character of the illusory perceptual state, are absent. Schellenberg (2013, 2014, 2016a, b) argues that the reason that a person who hallucinates has some justification, in the form of phenomenal evidence in believing the proposition that matches the content of the hallucination, is exactly because in hallucinations the same capacities that are at work in perception operate. In the hallucinatory case, however, they fail to single out particulars in the environment. Thus, the justificatory force of hallucinations depends, and in this sense is parasitic, on the justificatory force of perception that is due to the discriminatory capacities that characterize perception—this is, as we shall discuss in some detail in a while, the thesis of the explanatory and metaphysical asymmetry between hallucinations and perception. Schellenberg (2014, 98) notes that even though her views are externalistic, they do entail reliabilism, but are certainly compatible with it. Schellenberg (2016a, b) insists that her account is neither phylogenetic nor ontogenetic (and, thus, does not require an explanation of why perception has the discriminatory capacities that it has, and of how these capacities make perception successful in allowing perceivers to interact successfully with their environment), but makes a metaphysical point about perception. Nevertheless, one should bear in mind that according to Schellenberg, the phenomenal evidence afforded by the phenomenal character of a perceptual state in which the perceptual capacities function well and perform successfully their function to individuate particulars in the visual scene is such that the “notion of phenomenal evidence makes room for the idea that having evidence is a matter of being in an epistemic position that is well situated to tell us the truth” (Schellenberg 2014, 98). It could hardly be otherwise, since Schellenberg’s analysis of perceptual states hinges on the notion of certain perceptual capacities as defined above. These capacities have the function to perform certain tasks, namely to individuate particulars in a visual scene. Even if one does not wish to explain these capacities in evolutionary terms and adopts a minimal notion of a natural function, it is difficult to see how

70     A. Raftopoulos

one could explain the natural character of the function without appealing to the function being reliable, and thus, yield perceptual states that are, approximately, veridical. To put it differently, Schellenberg sooner or later in her argument needs at least a notion of normal function if she wishes to explain the perceptual capacities that characterize perception in the good case. And, as McGrath (2016, 9021–902) notes, such an explanation, for an externalist, has at some point to rely either on standard reliabilism or in some Burgean form of it according to which the justificatory force of perception or its capacities is grounded not in the reliability of perception or its capacities per se, but, rather, in the reliability in conditions that explain why one has perception or the capacities that go with it. Moreover, Schellenberg (2013, 2014, 2016a, b) acknowledges that the externalistic character of her notion of perceptual states emanates from the fact that these states are individuated by means of perceptual capacities that have a specific proper function. Specifically, perceptual states are individuated by the discriminatory capacities of perception, which, in turn, are individuated by the types of the mind-independent particulars that they function to single out in a visual scene (Schellenberg 2016a, 878). This individuation entails that a perceptual state and its phenomenal character constitutively involves external, mind-independent objects and property-instances, because these capacities are themselves analyzed in terms of perceptual relations to external, mind-independent objects and property-instances. Thus, perceptual states and their phenomenal characters are inherently related to the objects and instances of properties that the perceptual capacities individuate when they function successfully. This view goes against the notion of phenomenal content as an intrinsic property of perceptual experience that is logically independent of the perceiver’s environment and is, thus, an externalist factor. Schellenberg (2014, 94) argues that There is a metaphysical primacy of the good over the bad case insofar as one can possess the discriminatory, selective capacities employed in the bad case only in virtue of being the kind of being that could employ

1  Cognitive Penetrability and the Epistemic Role of Perception     71

those very capacities in the good case. Call this the metaphysical primacy thesis. (The metaphysical primacy thesis)

This thesis entails two counterfactuals. First, if one possesses a perceptual capacity to individuate particulars in a visual scene, then, ceteris paribus, one would be able to individuate a certain particular, were one perceptually related to such a particular. Second, if one possesses such a capacity, one would fail to individuate some particular, if one was not perceptually related to such a particular. The last counterfactual means that if a property-instance is not present in a visual scene, and therefore a perceiver is not perceptually related to this property then, when the perceptual capacities function normally, a perceiver should not perceive this property. Schellenberg (2014) uses the metaphysical primacy thesis to support the view that when persons hallucinate, they do have phenomenal evidence for the belief whose content matches the phenomenal character of the hallucinatory state. The reason is that when someone hallucinates that there is a white cup in front of them, even though the hallucinating subjects fail to single out any white cup, they are in a sensory state that is as of a white cup in virtue of employing the capacity to discriminate and single out white from other colors and cup-shapes from other shapes. These considerations bring us closer to applying Schellenberg’s notion of ‘misleading evidence’ that results from the phenomenal character of perceptual states in which the perceptual capacities are employed baselessly to cases of CP of perception. Take Jill’s belief that Jack is angry, which penetrates her perceptual processes and partly causes a perceptual state that presents Jack as being angry, which in turn causes Jill to form the belief that Jack is angry. Since the content of the belief matches the phenomenal character of her perceptual state, Jill exercises her epistemic skills correctly and, thus, as phenomenal dogmatists and Schellenberg think, Jill’s perception justifies the belief that Jack is angry or, equivalently, the phenomenal character of the perceptual state provides Jill with phenomenal evidence for her belief; it is rational for Jill, after all, to heed the testimony of her perceptual states since Jill’s perceptual states are systematically linked to what they are of in the good case

72     A. Raftopoulos

(veridical perception) since the perceptual capacities employed in the bad case (to which I add cases of CP perception because CP misleads Jill into forming a wrong belief ) are explanatorily and metaphysically parasitic on their employment in the good case; in virtue of this systematic linkage, it is rational for Jill to heed the testimony of her perceptual states even when they are, unbeknownst to her, CP (Schellenberg 2014, 94–95). Schellenberg’s account allows us to treat cases of CP of perception as bad cases, because when perceptual processes are CP, the perceptual capacities employed do not function properly and fail to pick up the particulars that really exist in the visual scene. In Jill’s case, they pick out an instance of the property ‘being angry’, but Jack is not angry; the perceptual capacities are baseless. The explanatory and metaphysical asymmetry between the good and the bad cases makes possible the claim that bad cases provide phenomenal evidence for beliefs, but this evidence, albeit tangible, is misleading. Thus, there is some phenomenal evidence11 but this evidence is not as good as the phenomenal evidence perception provides in the good cases, though Schellenberg does not discuss how weak or strong is the evidence in the bad cases. Schellenberg (2013, 2014) also discusses the notion of factive evidence. Schellenberg (2014) argues that a perceiver who is in a veridical perceptual state and a hallucinating subject who has a veridical hallucination with the same exact phenomenal character as the perceiver’s perceptual state have the same phenomenal evidence, albeit the evidence of the hallucinating subject is misleading because the perceptual capacities employed in hallucination are baseless while the evidence of the perceiver eventually allows her to know something about the world. Unlike phenomenal evidence that is constituted by the phenomenal content or character of the experience (whether it be perceptual or hallucinatory), factive evidence depends on the representational content of the experiential states, which entails that the veridical perceptual state and the (phenomenally, subjectively, indistinguishable) hallucinatory state do

11“I will argue that we have at least some evidence provided directly through experience regardless of whether we are perceiving, hallucinating, or suffering an illusion” (Schellenberg 2013, 700).

1  Cognitive Penetrability and the Epistemic Role of Perception     73

not provide the same factive evidence, or, in fact, that only the perceptual state provides factive evidence, while the hallucinatory state does not provide any factive evidence. This is so because only the veridical perceptual state has representational content. Schellenberg (2016a, b) elaborates further on the sort of contents that are related to factive evidence and phenomenal evidence. Factive and phenomenal evidences are related to two levels of perceptual content. The former is individuated by the factive content of perception, which is a token content, in the sense that it is a singular proposition that makes specific reference to some particulars in the environment. The latter is individuated by a type of content, which is a gappy proposition in that it has the form , where MOPo(_) is a gappy, object-related de re mode of presentation, and MOPf(_) is a gappy property related de re mode of presentation (Schellenberg 2016b, 944). Let me pause for a while to elaborate on the notion of de re mode of presentation. This will give us also the opportunity to relate these modes of presentation with the discriminatory capacities that play such a crucial role in Schellenberg’s work and make a better sense of what Schellenberg has in mind when she claims (Schellenberg 2014, 91–92) that the perceptual capacities are low-level mental capacities realized in early vision that allows a proto-conceptual analogue of referring to worldly particulars, in the sense of perceptually singling out particulars requires no conceptual capacities. As I said in the introduction, I have argued (Raftopoulos 2009) that in perception we are acquainted with objects in the sense that one is in direct (that is, without any conceptual intermediaries) contact with the object itself and retrieve information about that very object from the object itself and not through a description that would individuate or identify the object, and thus, by depicting it would secure reference to it. On that account, perception puts us in a de re relationship with the object (as opposed to a descriptivist relationship), and the ensuing perceptual judgments are de re judgments; when one forms a de re belief, one stands in “appropriate nonconceptual, contextual relations to objects the belief is about” (Burge 1977, 346). In perception, the reference of perceptual demonstratives is fixed through the nonconceptual

74     A. Raftopoulos

information retrieved directly from the environment in early vision that allows one to individuate the objects in a visual scene by assigning to them object-files. This nonconceptual information constitutes the nondescriptive mode of presentation of the demonstratum in perception. Thus, the NCC of the perceptual states functions as a de re mode of presentation of the object in perception. This allows one to be in a position to make de re judgments or form de re beliefs about the world. Perception individuates or singles out objects in a visual scene by assigning them object-files based on spatio-temporal and, occasionally, other featural information. The content of an object-file, which is not encoded but is only momentarily used and cannot be reused—and thus is not descriptive—is the nondescriptive, de re mode of presentation of the object/demonstratum in perception. When one perceives an object, the natural analogue in the perceptual act of the term “that,” which occurs in the linguistic expression of the demonstrative that the perceiver could use to point to that object, is the occurrence of the perception itself. The perception itself constitutes a demonstrative reference to the world and, thus, the perception of the object has the cognitive force of “that object.” The objects of perceptual demonstratives are determined relationally, in a very similar sense to that defined by Bach (1987, 12), which means that for an object to be an object of a perceptual state, it must stand in a certain kind of relation to that same perceptual state. The NCC of an object-file is the nondescriptive de re mode of presentation of the object/demonstratum in perception. Since object-files index objects and secure reference to them through their content, one could claim that perceptual de re modes of presentation function as mental demonstratives. Furthermore, in accordance with Evans (1982), the causal relationship between a visual scene and the content of the object-files the perceptual system assigns to the objects it parses in the scene is such that it allows the perceiver to individuate (discriminate is the term used by Evans, but in the case of perception the term “individuate” is more apt) the objects in perception. Now, the content of an object-file is idiosyncratic to the relationship of the viewer with the visual scene, which means that different viewers may use different information to parse a scene or that the same viewer may use different information to individuate the same

1  Cognitive Penetrability and the Epistemic Role of Perception     75

objects, depending on its perspective of the scene. This entails that the de re relationship of the perceiver with a visual scene, a relationship that allows her to retrieve information regarding the scene from the scene itself and not from a description of it, is highly contextual. Let us return to the notion of factive evidence, which supervenes on the representational and not on the phenomenal content of the perceptual state. This time things are clearer. In the good cases, there is factive evidence. In bad cases such as hallucinations, there is no factive evidence because hallucinatory states do not have representational contents, even though they have phenomenal characters. What is the account of bad cases, such as illusions or cases in which perception is CP that states, have representational contents, but do not represent the visual scene correctly? Schellenberg (2013, 700) thinks that factive evidence is necessarily determined by the content of the visual scene to which a perceiver is perceptually related and, as a result, factive evidence is guaranteed to yield truth in accurately representing the visual scene. Moreover, perceivers have factive evidence that is determined by the visual scene (this way Schellenberg attempts to screen off cases, such as veridical illusions or hallucinations in which the experience is accurately but accidentally so) only if they perceive accurately the visual scene with which they are perceptually related (Schellenberg 2013, 723). It follows that Jill’s CP perceptual state as an angry Jack does not provide factive evidence for the belief that Jack is angry, independent of the fact that it provides some phenomenal evidence for it. To explain why only in the good case does perception grounds beliefs well, Millar (2011, 343) analyzes the prerequisites for perception that are, or constitutively involve, the exercise of certain recognitional abilities. Having a visual–perceptual recognitional ability requires one to be very highly reliable in a specific way—in particular, with respect to correct applications of a concept in response seeing something … one can be thus reliable with respect to judging correctly of things from the way they look that they are tomatoes, only if the visual appearance that is characteristic of tomatoes is also distinctive of them, and so only if this appearance is such that its possession by a thing is a very highly reliable indicator that it is a tomato.

76     A. Raftopoulos

It follows that the justification perceivers have in the good case for believing that the object in a visual scene is a tomato should not be thought to derive simply from the visual appearance of the tomato, that is, simply from the phenomenal character of the perceptual experience, as the internalistic assumption underlying phenomenal dogmatism requires. The justification depends on the perceivers’ coming to know that the object is a tomato from the way it looks and on their coming to know that they see that the object is a tomato. Thus, whether a belief is justified in the sense of being well-grounded in, or -founded on, the perception depends on the abilities that make it possible for the perceiver to know that the object is a tomato, which, in turn, depends on the environment being favorable to having those abilities with respect to tomatoes (that is, by allowing perception to pick out and recognize tomatoes in the environment reliably). It follows that S’s coming to believe, in the bad case, that the thing is a tomato is not the exercise of the same recognitional ability that is exercised in the good case. That a perceptual experience justifies a belief in the sense that it is reasonable for perceivers to form the belief in view of their experience, and by doing so they are not acting with doxastic irresponsibility, Millar (2011, 345) argues, is not, or should not be, the epistemologically central notion of being justified. If S believes that a is F because S falsely believes that an object that is F is the object a, while in reality it is the object b, no matter how reasonable this may sound, the belief that a is F is not well-grounded because S is wrong in thinking that a is F (in reality b is F). What S takes to be the reason for believing that a is F, namely that the object at issue is the object a and is F, is really not a reason to believe that a is F, and, thus, the belief that a is F is not justified in the sense of being well-grounded or founded. S’s “belief might be reasonable, but this notion of being reasonable is parasitic on the notion of being justified in the sense of being well founded” (Millar 2011, 345). All the accounts of perceptual justification that we have encountered in the preceding paragraphs emphasize the distinction between the phenomenal character of perceptual states and their intentional or representational content, and undermine the view that the content of a perceptual state is an intrinsic property of that state that, as such, is

1  Cognitive Penetrability and the Epistemic Role of Perception     77

logically independent of what the perceptual state is about; in other words, they adopt an externalist view of metal content. This challenges phenomenal dogmatism or conservatism in that it paves the way for a notion of perceptual justification that is not determined solely by the phenomenal character of the perceptual experience viewed as an intrinsic property of the experience. This challenge, in turn, allows the introduction of conditions such as the reliability of perception and its sensitivity to the environmental data to enter into discussions of perceptual justification. What unites Sosa, Brogaard, and Schellenberg is a conception of justification that a veridical perceptual state that grounds the relevant belief provides for that belief, which is stronger than the justification that an illusory, or hallucinatory, or a CP perceptual state provides for a belief with matching content that the state causes. This conception is stronger that imposes a further requirement on justification in addition to the condition that the content of the justified belief should match the phenomenal content of the perceptual experience. These philosophers, therefore, are sympathetic to the internalist’s intuition that when two viewers share the same phenomenal content, there is a sense in which both are justified in believing that O is F. (This is perhaps why Steup [2018, 2911, ft 4] thinks that Broggard’s theory is an internalist/externalist hybrid.) Let us grant that when perceivers form, or are caused to have, a belief whose content matches the phenomenal character of their experience, they are doing the epistemically correct thing and exercise their epistemic capacities appropriately; they are reasonable in the sense that they are doing the doxastically responsible thing to do. This is not enough, however, for the perception to justify a belief in the sense that it grounds or bases the belief. That, only veridical perceptual experiences can do, which is why grounding is lacking in the bad cases, including the CP cases. What underwrites all these externalist views is the belief that for perception to play its justificatory role, it must be sensitive to the environmental input, that is, to reflect accurately that input. This is an insight that I will use in the next two chapters to discuss CP and the repercussions of indirect cognitive effects on early vision to determine whether they entail the CP of early vision.

78     A. Raftopoulos

References Audi, R. (2003). Epistemology: A Contemporary Introduction to the Theory of Knowledge (2nd ed.). London: Routledge. Austin, J. (1962). Sense and Sensibilia. Oxford: Oxford University Press. Bach, K. (1987). Thought and Reference. Oxford: Clarendon Press. Block, N. (2007). Consciousness, accessibility, and the mesh between Psychology and Neuroscience. Brain and Behavioral Sciences, 30, 481–548. Brogaard, B. (2013). Phenomenal seemings and sensible dogmatism. In C. Tucker (Ed.), Seemings and Justification (pp. 270–289). Oxford: Oxford University Press. Brogaard, B., & Gatzia, D. (2017). The real epistemic significance of perceptual learning. Inquiry. http://dx.doi.org/10.1080/0020174X.2017.1368172. Burge, T. (1977). Belief de re. Journal of Philosophy, 74, 338–362. Burge, T. (2003). Perceptual entitlement. Philosophy and Phenomenological Research, 67(3), 503–548. Burge, T. (2010). Origins of Objectivity. Oxford: Clarendon Press. Burnston, D. (2017). Cognitive penetration and the cognition-perception interface. Synthese, 194, 3645–3668. Byrne, A. (2014). Perception and evidence. Philosophical Studies, 170, 101–113. Cavanagh, P. (2011). Visual cognition. Vision Research, 51, 1538–1551. Churchland, P. M. (1988). Perceptual plasticity and theoretical neutrality: A reply to Jerry Fodor. Philosophy of Science, 55, 167–187. Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36, 181–253. Davidson, D. (1986). A coherence theory of truth and knowledge. In E. LePore (Ed.), Truth and Interpretation: Perspectives on the Philosophy of Donald Davidson. Oxford: Basil Blackwell. Evans, G. (1982). The Varieties of Reference. Oxford: Clarendon Press. Fantl, J., & McGrath, M. (2009). Knowledge in an Uncertain World. Oxford: Oxford University Press. Feyerabend, P. (1981). Realism, Rationalism and Scientific Method: Philosophical Papers (Vol. 1). Cambridge: Cambridge University Press. Firestone, C., & Scholl, B. J. (2016). Cognition does not affect perception: Evaluating the evidence for ‘top-down’ effects. Behavioral and Brain Sciences. http://dx.doi.org/10.1017/S0140525X15000965.

1  Cognitive Penetrability and the Epistemic Role of Perception     79

Ghijsen, H. (2016). The real epistemic problem of cognitive penetration. Philosophical Studies, 173, 1457–1475. Hanson, N. R. (1958). Patterns of Discovery. Cambridge: Cambridge University Press. Hatfield, G. (2002). Perception as unconscious inference. In D. Heyer & R. Mausfeld (Eds.), Perception and the Physical World: Psychological and Philosophical Issues in Perception. West Sussex: Wiley. Heck, R. G., Jr. (2000). Nonconceptual content and the ‘space of reasons’. Philosophical Review, 109, 483–523. Heck, R. G., Jr. (2007). Are there different kinds of content? In J. Cohen & B. McLaughlin (Eds.), Contemporary Debates in the Philosophy of Mind. Oxford: Blackwell. Helmholtz, von H. (1878[1925]). Treatise on Psychological Optics. New York: Dover. Huemer, M. (2007). Compassionate phenomenal conservatism. Philosophy and Phenomenological Research, 74, 30–55. Huemer, M. (2013). Epistemological asymmetries between belief and experience. Philosophical Studies, 162(3), 741–748. Jackendoff, R. (1989). Consciousness and the Computational Mind. Cambridge: MIT Press. Kitcher, P. (2001). Real realism: The Galilean strategy. Philosophical Review, 110(2), 151–199. Kosslyn, S. M. (1994). Image and Brain. Cambridge: MIT Press. Kuhn, T. S. (1962). The Structure of Scientific Revolutions. Chicago: Chicago University Press. Kvaning, J. (2003). Propositionalism and the perspectival aspect of justification. American Philosophical Quarterly, 40(1), 3–18. Lamme, V. A. F. (2005). Independent neural definitions of visual awareness and attention. In A. Raftopoulos (Ed.), The Cognitive Penetrability of Perception: An Interdisciplinary Approach. Hauppauge, NJ: NovaScience Books. Long, R. (2017). How wishful thinking is not like wishful thinking. Philosophical Studies. http://doi.org/10.1007/s11098-017-0917-2. Lyons, J. (2005). Perceptual beliefs and nonexperiential looks. In J. Hawthorne (Ed.), Philosophical Perspectives, 19: Epistemology. Malden: Blackwell. Lyons, J. (2009). Perception and Basic Beliefs. Oxford: Oxford University Press. Lyons, J. (2011). Circularity, reliability, and the cognitive penetrability of perception. Philosophical Issues, 21, The Epistemology of Perception, 289–311. Lyons, J. C. (2015). Inferentialism and cognitive penetrability of perception. Episteme, 13(1), 1–28.

80     A. Raftopoulos

Lyons, J. C. (2016). Experiential evidence. Philosophical Studies, 73, 1053–1079. Macpherson, F. (2012). Cognitive penetration of colour experience: Rethinking the issue in light of an indirect mechanism. Philosophy and Phenomenological Research, 84(1), 24–62. Markie, P. J. (2005). The mystery of direct perceptual justification. Philosophical Studies, 126, 347–373. Markie, P. J. (2006). Epistemically appropriate perceptual belief. Nous, 40, 118–142. Markie, P. J. (2013). Searching for true dogmatism. In C. Tucker (Ed.), Seemings and Justification (pp. 248–269). Oxford: Oxford University Press. McAllister, B. (2018). Seemings as sui generis. Synthese, 195, 3079–3096. McDowell, J. (1994). Mind and World. Cambridge, MA: Harvard University Press. McDowell, J. (2011). Reception as a Capacity for Knowledge. Milwaukee, WI: Marquette University Press. McGrath, M. (2013a). Siegel and the impact for epistemological internalism. Philosophical Studies, 162(3), 723–732. McGrath, M. (2013b). Phenomenal conservatism and cognitive penetration. In C. Tucker (Ed.), Seemings and Justification (pp. 225–247). Oxford: Oxford University Press. McGrath, M. (2016). Schellenberg on the epistemic force of experience. Philosophical Studies, 173, 897–905. McGrath, M., & Fantl, J. (2002). Evidence, pragmatics, and justification. The Philosophical Review, 111, 67–94. Millar, A. (2011). How visual perception yields reasons for belief. Philosophical Issues, 21, The Epistemology of Perception, 332–351. Peacocke, C. (2004). The Realm of Reason. New York, NY: Oxford University Press. Perry, J. (2001). Knowledge, Possibility, and Consciousness. Cambridge: MIT Press. Plantinga, A. (1993). Warrant and Proper Function. New York, NY: Oxford University Press. Pryor, J. (2000). The sceptic and the dogmatist. Nous, 34, 517–549. Pryor, J. (2005). There is immediate justification. In M. Steup & E. Sosa (Eds.), Contemporary Debates in Epistemology (pp. 181–201). Maiden, MA: Blackwell. Pylyshyn, Z. (1999). Is vision continuous with cognition? Behavioral and Brain Sciences, 22, 341–365. Pylyshyn, Z. (2003). Seeing and Visualizing: It’s Not What You Think. Cambridge: MIT Press.

1  Cognitive Penetrability and the Epistemic Role of Perception     81

Pylyshyn, Z. (2007). Things and Places: How the Mind Connects with the World. Cambridge: MIT Press. Raftopoulos, A. (2001a). Is perception informationally encapsulated? The issue of the theory-ladenness of perception. Cognitive Science, 25, 423–451. Raftopoulos, A. (2001b). Reentrant pathways and the theory-ladenness of observation. Philosophy of Science, 68, 187–200. Raftopoulos, A. (2009). Cognition and Perception: How Do Psychology and Neural Science Inform Philosophy?. Cambridge: MIT Press. Raftopoulos, A. (2011, November 30). Late vision: Its processes and epistemic status. Front Psychol, 2: 382. https://doi.org/10.3389/fpsyg.2011.00382. Raftopoulos, A. (2014). The cognitive impenetrability of the content of early vision is a necessary and sufficient condition for purely nonconceptual content. Philosophical Psychology, 27(5), 601–620. Raftopoulos, A. (2015a). The cognitive impenetrability of perception and theory-ladenness. Journal of General Philosophy of Science, 46(1), 87–103. Raftopoulos, A. (2015b). Cognitive penetrability and consciousness. In J. S. Zeimbekis & A. Raftopoulos (Eds.), Cognitive Effects on Perception: New Philosophical Perspectives (pp. 268–298). Oxford: Oxford University Press. Raftopoulos, A., & Muller, V. (2006). Nonconceptual demonstrative reference. Philosophy and Phenomenological Research, 72(2), 251–285. Recanati, F. (1997). Direct Reference: From Language to Thought. Oxford: Blackwell. Rock, I. (1983). The Logic of Perception. Cambridge: MIT Press. Schellenberg, S. (2011). Perceptual content defended. Nous, 45, 714–750. Schellenberg, S. (2013). Experience and evidence. Mind, 122(487), 699–747. Schellenberg, S. (2014). The epistemic force of perceptual experience. Philosophical Studies, 170, 87–100. Schellenberg, S. (2016a). Phenomenal evidence and factive evidence. Philosophical Studies, 173, 875–896. Schellenberg, S. (2016b). Phenomenal evidence and factive evidence defended: Replies to MacGrath, Pautz, and Neta. Philosophical Studies, 173, 929–946. Sellars, W. (1956). Empiricism and the philosophy of mind. In H. Feigl & M. Scriven (Eds.), Minnesota Studies in the Philosophy of Science (Vol. I, pp. 253–329). Minneapolis: University of Minnesota Press. Siegel, S. (2011). Cognitive penetrability and perceptual justification. Nous, 46, 201–222. Siegel, S. (2012). The Contents of Visual Experience. Oxford: Oxford University Press.

82     A. Raftopoulos

Siegel, S. (2013a). The epistemic impact of the etiology of experience. Philosophical Studies, 162, 697–722. Siegel, S. (2013b). Can selection effects influence the rational role of experience? In T. Gelder (Ed.), Oxford Studies in Epistemology (Vol. 4, pp. 240– 270). Oxford: Oxford University Press. Siegel, S. (2015). Epistemic charge. Proceedings of the Aristotelian Society, CVX(3), 277–305. Siegel, S. (2016). How is wishful seeing like wishful thinking? Philosophy and Phenomenological Research. https://doi.org/10.1111/phpr.12273. Siegel, S., & Silins, N. (2014). Consciousness, attention, and justification. In D. Dodds & E. Zardini (Eds.), Scepticism and Perceptual Justification (pp. 149–169). Oxford: Oxford University Press. Silins, N. (2005). Deception and evidence. Philosophical Perspectives, 19, 375–404. Smith, A. D. (2002). The Problem of Perception. Cambridge, MA: The Harvard University Press. Smithies, D. (2014). The phenomenal basis of epistemic justification. In J. Kallestrup & M. Sprevak (Eds.), New Waves in Philosophy of Mind. New York: Palgrave Macmillan. Smithies, D. (2016). Perception and the external world. Philosophical Studies, 173, 1119–1145. Sosa, D. (2011). Some of the structure of experience and belief. Philosophical Issues 21, The Epistemology of Perception, 474–484. Speaks, J. (2005). The Phenomenal and the Representational. Oxford: Oxford University Press. Spelke, E. S. (1988). Object perception. In A. I. Goldman (Ed.), Readings in Philosophy and Cognitive Science (pp. 447–461). Cambridge: MIT Press. Steup, M. (2018). Destructive defeat an justificational force: The diallectic of dogmatism, conservatism, and metaevidentialism. Synthese, 195, 2907–2933. Tucker, C. (2010). Why open-minded people should endorse dogmatism. Philosophical Perspectives, 24, Epistemology. Tucker, C. (2014). If dogmatists have a problem with cognitive penetration, you do too. Dialectica, 68(1), 35–62. Tye, M. (1995). Ten Problems of Consciousness. Cambridge: MIT Press. Tye, M. (2000). Consciousness, Color, and Content. Cambridge: MIT Press. Tye, M. (2002). Visual qualia and visual content revisited. In D. Chalmers (Ed.), Philosophy of Mind (pp. 447–457). Oxford: Oxford University Press.

1  Cognitive Penetrability and the Epistemic Role of Perception     83

Tye, M. (2006). Nonconceptual content, richness and fineness of grain. In T. Gendler & J. Hawthorne (Eds.), Perceptual Experience (pp. 504–530). Oxford: Oxford University Press. Tye, M. (2009). Consciousness Revisited: Materialism Without Phenomenal Concepts. Cambridge: MIT Press. Vahid, H. (2014). Cognitive penetration, the downgrade principle, and extended cognition. Philosophical Issues, 24(1), 439–459. Williamson, T. (2000). Knowledge and Its Limits. Oxford: Oxford University Press.

2 Cognitive Penetrability

1 Introduction Although discussions about cognitive penetrability (CP) and its role have become frequent in the last forty years, the notion of CP was not until very recently thoroughly analyzed. In fact, one of the most important problems, namely whether all cognitive effects on perception constitute cases of CP or whether only a sub-class of such effects should be deemed cases of CP was left unscathed. With the reinvigorated interest in the nature of perception, the notion of CP has come to the fore and started receiving thorough analyses. As a result many definitions have appeared in the literature (MacPherson 2012; Pylyshyn 1999; Raftopoulos 2009; Siegel 2011, 2013a, b; Stokes 2012, 2015; Wu 2013). Most definitions share a common thread; they exclude from instances of CP cases in which the percept is determined through the focus of peripheral spatial attention (I discuss cases of cognitively driven attention), or, in general, cases in which concepts determine indirectly the percept by introducing an external link in the causal chain from the cognitive to the perceptual state, as when someone focus their eyes on some location or on some feature/object. © The Author(s) 2019 A. Raftopoulos, Cognitive Penetrability and the Epistemic Role of Perception, Palgrave Innovations in Philosophy, https://doi.org/10.1007/978-3-030-10445-0_2

85

86     A. Raftopoulos

It is not always adequately explained why the indirect effects are not cases of CP. It is also unclear what happens when spatial or feature/object-based attention act covertly and attain focus without being accompanied by any bodily movements. Philosophers (Gross 2017; Gross et al. 2014; MacPherson 2012; Siegel 2011, 2013a, b, 2016), cognitive scientists (Firestone and Scholl 2016), and psychologists (Pylyshyn 1999, 2003, 2007) think that covert attentional shifts are indirect cognitive effects on perception, in that they affect either a pre-perceptual or a post-perceptual stage but not perceptual processing, and like overt shifts of attention are not genuine cases of CP. Other philosophers (Mole 2015; Raftopoulos 2001a, b, 2009, 2014; Raftopoulos and Muller 2006; Stokes 2012, 2017; Wu 2013, 2017) think that cognitively driven covert attention does affect directly perceptual processing or some stage of it and is, thus, entails the CP of perception or of a perceptual stage. Another problem besetting discussions about CP is that discussions about CP of perception treat perception were a unified stage. Perception, however, is not a homogeneous undifferentiated process. It consists of two stages, i.e., early vision and late vision that are differently affected by cognition. When a claim is being made that perception is CP and empirical evidence is adduced to support it, it is not clear whether the claim concerns early vision or late vision and this despite the fact that most proponents of the Cognitively Impenetrable (CI) (Pylyshyn 1999; Raftopoulos 2009) argue that early vision is CI but late vision is CP. Among philosophers, Raftopoulos (2009), Siegel (2012), and Cecchi (2018) are sensitive to this distinction. Among scientists only very recently Newen and Vetter (2016), and O’Callaghan et al. (2016) explicitly discuss the impact of empirical studies on the various stages of perceptual processing. O’Callaghan et al. (2016), for example, adduce evidence that perception is penetrated by predictions, but they hasten to add, first, that the evidence is much less robust when early vision is concerned, and second, that the evidence concerning early vision is drawn from pre-cueing studies, and it is open to debate whether the cognitive effects accompanying pre-cueing should be interpreted so as entailing CP, since in pre-cueing the cognitive effects may be seen as guiding the allocation of attention before stimulus presentation, in which case they

2  Cognitive Penetrability     87

are not usually construed as entailing CP. Therefore, discussions on CP/ CI should specify the scope of the claim that perception is CP or CI. In addition, an account of CP/CI should be able to explain the differences in the CP/CI character of each stage owing to the differences between the ways cognition affects the two visual stages. In view of the above considerations, a definition of CP is constrained by the demand that it should offer a principled way to distinguish between cognitive effects that constitute cases of CP and cognitive effects that do not. In addition, in view of the discussion in Chapter 1 concerning the epistemic effects of CP, it should make possible an account of why some cases of CP downgrade perception, while others do not. It should also address the question why indirect cognitive effects should not count as cases of CP. To address all these concerns, I propose in this chapter a reframing of the problem of CP, according to which a definition of CP should incorporate two factors that until recently have largely been ignored in philosophical accounts of CP, and combine them with the standard account according to which CP occurs if cognition affects directly perception. The first factor is that when someone discusses CP, they should distinguish between early vision and late vision and examine separately the cognitive effects on these stages because cognition may affect them differently. For example, attention may act directly on late vision but it may act on early vision only indirectly (Raftopoulos 2001a, b, 2009). Since the distinction between direct and indirect cognitive effects on perception runs through the entire book let me define these terms from the outset and will turn to discuss them in considerable detail later in this chapter. A cognitive effect on perception is direct if it affects the perceptual processes themselves. Otherwise, it is indirect. Suppose that the role of attention consists merely in selecting the input to be perceptually processed either by focusing on some location, or some object, or some feature. The role of attention ends with the selection of the input and attention does not affect the ensuing perceptual processes. In this case, the cognitive information that guides attention affects perception indirectly since it acts pre-perceptually. Or, consider the case in which cognition intervenes so that judgments concerning the output

88     A. Raftopoulos

of perception are formed. In this case, cognition acts post-perceptually and does not affect perceptual processing; this is also a case of indirect cognitive effects on perception. An equivalent way to express this is to say that when cognitive effects do not emerge as part of perceptual competition, they are not intrinsic to perceptual processing and affect it indirectly (Nobre et al. 2012, 161). Examples of the second kind are the various pre-cueing effects. These effects do not influence perceptual processing online and, thus, the conceptual information carried by the cognitive states is not used by perception. If, in contrast, attention participates in perceptual processing, as when attention determines which iconic information retrieved from the visual scene and stored in visual circuits will be revisited when the perceptual system tests hypotheses concerning the identity of the visual objects to deliver the percept, the cognitive information guiding attention affects perception directly, since cognitive contents guide attention to those parts of the iconic image in which relevant information could most likely be found. The attentional effects on late vision are a characteristic example of this kind of effects. These effects influence perceptual processing by altering the state transformations of which the processes consist; the perceptual processes use the conceptual information that allocates attention. Hence, perception uses the cognitive systems as informational resources and concepts enter perceptual contents. Most accounts of CP assume that not all cognitive influences on perception constitute cases of CP. When cognitive factors affect perception by introducing an external link to perceptual processing, as when they make the viewer refocus attention to some part of the environment and select some other input changing, thus, the percept, this is unanimously considered not to be a case of CP. This means that the effects of overt attention that determine focus and select an input do not entail the CP of perception. This class of effects is among the so-called “indirect cognitive effects on perception” because attention in those cases acts to select the input but does not affect the perceptual processes themselves. However, this role of cognition could also be considered to affect perception since it causally affects the percept in the sense that a causal explanation of the percept would have to include the concepts that guided attentional refocus. Why is it, then, that this class of cognitive

2  Cognitive Penetrability     89

effects does not usually count as a case of CP? The answer is that CP is really about the possibility that two viewers form different percepts when they face the same distal stimulus. If peripheral attention chooses some stimulus over some other, the ensuing difference in the percept does not entail that perception is CP. Even though the reason as to why this should be the case is not clear, it is intuitive to think that cases in which the formation of different percepts as a result of selecting through attentional refocusing different stimuli do not really threaten the epistemic role of perception because any disputes about the percept can be mitigated by simply refocusing attention and selecting another stimulus, neutralizing thus the role of the cognitive effects (Siegel 2013a, 717).1 It seems, therefore, that there is a sort of consensus in the discussions about CP that the effects of cognition on perception involving bodily movements that result in selecting the stimulus to be processed should not be construed as cases of CP because they do not seem to undermine the epistemic role of perception in justifying perceptual beliefs since the cognitive effects could be neutralized in some way or other. Adding the condition that when the harmful cognitive effects could be mitigated by, say, asking viewers to refocus attention, these effects do not signify the CP of perception, creates a problem because this condition renders the class of cognitive effects that constitute CP nearly empty since, as I will argue in Chapter 4, the epistemically harmful cognitive effects on perception could be mitigated in most cases of CP, which allows one to escape the constructivist conclusions (Kitcher 2001) usually raised by the CP of late vision. The difference between the two cases is that the effects on the epistemic status of perception due to the direct cognitive effects on late vision, although addressable, are much more difficult to counteract than a simple refocusing of attention that suffices to solve the epistemic problems posed by the indirect cognitive effects on perception. This is the reason that the former but not 1[W]hen

prior mental states influence what you look at or attend to, without influencing how things look to you when you see them, the result might seem to be a mere selection effect. If you want the Necker cube or the duck-rabbit to shift, you can make it shift by adjusting your focus to the relevant part of the figure, thereby affecting the contents of your experience.

90     A. Raftopoulos

the latter are philosophically interesting. One should distinguish, therefore, between philosophically interesting and philosophically uninteresting cognitive effects on perception. In an attempt to distinguish CP from the effects of attentional selection of the input to be processed, the demand is usually being made that cases of CP are those cases in which the cognitive effects on perception operate through purely internal mental links. By introducing, however, the demand that for CP to occur attention should act covertly, that is, by requiring that attention act through purely internal links, does not entirely solve the problem of defining CP. Covert attention can also act indirectly, that is, its sole influence on perception could be restricted in just selecting the input. In this case, for the same reasons as with overt attention, the indirect effects of covert attention should also be excluded from entailing CP. A characteristic example is the cognitive influences underlying pre-cueing, which could operate through purely internal mental links. The cognitive influences in this case do not affect early vision processing directly and, also, do not affect the epistemic role of early vision, and, thus, it is questionable whether they should be taken as indications that early vision is CP. Therefore, cognition driving attention that acts through entirely internal mental links may well be a necessary condition for CP, but it is not a sufficient condition. At the same time, covert shifts of attention acting through purely mental links can, and do, affect directly the processes of late vision, rendering late vision CP. To see how these considerations bear on the problem of the CP of perception and the role of attention in it, let us consider Siegel’s work discussed in the previous chapter. Siegel thinks that attention is restricted to selecting the input and does not effect perception otherwise; consequently, Siegel excludes all attentional effects, be it the results of covert or overt attention, from being cases of CP since they all affect perception indirectly. To assess this claim, one should examine the ways in which attention operates in perception and distinguish between the ways attention affects early vision and the ways it affects late vision because it may turn out that attention affects some perceptual stage directly but affects another stage indirectly, in which case Siegel’s thesis is partly justified but at the same time fails

2  Cognitive Penetrability     91

to explain why in CP cognition affects perceptual processing through attention. The remark made above as to why the indirect cognitive effects that affect perception through bodily movements or the indirect cognitive effects on perception that act internally through covert attention do not usually count as cases of CP and the brief explanation of why this is so bring to the fore the fact that discussions of CP should be linked with the role of perception in justifying perceptual beliefs. As we saw, CP from its inception was thought to be a significant problem for philosophy exactly because it was taken to entail that the epistemic role of perception is seriously flawed. This introduces a second factor that should be taken into account in discussions about CP. To discuss adequately CP and its epistemic effects on perception, one has to examine carefully the way the cognitive states affect perceptual processing. To this end, I distinguish between direct/intrinsic, and indirect/extrinsic cognitive effects on perception and discuss the ways cognition affects the epistemic role of early and late vision, and their impact on defining CP. This is not the first time that this second factor is mentioned with respect to the definition of CP. Raftopoulos (2001a, b, 2009, 2014) argued that the indirect cognitive effects on early vision should not be considered cases of CP because they do not undermine the role of early vision in grounding perceptual beliefs on account of the fact that they do not affect the perceptual processes of early vision themselves. More recently, Stokes (2015) argued that whether some cognitive influence should be deemed a case of CP depends on the consequences of this specific cognitive effect on the epistemic role of perception, a view that he calls “a consequentialist account of CP.” Let us call this criterion for CP, the epistemic criterion for CP. In the first section, I discuss the various definitions that have been proposed in the literature. In the second section, I propose a definition of CP that synthesizes the cons and pros of the other definitions. I offer at a first pass a definition along the traditional line that construes CP in terms of cognitive influences on perceptual processing. This definition entails that the indirect effects on perception are not cases of CP. It also entails that direct cognitive effects on perception or some stage of it are clear-cut cases of CP because in these cases cognition affects perceptual

92     A. Raftopoulos

processing itself. This definition does not explain why the indirect cognitive effects on perception should be excluded from being cases of CP. To answer that, I augment the definition to include the clause that cognitive effects entail the CP of perception if they affect the epistemic role of perception in grounding perceptual beliefs. Cognitive effects on a stage of perception that do not, and cannot in principle, affect the epistemic role of that stage do not count as cases of CP. The main reason why this is so is because CP was initially conceived in such a way as to cause epistemic problems for perception. Accordingly, cognitive effects that do not affect the epistemic role of perception do not constitute cases of CP. This definition explains, and expands significantly Marchi’s (2016, 3) account of CP according to which, a cognitive effect on perception is an instance of CP if it meets two conditions, namely, (i) it involves a cognitive–perceptual relation that is causal, internal, or rational; and (ii) it challenges the epistemic role of perception. As it will turn out, whether cognition affects or does not affect the epistemic role of some perceptual stage hinges on whether cognition does or does not affect directly that stage. This remark does not render the amendment of the definition so that it incorporates the epistemic condition redundant, because the inclusion of the epistemic criterion explains why indirect cognitive effects on perception do not constitute cases of CP by appealing to the reasons that initially gave rise to the problem and the ensuing discussion. The epistemic condition provides a pragmatic justification for the epistemological decision to exclude some cognitive effects cases of CP. Whether a cognitive effect is direct or indirect has a property related to the way it affects epistemically perception or a stage of it that has a value that can be pragmatically cashed out in the given dialectic context. I claim that late vision is CP because it is directly affected by cognition. Independently of the side one takes concerning the problem of CP, it is widely assumed that if cognition affects perception, it does so through cognitively driven attention, imagination, or affective states. If someone holds that perceptual and cognitive states both have conceptual, symbolically structured contents, the answer to the problem as to how cognitive states could interact with perceptual states is easy

2  Cognitive Penetrability     93

since one could at minimum appeal to some sort of inferential relations between the two sorts of states. If, however, someone holds that perceptual and cognitive states are cast in different representational contents, iconic and symbolic respectively, and they also think that cognition affects perception, an account on this interaction in view of the difference in representational formats should be provided. I belong to this category and for this reason in the third section of this chapter I attempt to sketch an account of how symbolic contents carried by cognitive states could modify directly iconic perceptual contents. I do this by addressing some powerful objection against the possibility of a direct interaction between cognition and perception raised by Burnston (2017).

2 Assessing the Definitions of Cognitive Penetrability 2.1 Pylyshyn Pylyshyn (1999, 405) provides a definition of the CI of the content of early vision relating it to the causal influences of cognitive states on perceptual states. The thesis of CI of perception or early vision, is that cognition only affects perception by determining where, to what, and the degree to which attention is focused. It does not in any more direct way alter the contents of perceptions so that they be logically or epistemically connected to the content of beliefs, expectations, etc. Cognition would be present in perception only when the output of a perceptual could be altered in a way that bears some logical/epistemic relation to some cognitive contents. By not affecting directly the contents of early vision, cognition does not determine the percept. The demand that perceptual contents be logically connected to cognitive contents may be taken to mean that should CP occur, there must be an inferential relation between cognitive and perceptual contents. This is allowed by the fact that according to Pylyshyn both perception and cognition have symbolic contents.

94     A. Raftopoulos

Elsewhere, Pylyshyn opts to discuss CP in terms of representational coherence. Cognitive penetration of perception occurs when “it is an influence that is coherent … when the meaning of the representation is taken into account” (Pylyshyn 1999, 365, fn 3). What Pylyshyn means by ‘coherence’ between contents is not clear, but the weakest and safest interpretation would be that Pylyshyn views CP as an influence of cognitive contents on perceptual contents that alters the content of the perceptual representation. That is, CP occurs when the contents of the states of early vision depend in some way on cognitive contents. The processes of early vision, therefore, should have access to, and operate on, the contents of the penetrating cognitive states; the cognitive contents of the cognitive states are computationally relevant to the perceptual processes of the perceptual states that the cognitive states affect. Pylyshyn (2003, 2007) flirts with the idea that perceptual contents are not representational because they are not conceptual and representations, Pylyshyn thinks, require concepts. At the same time, Pylyshyn (2007, 52) thinks that the contents involved in perceptual states may be “codes for proximal properties involved in perception, such as edges, gradients, or the sorts of labels that appear in early computational vision.” This, in addition to the fact that perceptual contents are not representational, paves the way to the view that perceptual states may not have any contents. This view finds a clear expression in Fodor and Pylyshyn (2015), where it is argued that perceptual states have no contents but, instead, refer directly to worldly states of affairs. It is also argued, against the two-factor theory of concepts, according to which concepts have both reference and sense or meaning, that the only semantic property that perceptual concepts may have is that they refer, directly, to the world. This thesis allows Pylyshyn to reintroduce concepts in perception, although this notion of concept is radically different from the notion of concept as used in discussions concerning both conceptual influences on perception and the nonconceptual content (NCC) of perception. Pylyshyn (1999, 343–365) relates the CI of contents to what he calls a semantic relation between the penetrating cognitive and the penetrated perceptual states. Should cognitive states causally affect directly perceptual states, there would be a semantic relation between them, and

2  Cognitive Penetrability     95

the cognitive contents would bear logical/epistemic relations to the perceptual contents. Pylyshyn’s discussion suggests that he thinks the definition of the CP of the contents of early vision to be derived directly from the fact that in cases of CP cognitive states should causally influence directly the states of early vision. “It [cognition] does not in any more direct way alter the contents of perceptions so that they be epistemically connected to the content of beliefs, expectations, etc. (emphasis added).” Thus, the definition of CP in terms of logical or epistemic relations is an application of the causal definition of CP of states when the latter is translated into what it entails for contents. This is an intuitive view. If some cognitive states causally penetrate directly the processes of early vision, the processes that transform the states of early vision from one to another employ the cognitive information expressed in the contents of the penetrating cognitive states. This imbues the contents of the states of early vision with cognitive contents and paves the way for epistemic or logical relations between cognitive and perceptual contents. Pylyshyn examines the psychological and behavioral evidence for the CP of perception and concludes that early vision is CI. Any cognitive influences on perception are confined to those that are realized by modulations from focal attention prior to the operation of early vision, and selection/decision operations, applied after early vision (Pylyshyn 1999, 341), leaving early vision unaffected. I call the cognitive influences that operate pre-perceptually (by determining attentional focus, for example), or the effects of the rigging-up of the FFS, indirect or extrinsic causal influences on perceptual processing on account of the fact that they do not affect perceptual processing itself. They are distinguished from direct or intrinsic causal influences that affect perceptual processing itself. If as I will argue in Chapter 3 (see also Raftopoulos 2009, 2015) concepts do not affect early vision by being inherently present in the relevant perceptual circuits, CP obtains when concepts enter the contents of early vision, which occurs when cognitive states directly affect early vision. Thus, only in the case of direct causal influences would cognition influence early vision. Pylyshyn limited the loci of cognitive influence to attention-shifts prior to processing by early vision, and to a ‘decision stage’ which follows early visual processing and does not alter its outputs.

96     A. Raftopoulos

Finally, unlike Fodor’s modularity thesis, which holds that perceptual modules encompass the processes that lead all the way to the triggering of concepts, Pylyshyn’s impenetrability thesis drew the perception– cognition distinction at an earlier stage, namely early vision. Pylyshyn’s account is beset by many problems. I will briefly discuss two concerns here and mention a third. First, Pylyshyn treats logical and epistemic relations between contents as equivalent and also thinks that there is a neat mapping between epistemic/logical relations between contents and causal relations between vehicles. However, this is not the case. To give one example, let us assume that perception has NCC that is CI. As such, it is not causally affected by any cognitive states and, so, there is no causal relation between the two realms. Yet, the NCC of perception plays a role in justifying perceptual beliefs and, thus, justification being an epistemic relation, there exists an epistemic relation between NCC and cognitive contents in the absence of any causal relations. In Pylyshyn’s defense one could point out that Pylyshyn’s demand that there be appropriate epistemic or logical relations between the two sorts of contents if CP occurs is explicitly stated in terms of the existence of inferential relations between cognitive and the perceptual contents. Thus, cognition would influence early vision if there were a semantic or logical relation between the respective contents, by which Pylyshyn (1999, 365, fn 3) means We sometimes use the term “rational” in speaking of cognitive processes or cognitive influences. This term is meant to indicate that in characterizing such processes we need to refer to what the beliefs are about—to their semantics. The paradigm case of such a process is inference, where the semantic property truth is preserved. But we also count various heuristic reasoning and decision-making strategies (e.g. satisficing, approximating, or even guessing) as rational because, however sub-optimal they may be by some normative criterion, they do not transform representations in a semantically arbitrary way: they are in some sense at least quasi-­logical. This is the essence of what we mean by cognitive penetration: it is an influence that is coherent or quasi-rational when the meaning of the representation is taken into account.

2  Cognitive Penetrability     97

So, what Pylyshyn says is that if CP occurs, there must be some inferential or quasi-inferential relations between cognitive and perceptual contents. In the counterexample I used above to criticize Pylyshyn, such a relation clearly does not hold and, thus, that case does not succeed in refuting Pylyshyn’s thesis. Granting that, another problem emerges if one recasts the discussion of CP in terms of the existence of inferential or quasi-inferential relations between cognitive and perceptual contents, which Pylyshyn calls ‘quasi-logical’. These are the sort of inferences that are involved in drawing inferences from some premises in the space of reasons. In the space of reasons, inferences provide reasons, the premises, for believing a proposition; the premises of the inference constitute an epistemic ground for the conclusion, and this presupposes: (a) that there is a semantic and a logical relation between the contents of the premises in the inference and the content of the conclusion; (b) that the cognizer draws the conclusion because of the semantic and logical relation, which means that the cognizer operates upon the information provided by the premises and uses the form of the inference to draw the conclusion; and (c) as a consequence of the second, that the cognizer represents implicitly or explicitly both the premises and the rule of inference. It is contestable that there are discursive inferences in the sense defined above either in early or late vision (Hatfield 2002; Raftopoulos 2011; Chapter 5). This has an important ramification for Pylyshyn’s definition of CP. Since (a)–(c) define discursive inference, the claim that there are no discursive inferences in early vision entails directly that early vision is CI, if one espouses Pylyshyn’s (1999) definition of CP, according to which early vision is CP if it operates on cognitive contents and if there is a semantic/logical relation between cognitive contents and the contents of the states of early vision. If there are no discursive inferences in early vision, it follows that it is CI and the discussion concerning the CP vision should end here. The reason that more needs to be said to defend the view that early vision is CI, in addition to that there are no discursive inferences in perception, is two-fold.

98     A. Raftopoulos

First, notice that my claim is not only that there are no discursive inferences in early vision, but that there are no discursive inferences in late vision as well, which means that late vision should be deemed CI too; but both Pylyshyn and Raftopoulos deny this. This entails that the criterion of semantic coherence according to which there should be a semantic or logical relation between cognitive and perceptual contents for CP to hold does not do a thorough job in determining whether some perceptual stage is CP or not; late vision is CP and, yet it is possible that there be no quasi-inferences involved, which means that there are no logical relations between the respective cognitive and perceptual contents. It may be a sufficient condition for CP but it is not a necessary condition. The criterion, however, that the processes of a perceptual stage should have access to, and operate upon, the contents of the penetrating cognitive states in order for CP to occur can distinguish between the two cases, because one could argue that the processes in early vision do not use any cognitive information, while, as I argued elsewhere (Raftopoulos 2011) the processes of late vision do use such information. This consideration underlies the second reason for which more work needs to be done to defend the view that early vision is CI. Independent of whether early vision uses discursive inferences or not, in view of the evidence from pre-cueing studies that cognition may affect early vision directly, one must argue on empirical grounds that early vision does not operate upon cognitive contents, that is, that there are no direct cognitive influences on early vision. After all, the processes of early vision may use cognitive information even though they do not use discursive inferences, exactly as the processes of late vision do. Finally, if one endorses the epistemic criterion for CP, one should examine whether the various pre-cueing effects on early vision affect its epistemic role. If they do, they should be deemed cases of the CP of early vision, notwithstanding the fact that early vision does not use discursive inferences. Second, conceptual effects on perception entail the CP of perception if the cognitive modulation affects the contents of perception and not only the mechanisms and processes that constitute the vehicles of perceptual states. In other words, one is interested not only in whether the perceptual neural pathways receive signals from higher cognitive

2  Cognitive Penetrability     99

circuits, but also in whether perceptual content is modulated by cognitive states. The relation just described between conceptual modulation and cognitive penetrability is weaker than Pylyshyn’s condition that a system is cognitively penetrable if the function it computes is sensitive, in a semantically coherent way, to the organism’s goals and beliefs, and, thus, can be altered in a way that bears some logical relation to what the person knows (1999, 343). For example, consider this potential case of CP: if judgments about the distance of objects are affected by how much we desire the objects (Balcetis and Dunning 2006), then perceptual content is affected by desire but not, at least in any evident sense in a way that bears some logical relation to what the person knows. So Balcetis and Dunning’s finding would not be a case of CP on Pylyshyn’s definition, but it would be a case of CP because a conceptual state affects visual processing with the result that one gets a different perceptual content.

2.2 Macpherson Macpherson (2012) defines CI as follows [I]f it is impossible for two participants (or one participant at different times) to have two different experiences with distinct content or character on account of a difference in their cognitive system which makes this difference intelligible when certain facts about the case are held fixed, namely, the nature of the proximal stimulus on the sensory organ, and the location of attentional focus on the participant.

Three things in Macpherson’s definition stand to note. First, the “which makes this difference intelligible” means that Macpherson thinks that in cases of genuine CP there is an explanatory relation between cognitive contents and perceptual contents, in the sense that an explanatory account of some perceptual content implicates some cognitive/ conceptual contents. This demand purports to capture Pylyshyn’s view that when CP occurs there is a logical or epistemic relation between the cognitive and the perceptual contents. However, the mere fact that there exists a causal explanation of why a perceptual stage has a certain

100     A. Raftopoulos

content that involves some cognitive factors does not suffice to render the state CP, because the mere existence of a causal explanation does not entail that there is both the appropriate causal relation between cognition and perception and an epistemic relation between cognitive and perceptual contents. We have seen, for example, that most accounts of CP exclude from cases of CP the cases in which attention acting externally introduces an external causal factor in the explanation of the percept. Yet, in these cases, an explanatory account of the percept should include the concepts that drove attentional focus in the first place. The case of ambiguous figures is another characteristic example why the condition that if some cognitive states render intelligible in an explanatory way some perceptual content this entails CP is wrong. Most researchers agree that why an ambiguous figure is perceived in one of the two possible ways is not the result of CP and, yet, the explanation of why some percept was chosen over the other necessarily involves the concepts that drove attention to some part of the figure, which resulted in one organization of the figure being selected over the other (Raftopoulos 2014). Thus, any causal explanation of why a viewer sees, say, a duck as opposed to a rabbit, would have to involve the concepts that guided the viewer’s attention to the part of the ambiguous figure that favors the duck organization of the figure. Some concepts are needed, therefore, to render the percept intelligible, and, thus, according to Macpherson’s definition of CP, this should be a case of CP, a conclusion that is rejected by almost all other accounts of CP. Second, by referring to contents and in conjunction with the first point, Macpherson purports to capture Pylyshyn’s criterion that a perceptual content is CP if it bears an epistemic relation, say, an explanatory relation, to some cognitive content. However, if no reference is made to a causal relation between cognitive and perceptual states, some epistemological relation that may exist between the respective contents may be incidental. S1 may believe that P* and S2, totally unrelatedly to S1, may perceive P (where P* is the propositional content that matches the perceptual content P). The two contents stand in an epistemic relation (that of identity or matching) and, yet, one would not wish to conclude that this is the case of CP. For CP to occur concepts

2  Cognitive Penetrability     101

must enter the content of a perceptual state and this takes place only when cognition causally affects directly perception. Thus, a reference to a direct causal relation (such that cognition affects the perceptual processing itself, or, equivalently, a relation such that the cognitive effects are intrinsic to perception), is required in a definition of CP to ensure that any relation between contents is due to the causal role of cognitive states in affecting perceptual states and is not incidental. It should be noted, however, that for Macpherson (2012) in CP cognitive contents permeate perceptual contents, which entails that implicitly Macpherson has in mind a causal relationship between perceptual states and contents in order for CP to occur. It follows that, to the extent that CP is about whether cognitive contents affect perceptual contents, the relation sought might be specified as a relation of information exchange between the two systems. If the perceptual system uses conceptual information there is a flow of information from cognition to perception and, hence the former penetrates the latter. Then, the formation of the percept becomes intelligible because it is clear that the perceptual computations that lead to the formation of the percept use this semantic information to complete their task. Here Macpherson meets Wu’s (2013, 2017) and Raftopoulos’s (2009) main prerequisite for CP. Third, pre-cueing and perceptual learning could cause differences in experience even when the same object is attended to. These two factors indicate differences in the cognitive system and they cause differences in perceptual content and, hence, render the difference intelligible. Therefore, pre-cueing and perceptual learning should indicate the CP of perception. However, we have good reasons to think that they do not signify CP. Indeed, in Chapter 3, I explain why perceptual learning does not necessarily entail CP, and argue that neither does pre-cueing. I think that the main problem with Macpherson’s account of CP is that it fails to distinguish between direct cognitive effects on perception that affect perceptual processing itself allowing thus concepts to enter into perceptual contents, and indirect cognitive effects that do not affect perceptual processing itself, even though Macpherson thinks that when CP occurs cognitive contents enter perceptual contents.

102     A. Raftopoulos

2.3 Stokes Stokes (2012) offers another definition of CP. A perceptual experience E is CP if and only if (1) E is causally dependent on some cognitive state C, and (2) the causal link between E and C is internal and mental.

The causal clause in (1) establishes that the relation between cognition and perception where CP occurs is a causal relation and not just an explanatory relation; if C did not occur, E would not occur. Thus, Stokes talks of cognitive and perceptual states and, thus, discusses CP as a (causal) relation between states and not as a relation between contents. The condition that the cognitive state C cognitively penetrates perceptual state E if there be a casual relation between them ensures that any relation between contents should not be incidental. As we have seen, the qualification ‘mental’ in (2) is meant to rule out indirect effects of cognition on perception that one would not wish to count as instances of CP, such as when attention fixates on some external area in the visual field; this relation between cognition and perception is not internal/ mental and is not a case of CP. Thus, the definition ensures that cases of some overt or covert action antecedent to the experience do not count as CP. Stokes attempts to capture the notion of ‘indirect effects on perception’ and exclude cases in which cognition indirectly affects perception from counting as cases of CP. Stokes’ reference to an internal link and the example he discusses purport to exclude the effects of focal attention from being instances of CP. The rational of this is that by focusing the gaze on some external location, spatial attention breaks the mental internal link that is required by CP. As Stokes acknowledges, the qualification ‘mental’ in (2) cannot rule out some indirect effects of cognition on perception that one would not wish to count as instances of CP. The reference to mental internal relations as a condition for CP allows one to count, or at best leaves unclear, whether they should be counted as CP some cases of cognitive effects on perceptual processing that are internal, but which we

2  Cognitive Penetrability     103

have reasons to doubt that they constitute genuine cases of CP. These are the cases of pre-cueing by spatial and object/feature-based attention that rig-up the FFS, which affect by direct mental links what happens internally in perception (I discuss this problem in detail in Chapter 3). Perceptual processing, thus, can be CI even if there is a causal internal link between cognition and perception, as in the case of pre-cueing. As we shall see, what matters for CP is not whether there exists an internal mental link but whether the link is intrinsic (that is, a part of the perceptual competition), or extrinsic (that is, it takes place before or after the competition). Internal mental effects that are extrinsic to perceptual processing may or may not be instances of CP and this depends on whether such extrinsic effects meet the epistemic criterion for CP. Thus, the existence of an internal, mental link is not a sufficient condition, although it is a necessary condition, for CP. Recently Stokes (2015) argues that the CP of perception should be understood in terms of its consequences. This consequentialism about CP captures what is important in all discussions of CP, namely, the consequences of CP for the epistemic role of perception, theory-ladenness of perception, rationality in science, constructivism, etc. According to Stokes, any analysis of CP should be constrained by its consequences. Therefore, an adequate account of CP will describe a phenomenon (or class of phenomena) that has implications for the rationality of science, the epistemic role of perception, etc. Stokes (2015) calls this the consequentialist constraint on analyses of CP. Stokes proposes a disjunctive consequentialism, which says that ψ is CP if and only if ψ is a cognitive– perceptual relation that entails consequences for theory-ladenness or the epistemic role of perception. It should be noted that even though the original considerations, to which Stokes points out and which we discussed in the previous chapter (and we will revisit in a while), concerning the epistemic impact of CP presupposed that CP undermines the epistemic role of perception and, thus, that it has harmful effects, Stokes (2015, 88) is careful to note that on certain occasions CP and the theory-ladenness it induces may be beneficial for perception rather than harmful. It follows that for Stokes, CP occurs when the cognition–perception relation that obtains affects the epistemic role of perception and not when this relation downgrades

104     A. Raftopoulos

perception, a view with which I fully agree. Thus, I concur with Stokes that a consideration that should be taken into account in determining whether some causal influence on perception should count as CP is the examination of the effects of the cognitive influences on the epistemic role of perception (Raftopoulos 2006, 2009, 2015; Raftopoulos and Zeimbekis 2015). Cognitive effects on perception that do not affect (and cannot in principle) in any way its epistemic role should not count as cases of CP. There is a difference between Stoke’s account and mine. Stokes (2015, 88) writes Questions remain here, of course, about whether this theory-ladenness would be epistemically good. By many criteria for epistemic normativity, theory-ladenness remains an epistemic bad even if it sometimes produces good results. So, for example, just as unreliable belief-forming mechanisms can sometimes produce true beliefs, theory-laden observation may occasionally produce more efficient or accurate perceptual observation. But a true belief formed by unreliable mechanisms is in no way justified by the mere fact that it is true. Analogously … theory-ladenness is generally epistemically pernicious even if it sometimes improves observation and testing…. [T]here are open questions about the epistemic status of theory-laden observation. What is clear, nonetheless, is that if observation is theory-laden, this is of epistemic import.

The difference stems from Stoke’s belief that when CP occurs perceptual mechanisms are rendered unreliable despite the fact that on some occasions CP may benefit perception. This entails, for Stokes, that CP is generally epistemically pernicious for perception. I have argued, and will argue again in the next chapters, that late vision is CP and this is a ubiquitous phenomenon and not something that happens occasionally. If the presence of CP entailed in general that the processes that are affected by it form an unreliable mechanism, that would mean that the output of perceptual processing would be always epistemically suspect, which is the main conclusion drawn by those philosophers who started the discussion concerning the philosophical consequences of CP.

2  Cognitive Penetrability     105

Following Stokes’ thought, one inevitably falls back to the relativistic conclusions that led to epistemological and semantic constructivism (Kitcher 2001). One way to block these conclusions is to argue that perception as a whole is CI, but this would be unwise since the empirical evidence for the occurrence of cognitive influences on perceptual processing at some latency is overwhelming and, moreover, Stokes agrees with this view. Another way to block the unwelcome conclusion, a way that takes into account the fact that late vision is CP, is to argue that the cognitive effects in late vision, far from rendering it an unreliable mechanism, are an indispensable factor in its functioning (the function of late vision is to categorize and identify the objects in a visual scene) and as such they cannot be viewed as introducing any sort of unreliability. Reversing, thus, Stokes’ conclusion, CP does not render late vision in general epistemically pernicious; quite to the contrary, late vision could not perform adequately its function without CP. It is also true that on some occasions, the CP of late vision downgrades perception; this is the case in which the harmful effects of theory-ladenness surface. Despite this disagreement, I concur with Stokes’ conclusion that ‘if observation is theory-laden, this is of epistemic import’. Finally, with regard to Stokes’s (2017) views on the role of attention in CP, it is interesting to discuss his view that the effects of cognitively driven attention on experience may violate the directness condition for CP (Stokes 2017, 10–11). Stokes presents some forms of attentional effects in perception, similar to those discussed in Raftopoulos (2009) and Raftopoulos and Muller (2006), that entail that perception is CP and proposes the scheme for CP through attention (b) ‘Cognitive state → Nonagential selective attention → Perceptual Experience’ (Stokes 2017, 6). Stokes also mentions the directness criterion for CP, which he calls the vehicle criterion, which he attributes to Raftopoulos and Zeimbekis (2015), according to which genuine cases of CP occur when perception draws on cognition as an information resource. Then he goes on to discuss the possible objection that the directness criterion may seem to be violated by scheme (b). Three preliminary remarks are in order. First, this criterion was proposed independently by Raftopoulos (2009, 2015) and Wu (2013) but its origins lie in Pylyshyn’s work on CP and specifically on his demand

106     A. Raftopoulos

that CP occur when there is a semantic coherence between the contents of the affecting cognitive states and the contents of the affected perceptual states (the two accounts are not equivalent, of course, but the origin of the idea clearly lies in Pylyshyn’s work). Second, the term ‘vehicle criterion’ is unfortunate as the directness criterion specifically mentions the use of cognitive contents by the computations underlying perceptual processes, in a way that results in the cognitive contents entering perceptual contents and potentially becoming phenomenologically relevant. Third, Stokes (2017, ft 3) claims that his account differs from that offered by Raftopoulos (2001a) and Wu (2017) on account of the fact that the latter are concerned with the role of attention in affecting the computational processes of perception, while his account is about attention affecting perceptual experience. Although it is true that both Raftopoulos and Wu focus on the cognitive effects on perceptual computations as they are mediated by the attentional modulation of these computations, this does not mean that this account is orthogonal or irrelevant to the way cognition affects perceptual experience. Surely Stokes would agree that perceptual experience is subserved by some perceptual computations and, therefore, cognitive influences on perceptual computations are directly involved in discussions of the CP of experience. Let us return to Stokes’ discussion concerning whether scheme (b) violates the directness criterion. His reply is to accept the criticism that the scheme violates the directness criterion and to respond by claiming that the criterion should not be required for CP. Stokes’ account is problematic. First, he does not offer any reasons why scheme (b) violates the directness criterion. I argued above that the role of attention in late vision renders late vision CP on account of the same reasons that Stokes invokes, exactly because it makes clear that perceptual processes draw on cognitive information; so contrary to Stokes’ claim, it is because the cognitive factors guiding the attentional effects are such that they affect perception directly that the role of attention entails the CP of some perceptual process. Second, his reply to this possible objection is that direct cognitive effects is only one class of cognitive effects that entail CP but there are other classes as well that despite not being direct still affect the epistemic role of perception and, thus, should be deemed

2  Cognitive Penetrability     107

cases of CP. As we have seen, and see again at the end of this chapter, I agree, with some qualifications, with the consequentialist or epistemic criterion for CP. Contrary to Stokes, however, I also think that it is inextricably linked with the directness criterion. The reader will have to wait until the end of this chapter and until Chapter 3 for an explanation of this tight relationship.

2.4 Siegel Siegel (2011, 5–6) defines the CP of visual perception as follows: If visual experience is cognitively penetrable, then it is nomologically possible for two subjects (or for the same subject in different counterfactual circumstances, or at different times) to have visual experiences with different contents while seeing and attending to the same distal stimuli under the same external conditions, as a result of differences in other cognitive (including affective) states.

Siegel (2013b, 240–241) distinguishes between CP and attention effects that cause selection of some features as opposed to some others. When cognitive penetration occurs subjects see, say, some pliers (the distal object) but they look to them (the percept) like a gun when they are given an appropriate prime; cognition affects the way things look. A different way to explain the percept is that the prime generates a selection effect, which can influence either the content of experience, or the role of experience. [W]hen prior mental states influence what you look at or attend to, without influencing how things look to you when you see them, the result might seem to be a mere selection effect. If you want the Necker cube or the duck-rabbit to shift, you can make it shift by adjusting your focus to the relevant part of the figure, thereby affecting the contents of your experience. (Siegel 2013a, 717)

Siegel claims that cognitive effects on perception signify CP only when they affect perceptual processing itself and not when they affect

108     A. Raftopoulos

perception otherwise, as when they determine attentional focus. I agree with Siegel’s view that cognitive effects on perception entail that perception is CP only if these effects affect the perceptual processing itself (I have called this sort of effects, direct or intrinsic cognitive effects on perception). This is line with almost all accounts of CP that exclude from being cases of CP all those cases in which a cognitive state causes an external action on the part of the viewer thereby changing what the viewer perceives. According to most definitions of CP, a cognitive state C1 cognitively penetrates a perceptual state P1 when C1 partly causes P1 in an internal way; if the causal chain partly takes place externally to the viewer, this is not CP. There are, however, problems with Siegel’s account when it comes to covert attention (Mole 2015; Raftopoulos 2009; Stokes 2017; Wu 2017). First, the ‘attending to the same distal stimuli’ should be amended to ‘attending to the same distal stimuli or to the same part of the stimulus’ to accommodate the role of cognition in the perception of ambiguous figures. Second, if Siegel thinks that all attentional effects are excluded from signifying cases of CP, as the second definition seems to entail, things are not as straightforward. Siegel seems to presuppose that all forms of attention act externally to mental processes and affect what one looks at or attends to, that is, attention determines the location or the objects/ features that are selected. After that, attention does not affect perceptual processing and does not influence the way things look. Siegel (2011) calls this sort of selection effects ‘global selection effects’ and also the selective mode in which attention affects perception (Siegel 2016). Siegel thinks that by controlling attention one ensures that perceptual processing itself is not affected by attention and, thus, that attention does not entail the CP of perception. Instead, one should look for other cognitive influences on perception, such as the effects of object recognition, or for affective effects in order to establish the CP of perception. The problem with this account is that it excludes all attentional effects from counting for cases of CP on account of the fact that attention always acts externally to perception. Attention, however, does not act only externally; covert attention can also affect perceptual processing itself.

2  Cognitive Penetrability     109

First, pre-cueing, in which attention or expectations direct focus on some location, or enhance the baseline activations of the neurons encoding the relevant information be it a feature or an object before the presentation of the stimulus, seems to affect the internal ongoings of perceptual processing as well, making it necessary to examine the nature of these effects to determine whether they entail the CP of perception. Second, late vision processes are internally and directly affected by cognition through cognitively driven attention, and are characteristic cases of CP perceptual processes. This means that late vision is CP because of the role of cognitively driven attention. This is not a mere appeal to covert attention, because covert attention can act so as to select locations or objects/features and, thus, its action can be a mere selection effect. It invokes the fact that in visual processing, in order for the percept to be formed in late vision, hypotheses about the identity of the objects in the visual scene should be constructed and tested against the visual information contained in the iconic image. This whole process inherently involves attention because it is through attention that the information crucial for this testing and contained in the iconic image is highlighted, that is, has the activation of the neurons encoding it enhanced. Why does attention enhance the activity of some neurons in the visual cortical regions during late vision? Clark (2013) argues that to perceive the world is to use what you know to explain away the sensory signal across multiple spatial and temporal scales; the process of perception is inseparable from cognitive processes. The aim of this interplay is to enable perceivers to respond and eventually adapt their responses as they interact with the environment so that this interaction be successful. Success in such an endeavor relies on inferring correctly (or nearly so) the nature of the source of the incoming signal from the signal itself. The cognitively driven direct attentional effects in late vision help testing hypotheses concerning the putative distal causes of the sensory data encoded in the lower neuronal assemblies in the visual processing hierarchy. As we shall see in Chapter 5, in late vision the visual system builds on the basis of visual information extracted from the environment in synergy with prior knowledge about the world and its objects hypotheses concerning the identity of an object in a visual scene.

110     A. Raftopoulos

This hypothesis is then tested against the rich iconic information that is extracted from the visual scene and is stored for a limited amount of time in the early visual areas of the brain, i.e., the iconic image. In this case, the formed hypothesis guides the visual system to search for clues that confirm it or disconfirm it, by directing attention so that the perceptual salience of diagnostic features relevant to the confirmation process be increased, which is accomplished because attention increases the activation of the neurons encoding these diagnostic features, or, alternatively as we shall see in Chapter 3, attention may act so as to suppress noisy neural activity rather than to increase the activity of the neurons that encode the information contained in the pre-cueing signal (Hegde and Kersten 2010; Murray et al. 2004). What I have described, which is typical of late vision, is a clear-cut case of CP since the cognitive effects influence perceptual processing. To use Siegel’s distinction between global and local effects of attention, attention does not act only by making a global selection among different stimuli or parts of the environment, thereby determining which visual scene will be perceived. It also performs a local selection, as it were, where it selects information from the iconic image or stimulus that confirms or disconfirms a hypothesis concerning the identity of an object in the visual scene. Both spatial attention and feature/object-centered attention may perform local selection, whereby some information from the iconic image is selected for further processing at the expense of other competing information in the iconic image. Third, the effects of expertise, familiarity, and cases of perceptual learning that facilitate object recognition and Siegel posits as possible causes of CP are mediated by a top-down control of perception by cognitively driven attention (see Chapter 3). Consider a case of expertise as a result of perceptual learning in which the belief that, say, a snake, is close-by leads viewers’ attention to the relevant clues in their iconic image, which this way pop-up and stand out in an attempt to confirm or disconfirm the hypothesis that the object is a snake; this is one way in which perceptual learning facilitates object recognition. A belief, the hypothesis concerning the identity of an object in the visual scene, directs attention to the iconic image and selects some information and

2  Cognitive Penetrability     111

de-selects some other.2 Furthermore, the causal process is entirely internal to the viewer and also changes the way things look to the viewer since this process forms the percept. In these cases, attention acts internally and affects the way things look. It is, thus, a typical case in which attention makes perception CP, except that in this case CP does not epistemically downgrade perception but enhances its epistemic role. Thus, attention does under certain circumstances signify the CP of perception, because attention does not operate only externally to perceptual processing. Hence, it would be wrong to exclude attention from entailing CP. Why does Siegel wish to distinguish between attentional effects that perform a global selection and CP? The point she seems to be making in discussing the ambiguous figures is that when mere selection effects occur they determine what you look at or what you attend to but they do not affect perceptual processing and how things look, and CP is supposed to cover cognitive effects that change the way a scene looks. According to Siegel, the fact that attention acts externally to perceptual processing allows merely refocusing attention to change the percept. This means that two viewers who see the same ambiguous figure and form different percepts could resolve matters pertaining to the content or meaning of their experience just by refocusing because in this way one will get to see what the other previously saw. Underlying Siegel’s account is, probably, the assumption that if some cognitive influences on perception are best interpreted as involving a shift in attention, which introduces an external link, these cases do not raise the same concerns about the epistemic role of perception as CP does, and, thus, they should not be deemed instances of CP. So, it seems that Siegel’s account of CP relies implicitly on consequentialist consideration of the sort invoked by Raftopoulos and Stokes. As we have seen, however, Siegel acknowledges that attention, even if its acts externally, may downgrade perception. This means that she probably thinks that the downgrade 2Tucker

(2014, 37) raises a similar concern to point out that attention may cause CP, except that he talks of ‘mental image’ instead of ‘iconic image’ as I do here. Although the iconic image is a mental image, In Chapter 6, I explain why I do not prefer to talk about mental images in cases of perception.

112     A. Raftopoulos

owing to the external attentional effects could be alleviated more easily than the downgrade due to genuine CP. In fact, if one compares the countermeasures she proposes to alleviate the epistemic downgrade of perception due to CP in Siegel (2011) and the countermeasures she proposes to alleviate the epistemic downgrade of perception due to global selection (Siegel 2016), one cannot but agree with her. (In Raftopoulos 2006, 2009 I made a similar argument and discussed an example to show that the bad consequences for perception of the cognitive influences that act externally to perception can be alleviated in a relatively straightforward way, but Siegel digs deeper than I.) Consider two viewers who attend to the same distal object. The first recognizes the object, while the second does not. Siegel rightly thinks that the experiential contents of the two viewers will be different on account of the role of object recognition on the part of the first viewer, although I disagree with her views about the specific phenomenological content of the experience since I do not think that ‘kind’ figures in the phenomenological content of the experience but is part of the representational content of the experiential state. Object recognition changes perceptual content because it directs attention in such a way that different aspects of the object are highlighted and noticed. The other viewer probably either misses these aspects or they do not stand out as strongly to her. This explanation is not available to Siegel because she thinks that the role of attention has been exhausted in fixing the location and the object of sight. Since the distal object is the same for both viewers, attention is being neutralized and plays no further role in perceptual processing. Hence, Siegel has no way to explain the way object recognition acts on perceptual processing. As I have argued, the role of attention far exceeds that of fixing the location and the object of sight. Attention modulates a significant part of perceptual processing rendering it CP and causing any differences in experiential content that may exist, even if location and object are held constant. It is interesting to analyze Siegel’s (2013b, 240–241) discussion of a pair of pliers that is mistaken for a gun when subjects are primed by pictures of black men, which creates the belief that a gun is likely present. Siegel proposes several ways in which the prior belief that a gun is present could affect the percept. The first way is through cognitive

2  Cognitive Penetrability     113

penetration whereby subjects see the pair of pliers but they look to them like a gun when they are given the black primes. A different way is that the prime generates a selection effect, which can influence either the content of experience, or the role of experience. In the case of selection of features, subjects see the pliers, but due to the prime they attend to and select only those features that pliers share with guns. In this case, the subjects have neither a pliers experience nor a gun experience, but rather an experience with more impoverished content, and they form the conclusion that it is a gun. Siegel thinks that only the first case is a case of CP. Let us consider the first case, which for Siegel is a case of CP. The belief that there is a gun that viewers form owing to the black prime penetrates perception and makes them see a gun instead of the set of pliers. How does this occur? How does the prior belief make viewers see a gun even though they face a pair of pliers? As I argue in Chapter 5, objects are recognized in late vision where tentative hypotheses concerning their identity are formed and tested against the rich iconic information retrieved from the visual scene and stored in visual areas (the iconic image). The hypothesis that passes the tests determines the percept. In our case, therefore, the viewers form first in late vision the implicit hypothesis that the object is a gun. Then they search the iconic image for a confirmation of this hypothesis, which they find because the pair of pliers shares some features with guns and, therefore, these features are present in the iconic image, and because any recalcitrant information is rejected. This is accomplished through the role of spatial and feature-centered attention, driven by the prior belief, that guides a search for characteristic features of guns. What the visual system effectively does is to search for and select those features in the iconic image of an object that will allow it to confirm the gun hypothesis. This, is, however, the second ‘selective’ way that Siegel describes the only difference being that the selection takes place not in the environment but in the iconic image, which means that the only way the prior belief can penetrate perception and cause the experience of an object is through cognitively driven attention guiding feature selection from the iconic image. The two different ways that Siegel describes, therefore, are not two different ways a belief may affect perception; they describe, rather, the

114     A. Raftopoulos

cognitive penetration and the way it takes place, to wit, through attention guiding feature selection. In other words, the phenomenon of CP that Siegel describes in the first way is explained by a selective mechanism that figures in the second way and applied to the iconic image, as there is no other way the prior belief could affect perceptual processing. Siegel is also wrong to assume that in the second case the viewers jump to the conclusion that the object is a gun and that they have neither a pliers experience nor a gun experience but, instead, an experience with more impoverished content. The subjects do not have a neutral, as it were, experience between a gun and a set of pliers and conclude that they see a gun; they experience a gun, an experience correlated with a hybrid perceptual state formed in late vision consisting of the visual characteristics of a gun selected from the iconic image, Siegel’s impoverished content, as they are transformed and highlighted by attention (because attention may alter some of the visual properties of objects), and the conceptual information about guns. The impoverished image corresponds to an object with the configuration of a gun that a viewer could form even if she did not possess the concept GUN; it is a gun-like experience that is formed pre-conceptually and pre-attentionally in early vision. The impoverished image, thus, is the output of early vision but it is not the percept, as Siegel contends. The percept is formed in late vision and its content differs from the impoverished image retrieved by early vision. More importantly, in late vision objects are recognized as such and, therefore, the percept always concerns, under normal viewing conditions, an object categorized in a certain way. Thus, a viewer sees either a set of pliers or a gun but not something in the between. It is possible, though, that the evidence present in the iconic image does not suffice to determine unequivocally the identity of an object, in which case two conflicting hypotheses are equally confirmed and the perceptual system delivers both objects and the viewer judges which one to choose. Or, in some other cases, viewers may end up wondering whether they see a gun or a set of pliers. Concluding the discussion on Siegel’s views about CP and attention, one could say that Siegel’s thesis on the role of attention in perceptual processing belongs to the class of views that, according to

2  Cognitive Penetrability     115

Watzl (2017, 157), attempt to deflate the role of attention in perceptual processing by restricting it to a mere selection of input that will enter consciousness. Maybe attention is like opening your eyes. It enables conscious experiences. Or brings a stimulus to consciousness … We can call a view that treats attention as a causal antecedent of conscious experience a deflationary (emphasis in the text) view of the phenomenal contribution of attention. According to a deflationary view you are conscious of everything you attend to. And you are not conscious of anything you do not attend to. And that is all there is to be to the phenomenal contribution of attention. (Watzl 2017, 157)

Watzl’s discussion puts emphasis on the role of attention for deflationists in bringing attended items to consciousness, but it also shows that for deflationists, attention acts merely to select the perceptual input. So, perception is just a causal antecedent not only of consciousness but also of the perceptual processes since it acts only externally to them, in the sense that after attention has determined the input it does not affect in any other way perceptual processing.

2.5 Wu Wu (2013) recasts the discussion about CP/CI in terms of the informational encapsulation of the perceptual system from the cognitive system. Failure of informational encapsulation is entailed by the following three conditions when conjoined: (1) Internal Causal Link: S’s visual experience V with content p is causally dependent on a non-visual system Y via an internal causal link; (2) Computational Condition: The influence of Y on V(p) makes the visual content intelligible owing to the computations that underwrite p using information in Y as a resource; (3) No Explanatory Defeaters: the resulting p is not explained by changes (i) in the proximal stimulus, (ii) the state of the eyes, (iii) the locus of attention.

116     A. Raftopoulos

Wu’s main contribution lies in recasting the discussion on CP in terms of informational encapsulation, which I think is a useful tactic. As I have explained, however, in discussions about CP what matters is that the effects on perception be cognitive and not from some other non-cognitive source, because in the latter case perception would be informationally non-encapsulated and, yet, CI. For this reason, I assume that Y is a cognitive system and that the information affecting perception is conceptual. Wu is right that to establish that the content p of V is penetrated by information originating in Y it is necessary to ensure that the computations that support V are able to exploit Y as a representational resource, that is, they are able to use the (conceptual) information in Y. Another advantage of Wu’s definition is that it entails that late vision is CP since its processes use conceptual information. Notice that the condition of intelligibility figuring in Wu’s account is different from Macpherson’s similar condition, because Wu demands that the information deriving from Y be used by the computations underwriting P, which means that some of the concepts contained in the cognitive information find their way into the perceptual content. Macpherson’s demand, on the other hand, simply states that a causal explanation of the content of the perceptual state includes the cognitive information, which is compatible, as we saw in discussing ambiguous figures, with the cognitive contents determining attentional focus without being used by the perceptual processes that analyze the ambiguous figure. A problem with Wu’s definition is that his conditions do not exclude from being a case of CP the effects of pre-cueing. Pre-cueing causally affects perceptual processing through an internal mental link and, in some sense, the perceptual processes do exploit cognitive information (this information sets the initial values of the parameters), which, thus, contributes to making perceptual content intelligible since the cognitive information figures in a causal explanation of the percept. Thus, Wu’s definition allows that, at least, some of the extrinsic mental effects on perception be instances of CP. As I will argue in the next sub-section, however, one of the effects, namely pre-cueing does not entail CP.

2  Cognitive Penetrability     117

3 A New Definition of CP Before I offer this definition, let us be reminded of the nature of the various cognitive effects on perception that has been underlying our discussion on CP thus far. The top-down cognitive effects on perception can be broadly categorized into two classes. The first concerns the effects that emerge as a part of perceptual competition and, as such, are intrinsic or direct to the perceptual processing. The second class concerns the effects that do not emerge as part of perceptual competition and, thus, are external to perceptual processing even though they causally affect it. Let us call them ‘extrinsic’ or ‘indirect effects’. The intrinsic/direct cognitive effects on perception exemplify how perceptual processes are altered as a result of cognitive influences; the cognitive influences alter the computations performed by the affected perceptual processes. This sets a condition for a definition of CP. Directness Condition for CP: Visual processes that are intrinsically or directly, in the sense explained above, affected in a top-down manner by cognitive states are CP. It follows immediately that late vision is CP owing to the fact that it is intrinsically modulated by cognition. There is abundant evidence that the contents of the states of late vision, including the phenomenological, i.e., personal-level contents, that is, the contents that present the world to the viewer as being in a certain way, are affected by attention. Since attention is the means par excellence through which cognition affects perception, it follows that cognition affects the contents of the states of late vision. Furthermore, late vision involves the testing of hypotheses concerning the identity of the visual objects and both the formation of the relevant hypotheses as well as the testing processes are guided by cognition in the form of stored knowledge about the world. One could object that this criterion is too narrow in that it restricts CP only to top-down cognitive effects on perception, whereas perception may be cognitively affected from within since it is possible that concepts figure inherently within perception, and conceptually modulate

118     A. Raftopoulos

in a direct way perceptual processing from its inception. To address this concern, one should expand the Directness condition as follows: Extended Directness Condition for CP: Visual processes that are intrinsically or directly, in the sense explained above, affected either in a top-down manner or from within by cognitive states are CP. Let us define CP with a view to keep from the definitions that have been proposed thus far those parts that have escaped criticism. CP revisited: A cognitive state C cognitively penetrates a perceptual state P when C partially causes P, and the causal chain from C to P is (a) mental and internal in the sense that it is contained entirely within the subject; (b) C does not act so as to merely select the input for P; (c) C affects the perceptual processes that lead to the formation of P in the sense that these processes use information contained in C. The information contained in C is used by the processes that issue P in an online manner, that is, it is used during the course of the processes underwriting P and it does not simply fix the values of some parameters that figure in the state transformations in which the processing in P consists. It follows that when C penetrates P, the conceptual contents of C (or a subset of them) enter the contents of P; (d) C may affect P in a top-down manner, or C may be imbedded in the processes that issue P. (e) The cognitive effects on perception should be such that if perception is CP, it is nomologically possible for two viewers (or for the same viewer at different times and circumstances), to have perceptual states with different contents while seeing the same distal stimuli under the same external conditions. This definition excludes from constituting CP cases in which attention makes one viewer focus on one part of the surrounding environment (or on a part of the input) and another viewer to focus on another part of the environment (or another part of the input), which results in selecting different inputs (or in organizing the input differently),

2  Cognitive Penetrability     119

and forming, thus, two different percepts, independent of whether the attention at work is overt attention that involves eye or body movements—this sort of attentional effects introduces an external factor in the causal chain of influence that excludes them from being cases of CP—or covert attention that is not associated with such external movements but, still, could be viewed as introducing an external factor in the sense that attention acts so as to select the distal, external stimulus. These are the cases in which, to use Siegsl’s (2011) term attention performs a global selection from among many possible stimuli. Thus, the definition excludes from being CP some cognitive effects on perception that operate through the effects of covert spatial attention. In addition to be internal, the penetration must be such that the percepts may nomologically differ when the distal stimulus is the same, a condition that the demand by itself that the causal influence be internal to the viewer does not satisfy owing to the possibility of attention acting in a covert way and selecting different distal stimuli. CP is in general the influence of cognitive (including emotive states) on perception under certain conditions. This entails that cognitive states partially cause a perceptual state, where the causal chain is internal to the viewer. The condition that the causal chain be internal to the viewer is sometimes thought (Siegel 2011, 2013a, b) to exclude cognitive effects mediated by attention, whether it be spatial or object-centered, from being instances of CP. This, however, is wrong since the attentional effects on late vision are internal and ubiquitous and clearly affect perceptual processing partially causing a perceptual state. Most of the attempts to define CP try to deal with the role of attention in perceptual processing and with the problem of the extent to which attentional influences entail the CP of perception. Furthermore, all definitions of CP think that attentional effects on perceptual processing are homogeneous in the sense that they affect perceptual processing in the same way. Moreover, once the distinction between overt and covert attention has been made, all cases of covert attention are treated as if they all share the same nature. As a result, the definitions that we examined cannot distinguish between cognitively driven pre-cueing effects and cognitively driven, top-down effects on perception and either treat all of them in a way that their operation does not entail CP because attention does not entail CP, as in Siegel’s

120     A. Raftopoulos

definition, or they are all treated so as to entail CP, as in other definitions. Both the assumption that attention acts the same way throughout perceptual processing and the assumption that all attentional effects have the same nature are wrong and this has vitiated the attempts to define CP adequately. The proposed definition handles well the case of pre-cueing. In pre-cueing, as I argue in the next chapter, attention acts internally not by selecting directly the distal stimulus as the covert attention discussed above, but by causing preparatory effects. In this case, the cognitive factors underlying attention that drives pre-cueing do not affect the processing of early vision but only the values of some parameters; as I will argue, pre-cueing belongs to the indirect cognitive effects on early vision. Things, however, differ with respect to late vision, because pre-cueing affects late vision directly by altering the perceptual processes occurring in late vision and, thus, pre-cueing renders late vision CP. Why should pre-cueing, as it affects early-vision, and the other indirect cognitive effects on perception, such as the effects of attention in global election, be excluded from being treated as cases of CP, as stipulated in almost all the definitions of CP, including the definition I offered above? In the introduction to this chapter, I suggested that the reason why the indirect cognitive effects on any perceptual stage should not be considered as cases of CP is that by not affecting perceptual processes themselves, they do not affect the epistemic status of perception in a pernicious way, in the sense that could be easily alleviated simply by asking viewers to refocus attention which results in their seeing the same thing given the same stimulus and under the same viewing conditions. If, as Stokes and I think, a cognitive effect on perception should count as CP only if this effect affects the epistemic role of perception, indirect cognitive effects are not cases of CP. This imposes a second condition that an adequate account of CP should fulfill. Epistemic Condition for CP: If perception (or a stage of it) is cognitively influenced in a way that either renders it unfit to play the role of a neutral epistemological basis by vitiating its justificatory role in grounding perceptual beliefs, or enhances its epistemic status, perception (or a stage

2  Cognitive Penetrability     121

of it) is CP. If perception (or a stage of it) is cognitively influenced in a way that does not affect its epistemic role it is CI.

Notice, first, that this is not a necessary condition for CI, which means that a perceptual stage can be CI even if attention affects in some specific way its epistemic status. The reason is that some indirect cognitive effects may downgrade perception, but their effects are not pernicious, because they could be easily alleviated, and for this reason they are not considered to be cases of CP. Covert attention, for example, when it gives priority to some objects in a visual scene may affect the epistemic role of perception by marking them for preferential processing during late vision, but since its effects are easily countermanded, its role does not entail that perception is CP. On the contrary, some cognitive effects do not influence the epistemic role of a perceptual stage, this stage is CI. As we shall see, covert attention shifts do not affect early vision, making early vision CI. The motive underlying this criterion is that if some cognitive effects on perception fail to affect perception’s epistemic role, they are epistemologically uninteresting and, certainly, discussions on the CP of perception purport to make some epistemological interesting claims. According to the epistemic criterion, to determine whether some cognitive effects constitute genuine cases of CP, one should examine the extent to which they affect the epistemic role of perception or of a stage of it. To understand better what is at stake with the idea that perception is CP, one should go back when the discussions about CP of perception started. As I wrote in Chapter 1, several philosophers interpreted findings in psychology and neuropsychology as showing that cognitive states involving propositional/conceptual contents affect perception and this was used as a springboard to mount an attack on the received view in the philosophy of science that there is a theory neutral observational basis on which a rational choice for empirical adequacy between competing theories could be made. The main motive, therefore, underlying discussions of CP was that CP was thought to undermine the epistemic role of perception in grounding perceptual beliefs, that is, to undermine the extent to which experience could justify some belief. It follows that a cognitive influence on perception is a case of CP if it undermines the

122     A. Raftopoulos

epistemic role of perception. As our discussion thus far shows, for a cognitive effect to undermine the epistemic role of perception it must do it in such a way that its effects are not alleviated simply by refocusing attention, whether it be overt or overt attention. The effects must be such that the epistemic role of perception is downgraded in a philosophically interesting way. This calls for a definition of what is a philosophically interesting problem. I will sketch an answer to this without purporting to offer a definition since what is philosophically interesting is highly contextual, which means that an attempt to define the term would take us far afield from the purposes of this book. One could say that a philosophically interesting problem is a problem whose solution requires philosophical analysis, which in the case of perception, at least, should be informed by empirical studies on perception. Thus, addressing the epistemic problems raised by the CP of perception and denying that they downgrade perception leading to the various forms of constructivism requires a lot of analysis and argumentation taking into account what the sciences tell us about what transpires in the brain during the various perceptual acts. This is so because cognition affects directly late vision and, thus, its harmful epistemically speaking effects are difficult to neutralize. In contrast, when cognition affects a perceptual stage indirectly, that is, it affects it by driving attention and selecting the input to perception, the harmful epistemic effects can be easily neutralized by a simple refocusing of attention. This calls for a revised Epistemic Condition Revised Epistemic Condition for CP: If perception (or a stage of it) is cognitively influenced in a way that either renders it unfit to play the role of a neutral epistemological basis by vitiating its justificatory role in grounding perceptual beliefs in a philosophically interesting way, or enhances its epistemic status, perception (or a stage of it) is CP. If perception (or a stage of it) is cognitively influenced in a way that does not affect its epistemic role it is CI. In view of the above considerations, it seems that the relationship between the directness condition, which relates the problem of whether

2  Cognitive Penetrability     123

a cognitive effect on perception entails CP with whether it affects perception directly, and the epistemic condition, which relates CP with the repercussions of the cognitive effect with respect to the epistemic status of perception is intricate. If cognition directly affects perception, the latter is CP. Let us put this as follows: CDAP (Cognition Directly Affects Perception) → CP. Thus, the directness condition constitutes a sufficient condition for CP. Does it hold that if a process is CP then it is directly affected by cognition CP → CDAP? In other words, could indirect cognitive effects render a perceptual process CP? If they did, the necessary part does not hold, which means that the directness condition is not sufficient and necessary for CP. This is the juncture at which the epistemic criterion enters the discussion. According to it, if cognition either downgrades perception in a philosophically interesting way, or enhances its role, perception is CP. As a lemma, cognitive influences on perception that do not affect in any way the epistemic role of perception are not cases of CP. This excludes indirect cognitive effects on perception from entailing CP and allows us to hold that CP → CDAP (the necessary part of the extended directness condition). It follows that the extended directness condition conjoined with the revised epistemic condition yield a sufficient and necessary condition for CP. Things are intricate because, at a last analysis, the fact that the indirect cognitive effects are easily alleviated stems from their being indirect effects that as such do not affect perceptual processing itself. It turns out that the directness condition entails a pragmatic property, namely, that the epistemic consequences of the indirect cognitive effects could easily be alleviated, which when used in the context of the dialectic surrounding CP has an epistemological consequence, namely that they do not entail the CP of perception. Returning to the problem of defining CP, since some cases of CP enhance the epistemic role of perception, one should extend the definition of CP so that any cognitive influences that affect the epistemic role of perception should be deemed as a case of CP independent of whether it diminishes or enhances this role. From all these, it follows that cognitive influences on perception that do not affect its epistemic role are not cases of CP. In view of these, to determine whether the indirect effects on perception in general and early vision in particular entail

124     A. Raftopoulos

the CP of perception or of early vision, one should examine the indirect attentional effects and determine whether they affect the epistemic role of perception or of early vision. This requires that one explain first what is exactly the epistemic role of perception and since perception consists of early and late vision, one should focus on the epistemic role of these two stages of perception.

4 The Epistemic Role of Early and Late Vision The epistemic criterion for CP entails that to determine whether a perceptual stage is CP one should examine whether there are cognitive influences on this stage that affect its epistemic role in grounding perceptual beliefs. To do that, one should delineate first the epistemic role of each of the perceptual stages. The epistemic role of perception centers on, but is not exhausted in, the percept or the seeming because it is the seeming that ultimately grounds the perceptual belief whose content matches the content of the percept. This is why the seeming figures predominantly in discussions of the epistemic role of perception. The percept that O is F, is formed in late vision because it presupposes that the object and the features in a visual scene have been identified and this takes place in late vision. In addition, as we saw in Chapter 1, what justifies prima facie a perceptual belief is a seeming, which is a conceptually structured perceptual state (McGrath 2013a, b; Tucker 2010); as such it could only be formed in late vision if one thinks that early vision is not conceptually modulated. It follows that the onus of perceptual justification is on late vision; it is late vision that delivers the most important item in the justification process, i.e., the seeming. The details of the processes by which late vision forms the percept will be discussed in Chapter 5 (see also Raftopoulos 2011). For the purposes of my arguments here it suffices to say that the epistemic role of late vision is affected by cognitive influences and, thus, late vision is CP. The epistemic role of early vision is constrained by the fact that early vision retrieves from the visual scene information that is fed to late

2  Cognitive Penetrability     125

vision and is used for the construction of the percept, in the formation of which the semantic information made available by cognition also plays a crucial role. Thus, the epistemic role of early vision consists in providing the input to late vision in which the seeming will be formed. This role of early vision makes it a plausible mechanism to provide Siegel’s pre-experiential state (to whose contents/evidence perception responds in the responsive mode), or McGrath’s receptive seemings on the basis of which the non-receptive seemings that eventually ground perceptual beliefs are quasi-inferred. Thus, the iconic information delivered by early vision (the iconic image) provides the ‘evidential’ or support basis (should one wish to deny that perception adduces evidence) on which the various hypotheses concerning the identity of objects in the visual scene are formed and tested in late vision. Thus, the role of early vision is to retrieve from the environment the information that will be used by late vision in order for the distal objects in the visual scene to be identified and this determines its epistemic status. It follows that if cognition affected the information retrieval from the environment during early vision, cognition would affect the epistemic role of early vision, which, in turn per the epistemic criterion, entails that early vision would be CP. The role of early vision is this. Early vision delivers a structural description of the visual scene that contains information about 3D shapes as viewed from the perceiver, spatio-temporal and surface properties, color, texture, orientation, motion, and affordances of objects, in addition to the representations of objects as bounded, solid entities that persist in space and time. In central vision, information other than texture is represented with as much accuracy, or maximal detail, as the limitations of our perceptual systems allow. Texture, however, is represented as an ensemble property, that is, as the joint statistics of responses of cells sensitive to texture. In peripheral vision, in contrast, the representational mode is that of ensemble summary statistics. That is, objects in the periphery are treated as ensembles and their statistical properties are represented. This means that in peripheral vision, mean orientations, motions, shapes, sizes, colors, etc. are represented.

126     A. Raftopoulos

5 How Do Cognition and Perception Interact? Before I proceed to the next chapter, I wish to address the following problem because it bears directly on definition of CP. Recall that if a cognitive state affects directly perceptual processing, the perceptual processes are CP. It is widely assumed that if cognitive penetration occurs it is through the effects of cognitively driven attention, imagination, or affective states. If someone holds that perceptual and cognitive states both have conceptual, symbolically structured contents, the answer to the problem of how cognition and perception interact is relatively easy since they could appeal to some sort of inferential relations between the two sorts of states. If, however, someone holds that perceptual and cognitive states are cast in different representational contents, iconic and symbolic respectively, and they also think that cognition affects perception, an account on this interaction in view of the difference in representational formats should be provided. I attempt here to sketch an account of how symbolic contents carried by cognitive states modify iconic perceptual contents, addressing an objection against the possibility of a direct interaction between cognition and perception raised by Burnston (2017). Burnston (2017) argues that discussions of the way cognition affects perception are not sufficiently clear about the kind of relationships that hold between the two. Burnston claims that this relationship could be construed in two ways. According to the strong one, the Internal Effect View (IEV) A perceptual process P is penetrated if, over a specific input, it would perform a certain computation C leading to content R1 in the absence of a cognitive state, S, but performs a different computation C2, yielding content R2, when S is present, where the causal, semantic coherence, and computation conditions are met.

In IEV, cognitive states penetrate perceptual processing because they affect perceptual processing itself, that is, when they modify the computations performed by perceptual processes; this is the computation

2  Cognitive Penetrability     127

condition. In this case, the cognitive states causally affect the perceptual states in a direct manner, which is the causal condition. Finally, the content of the modified perceptual state should be intelligibly related to categorical facts concerning the content of the penetrating cognitive state; this is the semantic coherence condition. Consider the putative case of CP of color perception in the Delk and Fillenbaum (1965) experiment. Participants were shown paper-cut objects presented in an orange–red color and when asked to match the background color to that of the presented object they tended to make the background a more saturated red for stereotypically red objects (hearts) than for stereotypically non-red objects. Perception normally would receive as input the orange–red heart and the participants would match the background color, an orange–red color, R1, but owing to the CP perception by the belief that hearts have a saturated red color, they choose a more saturated red R2. The weak construal of the cognition/perception relationship, the External Effect View (EEV), states that Tokening of a lexical/atomic concept as part of a cognitive state provides a bias toward any perceptual processes associated with the concept, raising the probability that those processes will be applied to a perceptual stimulus.

According to EEV, cognitive effects do not modify the perceptual computations, as in IEV, but, rather, change or bias the distribution of probabilities of all possible perceptual processes that could be applied to a stimulus so that the perceptual processes associated with the concept(s) figuring in the affecting cognitive contents have their probability of being applied to that stimulus increased. Burnston argues that owing to their different representational format, cognitive states cannot affect perceptual processing in the way described by IEV; only EEV can explain cognitive effects on perception. Thus, cognition cannot modify perceptual computations but it can only bias which perceptual process will occur when a stimulus appears. Burnston notes that he will remain neutral as to whether EEV should be construed as a genuine case of CP.

128     A. Raftopoulos

Since I think, first, that purely perceptual states, namely, those of early vision, have iconic contents that differ from the symbolic content of the cognitive states, and, second, that despite this difference cognition does affect those perceptual states to produce states with hybrid (iconic and symbolic) contents, I must provide an account of how this interaction could take place. Specifically, I have argued that cognition modulates directly perceptual processing in late vision and, thus, that late vision is CP. Cognition modifies the perceptual computations of late vision so that late vision states are imbued with concepts. I adopt, thus, Burnston’s IEV view of cognitive penetration. To contend with Burnston’s view that IEV cannot explain the way cognition affects perception, I propose an account of how cognitive effects modulate perceptual processes of late vision that meets Burnston’s objections while retaining the core theses of IEV. If I am on the right track, I will have offered a plausible account of how cognition affects purely perceptual processing and this has far wider implications than a mere assessment of Burnston’s views on this matter. Thus, my purpose here is not to provide a commentary to Burnston’s work, which I do, but to present a way cognition could interact with pure perception. (When the context makes it clear that I discuss ‘pure perception’, I will omit the designation ‘pure’.) First, I present the main points of Burnston’s arguments. I point out, second, two problems in his arguments. The first concerns the way Burnston thinks of the representational content of cognitive states as they operate in perceptual encounters. The second concerns the way cognitive contents modulate perceptual processes. Then, I offer an account of how cognitive states modify perceptual processing in late vision in the way described by IEV that is not affected by Burnston’s criticism. Burnston puts into IEV some unwarranted assumptions, which, once eliminated, allow IEV to describe adequately how cognition affects late vision. Finally, I argue that once the way the cognitive modulation of perception occurs is properly understood, IEV and EEV capture the two ways cognition affects perception; the direct way (IEV), and the indirect way (EEV) in which cognition sets the parameters or initial conditions of these computations but does not modify them.

2  Cognitive Penetrability     129

5.1 The Argument Burnston (2017, 3648–3649) argues that perceptual and cognitive states have different structures and different kinds of content that refer to their referents through different referential relationships. Cognitive representations are discrete or atomic in the sense that they have no referentially relevant internal structure, from which it ensues that their content does not describe or specify any properties of their referents. The term ‘CAT’, for example, consists of three parts, namely, the letters ‘C’, ‘A’, and ‘T’, but none of these parts specify some property of the term’s referent, i.e., a cat. Perceptual representations have a representationally relevant internal structure since they have parts that carry distinct information about the various properties of their referents. Each part of the image of a cat corresponds to, and specifies, some particular property of the cat. Thus, perceptual representations have an internal structure that maps its parts to the structure represented by them. If a perceptual state represents a particular shade of red, it represents the particular structural properties that place that shade at a particular location in color space—that is, as taking up specific values along the dimensions that define the color space. (Burnston 2017, 3649)

Burnston is after the digital/analog distinction that is much discussed in the literature and, indeed, Burnston refers to the analog/digital distinction to describe the distinction between perceptual and cognitive representations. Having argued that perception and cognition differ in the ways they carry their informational content, Burnston (2017, 3654) goes on to explain why even if one accepts that the Delk and Fillenbaum (1965) experiment, for example, concerns a real case of CP, IEV cannot explain it. Recall that, per the form distinction, a cognitive state doesn’t contain any content that maps to the structure of perceptual stimuli. However, per the same distinction, perceptual representations must contain such structure, and the function that perceptual processes compute involves mapping

130     A. Raftopoulos

inputs to structured outputs. Since the change from R1 to R2 is precisely a change in what perceptual structure (e.g., metric properties or location in color space) is represented, the content of cognitive representations cannot determine the change. That is, the cognitive state doesn’t have the right kind of content to provide the “informational resource” that tells perception to modify its function in a particular way—viz., to end up at a particular R2 instead of R1.

Cognitive representations could not account for specific changes in perceptual outputs because by not having perceptual structure they cannot determine a particular perceptual structure. Burnston (2017, 3657) examines the possible rejoinder that cognition might affect perception through some mediating mechanism “in the sense that some mediating process conveys the content of the cognitive state to the perceptual system for use in modifying its function, and it could be contended that at some point during this process the content of the cognitive state is translated into the kind of content that perception can use to modify its functioning.” Such a mechanism could be imagery (Fazekas and Nanay 2017; Macpherson 2012), or attention (Mole 2015; Raftopoulos 2009; Wu 2013). According to Burnston, this translation will not solve the problems for IEV because on IEV the cognitive content has to cause perception to modify its function in a particular way; that is, it must modulate perceptual processing. For IEV to occur, the translation process would need to convey the cognitive content to perception, but for the cognitive content to be able to cause a specific perceptual effect it should be converted to perceptually structured content. Even if such a mechanism existed, the argument against IEV would arise within the translation mechanism. Suppose that some stage in the translation mechanism is proposed to implement the change in representational type. One way for it to do so would be to represent both kinds of content, such that it can map the discrete to the perceptual. To which perceptually structured content should it map the cognitive representation? Without some theory of why the mechanism should “choose” one specific perceptual content as opposed

2  Cognitive Penetrability     131

to another, no explanatory gain has been made by positing translation. (Burnston 2017, 3657)

Burnston acknowledges that IEV’s definition assumes that the influence of the cognitive state should represent in detail the structure of the stimulus, by which he probably means that the influence conveys content concerning perceptual structure. This, Burnston thinks, could be objected as being a very strong demand; one could point out that the modulation of perception by the cognitive state could assume the form of a kind of cognitive command, “if x is a heart, modify saturation by 5% to produce a redder shade”. This would barr the objection that cognitive contents cannot be translated to a representation that represents the detailed structure of a perceptual state (Burnston 2017, 3655). Burnston, however, thinks that two aspects of perceptual modifications due to the causal influence of cognitive states are incompatible with the cognitive command view. “The first is the possibility that the result of feature interactions can be graded depending on the amount of evidence for the perceptual modification … Deroy (2013) cites similar graded effects for categorical perception—categorical effects on color perception are more pronounced when there are corroborating depth and texture cues for the category” (Burnston 2017, 3655). In the example above, the command view posits a specific command—5% adjustment—as the way in which cognition modulates the perceptual function. But the effects are not specific in this way; instead they are graded depending on the evidence; cognitive effects on color perception are graded on account of texture and depth cues present in the image. Burnston’s point is that if the cognitive state either determined directly the perceptual content by imposing its own content, or issued a specific command as to how the perceptual content should be modified, the cognitive effect would be specific and, thus, the ensuing content would be fixed independently of any other factor. Since, as the graded effects for categorical perception suggest, this does not happen, the cognitive content cannot determine in any of these two ways the perceptual content. The second aspect of the way cognition affects perception that Burnston thinks is incompatible with the command view is the diversity

132     A. Raftopoulos

of potential categorical effects. The fact that a perceived object belongs to a category possibly implies a large number of perceptual consequences, both within and across modalities. The fact that a perceived object belongs to a category possibly implies a large number of perceptual consequences, both within and across modalities … something’s being heart shaped not only increases the likelihood that it is red, but also that it will make lub-dub noises. Suppose the belief ‘That is a heart’ is tokened. Which perceptual process should be modified, in which way? Given that the cognitive representation, ex hypothesi, lacks perceptually structured content, there is no way for the cognitive state to determine which out of the possible potential perceptual effects should be implemented. (Burnston 2017, 3655–3656)

After having rejected IEV as a plausible explanation of the way cognition affects perceptual processing, Burnston argues that EEV is a viable candidate for describing some cognitive effects on perception. I agree that cognition some times does affect perception in the way described by EEV, if EEV is modified in an appropriate way, that is, by biasing but not by affecting perceptual processes. In the concluding discussion, I will claim that a modified version of EEV describes what I have called (Raftopoulos 2009) the indirect cognitive effects on perception. To recapitulate his main point, Burnston claims that cognition cannot exert causal influences on perception by modifying the computations performed by perceptual processes due the format differences between cognition and perception that do not allow any content relationship and interaction, but, instead, it affects perception by biasing which perceptual processes occur when a stimulus is present.

5.2 How Do Cognitive States Modulate Perceptual Processing in Late Vision? I have argued that late vision is CP because it is directly affected by cognitive states since the perceptual processes use parts of the cognitive content, i.e., concepts, as informational resource. This means that I espouse some form of IEV to explain the cognitive effects on late vision.

2  Cognitive Penetrability     133

At a first glance, Burnston’s arguments against IEV that are based on the thesis that cognitive states do not have the right kind of content to modify perceptual processes seem not to affect the view that late vision is CP because one could argue that, owing to the hybrid perceptual/ conceptual character of late vision, the cognitive effects modify the processes of late vision involving the cognitive parts of the hybrid states. Since no one would deny that cognitive states can affect other cognitive states and modify the processes that govern intra-cognitive transformations, Burnston’s objection seems not to affect my thesis. The objection, however, does affect the thesis. I may hold that late vision has hybrid states and contents, but I also think that cognitive effects in late vision modulate the phenomenology of the visual scene, which means that cognition modifies perceptual processing itself. In addition, I have argued that the cognitive effects are mediated through cognitively driven attention, which means that there is a mechanism that mediates the cognitive effects on perception and Burnston rejects this possibility. To defend my views, I must counteract Burnston’s powerful objections. Since the claim defended here is that cognition affects directly late vision, I present first a brief sketch of what I mean by late vision (in Chapter 5 I discuss late vision in much detail and, so, I omit references here), which, as I also argue in Chapter 5 is a genuine perceptual stage and not a cognitive stage belonging to the space of reasons. Starting at 150–200 ms, signals from higher executive centers including mnemonic circuits intervene and modulate perceptual processing in the visual cortex and this signals the onset of global recurrent processing (GRP). In 50 ms low spatial frequency (LSF) information reaches the IT and in 100 ms high spatial frequency (HSF) information reaches the same area. LSF signals precede HSF signals because LSF information is transmitted through fast magnocellular pathways, while HSF information is transmitted through slower parvocellular pathways. Within 130 ms poststimulus, parietal areas in the dorsal system but also areas in the ventral pathway (IT cortex) semantically process the LSF information and determine the gist of the scene based on stored knowledge that generates predictions about the most likely interpretation of the input. This information reenters the extrastriate visual areas and modulates

134     A. Raftopoulos

(at about 150 ms) perceptual processing facilitating the analysis of HSF, by specifying certain cues in the image that might facilitate target identification. Thus, at about 150 ms, specific hypotheses regarding the identity of the object(s) in the scene start to be formed using HSF information in the visual brain and information from visual working memory (WM). These hypotheses are tested against the detailed iconic information stored in early visual circuits. This testing requires that topdown signals reenter the early visual areas of the brain, and mainly V1. Evidence shows that V1 is affected by object/feature-centered attention at 235 ms poststimulus. Let me also say that I agree that the analog representational format of perception should be distinguished from the digital or symbolic representational format of cognition, except that, in my view, this distinction concerns the purely perceptual content of early vision as opposed to the symbolic/cognitive content of cognition and does not extend to the content of the states of late vision, which is hybrid analog and symbolic content. Late vision is the interface between (pure) perception and cognition. Thus, I reject Burnston’s view that perception is cast in an analog representational format, while cognition is cast in a symbolic format, since there is a part of perception, namely late vision, that has a hybrid format. If this is so, how could I evade Burnston’s objection that owing to their representational symbolic/digital format, cognitive contents cannot modify perceptual processes? How could cognitive states affect the computations performed by perceptual processes so that the hybrid states of late vision emerge? To answer this question, recall that the brunt of Burnston’s argument is that given that a cognitive representation lacks perceptually structured content, the cognitive state cannot determine which out of the possible potential perceptual effects should be implemented. For IEV to occur, Burnston also argues, through a translation process (that is, some mediating process that conveys the content of the cognitive state to the perceptual system for use in modifying its function) this process would need to convey the cognitive content to perception, but in order for the cognitive content to be able to cause a specific perceptual effect it should be converted to perceptually structured content. In other words, for cognition

2  Cognitive Penetrability     135

to be able to act directly on perception its digital/symbolic content should be transformed to analog content. I think, however, that cognitive contents can modify, through attention, perceptual processing without any need for an analog translation of their representational content. In addition, I think that there is no need for a cognitive state to determine all the representational details of the penetrated perceptual state; the perceptual process will take care of that. Cognitive states, however, can modify the perceptual process itself; there is a distinction to be made that Burnston fails to notice between direct cognitive influences on perception in which perceptual processes employ cognitive information by incorporating concepts and deterministic influences by cognition on perception. Burnston argues successfully that cognition cannot exert deterministic influences on perception, but his arguments fail against the thesis that cognition could modify perception directly by affecting perceptual processing itself and not just in a pre-perceptual or post-perceptual manner. In the latter case, cognition would affect perception indirectly (Raftopoulos 2009; Pylyshyn 1999) but almost everyone involved in the debate accepts that indirect cognitive effects on perception, which Burnston describes as EEV, do not constitute cases of cognitive penetration. In the case of direct cognitive effects on perception, what cognitive states do is not to exert a determinist influence on perception, as Burnston assumes that they should do if IEV were correct, but to tip the scale in the biased competition between competing perceptual representations. Admittedly, tipping the scale is also how cognition acts in instances of biasing effects that are cases of EEV, that is, cases of indirect cognitive effects on perception, but when cognition affects perception directly this tipping the scales occurs during perceptual processing and not before or after this processing. Thus, in IEV too, perception carries the weight in determining the perceptual content, as Burnston thinks that it does only in EEV. Many researchers claim that cognitively driven attention is better construed as the biased competition between competing representations, where the bias comes from top-down cognitive information that is added to the bottom-up or lateral, data-driven information and together determine which of the competing representations will prevail.

136     A. Raftopoulos

In late vision, the cognitive information transmitted top-down concerns the core characteristics of the object(s) that are hypothesized to exist in a perceived visual scene, or the relevant locations where most likely such information may exist. Since spatial attention acts through a gaining mechanism that multiplies activations, while object/feature-based attention operates through both a gaining mechanism and a tuning mechanism that sharpens the responses of the relevant neurons (Ling et al. 2009), the perceptual neuronal assemblies that encode the spatial or featural relevant information receive an extra activational boost or have their responses sharpened and this biases the competition against neuronal assemblies that encode different information. This is how cognitively driven attention affects the activation values of the neurons in the relevant neuronal assemblies. This boost or sharpening occurs in the course of perceptual processing and is not just an offline increase in the baseline activation, as is the case in pre-cueing that affects neuronal activations before stimulus onset. Attention, by biasing the competition affects directly the perceptual computations. If this is how cognitively driven attention affects perceptual processing, does it presuppose that the cognitive contents driving attention should be converted to an analog form? Differently put, if attention performs either of these two functions (boosting or sharpening) and this is how it modifies perceptual processing, the cognitive contents that drive attention should be able to issue commands that these modifications take place. How can they do it given their symbolic format? To answer this, let us revisit the distinction between analog (iconic) and symbolic contents. An iconic representation is dense and homogeneous without an internal logical or formal canonical structure and, thus, does not admit of a canonical decomposition. According to Fodor (2007), perceptual representations are iconic and cannot recombine, while symbolic or conceptual representations are discursive and can be recombined the right sort of way. The reason is that iconic representations have no canonical decomposition, that is, although they have interpretable parts, they have no constituent parts. Discursive representations, on the other hand, have canonical decomposition because they consist of distinguishable parts. Simply put, a representation is compositional if its syntactic

2  Cognitive Penetrability     137

structure is determined by the syntactic structure of its parts and the syntactic features that are used in the composition. Having syntactic structure means, first, that the representation uses symbols that are discrete entities (which immediately entails that symbolic representations are not dense but discontinuous) and, second, that some parts of the representation are constituents and others parts are not. “Φ”, for instance, is a constituent of the representation ‘Φ (a)’ but ‘Φ (’is not a constituent. In this sense, discursive structures are not homogeneous; since they are not continuous or dense, their decompositions may lead to elements that are not representational. Iconic representations, on the other hand, satisfy the Picture Principle, which states that if P is a picture of X, then parts of P are pictures of parts of X (Fodor 2007, 173). In that sense, iconic structures are homogeneous. But then, all the parts of a picture are among its constituents and, thus, an icon is compositional whichever way you carve it up, that is, no matter how you cut the picture you always get a picture of something. Think of the difference between iconic and discursive representations in the following way: any part of the picture of the ocean is a picture of a part of the ocean, whereas not any part of the discursive representation Φ (a) is a discursive representation of a part of Φ (a). So pictorial representations are structurally different from conceptual, discursive representations. Research in perception supports the distinction between dense iconic perceptual representations and the symbolic representations used in WM and Long Term Memory (LTM) and which support conceptual thought. Since attention is involved in Visual Short Term Memory (VSTM) and visual LTM memory, it seems that the attentional modulation of the output of early vision results not only in restricting the number of objects that can be held in memory, but also in impoverishing the information about those objects that is stored in WM. In general, it is thought that iconic representations are high-density representations in the order of 100,000 bits of information (Itti and Baldi 2005). The representations in VSTM, on the other hand, have much lower density, about 30–40 bits of information (Norretrandes 1998; Vogel et al. 2001).

138     A. Raftopoulos

Coding of the content of iconic/analog representations in early vision is done through basis functions; iconic representations are modal and represent by means of dense basis functions that seem to work at the early perceptual levels. A color, for instance, is represented by a vector or a pattern of activation values (scalars that represent the relative activity of red, green, and blue) across columns of neurons that distributively represent colors. The basis functions in early vision are dense in the sense that the relevant activations can take continuous values. This explains why analog representations are homogeneous. VSTM coding is also done by means of basis functions but these basis functions are sparser since the relevant activations do not take continuous values; instead, they take discrete values. VSTM codes of colors, for example, concern categories like ‘red’, ‘light’, etc., but lacking a dense structure they do not encode the fine color information regarding hues, intensities, etc., that is available to low-level color channels. Thus, information stored in VSTM does not allow the fine discriminations made available via low-level color channels and the representations in visual areas differ from the representations stored in VSTM. It is debatable whether the representations in visual LTM function as descriptors that code in a symbolic all or nothing manner (for example something is red or not), or by means of sparse basis functions of the sort used in VSTM. Evidence suggests that VSTM acts as a gate of visual information for visual LTM (Nikolic and Singer 2007). Visual LTM cannot store information in a richer format than that of VSTM, although it can store more information than VSTM. More recent research (Sligte et al. 2010) suggests that viewers maintain 6.1 objects in iconic memory, 4.6 objects in fragile VSTM, and only 2.1 objects in visual WM. If VSTM and LTM store information by means of sparse basis functions, this information may be described as symbolic because the basis functions used concern types of visual object-features that form a discrete set of values, as opposed to the continuum of values that the basis function take when representing iconic information. These discrete values become symbols that are available to the cognitive mechanisms and therefore can be used for categorization purposes. Hence, the representations stored in memory, apparently owing to limitations in

2  Cognitive Penetrability     139

storage capacity, do not contain information about, say, the determinate hue, only information about the category of the color (say, bright red). The construal of the representations in memory as symbolic does not rely on their being categorical; it is based on the fact that information is stored in memory by means of sparse basis-functions that, as such, are not continuous and homogeneous, as analog representations are. By being symbolic it enables categorization since it abstracts away much of the detailed iconic information of the stimulus and allows different tokens that differ in various features to be subsumed under the same type. Some forms of categorization may be purely perceptual and, thus, the inference from categorization to the symbolic character of the representation is false (Raftopoulos 2010; Burnston and Cohen 2015). The above do not entail that perceptual representations could not themselves be less determinate, or less fine grained, than in typical cases; parafoveal vision, for example, typically yields such representations since it delivers summary statistics of the features in a visual scene rather than their detailed encodings. These summary statistics are not, however, the main product of visual processing, which is the percept, although they may contribute to, and facilitate, the formation of the percept allowing the rapid representation of the gist of the scene. The percept however is necessarily more determinate than any representation in VSTM. Thus, perceptual contents can be determinate, while the contents of VSTM are constitutively less determinate, because the latter are symbolic representations. Recall that according to Burnston, cognitive representations consist of parts that do not specify some property of their referent. Perceptual representations, in contrast, have a representationally relevant internal structure since they have parts that carry distinct information about the various properties of their referents. This means that a perceptual representation, but not the cognitive representation, has an internal structure that maps its parts to the structure represented by the perceptual representation. If a perceptual state represents a particular shade of red, it represents the particular structural properties that place that shade at a particular location in color space. In view of the way information is stored through basis functions in VSTM, the representations in VSTM are symbolic and not analog but, importantly, this does not mean that

140     A. Raftopoulos

they lack structure that conveys information about the represented feature. They, too, place a feature at a certain location in the featuredimension space but this location is more abstract and cannot fix the value of the determinate feature owing to the fact that representations in VSTM are coded through sparse basis functions. For example, the cognitive representation in VSTM of a particular hue conveys the information that it is in the intersection of the categories ‘red’ and ‘bright’. It is worth emphasizing that both the symbolic representations in VSTM and the analog representations in early vision are done by means of basis functions, the difference being that the former use sparse basis functions, whereas the latter use dense basis functions. Let us delve into some detail concerning the color encoding in the brain that will help us understand how cognitive states could interact with early visual states. Color is represented in the blobs of V1, the darker stains in V1 with oval three-dimensional shapes, which project to the thin strips of V2 in which the hue maps are found (Livingstone and Hubel 1987). V1 color sensitive neurons do not represent hues but color opponency (color sensitive cells in V1 cells respond to the blue– yellow and red–green differences). In V2 where hues are represented the various hue maps are found in the thin stripes of V2 (Xiao et al. 2003). There are three stripe types (thin, pale, and thick) that represent color, form, and depth respectively, by a collection of discontinuous domains; a color field is represented by coalescing all thin stripes in V2 (Roe et al. 2012, 14). Color preference domains that include hue maps are found in V4 (Tanigawa et al. 2010) and correspond to, and communicate with, the hue maps in V2 (Xiao et al. 2003). Colors are represented in V4 in a similar collective way as in V2. The color selective functional regions in V4 are called ‘globs’ and are narrowly tuned for hue (Conway et al. 2007). It is tempting to draw the conclusion that the brain stores in memory the basic unique hues that correspond to the basic color categories used in language, except that in memory we store more saturated hues than those typically perceived in the environment, that is, the measured colors of objects (Hansen et al. 2006). The evidence, however, is ambivalent (Witzel and Gegenfurtner 2018). The correspondence between the color codes in V1 and V2 discussed above, for example, and the color

2  Cognitive Penetrability     141

representations further up in the brain is not clear. The second step of color representation involves the second stage mechanisms that operate on the information provided by the photoreceptors. These mechanisms are responsible for the color opponency and the trichromacy of color vision and create three different channels. A luminance channel that adds the excitation of long- and middle-wavelength cones (L + M), a chromatic channel that contrasts L and M excitations (L − M), and a second chromatic channel that contrasts the excitation of short wavelengths (S) to the combines L- and M-wavelengths (S − (M + L)). The problem is that it is not clear how the signals of the second order mechanisms are processed further at the cortical level to produce the colors one perceives. In other words, there is not a direct relationship between the unique hues of our experience and the cone-opponent mechanisms, which means in turn that there cannot be established direct links between prototypes of color linguistic categories and particular perceptual mechanisms (Witzel and Gegenfurtner 2018, 2). There is some evidence that unique and binary hues (that is, the transition between unique hues) involving green, blue, and yellow and their binary hues are inbuilt in early stages of color processing (see Witzel and Gegenfurtner 2018 for a discussion), which means that they may constitute perceptual color categories. In contrast, there are not categorical patterns for red and dark hues across various levels of lightness, and the red–yellow binary hue, owing probably (Witzel and Gegenfurtner 2018, 11) to the complex interactions between luminance and chromaticity. Categorical patterns for these colors emerge only under certain lighting conditions; unique red exists only at low lightness and yellow only at high lightness. All these bear to the problem addressed here because they concern the way colors are perceptually categorized and, therefore, the way these categories modulate color perception. Before we see how they affect the discussion there is a qualification that should be made. Gegenfurtner and Rieger (2000) distinguish between sensory and cognitive contributions of colors to the recognition of natural scenes. They argue that color contributes to fast scene recognition very early in visual processing in a purely sensory way since it improves object recognition irrespective of the diagnostic role of color in for object identification, by providing

142     A. Raftopoulos

an additional cue on which image segmentation could be based. Color also has a cognitive, later, contribution, when it adds one more cue for information retrieval from memory. As Gegenfurtner and Rieger (2000, 805) claim “color helps us to recognize things faster and to remember them better.” Color can act that fast because, as we saw in the preceding paragraphs, some binary and unique cues are inbuilt in early stages of visual processing and under certain lightning conditions all hues and binary colors can be used to make object and scene recognition faster. Since we are discussing the way cognition affects perception, what interests us here is the cognitive role of colors that when stored in memory facilitate retrieval from memory of objects and natural scenes. Thus, let us turn to examining the way cognitive information stored in the way described above could affect perceptual processing. Recall that the discussion is about cognitive states that are activated during a perceptual encounter; viewers perceive a certain determinate color hue and must perform some task and while performing this task certain cognitive states are activated. What happens when a cognitive state, say, a certain color-related belief is activated in the course of a color-related task? Suppose that the perception of color is CP. In this case, the occurrent belief that, say, hearts have a bright red color is activated because the participants are presented with an unmistakably heart-shaped paper cut and are asked to turn a knob until a color match is made, that is, they are asked to perform a color matching task. It is the presented shape and the nature of the task that determines which beliefs involving hearts are activated when a stimulus appears. This being a color task, the belief that hearts have a bright red color is activated but the belief that the heart does such and such a noise is not. This answers one of Burnston’s objections to the thesis that cognitive states issue commands that affect perceptual processing. Since I will return to this point, I will not elaborate any further here. How is the cognitive information that hearts have a deep red bright color represented in VSTM? Based on the above considerations, it is represented by means of sparse basis functions; VSTM codes of colors, for example, concern categories like ‘red’, ‘light’, ‘dark’. Thus, let us suppose that the typical red color of a heart is represented in a neuronal assembly by a triplet of values coding for ‘deep’, ‘bright’, and ‘red’, say .

2  Cognitive Penetrability     143

Attention and VSTM are tightly interlinked and involve activations in parietal, prefrontal, and temporal cortices. When the heart representation is activated in VSTM, signals from these brain areas flow top-down to visual areas (Retzeperis et al. 2014; Roe et al. 2012). V4 is directly connected to, and receives feedback from, temporal, prefrontal, and parietal areas, which means that is well positioned to be modulated by cognitively driven, top-down directed attention. V1 and V2 are directly connected to the Front Eye Fields (FEF) that is also known to play a role in the control of cognitively driven attention. Since, as we have seen, V1 and V2 also project to, and receive feedback from, the central area of V4, they are directly connected to V4 from which they are indirectly affected from the top-down signals from parietal, prefrontal, and temporal areas. One can start seeing how the reply to Burnston shapes up. Burnston thinks that all cognitive contents have no structure, they are like Fodor’s atomic concepts. If the picture about the nature of symbolic representations in VSTM that I have drawn is correct, this view is not accurate. The fact that cognitive representations are symbolic does not mean that they have no perceptually relevant structure. Conceptual/cognitive representations may be structurally different from pictorial representations but they do have some structure; they are not atomic symbols. Let us assume that in VSTM the typical red color of a heart is indeed represented by a triplet of values coding for ‘deep’, ‘bright’, and ‘red’, , which for the reasons explained above, is a symbolic representation. This symbolic/conceptual representation, however, maps onto the relevant perceptual space, not simply in the sense that the color is in a certain color-range, but also in the sense that this representation corresponds, at least in part, to a region in the phenomenal similarity space and, thus, maps to the structure of the phenomenal color space. In other words, in order for a representation to map to a phenomenal feature space it is not necessary that it have analog structure. Provided that a symbolic representation is made through sparse basis functions, this can, too, correspond to a phenomenal space, although this mapping is partly, whereas analog representations map fully to phenomenal spaces owing to their continuous, homogeneous nature. The symbolic-phenomenal space mapping is natural since it is the same kind of

144     A. Raftopoulos

basis functions that underlies both iconic representations in pure perception and symbolic representations of colors in VSTM, and since the concepts in the VSTM are formed by processes that are partly guided by the stimulus since it is the nature of the stimulus that elicits the relevant concept. It is, thus, not arbitrary which concept will be activated and applied to a given stimulus. Brossel (2017) explicates further this relation between perception, as he calls it, and cognition. Perceptual experiences (Brossel 2017, 9–10) are analyzed in terms of their position in phenomenal spaces. A color space represents the shades of colors viewers experience by placing them in a geometrical space where each shade occupies a point and the distances between points represent the dissimilarities between two shades of color along the three axes that define the color space, i.e., hue, brightness, and saturation. Each shade/point specifies through the projections to the three axes the hue, brightness, and saturation of the specific phenomenal content of the experience. The content of perceptual experiences is analog content since it represents shades of colors as points in a continuous space, in agreement with the thesis that iconic representations are made through dense, continuous, basis-functions. Brossel (2017, 11–12) examines next the conceptual structure space of perceptual beliefs or conceptual spaces. These are geometrical spaces whose structure captures semantical properties and relations. One main purpose of perceptual concepts is to allow for the categorization of objects in a few linguistic categories on the basis of our manifold Pes [perceptual experiences] of those objects. For example, the color concept VIOLET allows one to categorize objects according to their manifold shades of color as experienced by the agent … Such concepts also allow us to group similar shades of color together by subsuming them under one concept or category … the concept VIOLET subsumes various different shades of color under one label and delimits them from various other shades of color … it is only natural to understand the concept VIOLET as corresponding, at least in part, to a region in the phenomenal similarity space. This region then includes all those manifold points in the space that we want to group in one category.

2  Cognitive Penetrability     145

According to Brossel, conceptual spaces, where the relevant concepts are the so-called perceptual concepts, naturally correspond to regions in some phenomenal space. Through this correspondence, concepts allow grouping together under one heading various shades that belong to the same region in the phenomenal similarity space. One could object that the argument thus far relies on the assumption that the perceptually related representations stored in VSTM and LTM are symbolic and, thus, cognitive and not perceptual but Burnston would deny that. In footnote 11, Burnston, criticizing a view held by Raftopoulos (2009) argues Raftopoulos (2009, 70) takes the fact that categorical representations like these are representations in memory, and must be matched to incoming stimuli, to show that they are cognitive… But given that it is possible to store representations with perceptual form in long term memory (Barsalou 1999), these inferences don’t follow.

According to Burnston, Barsalou’s theory of perceptual symbols allows the storage of perceptual, non-symbolic information in memory. This, however, is not what Barsalou holds. Barsalou (1999, 577) writes “during perceptual experience, association areas in the brain capture bottom-up patterns of activation in sensory-motor areas. Later, in a top-down manner, association areas partially reactivate sensory-motor areas to implement perceptual symbols.” Association/cognitive areas in the brain store activations patterns from sensory-motor areas that later by top-down spread of activation reactivate sensory-motor areas to implement perceptual symbols. The symbol systems are implemented by the reactivation of sensory-motor areas; they are not stored in memory. This is what drives Barsalou to affirm explicitly that all memory effects on perceptual content are cognitive top-down effects: “[i]n this spirit, the remainder of this target article assumes that top-down cognitive processing includes all memory effects on perceptual content, including memory effects that originate in local association areas” (Barsalou 1999, 588). Neither could an appeal to Kosslyn (1994), who argues that visual images are part of the architecture of the mind, would help, because Kosslyn is explicit that in LTM only information in

146     A. Raftopoulos

propositional/symbolic format can be stored. This information activates top-down the visual buffer producing the mental images. The main shortcoming of Burnston’s argument is that he treats cognitive representations according to the atomistic account of concepts, which precludes them from having any sort of structure that could allow them to map to some phenomenal space. If, however, cognitive/symbolic representations are construed in terms of sparse basis-functions, a structure that can naturally map to a phenomenal space emerges. The problem is to determine whether the structure of the cognitive representations in VSTM could explain the putative cognitive effects on perceptual processing of colors. Feature-based attention, regardless of spatial geometry, is able to highlight all the neuronal ensembles that encode information that is potentially relevant to the current task (Roe et al. 2012, 21). It is unanimously accepted in the literature that the nature of the task and prior expectations drive most contextual influences in perception. In tasks involving color, the neurons in the visual areas that encode colors do receive the top-down modulatory signals generated in color-related areas involved in VSTM. Thus, color-related activation spreads top-down from the areas involved in VSTM that include V4 to V2 and V1 since the color maps in these regions are directly related through recurrent connections. When the top-down activation arrives at these sites, this activation is added to the activation caused by the bottom-up color signals from the proximal stimulus, and together fix the activation of neurons at this lower level. The points made above are important because they suggest an answer to one of Burnston’s (2017, 3655–3656) objection to the possibility of cognitive states issuing an attentional command that affects perceptual processing. Recall that Burnston objected that given that cognitive representations lack perceptually structured content and given that the fact that a perceived object belongs to a category possibly entails a large number of perceptual consequences, the cognitive states could not determine the percept because they do not determine which perceptual process should be modified. In view of the abovementioned considerations, the answer suggests itself. First, even though when the heart-belief is activated all sorts of semantic information pertaining to hearts may be activated as well and,

2  Cognitive Penetrability     147

thus, many different perceptual consequences might ensue, the nature of the task determines which information will be given priority for further use and, thus, which belief will be formed or prioritized; the number of cognitive states, in other words, that may be activated upon perceiving and categorizing a stimulus is constrained by the nature of the task at hand. The task in our case being a color task, the belief that hearts have bright red color is formed or given priority. Second, once the semantic information about the typical color of hearts has been activated, the relevant color maps in V4 that store color information by means of sparse basis functions are activated as well and transmit topdown this information through their direct connections to the relevant color maps in V2 and V1. In other words, even though cognitive representations do not have perceptually structured content but are symbolic, in the sense that they concern types of visual object-features that form a discrete set of values, this does not entail that they do not code color information in the form of sparse basis functions. This is why, given the nature of the task, only (or preeminently) color information about hearts is at play despite the variety of semantic information activated when the heart-belief is activated and this, in turn, determines which perceptual process will be affected. Recall that the visual areas encode through dense basis functions, that is, they represent by a vector or a pattern of activation values across columns of neurons that distributively represent colors. Since the topdown activation carries sparse information about the typical color of a heart, it affects a number of neuronal assemblies whose fine-grained encoding is compatible with the top-down incoming information. In other words, the color-relative information stored in VSTM is symbolic and categorical since it concerns types of color hues, but different tokens coding different hues of the same color-type exist; for example, many different determinate hues of red belong to the determinable type ‘bright red’. Thus, many encodings of the red color in the image will be compatible with the sparse top-down information  and will get activated. Let me explain this. Cells in higher brain areas due to their wider receptive fields poll together bottom-up signals sent from neurons lower in the hierarchy whose receptive fields fall within the receptive field of

148     A. Raftopoulos

the higher neurons. This means that some cell in the color areas of V4, for example, receives bottom-up input from a number of different cells in the color areas of V1 and V2 that may have different preferred stimuli but are nevertheless activated to some degree or other in response to a certain input that may not be their preferred stimulus. This is so because color information is encoded distributively across neurons, which means that a neuron is excited even if the preferred stimulus of the neuron (that is, the stimulus to which it has its maximum activation) is not in its receptive field but some other stimulus is, although in its case its activation is less. It follows that neurons in the red hue-type maps of V2 that code different red or similar to red (for example, orange) hues are activated to various degrees and send bottom-up signals to V4 and other cortical areas. When the V4 cell is activated as a part of VSTM, it sends top-down signals to the multiplicity of neurons in V1 and V2 with which it communicates. It follows that the neurons in the red (or similar to red) hue color maps in V2 are activated to various degrees owing to both stimulus-driven bottom-up and cognitively driven top-down signals. Due to the top-down modulation, the perceptual process is biased in favor of neurons that code for bright red, but which one will prevail also depends on the amount of activation received from the bottom-up signals, as well as on the modulation from other assemblies that carry contextual perceptual information. This account suggests how semantic knowledge in late vision modulates perceptual processing directly, that is, by being used by the perceptual processes themselves. Recall that in discussing late vision, we saw that parietal areas in the dorsal system and areas in the ventral pathway semantically process the LSF information and determine the gist of the scene based on stored knowledge that generates predictions about the most likely interpretation of the input. This information reenters the extrastriate visual areas and modulates perceptual processing facilitating the analysis of HSF, by specifying certain cues in the image that might facilitate target identification. Thus, perceptual processes use the semantic information concerning the putative identity of the object(s) in the visual scene since this information specifies those parts of the iconic image that probably contain information relevant to the identification of these objects, and which, once highlighted streamline the subsequent

2  Cognitive Penetrability     149

perceptual processes. This is also how the semantic/cognitive factors tip the scale in favor of some perceptual interpretation. It is important to stress that semantic knowledge (co)shapes the formation of the percept by affecting perceptual processing itself, i.e., directly; it is not used either to pre-specify the percept before the onset of perceptual processing, or after perception has outputted several interpretations to select one of them. I argued that cognitive states, despite their symbolic nature, could modify perceptual processing so that, when presented with a stimulus, perception outputs R2 instead of R1 that would have outputted were it not modified by the cognitive states. To see how this applies to a specific case of assumed CP of perception, let us revisit the Delk and Fillenbaum experiment. Participants are presented with an orange–red paper-cut heart and asked to turn a knob so that the color of the background matches the color of the stimulus. The input, thus, is an orangered color. This entails that those neuronal assemblies in V2 and V4 that represent hues and code for the specific range-red hue (R1) of the stimulus are maximally activated. Owing to the distributive way in which neurons represent the stimulus, other assemblies coding for various determinable hues of red (R2…k) are also activated to a lesser degree. Lacking any top-down biases, R1 would win the competition and the participant would have chosen R1 as the background color. When the object is recognized in late vision as a heart and stored in VSTM, beliefs about hearts stored in memory are activated, including the belief that hearts have a deep bright red color. When this belief is activated owing to the fact that the task at hand is a color task, the neuronal assemblies in VSTM that code for deep bright red are activated too and this activation spreads top-down or laterally and is added to the initial activation of hue specific assemblies in V2 and V4 due to stimulus color information. The result is that the assemblies coding for R1 that were initially winning the competition end up having lesser activation than the assemblies coding for R2 and lose the biased competition; the viewer perceives R2 and turns the knob to match the background color accordingly. In this account, the cognitive states modify perceptual processing without a need for their contents to be transformed to analog contents; they modulate the activity of, and bias the competition

150     A. Raftopoulos

among, the neuronal assemblies that represent information from the visual scene. This account explains the possibility for the cognitive effects to overcome the bottom-up influences and shift the percept toward the cognitive content (in the Delk & Fillenbaum experiment, for example, the viewing conditions are not normal (Zeimbekis 2013), which means that the bottom-up signals are weak), and the graded effects noticed by Deroy. Since the attentional modulation is the expression of the confluence of bottom-up and top-down influences on a neuronal ensemble, the magnitude of the attentional effect depends on the bottom-up biases pertaining to the specifics of the stimuli that affect the ensemble bottom-up, such as the existence and the number of distractors, depth, texture, and salience in general. The gradient effects, far from indicating that attention cannot directly affect perceptual processing, are the results of the way attention directly modulates perceptual processing. The perceptual outcome, thus, also depends on the other perceptual relevant cues in the visual scene, such as texture and depth information. Since these cues play a role in the formation of the output, their role explains the graded effects that Burnston points out. Therefore, far from showing that cognitive states cannot directly affect perceptual processing, the graded effects follow immediately from the way cognitive states affect perception. This answers Burnston’s other objection against the possibility of cognitive states issuing commands that modify perceptual processing. The most significant part of the answer to Burnston’s objection, however, is that cognition need not issue commands of the type Burnston discusses in order to modify perceptual processing. The belief that hearts typically have a deep red bright color need not issue a command that prescribes to what extent the perceptual processes should be modified, by commanding, for example, that they compute colors in such a way so as to increase saturation by 5% to produce a redder shade. The cognitive state affects perceptual processing by transmitting the top-down information  and biasing the perceptual competition among the neuronal assemblies that encode determinate hues. Saturation is increased to match the typical color of hearts because the assemblies that code this hue win the perceptual competition owing to the bias

2  Cognitive Penetrability     151

they receive from the cognitive state; it is the assemblies thus affected that determine what determinate hue is perceived since they carry the required perceptual content. A final objection to my arguments is the following. Recent evidence (Retzeperis et al. 2014) questions the strict functional segregation of V1 and V2 and of most cortical areas. It seems that many color selective V1 neurons (about 40%) also encode form, and many form selective neurons in V1 (about 30%) encode color although the additional selectivities are weaker. Thus, the selectivities within both the thin and the pale (or interstripe) areas are mixed. One might be tempted to object to my views that owing to this mixed selectivity, the top-down cognitive signals that reenter V1 and V2 and enhance the activity of the neurons in these areas does not produce specific perceptual effects concerning color, but might do some other things, say by affecting the perception of shape, precisely as EEV would predict. This objection overlooks the fact that the top-down cognitively driven modulation increases or sharpens the activations of the neurons in the reentered site encoding the relevant feature, in this case, color and affects, thereby, perceptual processing. Since the neurons thus affected also encode for shape in a secondary way, shape processing may be affected too albeit less than color processing. However, since the shape-related content of the cognitive states directing the top-down signals corresponds to the affected perceptual content of the heart-stimulus, they both are heart-like, the perception of the shape does not change, in contradistinction to the perception of color which is affected and shifts toward the color-related content of the cognitive state. Furthermore, the fact that some neurons in, say, V1 encode both for color and shape only means that they are activated when some color or shape is present in the stimulus; then these neurons participate in the distributed representation of both features. When a top-down modulation affects the activation of the neurons it affects their role in the distributed representation of both features, which means that the topdown influences do produce specific effects for both representations, although they change the representation of color but not of shape for the reason explained above. It is not the case, as the objection states, that on some occasion it may affect the representation of color and on other occasions it may affect the representation of shape.

152     A. Raftopoulos

5.3 IEV and EEV: Direct and Indirect Cognitive Effects on Perception It is interesting to compare the account I have offered here with Burnston’s EEV. According to its definition, “Tokening of a lexical/ atomic concept as part of a cognitive state provides a bias toward any perceptual processes associated with the concept, raising the probability that those processes will be applied to a perceptual stimulus.” Thus, the proposal of EEV is that tokening of a concept potentiates any of the perceptual processes associated with the category, including those that represent particular object features and the processes that integrate them, but does not determine any of their specific outcomes. How is this different from the direct CP that I defended above? The first difference is that for Burnston the concept biases any perceptual process associated with the concept, since the lack of structure in the concept entails that all associations of the concept will get activated and bias all perceptual processes associated with the concept. In my account, the cognitive representation associated with the belief that hearts have bright red color represents its referent with a sparse basis function that puts the referent in the space of colors and, thus, activates only, or predominantly, color-related processes. As a matter of course, the belief that the object-stimulus is a heart activates other properties of typical hearts, but the nature of the task, which is a color task and in which the shape of the heart is the same in both the perceptual and the cognitive representations, determines which feature dominates and which is affected in a way that changes the perception of the feature. In other words, not anything goes, as EEV suggests, when cognition affects perception. The perceptual context determines both which cognitive states are activated and which features are affected in a top-down manner from these cognitive states. The bigger problem, however, is that Burnston’s distinction between IEV and EEV is based on the erroneous belief that in IEV the cognitive state needs to have a deterministic influence on the percept. If, as I have argued, the direct cognitive influence directly influences perceptual

2  Cognitive Penetrability     153

processing by biasing the competition between perceptual representations and the content of the winner state is determined by the synergy of the top-down and bottom-up information, this part of the distinction between IEV and EEV collapses because it may be true that a perceptual process P that performs a certain computation C1 leading to content R1 in the absence of a cognitive state S would perform a different computation C2 yielding content R2 if S penetrated P, but this does not entail that S determines R2 in the sense that it dictates which content R2 has. R2 is determined by the synergy of the affecting cognitive content and the affected perceptual content. To repeat a claim made several times before, the cognitive states do not influence determinis­ tically perception and, thus, do not have to support the requisite content, but simply influence it directly. Revisiting EEV, recall that EEV stipulates that the tokening of a lexical/atomic concept as part of a cognitive state provides a bias toward any perceptual processes associated with the concept, raising the probability that those processes will be applied to a perceptual stimulus. This is not quite right. Even in cases where the baseline activity of some neuronal assemblies is biased by pre-cueing, not all perceptual processes associated with the cued concept are biased. Only those processes specific to the task at hand and the perceptual context are biased. Biasing read hearts by activating the concept RED HEART does not affect the baseline activity of all neurons that are in principle relevant to all the features associated with hearts (such as the pumping noise made by the heart) but only those that represent distributively the typical color of the heart. Replacing, thus, the “any perceptual processes” by “the task relevant perceptual processes” and specifying that the bias occurs offline and, thus, does not affect the relevant perceptual computations but only the initial conditions or parameters figuring in these processes (the way pre-cueing affects perceptual processing), EEV describes the indirect cognitive effects on perception. EEV can account for the indirect cognitive effects on perception but it is not adequate to describe the way cognition directly effects perception since this is captured by IEV properly understood.

154     A. Raftopoulos

References Balcetis, E., & Dunning, D. (2006). See what you want to see: Motivational influences on visual perception. Journal of Personality and Social Psychology, 91, 612–625 Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22(4), 577–609. Brossel, P. (2017). Rational relations between perception and belief: The case of color. Review of Philosophy and Psychology. https://doi.org/10.1007/ s13164-017-0359-y. Burnston, D. (2017). Cognitive penetration and the cognition-perception interface. Synthese, 194, 3645–3668. Burnston, D. C., & Cohen, J. (2015). Perceptual integration, modularity, and cognitive penetration. In J. Zeimbekis & A. Raftopoylow (Eds.), The Cognitive Penetrability of Perception: New Philosophical Perspectives (pp. 123– 144). Oxford: Oxford University Press. Cecchi, A. (2018). Cognitive penetration of early vision in face perception. Consciousness and Cognition. http://doi.org/10.1016/j.concog.2018.06.005. Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36, 181–253. Conway, B. R., Moeller, S., & Tsao, D. Y. (2007). Specialized color modules in macaque extrastriate cortex. Neuron, 56, 560–573. Delk, J. L., & Fillenbaum, S. (1965). Differences in perceived colour as a function of characteristic color. The American Journal of Psychology, 78(2), 290–293. Deroy, O. (2013). Object-sensitivity versus cognitive penetrability of perception. Philosophical Studies, 162, 87–107. Fazekas, P., & Nanay, B. (2017). Pre-cueing effects: Attention or mental imagery? Frontiers in Cognitive Science. https://doi.org/10.3389/ fpsyg.2017.00222. Firestone, C., & Scholl, B. J. (2016). Cognition does not affect perception: Evaluating the evidence for ‘top-down’ effects. Behavioral and Brain Sciences. http://dx.doi.org/10.1017/S0140525X15000965. Fodor, J. (2007). The revenge of the given. In B. P. McLaughlin & J. Cohen (Eds.), Contemporary Debates in the Philosophy of Mind. Malden, MA: Blackwell. Fodor, J., & Pylyshyn, Z. (2015). Minds Without Meanings: An Essay on the Content of Concepts. Cambridge: MIT Press.

2  Cognitive Penetrability     155

Gegenfurtner, K. R., & Rieger, J. (2000). Sensory and cognitive contributions of color to the recognition of natural scenes. Current Biology, 10, 805–808. Gross, S. (2017). Cognitive penetration and attention. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2017.00221. Gross, S., Chaisilprungraung, T., Kaplan, E., Menendez, J., & Flombaum, J. (2014). Problems for the purported cognitive penetration of perceptual color experience and Macpherson’s proposed mechanism. In E. Machery & J. Prinz (Eds.), Thought and Perception (pp. 1–30). Lawrence, KS: New Prairie Press. Hansen, T., Olkkonen, M., Walter, S., & Gegenfurtner, K. (2006). Memory modulates color appearance. Nature Neuroscience, 9(11), 1367–1368. Hatfield, G. (2002). Perception as unconscious inference. In D. Heyer & R. Mausfeld (Eds.), Perception and the Physical World: Psychological and Philosophical Issues in Perception. West Sussex: Wiley. Hegde, J., & Kersten, D. (2010). A link between visual disambiguation and visual memory. The Journal of Neuroscience, 30(45), 15124–15133. Itti, L., & Baldi, P. (2005). A principled approach to detecting surprising events in video. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1, 631–637. Kitcher, P. (2001). Real realism: The Galilean strategy. Philosophical Review, 110(2), 151–199. Kosslyn, S. M. (1994). Image and Brain. Cambridge: MIT Press. Ling, S., Liu, T., & Carrasco, M. (2009). How spatial and feature-based attention affect the gain and tuning of population responses. Vision Research, 49, 1194–1204. Livingstone, M., & Hubel, D. H. (1987). Psychophysical evidence for separate channels for the perception of form, color, movement, and depth. Journal of Neuroscience, 7, 3416–3468. Macpherson, F. (2012). Cognitive penetration of colour experience: Rethinking the issue in light of an indirect mechanism. Philosophy and Phenomenological Research, 84(1), 24–62. Marchi, F. (2016). Attention and cognitive penetrability: The epistemic consequences of attention as a form of metacognitive regulation. Consciousness and Cognition. https://doi.org/10.2016/j.concog.2016.06.014. McGrath, M. (2013a). Siegel and the impact for epistemological internalism. Philosophical Studies, 162(3), 723–732. McGrath, M. (2013b). Phenomenal conservatism and cognitive penetration. In C. Tucker (Ed.), Seemings and Justification (pp. 225–247). Oxford: Oxford University Press.

156     A. Raftopoulos

Mole, C. (2015). Attention and cognitive penetration. In J. Ziembekis & A. Raftopolous (Eds.), The Cognitive Penetrability of Perception: New Philosophical Perspectives (pp. 218–238). Oxford: Oxford University Press. Murray, S. O., Schrater, P., & Kersten, D. (2004). Perceptual grouping and the interactions between visual cortical areas. Neural Networks, 17, 695–705. Newen, A., & Vetter, P. (2016). Why cognitive penetration of our perceptual experience is still the most plausible account. Consciousness and Cognition. http://dx.doi.org/10.1016/j.concog.2016.09.005. Nikolic, D., & Singer, W. (2007). Creation of visual long-term memory. Perception and Psychophysics, 69(6), 904–912. Nobre, A. C., Rohenkhol, G., & Stokes, M. G. (2012). Nervous anticipation: Top-down biasing across space and time. In M. Posner (Ed.), Cognitive Neuroscience of Attention (2nd ed.). New York, NY: Guilford Press. Norretranders, T. (1998). The User Illusion: Cutting Consciousness Down to Size. New York: Penguin Books. O’Callaghan, C., Kveraga, K., Shine, J. M., Adams Jr, R. B., & Bar, M. (2016). Predictions penetrate perception: Converging insights from brain, behavior and disorder. Consciousness and Cognition. https://doi. org/10.1016/j.concog.2016.05.003. Pylyshyn, Z. (1999). Is vision continuous with cognition? Behavioral and Brain Sciences, 22, 341–365. Pylyshyn, Z. (2003). Seeing and Visualizing: It’s Not What You Think. Cambridge: MIT Press. Pylyshyn, Z. (2007). Things and Places: How the Mind Connects with the World. Cambridge: MIT Press. Raftopoulos, A. (2001a). Is perception informationally encapsulated? The issue of the theory-ladenness of perception. Cognitive Science, 25, 423–451. Raftopoulos, A. (2001b). Reentrant pathways and the theory-ladenness of observation. Philosophy of Science, 68, 187–200. Raftopoulos, A. (2006). Defending realism on the proper ground. Philosophical Psychology, 19(1), 1–31. Raftopoulos, A. (2009). Cognition and Perception: How Do Psychology and Neural Science Inform Philosophy? Cambridge: MIT Press. Raftopoulos, A. (2010). Can nonconceptual content be stored in visual memory? Philosophical Psychology, 23(5), 639–668. Raftopoulos, A. (2011, November 30). Late vision: Its processes and epistemic status. Frontiers in Psychology, 2, 382. https://doi.org/10.3389/ fpsyg.2011.00382.

2  Cognitive Penetrability     157

Raftopoulos, A. (2014). The cognitive impenetrability of the content of early vision is a necessary and sufficient condition for purely nonconceptual content. Philosophical Psychology, 27(5), 601–620. Raftopoulos, A. (2015). The cognitive impenetrability of perception and theory-ladenness. Journal of General Philosophy of Science, 46(1), 87–103. Raftopoulos, A., & Muller, V. (2006). The phenomenal content of experience. Mind and Language, 27(2), 187–219. Raftopoulos, A., & Zeimbekis, J. (2015). The cognitive penetrability of perception: An overview. In J. Zeimbekis & A. Raftopoulos (Eds.), The Cognitive Penetrability of Perception: New Perspectives. Oxford: Oxford University Press. Retzeperis, I., Nikolaev, A. E., Kiper, D., & van Leeuwen, C. (2014). Distributed processing of color and form in the visual cortex. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2014.00932. Roe, A. W., Chelazzi, L., Connor, C. E., Conway, B. R., Fujita, I., Gallant, J. L., et al. (2012). Toward a unified theory of visual areas V4. Neuron, 74, 12–29. Siegel, S. (2011). Cognitive penetrability and perceptual justification. Nous, 46, 201–222. Siegel, S. (2012). The Contents of Visual Experience. Oxford: Oxford University Press. Siegel, S. (2013a). The epistemic impact of the etiology of experience. Philosophical Studies, 162, 697–722. Siegel, S. (2013b). Can selection effects influence the rational role of experience? In T. Gelder (Ed.), Oxford Studies in Epistemology (Vol. 4, pp. 240– 270). Oxford: Oxford University Press. Siegel, S. (2016). How is wishful seeing like wishful thinking? Philosophy and Phenomenological Research. https://doi.org/10.1111/phpr.12273. Sligte, I. G., Vandenbroucke, A. R. E., Scholte, H. S., & Lamme, A. F. (2010, October). Detailed sensory memory, sloppy working memory. Frontiers in Psychology, 1, 175. Stokes, D. (2012). Perceiving and desiring: A new look at the cognitive penetrability of experience. Philosophical Studies, 158(3), 479–492. Stokes, D. (2015). Towards a consequentialist understanding of cognitive penetration. In J. Zeimbekis & A. Raftopoulos (Eds.), Cognitive Penetrability of Perception: New Philosophical Perspectives (pp. 75–100). Oxford: Oxford University Press.

158     A. Raftopoulos

Stokes, D. (2017). Attention and the cognitive penetrability of perception. Australasian Journal of Philosophy. https://doi.org/10.1080/00048402.2017. 1332080. Tanigawa, H., Lu, H. D., & Roe, A. W. (2010). Functional organization for color and orientation in macaque V4. Nature Neuroscience, 13, 1542–1548. Tucker, C. (2010). Why open-minded people should endorse dogmatism. Philosophical Perspectives, 24, Epistemology. Tucker, C. (2014). If dogmatists have a problem with cognitive penetration, you do too. Dialectica, 68(1), 35–62. Vogel, E. K., Woodman, G. F., & Luck, S. J. (2001). Storage of features, conjunctions, and objects in visual working memory. Journal of Experimental Psychology: Human Perception and Performance, 27, 92–114. Watzl, S. (2017). Structuring the Mind: The Nature of Attention & How It Shapes Consciousness. Oxford: Oxford University Press. Witzel, C., & Gegenfurtner, K. R. (2018). Are red, yellows, green, and blue perceptual categories? Vision Research. https://doi.org/10.1016/j. visres.2018.04.002. Wu, W. (2013). Visual spatial constancy and modularity: Does intention penetrate vision? Philosophical Studies, 165, 647–669. Wu, W. (2017). Shaking up the mind’s ground floor: The cognitive penetration of visual attention. The Journal of Philosophy, 114(1), 5–32. Xiao, Y., Wang, Y., & Felleman, D. J. (2003). A spatially organized representation of colour in macaque cortical area V2. Nature, 421, 535–539. Zeimbekis, J. (2013). Color and cognitive penetrability. Philosophical Studies, 165(1), 167–175.

3 Early Vision and Cognitive Penetrability

1 Introduction In the previous chapter, I offered a definition of Cognitively Penetrated (CP) one of the entailments of which is that to determine whether a perceptual stage is CP one should first examine whether this particular stage is directly affected by cognition; if it is, this stage is CP. If this stage is only indirectly affected by cognition, one should turn to examining whether the indirect cognitive effects affect its epistemic role. If they do, this stage is CP. If they do not, it is Cognitively Impenetrable (CI). It follows that late vision, being directly affected by cognition, is CP, which leaves early vision as the sole candidate for being CI. I have argued (Raftopoulos 2001a, b, 2009, 2015) that early vision is not affected directly by cognition, which means that in order to determine whether early vision is CP or CI, one should examine the indirect cognitive influences on early vision and determine whether they affect its epistemic role. Recently, there has been a resurgence of arguments against the view that early vision is not affected directly by cognition. In this chapter, I discuss early vision and examine the new evidence © The Author(s) 2019 A. Raftopoulos, Cognitive Penetrability and the Epistemic Role of Perception, Palgrave Innovations in Philosophy, https://doi.org/10.1007/978-3-030-10445-0_3

159

160     A. Raftopoulos

purporting to show that early vision is directly affected by cognition. Having determined that it does not and that the newly adduced evidence describes in fact indirect cognitive effects on early vision, I proceed to examine whether these indirect cognitive effects influence in any way the epistemic role of early vision. The thesis that there are no direct effects in early vision can be attacked from three fronts. The first criticism points out that there are two special sorts of effects on early vision that may be taken as evidence that early vision is intrinsically or directly penetrated by cognition and is conceptually modulated, but these effects are not transmitted in a top-down manner; rather, they affect early vision from within. These are two sorts of ‘bodies of knowledge’ that are embedded in the visual circuits and guide perceptual processing very early and, consequently, function within the time scale of early vision. Moreover, they seem to affect early vision intrinsically. The first is a set of general principles that perception employs to solve various underdetermination problems. The second is information that results from perceptual learning and is encoded in early visual circuits enabling fast recognition of objects. The second and third lines of attack are based on empirical findings that allegedly show that cognition affects directly early vision in a top-down manner. In this vein, it has been argued by philosophers (Cecchi 2014; Ogilivie and Carruthers 2015) and cognitive scientists (Goldstone et al. 2015; Newen and Vetter 2016; Lupyan 2015; Vetter and Newen 2014) that various cognitive effects directly modulate early visual processing itself, in that the signatures of these effects are found within early vision, and since these effects involve cognition, early vision is CP because the processes of early vision are directly affected by cognition and, thus, operate on cognitive information. The studies referred to as evidence against the CI of early vision come from two experimental paradigms that constitute the two lines of attack respectively. The first includes the class of studies that examine the sort of processes that occur in visual perception and their timing with an emphasis on the existence and timing of recurrent flow of information that involves both bottom-up and top-down processing. The argument against the CI of early vision is that these studies show that recurrent processes involving cognitive information are found in early vision.

3  Early Vision and Cognitive Penetrability     161

The second includes the class of studies that use the pre-cueing paradigm. If a viewer is told that a red T will appear in the visual scene, after some ms the neurons that encode ‘red’ and the shape ‘T’ are activated, which results in the red T receiving priority when it appears in the visual scene with sufficient delay. At a first glance, it seems that these effects constitute cases of CP; they are internal to the viewer, the influence is transmitted in a purely mental way, and they affect the percept. Against this criticism, I defend the CI of early vision, offering the second line of defense of the view that early vision is CI. I am going to argue that the empirical evidence adduced at best shows either that early vision is not affected by cognition at all (in the case of recurrent effects) or it is affected indirectly by cognitive states (in the case of pre-cueing). It is undoubtable, in any way, that there are many sorts of indirect cognitive effects on early vision. To determine whether they render early vision CP, one should take recourse to the epistemic criterion of CP and examine whether these indirect effects influence in any way the epistemic role of early vision; if they do, early vision is CP, notwithstanding the indirect nature of these effects. This presupposes that one has determined the epistemic role of early and late vision on independent grounds, which is the problem I took up in the last section of the preceding chapter. Let me first say a few things about a frequent reaction whenever the cognition/perception interaction is discussed. In view of the pervasiveness of recurrent processes in the brain involving high- and low-level brain areas, does it even make sense to think of the brain as being divided into cognitive and non-cognitive areas? I have dealt with this problem elsewhere arguing that there is a clear-cut, well-defined distinction between cognition and perception (Raftopoulos and Zeimbekis 2015, see also Chapter 5), but fortunately I do not have to repeat any of these here, because all of my critics in their work presuppose such a distinction and make it central to their arguments even when they argue that cognition penetrates all of perception. Another thing that the reader should bear in mind is that whenever I talk of cognitive effects on perception, I include cognitively driven attention effects and the effects of mental imagery, since both classes of effects involve cognitive components.

162     A. Raftopoulos

2 The Operational Constraints in Perception and Fast Categorization Owing to Perceptual Learning Do Not Entail That Perception Is CP: Why Early Vision Is Not Affected Directly by Cognition Part 1 2.1 Operational Constraints It is common knowledge that, owing to the poverty of the stimulus, a host of problems related to various sorts of underdetermination arise during perceptual processing (Raftopoulos 2009). In order for the visual system to be able to construct the percept, it uses a large number of ‘assumptions’ about the geometry of our world and its physical nature that guide perceptual processing from its onset. I have called these assumptions ‘operational constraints’. One could insist that these constraints entail that there is a deeply rooted conceptual ingredient in perception that renders perception akin to discursive, doxastic inferences (Cavanagh 2011; Rock 1983; Spelke 1988). To decide whether such constraints entail that perceptual processing is affected by, or depends on, concepts from its onset, one should examine these constraints and determine their epistemic status. Visual processing does not function free of any internal restrictions; it is constrained and modulated at every level by certain principles, or operational constraints that embody certain principles. Because the retinal image underdetermines both the distal object and the percept, perception would not be feasible if the processing of information was not constrained by ‘assumptions’ that substantiated reliable generalities about the physical world and its geometry. Most computational theories (Biederman 1987; Marr 1982) endorse this view, and there is evidence that physiological visual mechanisms implement such constraints in their design, from cells for edge detection to mechanisms implementing the epipolar constraint. The constraints are described by many theorists (see, for example, Marr 1982, 185) as ‘hardwired’ into the visual system and as reflecting ‘some kind of a statistical rule of the universe’. (Note that Fodor [1983] uses the term ‘hardwired’ to describe modular processing.

3  Early Vision and Cognitive Penetrability     163

Fodor also thinks that constraints could be implemented within modules by sentence-like structures, a point with which I disagree.) The force of these principles is such that Gestalt principles are very often overridden by the principles underlying the perception of objects in motion. These mechanisms allow infants to infer that under proper movement a single object is displaced despite the fact that this object is center-occluded by another superimposed object, a fact which could have led the infants to perceive two separated objects. Whenever the object remains still, infants perceive two different objects separated by another one. In addition, infants fail to recover the boundary between objects that are adjacent and move together. Burge (2010) calls the constraints ‘formation principles’. I (Raftopoulos 2001a, 2009) call them ‘operational constraints’, which is the term I adopt here. Operational constraints reflect general or higher-order physical regularities that govern the behavior of objects in our world and the geometry of the space around viewers. Through causal interaction with the environment over the course of evolution they have gradually been incorporated into the perceptual system. They allow us to perceptually lock onto medium-sized lumps of matter in the world by providing the discriminatory capacities necessary for the individuation and tracking of objects in a bottom-up, nonconceptual way (Raftopoulos 2009), and they allow perception to generate perceptual states that present objects in the world as cohesive, bounded solids, and as spatio-temporally continuous entities (Spelke 1988). These constraints can be seen as the rules that guide the various grouping principles (that extend and occasionally override Gestalt grouping principles) that the perceptual system uses to segregate objects from ground. The constraints are not available for introspection, function outside the realm of consciousness, and cannot be attributed as acts to the viewer. Viewers do not believe implicitly or explicitly that an object moves in continuous paths or that it is rigid, even though they use this information to parse objects. These constraints are not perceptually salient but one must be ‘sensitive’ to them if they are to be described as perceiving the world. The constraints are the modus operandi of the perceptual system and not a set of rules used by the perceptual system either as premises in inferences or as rules in inferences;

164     A. Raftopoulos

this modus operandi consists of operations determined by laws describable in terms of computation principles. They are reflected in the functioning of perception and can be used only by it, whereas “theoretical” constraints are available for a wide range of cognitive tasks. These constraints cannot be overridden since they are not under the perceiver’s control; one cannot substitute them with another body of constraints even if they know that they lead to errors. According to Haugeland (1998, 261), non-concept possessing creatures and we share various innate “object-constancy” and “object-tracking” mechanisms that automatically ‘lock onto’ medium-sized lumps of matter. These mechanisms provide the discriminatory capacities necessary for the individuation and recognition of objects in a bottom-up way that does not involve any concepts. Haugeland claims that the objective character of perception, that is, the fact that perception is about objects qua objects, can be attributed to the role of some normative standards that constitute thinghood. The “constitutive standards for thinghood” are the principles of cohesiveness and compatibility that result from the operational constraints on perception. Haugeland (1998, 248–249) claims that perceivers do not have cognitive awareness of the standards in some explicit formulation, and that these standards are not expressed as rules. By being hardwired, the constraints are not even contentful states of the perceptual system, or, if they are, the contents are not conceptual, propositionally structured contents that could form some theory. A neural state is formed through the spreading of activation and its modification as it passes through the synapses connecting the neurons. The hardwired constraints determine the processing, that is, the transformation from one state to another, but they are not the result of this processing. They are computational principles that describe transitions between states; they constitute a computational processor. The states that are produced by means of these mathematical transformations have contents, but there is no reason to assume that the principles that specify the mathematical transformation operations are states of, and are represented in, the system. This is what the expression modus operandi used above purports to convey; even though perception by using the constraints operates in accord with

3  Early Vision and Cognitive Penetrability     165

the principles reflected in them, the perceiver does not represent these constraints. Broggard and Gatzia (2017, 198, 200) argue that these principles may be best classified as types of implicit beliefs but are not rational principles because they do not conform to standard tenets of rationality that include standard rules of logic, probability theory, and statistics and other norms of reasoning. Since for Brogaard and Gatzia, CP concerns cases where visual experience can be modulated by belief and knowledge acquired after the maturity of the perceptual system, and since these principles are inherent or acquired very early in development, their function in visual processing does not entail that visual perception is CP. Even though I agree that these principles do not signify that perception is CP, as I said above and will argue next that, I do not share the view that the principles are types of implicit beliefs. What, then, about the claim that knowledge about worldly objects is needed for the filling-in that allows the construction of the percept? If the operations that effectuate the filling-in are not represented in the system but are performed by hardwired computational processors, is it legitimate to talk about these processors realizing some object knowledge in the form of a set of rules concerning the physical environment and its geometry? This depends on what one is willing to count as knowledge. I said above that if the operational constraints are not states of the visual system but computational processors, they are not representations or beliefs of any form, either implicit or explicit. (Explicit beliefs are representations that are activated, while implicit beliefs are representation stored in long-term memory but not currently activated.) If the operational constraints are not states of the system, what is the epistemic status of the information they contain, that is, what is the epistemic status of the information included in the regularities about the environment and its geometry that the constraints realize? One could say, first, that by not being states of the system, the operational constraints do not have any contents; they are not semantic or mental entities of any kind. To think that they are is a mistake that cognitive scientists often commit (Searle 1995). When cognitive scientists

166     A. Raftopoulos

deal with an input and an output state that are both mental states with contents, they usually assume that the processes that connect them are also mental states with representational contents. There is no reason, however, not to think of the processes that connect the inputs with the outputs as non-meaningful, non-contentful, causal connections. Accordingly, the function of the operational constraints in perception does not entail that perception is guided by ‘object knowledge’. The operational constraints are combinatorial principles. Other philosophers think that such operational constraints are states of the system. The term ‘tacit knowhow’ is employed to denote the information carried by states that are built into the system in a way that does not require that the states be represented in any form in the system (Dennett 1983). This tacit knowhow is not represented anywhere in the system and is not a kind of knowledge; if it were we would have to hold the view that birds, in the muscular system of which the laws of aerodynamics are hardwired, know aerodynamics (Dennett 1983). There are philosophers who part company and believe that hardwired computational processors realize in a system tacit knowledge of a particular set of rules or generalizations (Davis 1995, 329). “The rules would not have to be explicitly represented in any representational state of the system. Still less would knowledge of the rules be realized in a state of the same kind as an attitude state.” Davis claims that tacit knowledge is not realized by attitude states because tacit knowledge has two main characteristics that bar them from being attitude states. First, tacit knowledge is subdoxastic knowledge since it is not inferentially integrated with other attitude states and it exists in special-purpose, separate sub-systems (Stich 1978). Second, attitude states require that the concepts that are part of the semantic contents of the states must be concepts that are possessed by the person who is in these states; a believer, for example, must necessarily grasp the concepts of which the belief is constituted. This means that beliefs have their representational contents conceptualized by the believer. The contents of tacit states, however, are not conceptualized. In addition, when persons are in a tacit state, they do not have, simply by being in that state, access to the contents of this state, as the persons who are in some attitude state are. When a person is in a belief state, they entertain the content of the

3  Early Vision and Cognitive Penetrability     167

belief simply by being in that state. It follows that, according to the proponents of tacit knowledge, the operational constraints realize tacit, representational knowledge of the regularities of the physical environment and of its geometry. However, these are not conceptual representations. Thus, irrespective of how one conceives of the information realized by the operational constraints, that is, independent of whether the constraints are construed as merely causal connectors with no representational contents, or whether this information constitutes some sort of tacit, non-representational knowhow, or whether it is some sort of tacit, representational knowledge, the constraints are not rules of inference that the visual system looks-up implicitly or explicitly to perform its interstate transformations, or premises used in such transformations. Moreover, that perception relies on some operational constraints to function successfully does not entail that perception is affected from concepts within. As we saw, in any interpretation of the information realized by the operational constraints, this is not conceptual content. Hence, the existence of some operational constraints hardwired in perception does not entail that there is some sort of knowledge that determines or simply affects perceptual processing. So, in general, the grouping principles that underlie the operational constraints are not top-down influences on perception, but, instead, are considered bottomup biases affecting perceptual competitions (Beck and Kastner 2009, 1159–1160). Thus, Burge (2010) is right to argue that the formation principles do not entail the theory-ladenness of perception: For many philosophers, the notion of computational states or explana­ tions is theory-laden in a way that I do not intend. When I call states or explanations ‘computational’, I do not mean that there are transformations on syntactical items, whose syntactical or formal natures are independent of representational content [of the computed states]. I also do not mean that the principles governing transformation are instantiated in the psychology, or ‘looked up’, even implicitly in the system … principles governing perceptual transformations … are not the representational content of any states in the system, however unconscious. (Burge 2010, 95)

168     A. Raftopoulos

Even if one accepts that the operational constraints do not amount to principles embedded in the visual system imbuing it with some sort of theory, one could still argue that even though early vision and its processes and states are CI, the content of these states could be conceptual if one assumes that some concepts are embedded in the contents of early vision not by reaching this stage via top-down flow of information but by being there either from the beginning or as a result of the development of the circuits of early vision. This is Fodor’s (2007) view, and also a possibility considered by Block (2007, 346). Concepts can be embedded within the visual system and perhaps can be used only by it, but this is enough to render early vision contents propositionally/ conceptually structured. For Pylyshyn (2007, 52), such concepts may be “codes for proximal properties involved in perception, such as edges, gradients, or the sorts of labels that appear in early computational vision”, or, in general, codes for those properties that are involved in the operational constraints (Raftopoulos 2009) hardwired in early vision and whose role is to ensure that the various underdetermination problems in vision are solved. Pylyshyn (2003, 2007) also thinks that perceptual contents are not representational because they are not conceptual. These two considerations pave the way to the view that perceptual states may not have contents after all. This view is brought to the fore in Fodor and Pylyshyn (2015), where it is argued that perceptual states have no contents but refer directly to worldly states of affairs. This view allows Pylyshyn to reintroduce concepts in perception, although this notion of concept is radically different from the notion of concept as used in discussions concerning both conceptual influences on perception and the NCC of perception. Or, finally, they may be sensory concepts for colors, shapes etc. (Block 2007; Fodor 2007). Be that as it may, these concepts do not play the role that concepts are usually thought to play in cognition. First, for embedded concepts of the sort posited by Pylyshyn, in view of the fact that the operational constraints in which these “concepts” enter are hardwired in the system and as such they are not represented in the system, these “concepts” are not represented in the system either, which is why Pylyshyn calls them “codes” or “labels”. This means that they are dissimilar to the ordinary concepts (recall that concepts are

3  Early Vision and Cognitive Penetrability     169

context independent, freely repeatable element that figure constitutively in propositional contents), which are representational elements. Second, and this point concerns also sensory concepts, they can be used only in visual processes and they are not available for cognitive tasks, because they cannot combine logically and they do participate in inferences. Perceptual contents being iconic and not discursive lack the formal structure that would allow them to enter into logical relations. The reason is that iconic representations have no canonical decomposition, that is, although they have interpretable parts, they have no constituent parts because they are homogeneous. Discursive representations, on the other hand, have canonical decomposition because they consist of distinguishable parts (Fodor 2007; Heck 2007; Pylyshyn 2007; Raftopoulos 2009). Third, they do not allow re-identification across times and contexts of the objects formed during early vision (Campbell 2006; Kelly 2001; Pylyshyn 2007; Raftopoulos 2009). Fourth, they do not satisfy Evans (1982) generality constraint (Heck 2007; Raftopoulos 2009). Thus, if one uses ‘concept’ with its usual significance, contents in early vision do not include concepts.

2.2 Perceptual Learning In the course of our interaction with the world some experiences are learned and form memories that are stored in visual memory and these memories include certain sensory concepts. Thus, one could argue, concepts affect perceptual processing rendering it CP from within. Let us examine this argument. Visual memories affect the way one perceives the world. Familiarity with objects or scenes, or the result of repeated exposure to objects or scenes (some times one presentation is enough), or repetition memory facilitate the search for objects, affect figure from ground segmentation, and speed up object identification and image classification (Liu et al. 2009; Peterson and Enns 2005). Familiarity affects visual processing in different ways. It facilitates object identification and categorization, which are processes that take time since they occur between 300 and 600 ms after stimulus onset, and their earlier stage starts at

170     A. Raftopoulos

about 150 ms after stimulus onset (Johnson and Olshausen 2005). Familiarity is thought to intervene during the latest stage of object identification and categorization (300–360 ms). These effects are considered to be post-sensory in that they involve the cognitive levels of the brain at which semantic information and processing, both being required for object identification and categorization, occur (Delorme et al. 2004). Thus, the effects of perceptual learning and familiarity take place in late vision and, thus, do not entail that early vision employs concepts. Hence, they do not entail that early vision is directly affected by cognition. Familiarity and repetition memory affect object classification (whether an image portrays an animal or a face, for example), a process that occurs in short latencies (95–100 ms and 85–95 ms after stimulus onset respectively) (Crouzet et al. 2010; Liu et al. 2009). These effects threaten the thesis that early vision is CI because they occur early and are clearly not post-sensory. The thesis that early vision is CI would be undermined if the classification processes require either that semantic information is being used, or that representations of objects in working memory be activated, since that would entail conceptual involvement. However, researchers agree that the early classification effects result from the feedforward sweep (FFS) and do not involve semantic information, and, furthermore, they do not depend on the activation of any object memories. The main reason for this claim is that if they did require any of these, they could not be that fast. Instead, the brain areas involved are low-level visual areas (including the front eye fields—FEF) from V1 to V4 (Kirchner and Thorpe 2006), and, a little more upstream to posterior IT, and lateral occipital complex-LO (Grill-Spector et al. 1998). The early effects of familiarity could be explained by invoking contextual associations (context spatial relationships) stored in early sensory areas to form unconscious perceptual memories (Chaumon et al. 2008), which, when activated from incoming signals that bear the same or similar target-context spatial relationships, modify the FFS of neural activity causing the facilitating effects. This is another case of rigging-up the FFS; it is not a case of top-down effects on early visual processing. The brain areas involved are low-level visual areas (including FEF) from V1 to V4

3  Early Vision and Cognitive Penetrability     171

(Kirchner and Thorpe 2006), and, a bit more upstream to posterior IT, and lateral occipital complex-LO (Grill-Spector et al. 1998). Alternatively, the early effects could be explained by invoking configurations of properties of objects or scenes stored in visual circuits. Neurophysiological research (Grill-Spector et al. 2006), psychological research (Peterson and Enss 2005), and computation modeling (Ullman et al. 2002) suggest that early visual areas store implicit associations representing fragments of objects and shapes (“edge complexes”), as opposed to whole objects and shapes. One of the reasons that researchers hold that object and shape fragments are used in rapid classifications rather than whole objects and shapes is that if these associations affect figure-ground segmentation, in view of the fact that figure-ground segmentation occurs very early (80–100 ms) (Lamme and Roelfsema 2000), they must be stored in early visual areas (up to V4, LO and posterior IT). Early visual areas store object and shape fragments that speed up FFS and local recurrent processing (LRP) in early vision. Similarly, Deroy (2013) argues that the putative examples of CP of color experience may be explained by invoking sensory integration whereby high-level multimodal representations about a visual object’s color, shape, volume, texture, etc. are activated by sensory information and affect early visual processing. These representations that affect top-down early perceptual processing, however, are not cognitive representations like beliefs and knowledge and, thus, do not entail that color perception is CP. Since posterior IT and lateral occipital complex-LO are involved in the storage of configurations of properties and these are high-level vision areas, Deroy’s views could be taken to belong in this category. These associations likely reflect the statistical distribution of properties in natural scenes in the environment (Delorme et al. 2004). The statistical differences in physical properties of various subsets of images are detected by the visual system very early and before any top-down semantic involvement, as is evidenced by the elicitation of an early deflection in the differential between animal-target and non-target ERP’s at about 98 ms in the occipital lobe. The low-cues are retrieved by analyzing the energy distribution across orientation- and spatial frequency-tuned channels (Torralba and Oliva 2003).

172     A. Raftopoulos

Familiarity, thus, increases sensitivity to certain properties of the image, visual features such as contours, for example, and this speeds up the perceptual processes that lead to object recognition reducing the time of object recognition. This occurs without any top-down cognitive effects or without any conceptual influence from within and, thus, these familiarity effects do not entail the CP of perception or any conceptual influences on perception. Potter et al. (2014) argue that early visual processing can be fast and mainly feedforward and may even lead to object categorization, and point out that a possible role for such rapid visual categorization, which leads to a rapid understanding of the visual scene, would be to provide almost immediate activation of the relevant concepts, or concept-like analogs, which, in turn, enables immediate action when necessary without the need for the organism to await for the time-consuming recurrent processing to recognize and categorize the objects in the visual scene and, even, acquire conscious awareness of the visual scene. Even though FFS and LRP activate very fast a concept that is type-identical to the same concept that figures in cognitive processes, the activation of this concept does not entail that early vision is CP by concepts for the simple reason that the concepts thus produced are simply the result of early vision, in the sense that the product of early vision matches a stored template in memory, and have not modulated in any way the visual processes that produced them. To conclude, the early latencies of the effects of perceptual learning preclude them from being a top-down cognitive effect. Instead, it is a bottom-up, stimulus-driven effect. Still, one could insist that the information stored in visual circuits as a result of perceptual learning involves sensory concepts rendering early vision conceptual and, thus, CP from within. I presented evidence suggesting that the early classification is due to associations of shape and object fragments stored in early visual areas and as I explained above these associations could not be viewed as concepts in a philosophically interesting way. I agree, therefore, with Gatzia and Brogaard (2017) when they argue that perceptual learning does not result in the acquisition of new facts, beliefs, pieces of knowledge etc., but in the acquisition of new perceptual capacities that allow fast recognition and categorization.

3  Early Vision and Cognitive Penetrability     173

3 Early Vision: Why Early Vision Is Not Affected Directly by Cognition Part 2 Early vision includes a FFS in which signals are transmitted bottom-up. In visual areas (from Lateral Geniculate Nucleus (LGN) to FEF) FFS lasts for about 100 ms. Early vision also includes a stage at which lateral and recurrent processes that are restricted within the visual areas and do not involve signals from cognitive centers occur. Recurrent processing starts at about 80–100 ms. Lamme (2003) calls it LRP. The unconscious FFS extracts high-level information and results in some initial feature detection that could lead to categorization, Indeed, a confluence of electrophysiological studies (Keysers et al. 2014) and psychophysical studies (Potter et al. 2014) suggest that early visual processing can be fast and mainly feedforward and may even lead to object categorization. In view of the overwhelming evidence that the activity of early visual cells far exceeds that predicted by their selectivity as determined by their classical receptive fields (RF) and that this activity depends on the global context that can modulate it, the need to posit the existence of lateral and feedback flow of information arises; this is called LRP (Lamme 2005; Lamme and Roelfsema 2000). LRP produces further binding and segregation. The LRP is needed because, owing to the small RF of the neurons in V1 and V2, only local information can be coded at this level. The segmentation and recognition of the objects in a visual scene, however, requires a more global analysis of the visual scene, of the sort that can be achieved in higher areas, such as V4 or MT/V5 where the RF of the neurons are larger and can integrate information across longer distances in the visual field, in addition to lateral connections that enhance the activity of neurons in the light of the computations performed by other neurons at the same level. The activity of orientation-selective neurons in V1, for instance, is enhanced both when stimuli to whose orientation the neurons are sensitive extend to form a contour, which is coded in V2, via top-down connection from V2 to V1 (Wilson and Wilkinson 2015), and when other V1 neurons code lines with the same orientation, through long-range lateral excitatory connections among V1 neurons (Gilbert et al. 2000;

174     A. Raftopoulos

Gilbert and Li 2013). This processing is restricted within the visual areas only and belongs to LRP. LRP seems to be needed because due to the small RF of the neurons in V1 and V2 only local information can be coded there and to the extent that more global information is required for object segmentation and object recognition and categorization, recurrent processes of the sort described above are needed. It seems, thus, that contextual modulation of the activity of early visual neurons is ubiquitous. Even the computation of perceptual saliency that is computed preattentionally (Itti and Koch 2001) and guides bottom-up, data-driven attention (exogenous attention) requires contextual modulation of the activities of early visual neurons: Perceptually, whether a given stimulus is salient or not cannot be decided without knowledge of the context in which the stimulus is presented. So, computationally, one must also account for nonlinear interactions across distant spatial locations, which mediate contextual modulation of neuronal responses. (Itti and Koch 2001, 198)

The feedback projections provide the global analysis that allows object segmentation, figure/ground separation, and object recognition. In the case of the MT/V5 feedback to V1, there is evidence (Plomp et al. 2015) that this feedback increases the responsiveness of the neurons in V1 especially for low salience, small signals, which means that the recurrent signals from MT/V5 may serve to disambiguate sub-optimal visual input with respect both to the spatial location and motion of the sub-optimal signals and to their content. In addition, the feedback signals may be used to inform V1 where a change has happened in the visual scene. By not involving signals from the cognitive areas of the brain, FFS and LRP are CI/conceptually encapsulated, since the transmitting of signals within the visual system is not affected by top-down signals produced in cognitive areas. Early vision processing is not affected directly by top-down signals from cognitive states through attention, that is, attention does not affect the early visual processes although it may affect pre- and post-early vision stages of perception.

3  Early Vision and Cognitive Penetrability     175

The processes of early vision retrieve from the environment the information that will eventually allow perception to perceive a visual scene with as much accuracy as possible. In order to do so, early vision gradually constructs representations of increasing complexity (from variations in light intensities it extracts edges, from edges blobs, from blobs it extracts two-dimensional surfaces, and from these it infers the 21/2 sketch. It also extracts the so-called ensemble properties of the objects in the scene (more about this when I discuss pre-cueing) by retrieving the statistical properties of the objects in the scene. The representations formed in early vision comprise information about spatio-temporal and surface properties, the shape of the object as viewed by the perceiver, color, texture, orientation, motion, and affordances of objects, in addition to the representations of objects as bounded, solid entities that persist in space and time. In general, the role of early vision is to segment objects from background and bound them to their properties. Identifying objects and categorizing them is the role of late vision (Raftopoulos 2009). One proposal concerning the nature of these processes, albeit a speculative and not widely shared even by researchers working within a Bayesian framework of perception, has been put forth by Clark (2013). Applying it to early vision, one gets the following picture. The topdown and lateral effects within early vision aim to test hypotheses concerning the putative distal causes of the sensory data encoded in the lower neuronal assemblies in the visual hierarchy. In this testing, predictions made on the basis of these hypotheses about the sensory information that the lower levels should encode assuming that the hypotheses are correct, with the current, actual sensory information encoded at the lower levels. The hypothesis that best matches the sensory data is selected. To form hypotheses concerning the probable cause of the sensory data at a certain level, at a specific spatial and temporal scale, the neuronal assembly at the next level, say level l, uses information not only about the sensory data at the previous level (or, to be precise, information regarding its prediction error) that is transmitted bottom-up, but also higher-level information that is transmitted to l either laterally, that is, from neuronal assemblies at the same level (neurons in V1 processing

176     A. Raftopoulos

wave-lengths inform other neurons in V1 processing shape information, for example), or top-down from levels higher in the hierarchy (neurons in V4, for instance, are informed about the color of incoming information from neurons in IT as a result of pre-cueing—that is, when a viewer has been informed about the color of an object that will appear on a screen). All this lateral and top-down flow of information provides the context in which each neuronal assembly constructs the most probable hypothesis that would explain the sensory data at the lower level. One caveat is in order. Proponents of the predictive coding framework described above usually assume that the bottom-up signals are mere error signals. Although it is intuitive to hold that after a hypothesis is tested, an error signal should be propagated bottom-up to inform the higher areas and of what were wrong in the prediction, this does not entail that all bottom-signals must be merely error signals. After all, for the higher areas to construct hypotheses about the sources of information at the lower areas, they must receive the information encoded in the lower rear that can be transmitted there only in a bottom-up manner. For example, in order for a V2 neuron to encode a curved contour, it must receive information from adjacent lower V1 neurons that owing to their small RF can encode only oriented lines, lines whose combination gives rise in the curved contour in a V2 neuron that has a three-fold wider receptive field and can, thus, incorporate and synthesize the information from the V1 neurons (Wilson and Wilkinson 2015, 2). It follows that the account presented here does not contradict with the predictive coding framework provided that some of the standard assumptions of that model are subtracted, as they should because as I just argued they cannot explain adequately the visual processes. More importantly, although Newen and Vetter (2016, 10) argue that the predictive coding framework as outlined by Clark is a very plausible account for explaining brain processes, as we shall see, the literature they adduce to argue for early visual recurrent processing presupposes the standard constructivist model according to which information encoded in lower areas is transmitted bottom-up to higher areas (independent of the nature of the top-down flow of information). Therefore, nothing in their arguments relies on the thesis that the bottom-up signals carry only error information, which, thus, is immaterial to the discussion I this paper.

3  Early Vision and Cognitive Penetrability     177

3.1 The MT/V5 to V1, V2 Interaction in Early Vision (First Pass) Since 90% of the information transmitted by neurons is transmitted within the first 100 ms of the neurons’ activation in response to a stimulus, information to neurons transmitted from other neuronal assemblies can affect their activity only if it arrives within 100 ms after stimulus presentation (Bullier 2001, 98). Thus, in order for the recurrent signals to have an impact and modulate the activity at the reentered sites, they should reenter them during these crucial 100 ms after the initial activation. Thus, for some signals, from V4 or MT/V5, which receive feedforward signals from V1, to reenter V1 in time to influence the activation of V1’s neurons, the loop consisting of feedforward and recurrent signals from V1 to V4 or MT/V5 and back must have been completed in 100 ms. To put things into perspective, let us examine Bullier’s (2001) ‘reintroinjection’ view as it pertains to MT/V5 and its interaction with V1 and V2. Low spatial frequency (LSF) signals precede high spatial frequency (HSF) signals; LSF information is transmitted through fast magnocellular pathways, while HSF information is transmitted through slower parvocellular pathways. As a result, the information transmitted through the M-channels reaches V1 from LGN 20 ms earlier than the information transmitted from LGN to V1 through the P-channels. The mean activation latency of the neurons in MT/V5 of the brain is 75 ms poststimulus. Signals arrive at these areas at about the same time as, or a bit later than, they arrive in V1 (50–80 ms) and V2 (85 ms) and much earlier than they arrive in V4 despite the fact that MT/ V5 is anatomically higher than V4 and further away from V1 (Bullier (2001, 98). MT/V5 (and FEF) are parts of the ‘fast brain’ and belong to the dorsal system (MT/V5 is situated in the parietal cortex and the FEF is in the pre-frontal lobe in an area where the dorsal pathway projects). Signals from V1 can reach the MT/V5 at about the same time they reach V2, that is, within 1–2 ms. Thus, in less than 20 ms the recurrent signals from MT/V5 affect the activation of neurons in V1 and V2. When HSF information transmitted through the P-channels reaches V1, 20 ms after LSF information transmitted through M-channels

178     A. Raftopoulos

had reached V1, the responses of V1 neurons have been modified as a result of the top-down signals from MT/V5 that had received earlier the LSF information. In addition to the fast transmission of signals from V1 to MT/V5 and back through the M-channels, MT/V5 also receives fast signals directly from LGN bypassing V1 through the koniocellular pathway. Thus, MT/V5 could be activated earlier than V1 (Foxe and Simpson 2002). To recapitulate, MT/V5 is activated very early and the activation in MT/V5 reaches V1 and V2 through top-down connections early enough and modulates the early activations in V1 and V2.1 Bullier (2001, 100) concludes that the first wave of activity that invades the visual cortex following the appearance of a visual stimulus appearing is carried by M channels and consists in recurrent processes between V1 and MT/V5. This entails that the earliest ERP component, namely C1, which is elicited between 40 and 60 ms, is not an indice of the activity of V1 alone but also reflects top-down influences to V1 from areas as high as MT/V5. This view is reinforced by a study by Plomp et al. (2015), whose work is cited by Newen and Vetter as evidence against weak CI, that is, the thesis that early vision is CI. The researchers aimed to “investigate whether recurrent and top-down interactions between visual and attentional brain areas can be identified and distinguished at short latencies in humans.” Their results confirm the fast interaction between V1 and MT/V5 reported by Bullier. They also show that the C1 ERP waveform reflects both V1 activity and activity in highly distributed areas situated at the occipital, parietal, and dorsal frontal lobes (where FEF is located). As Plomp et al. (2015, 1) write “[s]timulus-evoked activity at latencies before 100 ms is traditionally considered a bottom-up process. Even at these short latencies, however, there is mounting evidence of fast

1Shorter

latencies of signal transmission have been reported by Innui and Kakigi (2006) who applied flash stimuli to the right eye and examined activations in eight cortical areas and found out that the cortico-cortical connection time of visual processing at the early stage was 4–6 ms. Even with this temporal profile of activation, there is still time for recurrent signals from hMT to affect V1 because they reenter V1 in about 20 ms, which is well within the time window of the first 100 ms after the initial activation of the V1 neurons.

3  Early Vision and Cognitive Penetrability     179

recurrent interactions between visual areas, obtained from direct recordings of neural activity in animal models.” In contradistinction to this early recurrent activity, the parietal cortex and the later cycle of activity in FEF (as we shall see, there is an early and a late cycle of activity of FEF neurons during perception), which are known to modulate perceptual processing so as to help adapt behavior to the demands of a task and context, affect occipital activity around the latency of N1 (170–200 ms after stimulus onset). Thus, top-down interactions that reflect task-specific processing, which obviously involve cognitive signals, of the stimuli arise at longer latencies. Plomp et al. (2015, 4–5) synopsize their results as follows “at the N1 latency, driving from MT no longer showed a stimulus effect, indicating that stimulus-specific driving from MT is confined to earlier latencies, in line with its fast response properties.” For Plomp et al., thus, the earliest, before N1, recurrent interactions are only stimulus-specific and do not involve cognitive signals. This agrees with Bulier’s conclusion that during the early interaction between MT/V5 and V1 (earlier than 100 ms) the signals are stimulusdriven (since the signals entering MT and processed originate from the stimulus), while only later do interactions that involve cognitively driven attention, whose commands are issued according to the task demands, modulate the activation of neurons in V1. It follows that for Plomp et al. the early recurrent interactions between MT/V5, FEF and V1 are stimulus effects and do not involve any cognitive influences and that the latter reflecting the task demands appear when N1 is elicited at about 170 ms poststimulus, which is outside the time frame of early vision. N1, the second waveform subcomponent of spatial attention to register, arises from multiple generators in posterior parietal areas (an early phase at 140–160 ms) and in ventral occipital-temporal areas (late phase at 160–200 ms). Unlike the P1 that was found suppressed at the unattended locations and was considerably larger at the attended locations but did not show an enhancement at the attended locations, the N1 was enhanced at the attended locations but it was not suppressed at the unattended locations. It seems that P1 inhibits information from unattended locations, whereas N1 facilitates information from attended

180     A. Raftopoulos

locations (Hopfinger et al. 2004). Luck (1995) has proposed that P1 reflects a gain control mechanism that suppresses signals from ignored locations, whereas N1 indexes the addition of a limited-capacity discriminative process at the attended location. N1 is considered to be an index of the orientation of spatial attention to task-relevant objects, that is, objects that are related to the task at hand and are found at the attended locations (Evans et al. 2000). Taskrelevant objects are distinguished from target irrelevant objects at about 140–200 ms after stimulus onset. This agrees with Chelazzi et al. (1993) and Schall and Bichot (1998) findings that target (task-relevant)/ nontarget (task-irrelevant) discrimination occurs at about 150–200 and 120–200 respectively after stimulus onset. Unlike P1 that is enhanced both in exogenous and endogenous attention, N1 is enhanced only in endogenous or voluntary attention, that is, it is elicited only when subjects view a scene and decide where to attend, and do not just when they passively view the scene. This reflects the role of N1 in target/ nontarget discrimination. However, N1 is insensitive to the type of the target and occurs long before the identification of the target. Phillips (2017, 7), after having said that purely perceptual (or narrowly) construed states are necessarily under the causal control of the stimulus, points out that that does not mean that there are no top-down influences involved. In clarifying the thesis that perceptual processes are stimulus-controlled, it is important to realize that it does not rule out top-down influences. The characterization given above merely says that perceptual processes function so as to be causally controlled by proximal stimuli: it does not say that proximal stimuli are always the only causal factors in play. (Phillips 2017, 7)

He then adds that In fact, even if a paradigmatically perceptual process is itself top-down, this does not entail that it is not stimulus-controlled. Feedback mechanisms in early visual areas (e.g. from V2 to V1) function so as to be causally controlled by proximal stimuli.

3  Early Vision and Cognitive Penetrability     181

So, for Phillips, the existence of recurrent, top-down processes in a visual stage does not entail that this stage is not paradigmatically, purely (recall that Phillips discusses narrowly defined, pure, perception), perceptual. In agreement with the abovementioned commentary on the empirical evidence, Phillips thinks of these top-down influences as being controlled by the stimulus, and, thus, as stimulus-driven, which means that at this stage there are no cognitive influences. One could argue that even though the evidence adduced thus far shows that the recurrent signals from MT/V5 to V1 carry no cognitive information, it is still possible that cognitive centers send directly cognitive information to V1/V2 bypassing MT/V5, or, V4 or LOC, because as I will argue in the next section no cognitive influences can be found in the recurrent processes involving those areas as well. One such possibility is FEF that is in the dorsal pre-frontal cortex and is known to mediate both shifts of overt attention and shifts of covert attention that precede saccades and, therefore, its function is affected by cognitive factors that drive attention. We shall see, however, that to the extent that FEF is involved in early visual processing and not in the planning of attentional shifts, there are no cognitive influences involved in this recurrent processing. In general, as the discussion on the role of N1 shows, all cognitive influences whatever their source might be are delayed in time and do not affect early vision.

3.2 Assessing the Evidence Thus Far Up to this point, there is no evidence supporting the existence of cognitive influences at these early latencies. The picture presented posits early latencies of the signals arriving to MT/V5 from LGN either directly or through V1, that is, it dictates a bottom-up early activation of MT/ V5 that processes these signals and sends the result of this processing as feedback to V1 affecting the activations of the neurons there. At these early latencies, no evidence for top-down cognitive signals to MT/V5 and, thus, for cognitive signals affecting V1 has been adduced. Newen and Vetter are right to claim that “it is more likely that both feedforward and feedback processes happen in parallel rather than in a strictly serial

182     A. Raftopoulos

manner”, but these early parallel loops concern strictly the processing and transmission forth and back of purely stimulus-driven information. This is what the evidence adduced by Plomp et al. (2015) also suggests, since, as they argue, the first cognitive effects are registered around 170 ms poststimulus as N1 shows. Recall that LRP is needed to provide a more global analysis of the sensory information than that performed by the narrow fields of V1 and V2 neurons to facilitate object segmentation and recognition. V4 and MT/V5 provide a global analysis of the sensory signals taking advantage of their wide RF and send it back to V1 and V2. This analysis, however, is based solely on the usage of the wider RF in MT/V5 that provide the context that is needed by V1 and V2 to compute their targeted information. The studies discussed make no mention of MT/ V5 using cognitive signals to perform this specific task. As Plomp et al. claim, in early latencies the signals driving from MT/V5 are only stimulus-specific. As the discussion of N1 shows, cognitive influences whatever their source might are delayed in time and do not to affect early vision. One could argue that even though the evidence adduced shows that the recurrent signals from MT/V5 to V1 carry no cognitive information, it is still possible that cognitive centers send directly cognitive information to V1 or V2 bypassing MT/V5, or, V4 or LOC for that matter, because as I will argue in the next section no cognitive influences can be found in the recurrent processes involving V4 or LOC as well. One such possibility concerns FEF that is in the dorsal pre-frontal cortex and is known to mediate both shifts of overt attention by guiding saccades and shifts of covert attention that precede saccades and, therefore, its function is affected by cognitive factors. We shall see, however, that to the extent that FEF is involved in early visual processing, there are no cognitive influences involved either.

3.3 What About Modules? Newen and Vetter (2016, 2) construe Raftopoulos’ (2014) thesis that early vision is CI to entail the existence of an early visual processing module constrained to those areas which are involved in the first 100 ms after visual stimulation.

3  Early Vision and Cognitive Penetrability     183

If one supposes some basic and very early visual processes as being impenetrable, then this asks for a module that becomes smaller and earlier every time research methodologies achieve a higher resolution. The principle problem lies in the postulation of a clear-cut and impenetrable module per se. As we argued above, while some brain areas are certainly functionally specialized to some predominant functions, this does not mean they have clear-cut functional boundaries that cannot be penetrated by contents from other brain areas. Postulating an impenetrable module, whether implausibly large or very small, will always entail a boundary that is not tenable. (Newen and Vetter 2016, 3)

The wording used here may cause some misunderstandings, so, allow me to clear the ground. ‘Module’ is usually taken to mean a welldefined and relatively restricted area of the brain. The early vision, which I claim is CI, involves a very widespread nexus of areas in the brain, from LGN to V4, the parietal cortex and the FEF. As such, it encompasses almost (if not all) of the visual areas of the brain, for which reason the use of the term ‘module’ rather stresses its meaning. What characterizes early vision is not the neuroanatomical sites that it involves but rather the functions that it computes and the time frame within which it does so; it is a functional definition. Newen and Vetter fall into the trap of taking early vision to be a module in the sense of a set of specialized anatomical sites when they write “while some brain areas are certainly functionally specialized to some predominant functions, this does not mean they have clear-cut functional boundaries that cannot be penetrated by contents from other brain areas.” Read in the context of the presentation of the thesis of weak CI, one might be tempted to think that the claim for the CI of early vision entails that some brain areas that cannot be penetrated. From the account of early vision presented here, the reader can readily see that this is false. I have extensively argued that during late vision all areas of the visual brain are revisited and the neuronal activities in these areas are modulated by top-down cognitive signals. The difference between the behavior of the areas that compute information during early vision and the behavior of these same areas that compute information during late vision lies in the fact that the role of, say, V1 during early

184     A. Raftopoulos

vision differs from its role during late vision. Thus, there are no areas of the visual brain that are eventually unaffected by top-down, cognitive modulation. There are, however, areas whose activity for a certain time period (up to 120–140 ms) is not modulated by any cognitive signals.

3.4 Is the Content of Early Vision Philosophically Speaking Significant? Suppose that early vision is CI. Still, Newen and Vetter could be right that the perceptual processes within this narrow time frame produce states with such poor contents that they are not properly speaking perceptual states; they could be a most sensory states. As we have seen, FFS and local RP allow, in about 120–150 ms after stimulus onset, the construction of fairly complex representations of stimuli. There is some form of perceptual organization, which certainly includes information regarding the presence of discrete objects in a scene (that is the segregation of objects), their orientations, sizes, shapes or forms, motions; these features determine the structural description of objects. As we saw in Chapter 2, early vision delivers a structural description of the visual scene that contains information about 3D shapes as viewed from the perceiver, spatio-temporal and surface properties, color, texture, orientation, motion, and affordances of objects, in addition to the representations of objects as bounded, solid entities that persist in space and time. In central vision, information other than texture is represented with as much accuracy, or maximal detail, as the limitations of our perceptual systems allow. Texture, however, is represented as an ensemble property, that is, as the joint statistics of responses of cells sensitive to texture. In peripheral vision, in contrast, the representational mode is that of ensemble summary statistics. That is, objects in the periphery are treated as ensembles and their statistical properties are represented. This means that in peripheral vision, mean orientations, motions, shapes, sizes, colors, etc. are represented. Toward the end of their paper, Newen and Vetter (2016, 10) argue for the existence of a purely perceptual stage (that is, a stage that is unaffected by cognitive states) whose contents are better described as nonconceptual. Thus, they relate the contents of the states of a CI stage of

3  Early Vision and Cognitive Penetrability     185

visual processing with the nonconceptual contents of philosophers. I argued that the CI of certain perceptual states is a sufficient and necessary condition for these states having nonconceptual content (author). I disagree, however, with Newen and Vetter in that this nonconceptual content is the content of perceptual states related to the perception of impoverished black and white pictures. The CI early vision retrieves from the visual scene a quite extensive range of information that includes information about spatio-temporal and surface properties, color, texture, orientation, motion, and affordances, in addition to the representations of objects as bounded, solid entities that persist in space and time. Burge (2010), Crane (2009), Johnston (2006), and Peacocke (1998, 2001) think that the nonconceptual content has a rich structure. Crane (2009, 465) claims that even though the nonconceptual content of perception does not have the structure of judgeable content, it still represents a manifold of objects, properties, and events. Johnston (2006, 282–283) thinks that the nonconceptual content of perception is not propositional but a host of interconnected exemplifications of properties and relations and kinds. Peacocke (2001, 241) argues that the nonconceptual content of experience represents things, events, or places and times in a certain way, as having certain properties or standing in certain relations, “also given in a certain way.” Peacocke (1998, 61–62) discusses nonconceptual content in elaborating his notion of scenarios and argues that this content is provided by the spatial types “the type being that under which fall precisely those ways of filling the space around the subject that are consistent with the correctness of the content.”

4 Recurrent Processes of Early Vision Do Not Involve Cognitive Information: Why Early Vision Is Not Affected Directly by Cognition Part 3 Newen and Vetter (2016) offer the most comprehensive criticism of my thesis that early vision is CI adducing evidence that purports to show that there are recurrent processes involving cognitive signals within the

186     A. Raftopoulos

time scale of early vision. Let us examine, thus, in detail the specific evidence that Newen & Vetter employ to support their views because by answering their objection one answers most of the objections coming from alleged early cognitive effects on perception. They start their arguments about temporal processing as follows: Time-resolving electrophysiological evidence showed that visual cortex is activated within 50 ms and pre-frontal areas within 80 ms after visual stimulus onset. This leaves plenty of time for iterative top-down processing between ‘‘cognitive”, e.g. frontal and parietal, areas and sensory, e.g. occipital, areas, within the first 100–200 ms after visual stimulation (Foxe and Simpson 2002). Thus, complex high level and reiterative processing can happen very fast and can influence visual processing very early on. (. . . Plomp et al. 2015; Newen and Vetter 2016, 4–5)

Newen and Vetter talk about recurrent signals that involve cognitive activity affecting visual areas at latencies 100–200 ms. Thus, Newen and Vetter accept that the available evidence suggests that cognitive effects on visual areas are registered after 100 ms poststimulus. However, one might rejoin that Newen and Vetter allow for cognitively driven recurrent processing after 100 ms, that is, before 120–150 ms, latencies that are within early vision; there is still plenty of time for cognition to affect early vision between 100 and 120 ms. To settle this, let us examine the evidence closer.

4.1 Recurrent Processing Between MT/V5 and V1, V2 (Second Pass) The reference to the interaction between V5 and V1 during motor perception in these studies brings into mind the foregoing discussion of V1/V2–MT/V5 recurrent loops, where I argued that the recurrent signals are purely visual, stimulus-driven and not cognitive. We discussed Plomp’s et al. (2015) view that the recurrent early activity owing to MT/V5 interaction with V1 and V2 (before N1 around 170 ms) is restricted within visual areas and is only stimulus-dependent. Foxe and Simpson (2002, 139) state:

3  Early Vision and Cognitive Penetrability     187

There is clearly sufficient time for multiple iterations of interactive processing between sensory, parietal, and frontal areas during brief (e.g., 200 ms) periods of information processing preceding motor output…. These data strongly suggest that activity represented in the “early” ERP components such as P1 and N1 (and possibly even C1) is likely to reflect relatively late processing, after the initial volley of sensory afference through the visual system and involving top-down influences from parietal and frontal regions.

The reference is to a time frame up to 200 ms, which is outside early vision. Furthermore, the top-down signals that are generated in the higher visual areas and reenter the early visual areas within early latencies result from the processing of sensory signals that arrive very quickly, through M-channels or the koniocellular pathway, to the higher areas; “the rapid activation of prefrontal cortex following initial visual activation (within 30 ms) suggests that this input is mediated through the faster dorsal visual stream” (Foxe and Simpson 2002, 147–148). Since the early recurrent activity involves the dorsal system, and since there is up to date no evidence to support the existence of any cognitive effects on the dorsal system when this system functions online to support fast action and does not rely on information from the ventral system, it follows that the higher visual areas from which the top-down signals emanate have not received as yet any signals from cognitive areas, and the feedback loops consist of bottom-up sensory signals and top-down sensory signals reprocessed and modified in higher visual areas without any cognitive involvement. Furthermore, and in reference to Foxe and Simpson’s mention of parietal and frontal regions involved in the early recurrent processing, which may be taken as evidence for the existence of cognitive influences since these regions are implicated in cognitive processes, one should note that MT/V5 is in the parietal cortex and FEF is in the prefrontal cortex. Our discussion on the role of MT/V5 shows that there are no cognitive effects on MT/V5 activation in early latencies since all MT/ V5 does is to process and integrate the sensory signals using its wide RF, and, as our examination of FEF will show, neither are any cognitive effects found in the early activation of FEF. Foxe and Simpson (2002, 146) confirm this analysis by stating that “multiple visual areas begin to contribute substantially to the

188     A. Raftopoulos

surface potential and C1 begins to reflect contributions from a number of visual areas other than, but is likely also to include V1 (emphasis added).” Foxe and Simpson (2002, 147), after their claim “that sustained activation patterns within cortical areas are consistent with feedback modulation of ‘lower’ visual areas by ‘higher’ areas, as well as local intrinsic processing”, add that their findings conform with the findings of Lamme and colleagues about the time frame of feedback modulation in figure-ground segregation studies with monkeys. As we saw early in the previous section, Lamme insists that the early recurrent processes are parts of the LRP that do not involve cognition. It follows that nothing in the Foxe and Simpson (2002) study suggests the existence of cognitive effects in the MT/V5 and V1, V2 recurrent interactions. It seems that when Newen and Vetter conclude “complex high level and reiterative processing … can influence visual processing very early on” this ‘very early on’ is not early enough to be within early vision. Specifically, Lamme (2003) and Lamme and his colleagues in various studies argue that functional MRI studies combined with electrophysiological recordings, such as EEG/MEG, suggest that one can distinguish three stages in visual processing: FFS, LRP that is restricted within visual areas, and “full” or “global” recurrent processing (GRP) involving higher cognitive centers. According to Lamme activation reaches V1 at a latency of about 40 ms; multiple stimuli are all represented at this stage. Then this information is fed forward to the extrastriate, parietal, and temporal areas. By 80 ms after stimulus onset most visual areas are activated. At this stage there are no attentional effects. The highest levels of visual cortical processing hierarchy in the ventral stream are reached within 100 ms, and at about 120 ms activation is found in the motor cortex as well (Lamme and Roelfsema 2000). 100–140 ms after stimulus onset recurrent processing occurs, which however is restricted within early vision and excludes top-down signals from either memory of other cognitive centers in the brain. At 200 ms after stimulus onset, recurrent interactions with areas outside the visual stream begin and global recurrent processing starts making possible the interaction between cognition and perception. Since the FFS terminates at about 100 ms after stimulus onset, visual tasks (such as visual search for a target) that result in longer delays must

3  Early Vision and Cognitive Penetrability     189

involve recurrent processing. Moreover, delays around that time interval indicate the result of lateral effects. These imply that the neurons in early visual pathways involved in such tasks should receive lateral and top-down signals from other areas in the visual system and other cognitive centers higher in the brain. Various pieces of evidence substantiate this assumption. Recordings (Roelfsema et al. 1998) from the visual cortex of monkeys who perform a texture segregation task show that V1 cells select for orientation of textures at about 55 ms after stimulus onset. The same cells are selective for the boundary between figure and ground at about 80 ms, and show an enhanced response when their RF cover the figure surface compared to the background surface at about 100 ms. This is clearly an effect of top-down processing that originates at V4 area of the brain. Neurophysiological studies with monkeys (Lamme and Roelfsema’s 2000; Roelfsema’s et al. 1998) that were trained to perform a curvetracing task in which the target curve started with a marker of a given color that the monkey had been cued before the presentation of the stimulus, show two things regarding the modulation of early visual processes by attention. First, V1 cells representing the cued color enhance their response 159 ms after stimulus onset, an enhancement caused by a color-selective feedback signal from higher visual areas and reflecting the effect of feature-based attention. Second, attention enhances the responses of V1 cells that respond to the target curve as opposed to the distractor curve 235 ms after stimulus onset. Roelfsema et al. (1998) trained macaques to select one of two equally salient curves and they implanted 40–50 multiunit electrodes to various sites in V1 cortex. However, the finding that the activity of cells in the primary visual cortex is enhanced by object-centered attention, and thus that attentional modulation occurs in the primary visual cortex as well, does not undermine the thesis that attentional effects occur after the FFS and local RP, since the attentional effects enhance the responses of V1 cells 235 ms after stimulus onset. This short excursion to Lamme’s work confirms that the recurrent processes that occur at early latencies do not involve cognitive signals. One might argue that there is a discrepancy between the work of Lamme et al. and the findings reported by Bullier (2001), Foxe and

190     A. Raftopoulos

Simpson (2002), and Plomp et al. (2015) in that Lamme’s work suggests a delayed onset for recurrent processes (after 80 ms), while the other studies suggest a much earlier onset starting almost immediately after stimulus presentation. The discrepancy disappears if one bears in mind that Lamme examines the ventral system, while the other studies are concerned with the dorsal system, whose latencies are shorter than those in the ventral pathway. Finally, Silvanto et al. (2005) studied the visual awareness of motion and the role of V1 in it. Their experiments show that back-projections from extrastriate cortex influence the activations of neurons in V1 and that it is the activation in V1 that determines which information reaches awareness. Since our interest is in the latencies at which the back projections affect V1 and the sites of origin of the top-down signals, I will ignore the findings concerning the role of V1 in motion awareness. Silvanto et al. (2005) applied TMS on V1 and V5 at different times to examine the perception of phosphenes. When subthreshold TMS (that is, TMS producing no phosphene on its own) was applied over V5 followed by a subthreshold pulse to V1, subjects did not report any phosphene. When a subthreshold pulse was applied over V5 followed 10–40 ms later by a suprathreshold pulse over V1, subjects reported a phosphene, which was not merely the suprathreshold V1 phosphene. Instead, it acquired features of a suprathreshold V5 phosphene since subjects reported the perception of movement, and the shape and size of their percept was a mixture of V1 and V5 phosphenes. This shows that activity in V5, which on its own is insufficient to induce a moving percept, can produce such a percept if the level of induced activity in V1 is high enough. Silvanto et al. (2005, 143) conclude that their finding “that moving phosphenes are perceived only when suprathreshold V1 stimulation follows, but not precedes, subthreshold V5 stimulation, together with the gradual increase in motion perception from the 10–50 ms period, precludes a simple feedforward summation account and points instead to a critical time of backprojection arrival in V1.” They also note that the narrow time window for V5–V1 interaction that they report (10–50 ms) is consistent with previous reports of extrastriate-striate feedback interactions in motion during this time interval. Indeed, this is in

3  Early Vision and Cognitive Penetrability     191

accordance with Bullier (2001) and Plomp et al. (2016) finding that there is an early (up to 100 ms) phase of recurrent activity between V1 and MT/V5, but as in these studies, so in Silvanto et al. (2005) report, there is no evidence to suggest top-down cognitive effects at these early latencies, because, here too, the recurrent signals from MT/V5 are stimulus-driven, or, to use Plomp et al. (2016) term, they are a stimulus-evoked activity. Indeed, Silvanto et al. (2005) make no reference to cognitive effects on early perceptual processing. Newen and Vetter’s appeal to early recurrent interactions between V1, V2 and MT/V5 does not establish that V1 and V2 are affected by cognitive signals at these latencies. The studies they cite suggest, instead, that only sensory signals are processed and transmitted back and forth, because the analysis of the signals at the higher visual levels, owing to the larger RF at these levels, when it reenters the lower levels enables them to participate in the computations of more complex properties relative to what they could compute on their own due to their narrow RF.

4.2 Other Types of Early Recurrent Interactions in Early Vision and the Role of FEF Let us examine some of the further studies cited by Newen and Vetter to see if these interactions involve cognitive signals. Wokke et al. (2012) confirm that recurrent processing engages early visual areas (V1/V2) to participate in more complex visual tasks. Wokke et al. (2012) applied Transcranial Magnetic Stimulation (TMS) at different latencies to disrupt activity in V1/V2 to study the causal links between the neuronal activities in early visual areas during the different stages of figure-ground segregation. Their results show that disruption of V1/V2 during an early time window (96–119 ms) affected detection of figure stimuli and of neural correlates of figure border detection, and border ownership. TMS applied during a later time window (236–259 ms) deteriorated performance associated with surface segregation. Thus, the areas V1/V2 do not only participate in an early stage of figure-ground segregation where figure borders are detected, but also causally contribute at a later time to more complex stages of

192     A. Raftopoulos

figure-ground segregation such as surface segregation. This is further confirmed by Heinen et al. (2015) who show that the figure-ground segregation that allows the discrimination between two visual stimuli requires two distinct periods of information processing in V1 and related early visual areas, an early one around 130–160 ms, and a later one ground 250–280 ms after stimulus onset. These results are in line with recent findings showing that early visual cortex is involved in a broad range of higher-level visual processes, such as perceptual grouping, working memory, and perceptual completion (Wokke et al. 2012, 774). Newen and Vetter (2016, 5) turn next to the interaction between FEF and V1. [T]he frontal eye fields (FEF), a higher-level area in frontal cortex involved in motor planning of eye movements, exerts its influence to V5 within 30 ms (Silvanto et al. 2006). Therefore, a feedback loop from a frontal region to an early occipital region can take as little as 80 ms or less. Importantly, this feedback happens in a task-specific manner, telling us something about the information conveyed in this feedback: when the task requires face recognition, FEF signals are sent to face-sensitive regions and when the task requires motion discrimination, FEF signals are sent to motion area V5, both within a time frame of 20–40 ms after FEF activity. (Morishima et al. 2008)

Before examining these studies to see whether they entail any cognitive effects on early vision, let us discuss some other studies concerning the role of FEF in visual processing that Silvanto et al. (2005) and Morishima et al. (2008) frequently cite, because they will help us understand the role of FEF. FEF is situated in the prefrontal cortex at a site heavily interconnected with the parietal cortex and is a part of the dorsal visual system. The mean activation latency of FEF neurons is 70 ms poststimulus. Signals arrive at FEF with a slight (if at all) time delay with respect to the signals arriving at V1 (50–80 ms) and V2 (85 ms) and much earlier than they arrive at V4 despite the fact that FEF is anatomically higher than V4. FEF TMS affects the detection of targets in arrays of

3  Early Vision and Cognitive Penetrability     193

distractors and these effects are apparent when pulses are applied early (40 and 80 ms) after presentation of the visual array (O’Shea et al. 2004). HSF signals from V1 can reach FEF in about100 ms. FEF contains visual and movement neurons (O’Shea et al. 2004). Studies (see O’Shea et al. (2004, 1060) for a discussion) show that there are two dissociated processing operations in FEF; the target selection by FEF’s visual neurons, and saccade programming by movement neurons. O’Shea et al. (2004), Silvanto et al. (2006), and Taylor and Nobre (2007) show that early FEF responses are independent of saccades to targets and respond to the visual stimuli. Some of the FEF early feedback signals play a role in the perception of a visual scene by affecting in a top-down manner the earlier visual areas. In particular, it seems that FEF plays a crucial role in visual target discrimination that is independent of saccade programming, as TMS applied to FEF impairs performance in target discrimination tasks if applied between 40 and 80 ms after stimulus onset (O’Shea et al. 2004). In addition, these visual neurons of FEF are thought to be associated with top-down or endogenous attention, whether it be spatial or object/feature-based (Taylor and Nobre 2007). Things, of course, are more fine-grained. Lawrence et al. (2005) have shown that the FEF neurons on the basis of their responses in discriminatory and cognitively driven spatial-attention tasks exhibit a visual-movement continuum between two extremes. Zhou and Thompson (2009, 1209) argued that there are four types of FEF neurons. (i) Phasic visual neurons exhibited a brief visual response after the stimulus was flashed in their RF and were inactive for the remainder of the trial; (ii) Visual-delay neurons exhibited a visual response followed by elevated activity during the delay period, and no increase in activity around the time of the saccade;

194     A. Raftopoulos

(iii) Visuomovement neurons exhibited a visual response and an increase in activity before the monkey made a saccade into the RF. Most visuomovement neurons also exhibited delay activity; (iv) Movement neurons exhibited no visual response following the stimulus flash and exhibited increased activity immediately before the saccade into the RF. With respect to the role of FEF in target discrimination, Schall and Bichot (1998) recorded the neural activity in the FEF of monkeys that make saccades to a target in a popout visual search task or scan complex images. Their study shows that there is an initial phase that lasts for about 100 ms after stimulus onset in which most neurons in FEF do not discriminate between target and distractors in the visual field. Discrimination occurs gradually starting at a latency of about 120 ms and culminating at 200 ms after stimulus onset, before the onset of saccades to the target, which is initiated about 180–220 ms after stimulus onset. The observed latency is an effect of the attentional mechanisms that result from the cueing of the target object and which exert their top-down effect on visual processing. The discrimination process maps the activity of visually responsive cells in the FEF as they start to signal the location of the target in the visual field. Similarly, Thompson and Schall (2000), using backward masking to affect the detection of a target, recorded signals from macaque frontal eye-field neurons (FEF). They found that these neurons initially (before 100 ms after stimulus onset) respond to the actual presence or not of the target. Between 40 and 60 ms, FEF activity was greater following the presentation of a target independent of whether the subject reported the target or not. Later on (100–300 ms), the activation of the same neurons corresponds to the monkey’s behavioral response, that is, the discrimination between targets and non-targets and the selection of the targets for eye movements toward them. The study by O’Shea et al. (2004) found earlier latencies for the target/distractor discrimination, as in their study the discrimination by FEF neurons was effective after 100–120 ms after stimulus onset, as opposed to the study by Schall and Bichot (1998) where discrimination occurs gradually starting at a latency of about 120 ms and culminating at 200 ms after stimulus onset. O’Shea et al. (2004, 1063) note that this

3  Early Vision and Cognitive Penetrability     195

discrepancy may be explained by the fact that the search paradigms used in the work on monkeys and in their TMS study were different. Their search displays were foveal, whereas the monkey displays were peripheral, a factor that might contribute to the early latencies they report. Moreover, the repetition of the same target/distractor combination likely resulted in feature priming across the 10 blocks of 80 trials in their experiment and such priming has been shown to produce earlier target discrimination peaks in monkey FEF. Thus, in the O’Shea et al. (2004) study, the early onset of target vs. non target discrimination was likely the result of some sort of feature pre-cueing. I will return to discuss the repercussions of the phenomenon of pre-cueing for the discussion concerning the CP of early vision in a while. For the time being notice that even if one accepts O’Shea et al. (2004) early latencies of FEF neurons in discriminating targets from non targets (100–120 ms), in view of the fact that, as Newen and Vetter (2016, 5) also accept, FEF exerts its influence to V5 within 30 ms (Silvanto et al. 2006), and, therefore, a feedback loop from FEF to an early occipital region can take as little as 80 ms or less, the total time it takes for the FEF neurons that have distinguished the targets from non targets to affect via topdown feedback projections the early visual areas is about 180–200 ms, considering that the target discrimination in FEF reported by O’Shea et al. (2004) occurs at 100–120 ms. This means that the FEF effects the activation of the neurons in early visual areas with a latency that places these effects outside early vision. Concerning now the finding that the FEF neurons can effectively discriminate targets from non targets as early as 100–120 ms after stimulus onset, one could argue that since this discrimination is task-relevant and, thus, involves cognitive factors, cognition affects a visual area, namely FEF, within the timing of early vision. Recall that O’Shea et al. (2004), think it very likely that the early latency they report is the result of feature pre-cueing, which means that the early discriminatory activity in FEF occurs as the result of a cognitive demand issued before the appearance of the stimulus. In the next section, I argue that the cognitive effects on perception that occur through pre-cueing are not cases of CP because they do not affect directly early vision, and, also, do not also affect its epistemic role in grounding empirical beliefs.

196     A. Raftopoulos

Let us turn now to examining the two studies cited by Newen and Vetter (2016) as providing evidence that early vision is cognitively affected owing to the involvement of FEF. Silvanto et al. (2006) found that stimulation applied to FEF 20–40 ms prior to the stimulation of MT/ V5 decreases the intensity of the MT/V5 stimulation required to elicit phosphenes, which entails that the activity of MT/V5 is modulated by the activity in FEF. FEF also modulates top-down V4. Silvanto et al. (2006, 944) claim that the content of top-down control may be either spatial or feature related, which means that they think that, as the previous studies on FEF that we have examined suggest, FEF affects the control of top-down attention; “an area involved in control would be expected to be active early and by responding to target features, FEF could increase the sensitivity of extrastriate neurons to task-relevant parameters.” With regard to the manner FEF exerts top-down control, it is possible that FEF activity occurs prior to sensory stimulation as opposed to rapid responses to visual stimuli since FEF neurons may also play a role in visual priming (Silvanto et al. 2006, 944). Thus, as Taylor and Nobre (2007), so Silvanto et al. (2006) think that FEF controls the allocation of top-down attention prior to stimulus presentation in situations involving pre-cueing. When is FEF modulation of the neuronal activity in the striate and extrastriate cortex felt? The discrimination between targets and nontargets depends on the task at hand and, thus, the top-down effects that result from this discrimination are cognitively driven and, thus, the visual processes thus affected are CP. However, as we have seen, accepting O’Shea’s et al. (2004) latencies of FEF neurons in discriminating targets from non-targets (100–120 ms), since FEF exerts its influence to V5 within 30 ms (Silvanto et al. 2006), and a feedback loop from V5 to early occipital regions can take 80 ms or less, the total time it takes for FEF neurons that have distinguished targets from non targets to affect via top-down feedback projections MT/V5 is 130–150 ms, and the effects on the early visual areas is about 180–230 ms. All of these being outside the timing of early vision do not entail its CP. Let us examine, finally, Morishima et al. (2008) study that suggests that microstimulation to FEF in monkeys modulates neural responses in posterior visual areas within 40 ms of the stimulation, which is consistent with the view that there is direct transmission of signals from FEF

3  Early Vision and Cognitive Penetrability     197

to posterior visual areas. Stimulation of the human FEF modulates the excitability of neurons in the human visual motion sensitive area (HMT) at a latency of 20–40 ms, a result that confirms Silvanto et al. (2006) findings. The top-down control signal is task-specific nature, which means that stimulation of FEF induces activation in posterior visual areas in a task-specific manner. As O’Shea et al. (2004) and Silvanto et al. (2006), so Morishima et al. (2008) found that a population of neurons in FEF whose preferred stimulus coincides with the attended visual feature are active before the presentation of a target stimulus, and they are thought to send top-down signals to neurons in the posterior visual areas involved in the processing of that feature; they participate in the control of top-down attention in pre-cueing. This is based on the finding that FEF neurons show differential responses to target and distractor at about 100 ms after the presentation of the image in a visual selective-attention task in monkeys. It has also been shown that the N1 ERP component that is observed at 170 ms after presentation of a face image is enhanced when subjects attended to the face, suggesting that the top-down signal acts on posterior visual areas at this time. “With a cortico-cortical transmission time of 20–40 ms, we decided that the attentional modulation of the TMS effect would be maximized when the TMS was given over FEF at about 130 ms after stimulus presentation” (Morishima et al. 2009, 90). The attentional modulation would be maximized if given to FEF 130 ms after stimulus onset, a fact which, adding the time it takes for the feedback propagation to reach from FEF to the posterior visual areas, means that the top-down modulation of these areas occurs well after early vision. That Morishima et al. (2009) correlate these effects with the elicitation of N1 and claim that “the top-down signal acts on posterior visual areas at this time period” also suggests that for these researchers the cognitive effects on the visual areas of the brain have a latency of at least 170 ms, which places them outside early vision. Assessing the evidence presented in this section one readily concludes that the studies on the involvement of FEF in visual processing instead of establishing any cognitive effects on early vision, suggest that the cognitive, top-down signals from FEF are delayed in time and occur outside the time frame of early vision.

198     A. Raftopoulos

4.3 Does Early Object Recognition Entail the CP of Early Vision (Again)? Newen and Vetter (2016, 5) appeal to Drewes’s et al. (2016) finding that object recognition involves recurrent processing with a time constant of 60 ms. Drewes et al. take aim at the view that since the visual system extracts from object information the shape of objects very fast it follows that the underlying cortical processing should be feedforward. Their study suggests that in shape perception there is a recurrent circuit, which is not an attentional cueing effect but reflects “the time course of feedback processing underlying the rapid organization of shape” (Drewes et al. 2016, 185). I discussed in the preceding section the repercussions of early object recognition concerning the CP of early vision elsewhere. Here, I restrict myself to discussing the study cited by Newen and Vetter. In their introduction, where they situate their study in the context of other studies, Drewers’s et al. mention work by Heinen et al. (2015), which suggests that the figure-ground segregation requires two distinct periods of information processing in the early visual areas; an early one around 130–160 ms and a later one ground 250–280 ms after stimulus onset, and by Wokke et al. (2012) showing that recurrent processing engages V1/V2 to participate in more complex visual tasks. In an early time window (96–119 ms) detection of figure stimuli and neural correlates of figure border detection and border ownership takes place. At a later time window (236–259 msec) V1 and V2 participate in surface segregation. Drewes et al. (2016), therefore accept these latencies as a general framework. Drewes et al. (2016, 190) claim that their experiments suggest that “the extent of facilitation between two shape stimuli depends non-monotonically on the delay between their presentations, peaking at a delay of 60 ms.” This is strong evidence for a recurrent circuit underlying shape processing in the human cortical object pathway. Drewes et al. (2016) remark that in the study by Wokke et al. (2012) in some trials TMS was applied to the occipital pole to disrupt processing in V1/V2 or to the lateral occipital lobe to disrupt processing in

3  Early Vision and Cognitive Penetrability     199

LOC. TMS disrupted performance at both locations, but at different latencies. In LOC, TMS disrupted processing when the pulse occurred at 100–122 ms, whereas in V1/V2, processing was disrupted when the pulse was applied at 160–182 ms. This suggests a feedback process in the grouping of contour fragments to form shape percepts with a oneway feedback time constant (LOC to V1/V2) of 40–80 ms. “Given that this delay should reflect only the feedback stage of processing, this time constant of 40–80 ms is a little long relative to the 60 ms time constant estimated from our reinforcement paradigm or the time constants ranging from 45 to 107 ms estimated from previous backward masking paradigms, both of which should reflect ‘round-trip’ (feedforwardfeedback) processing” (Drewes et al. 2016, 190). Accepting the 60 ms time constant of Drewes et al. (2016), the top-down signals reenter V1 and V2 at latencies outside early vision.

5 Pre-cueing Effects in Perception: Why Early Vision Is Not Affected Directly by Cognition Part 4 Many studies show that when subjects are instructed to attend to a certain location or attend for a certain object or feature to appear, the neuronal assemblies in the visual brain whose RF are within the attended location, or the neuronal assemblies that encode the feature indicated by these instructions receive a boost in their activation as a result of these instructions and, most importantly, this boost occurs before the appearance of the stimulus. This means that cognitive effects affect perceptual processing from its inception, and, hence, they also affect early vision, rendering it CP. Cognitive effects are involved in this process because the instructions determine attentional commands (wait for a red latter A to appear, or attend to the upper left part of the screen) to be carried out by a subject and these commands require that the subject understand them. When subjects are instructed that a red object will appear on a screen, they use their cognitive resources to understand the instruction, and activate

200     A. Raftopoulos

their knowledge concerning the color red by activating the neuronal assemblies in the cognitive centers of the brain that store this knowledge. The activation is spread top-down and increases the baseline activation of the neuronal assemblies in the visual areas of the brain that encode the color red. This is a typical example of a cognitively driven attentional effect. Such instructions function as cues directing attention and, since they are given before stimulus presentation, the experimental setting is called pre-cueing. Notice that pre-cueing can occur by cues presented on a screen without any accompanied verbal instructions, as when an arrow ‘up’ appears on a screen. As with instructions, these cues can generate attentional commands because the subject understands them and this makes the ensuing attentional effects cognitively driven. The problem at hand is to decide whether pre-cueing effects on perception entail that early vision is CP. To answer that, one should examine these effects and determine whether they satisfy the epistemic criterion for CP, that is, whether pre-cueing effects influence in any way the epistemic role of early vision. Since the epistemic role of early vision consists in providing late vision with iconic information concerning the visual scene that late vision will use to construct the percept, and since this information is retrieved by early vision from the environment, the epistemic role of early vision would be affected by pre-cueing if pre-cueing effects could influence the processes of information retrieval during early vision. If they could, they would affect, either by diminishing or enhancing, the sensitivity of early vision in particular and of perception in general to the environmental data. This would mean that pre-cueing effects entail that early vision is CP. There is also a closely related problem that pre-cueing effects seem to create for the thesis that early vision is CI. They seem to entail that the processes of early vision are directly affected by cognition in the sense that they operate over some cognitive information. This is a problem because, as we saw, many definitions of CP hinge on whether some perceptual process is directly affected by cognition; should any direct effects on early vision be found, this would entail that early vision is CP. Whenever viewers are instructed to attend to a certain location, or for a certain feature or object to appear, attention affects perception by modulating the internal ongoings biasing the baseline activation of the

3  Early Vision and Cognitive Penetrability     201

neurons that encode the expected stimulus or location. By being internal, this sort of attentional effects is a candidate for CP of early vision. A word of caution is needed here. I have talked about instruction to attend to some location or object/feature, and about expectations that some space will be occupied or that some specific object/feature will appear on the screen and I continued to subsume both attention and expectation effects under the general heading of attentional effects. But surely expectation and attention seem to be different. When someone expects for something, they operate on, or express, information concerning the statistical distributions of objects and spaces in their environment; when expecting object O to appear, one attributes an elevated probability to O’s presence in one’s environment. Attention, on the other hand, is thought as a mechanism that allows on to focus on, or zoom on, or being in the spotlight what is relevant for one’s purposes. Put it this way, there is a host of empirical evidence, in which the probability of stimulus occurrence and task-relevance are independently manipulated, suggesting that expectations are dissociated from attention (Kok et al. 2013, 2014). If this is the case, one should treat the effects of attention and expectation differently and not subsume them under the same heading. One should note, however, two things. First, even if they are different in nature, their effect on the early visual circuits is the same as we shall shortly see. Second, this dissociation presupposes a conception of attention as some short of mechanism that acts on information. As I have argued (Raftopoulos 2009), however, attention is best viewed as the result of the biased competition among pieces of information along the visual circuits. The biases may involve top-down cognitive information, in which case both prior expectations and attentional commands are such biases. If true, this would also explain why they act the same way on early visual neurons but it would also allow one to treat them as one sort of cognitive effect, which is what I have chosen to do here. Before I proceed, let me mention that some of the material that follows is from Raftopoulos (2017b). Studies of the effects of spatial attention cues presented to a viewer before stimulus presentation show early modulation of perceptual processing (Carrasco 2011; Freiwald and Kanwisher 2004; Reynolds and Chelazzi 2004). Attending to a location

202     A. Raftopoulos

may enhance the baseline activation of the neuronal assemblies tuned to the attended location in specialized extrastriate areas V2, V3, V3a, V4, and in parietal regions (Freiwald and Kanwisher 2004; Heeger and Ress 2004; Hopfinger et al. 2004) and in striate cortex V1 (Kastner et al. 1999) by an average of 30–40%, although the increase is more pronounced in V4 and less evident in V1 (Kastner et al. 1999; Ling et al. 2009). Other studies with single stimuli show that spatial attentional enhancement is low in V1 and V2 (less than 5%) and more robust in V4 and IT (15–20%) (Reynolds and Chelazzi 2004; Kastner and Ungerleider 2000). Furthermore, following target onset, spatial attention increased the amplitude of early visual responses to cued targets at about 100 ms after target onset (Wyart et al. 2012). This phenomenon refers to the enhancement of the baseline activity of neurons at all levels in the visual cortex that are tuned to a location that is cued and thus this location is attended before the onset of any stimuli. It is called attentional modulation of spontaneous activity. The spontaneous firing rates of neurons are increased when attention is shifted toward the location of an upcoming stimulus before its presentation. This cueing reflects the effects of the neural processes that occur in response to cues to orient attention to a location before the stimulus appears. Spatial attention enhances the sensitivity of the neurons tuned to the attended spatial location by improving the signal-to-noise ratio of the neurons tuned to the attended location over the neurons with RF outside the attended location that contribute only noise. Spatial pre-cueing operates by boosting the gain of the neuronal responses, that is, it increases by a multiplicative factor the overall neuronal response of all the relevant neurons, and, thus, increases the response of all feature detectors independent of which features are the targets and the nontargets (Ling et al. 2009). The boost of the neuronal activity of the affected neurons can be construed as an increased sensitivity to the stimuli in the attended area, by affecting the contrast levels (Carrasco et al. 2004) for example, which boosts the visibility of the stimuli. What is perceived depends on the relative activity of appropriate assemblies of neurons that selectively code the features of the stimulus compared to the activity of assemblies that do not code the features of

3  Early Vision and Cognitive Penetrability     203

the stimulus and contribute noise. Since the percept depends on such differential responses, the effects of spatial attention by not evoking differential responses leaves the percept unchanged. Spatial attention makes detection of the objects/features in the scene easier but does not determine the percept. Differently put, spatial attention enhances activity of all functional domains falling topographically within the attended location (Roe et al. 2012, 21). Thus, spatial pre-cueing does not determine what is perceived at that location because, by enhancing the responses of all neurons tuned to the attended location independent of the neurons’ preferred stimuli, it keeps their differential responses unaltered and does not affect what is perceived. “The increase in baseline activity might be due, for example, to an activation of large populations of neurons containing the attended spatial location within their RF and responding relatively nonspecifically to the various features of the expected stimuli” (Kastner et al. 1999, 758). Evidence (Carrasco 2011; Hayden and Gallant 2009; Kok et al. 2013, 2014; Liu et al. 2007; Shibata et al. 2008; Wyart et al. 2012) also suggests that through pre-cueing of object features (instructing a subject to look at a screen for a red object, for example or when a subject expects a particular grating to appear) feature-based attention modulates prestimulus activity in the visual cortex. In fMRI experiments designed to examine the effects of feature attention to color and motion on the visual, frontal, and parietal areas, a cue appeared 1 s before the stimulus. The activity within the color sensitive visual areas and the motor sensitive visual areas was increased by attention to color and motion, respectively. This resulted in the visual areas that encode color showing enhanced activation as early as 80 ms poststimulus. The effects of prestimulus feature pre-cueing may act either as a preparatory activity to enhance the stimulus-evoked potentials and, thus, the sensitivity to the cued feature, within feature sensitive areas, or they may act to modulate stimulus-locked transients suppressing neural noise. It has been suggested that feature-based attention, unlike spatial attention that acts through a gaining mechanism, operates through both a gaining mechanism and a tuning mechanism that sharpens the responses of the relevant neurons (Ling et al. 2009). With regard to its function as a tuning mechanism, feature-based attention can alter

204     A. Raftopoulos

spatial tuning properties at least in area V4 since the orientation and spatial frequency tuning of many V4 neurons tended to shift to match the orientation and spatial frequency content of the target. These data appear to be consistent with a “matched filter mechanism” that shifts neurons tuning to increase the neural representation of relevant features, at the cost of representation of irrelevant features; feature highlighting can occur not only by boosting the activations of the ensembles representing the relevant features, but also by biasing the sensitivity of the neuronal population toward attended features (Roe et al. 2012, 21). This view is supported by recent evidence (Kerkoerle et al. 2014) suggesting that selective attention enhances the γ high-frequency rhythm activity in the striate cortex and also increases α low-frequency rhythm activity in this area. Importantly, V1 neurons have their α low-frequency rhythm activity increased only if their RF fell on the background and not the target. Since the γ high-frequency rhythm is related to the feedforward propagation of information, while the α low-frequency rhythm is associated with feedback information, these results suggest that (i) selective attention, by increasing γ activity, either boosts the feedforward flow of information from the affected neurons to higher areas, or is an expression of the fact that there is a more efficient feedforward flow for attended stimuli; in either case, the findings can be described as the result of a gaining mechanism; (ii) feedback signals to V1 act so as to suppress the activity of the neurons that encode the background, sharpening thus the activity of the neuronal population so that the target be preferred, a result supported by the finding that strong waves are associated with decreased firing rates of V1 neurons. In either case, feature pre-cueing makes the detection of the target easier, less expensive, and faster. Thus, the preparatory activity that occurs through pre-cues that rely on feature/object-based attention increases the baseline firing rate of the neurons preferring the attended stimulus that the participant is instructed to attend to or for which a cue is presented before the presentation of the stimulus. These effects are widespread from V1, V2 to upper levels of perceptual processing.

3  Early Vision and Cognitive Penetrability     205

Research (Itti and Koch 2001; Montemayor and Haladjian 2015, 41–42; Raftopoulos 2009; Chapter 2) suggests that the visual objects are individuated by the early visual stage irrespective of whether they are targets or non-targets, which means that early vision retrieves the required information and individuates all objects in a visual scene, despite the modulation of the prestimulus activity due to object/ feature-based pre-cueing. Further research suggests that there is rich information from a visual scene stored in early visual circuits independent of storage limitations of visual working memory and outside of the focus of attention; non-cued or suppressed items are perceived and processed despite the fact that they are not attended to. In addition, it is very likely that subjects have phenomenal awareness of some of these items (Block 2014; Bronfman et al. 2014; Frassle et al. 2014; Vandenbroucke et al. 2014). Therefore, despite feature- or object-based pre-cueing, both noncued and cued items are all perceptually represented in early vision. This conclusion is reinforced by studies on the interplay between central and peripheral vision in viewing visual scenes and on the kinds of processes underlying early visual perception. It is known that 50% of early visual processing concerns information from peripheral vision and 50% concerns information from central (foveal and parafoveal vision) (Wang and Cottrell 2017). The processes of central vision aim to retrieve texture and shape information found in the visual scene, which are most likely represented in different brain areas, with shapes represented in LOC and textures in PPA, an area overlapping parahippocambal cortex (Wang and Cottrell 2017; Cant et al. 2015). The processes of information from peripheral vision serve to extract general information or the gist of the scene. To extract texture from, and the gist of, the scene the visual processes in peripheral vision perform summary statistics and extract statistical properties of the input, that is, the visual system represents the input by the joint statistics of responses of cells sensitive to position, orientation, phase, spatial frequency, color, shape, texture, etc.; these representations are also called ‘ensemble properties’ or ‘ensemble summary statistics (Arratha and Moore 2014, 2015a, b; Balas et al. 2009; Cant et al. 2015; Chong and Treisman 2003; Eisenberg and Zacks 2016; Im and Halberda 2013; Rosenholtz et al. 2012; Utochkin 2015). With regard to the perception of texture

206     A. Raftopoulos

pre-attentively, specifically, Julesz’s (1981) pioneer research suggests that texture is not perceived in its maximal detail, that is, early vision does not retrieve the exact texture of the objects in the visual scene. Instead, through statistical processes resulting in the retrieval of ensemble properties, early vision represents textures in terms of four or five prototypical texture features, called textons, that is, it represents textures as a combination of these textons. Differently put, textures are represented in early vision as ensemble, statistical properties. These ensemble statistics are thought to be pre-attentive, to occur in parallel and to be automatic. This entails that they are not subject to bottleneck capacity limitations brought about by attention and VWM (Chong and Treisman 2003). These traits of ensemble statistics are currently under debate (Arratha and Moore 2014, 2015a, b) but even those who think that there are capacity limitations argue, first, that these concern the extraction of ensemble statistics within certain feature dimensions and not between feature dimensions, that is, the limitations concern the extraction of the same ensemble features across various collections of objects and not the extraction of ensemble statistics of different features across collections (Arratha and Moore 2015a). Second, the evidence suggests (Arratha and Moore 2015b, 1129) that some processes of extracting ensemble statistics do engage unlimited capacity processes (and, thus, are pre-attentional and occur in parallel). Such processes include contrast discrimination, size discrimination of individual objects, image shape discrimination, symmetry detection, modal, and amodal completion; these are the processes that are implicated in segmentation aspects of visual processing. In contrast, processes involved in object categorization and semantic processing (summary statistics of mean size, object categorization, object shape identification, mean orientation, and word categorization) are capacity-limited, which means that they involve attention and VWM. This is important because as the reader recalls, the former set of processes characterize early vision, in which objects are segmented from background, while the latter characterizes late vision where objects are recognized and categorized. The evidence, thus, suggests that early visual processes retrieve in parallel from the environment information (whether it be information about specific object or information about the statistical properties of the

3  Early Vision and Cognitive Penetrability     207

objects in the visual scene) irrespective of task demands and, equivalently, cognitive influences. Both effects of pre-cueing reflect a change in background neural activity and, thus, rig-up perceptual processing. These effects are called anticipatory effects and are established prior to viewing the stimulus. In this sense, they do not modulate processing during stimulus viewing but they bias the process before it starts; they rig-up, as it were, perceptual processing without affecting it online. There are various interpretations of the effects of pre-cueing on the neural activity in the occipital areas of the brain. They may act so as to increase the baseline firing rates of the neurons that encode the pre-cued stimuli; these are cases of gain modulation. Alternatively, they may act so as to suppress noisy neural activity rather than to increase the activity of the neurons that encode the information contained in pre-cueing signal (Hegde and Kersten 2010; Murray et al. 2004). It may also be that a variety of mechanisms are available and which one is chosen depends on the task at hand, which means that attention is flexible to solicit different ways to modulate the activity of neurons so as to change visual representations at a cellular level and affect the functional properties of neurons (Gilbert and Li 2013). In all these cases the net result is the same: the anticipatory activity sharpens and optimizes the response properties of the affected neurons according to anticipated stimulus (and this happens independent of whether a stimulus is expected as more probable to appear, or attended to as more relevant to the viewer’s purposes). As such, the anticipatory effects do not emerge as part of perceptual competition and in this sense they are not intrinsic to the perceptual processing (Nobre et al. 2012, 161), which is otherwise unaffected by top-down effects. During the feedforward processing (FFS) and (LRP— local in the sense that it consists of purely visual signals and does not involve any cognitive signals) there is no top-down cognitive activity owing pre-cueing to modulate the perceptual processing, which is data driven. What pre-cueing does is to set up the values of some parameters of the transformation rules in FFS processing. When they set the parameters of the transformation rules, pre-cueing effects highlight some information in the visual scene, by enhancing the activation of the neurons that

208     A. Raftopoulos

encode this information, but they do not create the proximal image or stimulus. What they essentially do is to modulate early perceptual ­filters; in this sense, they act “as a ‘filter’ that ‘selects’ the information for downstream processing, which may itself be impervious to cognitive influence” (Firestone and Scholl 2016, 23–24). These parameters can be construed as the attentional parameters that weight the effect of sensory signals, as they are postulated in computational models of perceptual attention, such as the model of divisive normalization proposed by Lee et al. (2009). Pre-cueing may increase the value of some parameter and decrease that of another and this results in some input being given priority in terms of subsequent processing but this does not mean that early vision does not retrieve all information in the visual scene. Pre-cueing effects, therefore, do not select which information is retrieved from the visual scene once the visual scene has been determined; all information from the visual scene is retrieved in parallel in early vision. In the case of spatial pre-cueing, the anticipatory effects do not determine the percept since pre-cueing enhances the responses of all neurons tuned to the attended location independent of the neurons’ preferred stimuli and keeps the differential responses of the neurons unaltered. In the case of object/feature pre-cueing, although the anticipatory effects enhance the activity of the neurons responding preferentially to the pre-cued object or feature increasing the likelihood that they will be selected eventually for further processing, early vision still retrieves in parallel information concerning all the objects and features present in the visual scene so that these objects be individuated independently of whether they are targets or non-targets. If pre-cueing does not affect the information retrieved from the visual scene, the relevant cognitive states involved do not affect the selection of the ‘evidence’ or the information against which hypotheses concerning object identity will be tested in late vision. It follows that pre-cueing and the various cognitive effects underlying it do not affect the epistemic role of early vision; pre-cueing does not entail the CP of perception. Rigging-up perceptual processing by streamining FFS is not an instance of a cognitively driven selective attentional control in which attention is used online, that is, during visual processing, to select for further processing

3  Early Vision and Cognitive Penetrability     209

a specific feature or object in a given visual scene by increasing the firing rates of neurons that have a stimulus-evoked response to a particular stimulus; in this case, top-down signals modulate perceptual processing during stimulus viewing. In pre-cueing effects, in contradistinction, the visual processing during stimulus viewing in early vision relies solely on bottom-up processing or top-down and lateral processing restricted within visual areas. This is different from the role of attentional control during visual processing that involves a top-down attentional control of the perceptual input, in addition to the bottom-up processing that carries information from the proximal scene. The effects of TMS on FEF in relation to pre-cueing was studied by Taylor and Nobre (2007), who applied TMS to the right FEF during the spatial cueing period of a covert attentional task. They found that inducing activity in the right FEF with TMS during the cueing period of a rule-guided covert endogenous attentional orienting task modulated ERPs recorded over visual cortex, which suggests that the TMS applied to FEF altered functional processes related to perception and attention in the visual cortex. FEF TMS had a causal impact on visual activity measured with ERPs (Taylor and Nobre 2007). The earliest effect of TMS was a sustained negative deflection, which became significant after the third TMS pulse, during the interval between the cue and the visual stimulus. This negativity remained until 200 ms after stimulus onset. The data were normalized to a pre-TMS baseline period to emphasize ERP shifts occurring after warning cue onset but before visual stimulus presentation. The normalization shows that this negativity remained present in the ERP until 200 ms after stimulus presentation, which means that this negativity can be interpreted as an effect on visual processing at the time of the attentional modulation of the ERP. Since the attentional modulation of the occipital visual areas is delayed in time and occurs after 170 ms poststimulus, one would expect that the TMS applied to FEF would affect the neuronal activity in early visual areas with a similar time delay, if the TMS effects on FEF affected online visual processing by controlling top-down attention. Indeed, when Taylor and Nobre (2007) isolated the stimulus-evoked activity by using the peristimulus period as the baseline, ERPs differed significantly as a result of TMS applied to FEF at 200 ms.

210     A. Raftopoulos

The study by Taylor and Nobre (2007) makes it clear that TMS is affecting ongoing visual cortical activity even prior to visual stimulation, and it is not just affecting the visual cortex’s generation of an ERP. These results mean that (a) the TMS applied to FEF affects neuronal activity in the posterior visual areas prior to the presentation of the stimulus, in accordance with the view that the FEF causally affects and modulates the visual activity in posterior visual areas when spatial attention is being allocated before stimulus presentation; this spatial pre-cueing may act either as a preparatory activity to enhance the stimulus-evoked potentials and, thus, the sensitivity to the information coming from the cued areas, or it may act so as to modulate stimulus-locked transients suppressing neural noise. Note, that the fact that the enhancement of FEF activity after the spatial cue but before stimulus presentation does not increase after stimulus presentation (unlike in areas V2, V4 and IT), that is, the fact that no additional activity is evoked in FEF by the onset of visually stimuli (Kastner et al. 1999), possibly means that FEF activity is related more to the attentional demands and operations of the task rather than to perceptual processing. Zhou and Thompson (2009, 1214) confirm this interpretation. The results of our study provide physiological evidence that FEF neurons represent the locus of endogenous covert spatial attention in the absence of visual input. The FEF neurons with anticipatory activity are ideally suited to convey a top-down spatial attention signal to visual cortex that enhances the processing of behaviorally important visual stimuli.

(b) TMS on FEF continues to affect the visual cortical activity generated by the visual stimulus for about 200 ms after stimulus presentation, which refutes the view that visual stimulation causes the immediate cessation of the cortical processes that were started by the TMS; the prestimulus stimulation of FEF keeps playing a role in the control of top-down spatial attention even after stimulus onset.

3  Early Vision and Cognitive Penetrability     211

(c) The effects on visually evoked activity of the FEF controlled topdown attention are felt on the posterior visual areas at about 200 ms after stimulus onset, which means that their latencies fall within late vision but outside early vision. This last result is very important because it shows that the cognitive states that drive cognitively driven attention do not affect early vision but only late vision. There is an additional question that needs to be answered. As I have said, in the literature, CP goes hand in hand with the thesis that cognition affects directly early vision in the sense that the processes of early vision use the cognitive contents of the penetrating cognitive states as an informational resource. The question, thus, is the following. Do pre-cueing effects suggest that cognition affects directly early vision? One could start by claiming the fact that the cognitive states do not influence the retrieval of information from a visual scene, which suggests that the cognitive states do not affect the perceptual processing itself and, therefore their influence is not direct. This needs arguing for, however. In view of the fact that the electrophysiological signatures of pre-cueing effects are found within the time frame of early vision, one must examine these electrophysiological signatures. One response could be that they are carry-over effects of the initial enhanced activation of the relevant color sensitive areas owing to pre-cueing, that is, the result of the anticipatory effect of pre-cueing. This would mean that the fact that they are found during early vision processing does not entail that the contents of the early vision states that participate in these processes are affected by cognitive information, or equivalently, that the processes of early vision operate over such cognitive contents. A way to express this is to say that even though pre-cueing effects set the attentional parameters that we discussed in the previous paragraphs and these parameters in turn affect the perceptual processing, pre-cueing effects act so as to set some initial values but they do not alter the equations that govern the state transformation in which the processing consists. It follows that pre-cueing does not affect the perceptual processes, which means that pre-cueing effects do not affect early vision directly; they are indirect, extrinsic effects on both early and late vision.

212     A. Raftopoulos

Another way is that the attentional parameters in visual computations provide an example of how cognitive contents can be accessed and operated over without their role in the computation being appropriately inference-like, that is, without there being a semantic or logical, reason-giving, relation between the cognitive contents that issue the attentional commands that set the values of the attentional parameters and the contents of the perceptual states that participate in the affected perceptual process. This is important because one of the reasons Pylyshyn (1999) adduces to support his claim that early vision is CI is that CP requires that the cognitive and the perceptual contents stand in a semantic, quasi-logical relation of the sort found in the way the premises of some argument provide reasons for its conclusion. Even though a computational transition might itself be deemed an inference, or inference-like, not all elements of the computation, the attentional parameters, for example, need be quasi-reason-giving. The attentional weights that in Lee and Maunsell’s model are computationally relevant affect computations in a way that does not presuppose that the cognitive contents that set them actually stand in the appropriate reasongiving relation that CP requires. To repeat the main conclusion drawn from the discussion in this section, neither are pre-cueing effects direct cognitive influences on early vision, nor do they affect the epistemic role of early vision, which consists in retrieving from the environment and storing in the proximal image the information contained in the visual scene. This means that early vision does not satisfy both the directness and the epistemic demand and, thus, is CI.

References Arratha, M., & Moore, C. M. (2014). Orientation summary statistics are limited in processing capacity. Visual Cognition, 22(8), 1018–1022. Arratha, M., & Moore, C. M. (2015a). The perceptual processing capacity of summary statistics between and within feature dimensions. Journal of Vision, 15(4), 1–17.

3  Early Vision and Cognitive Penetrability     213

Arratha, M., & Moore, C. M. (2015b). The capacity limitations of orientation summary statistics. Attention, Perception, Psychophysics, 77, 116–1131. Balas, B., Nakano, L., & Rosenholtz, R. (2009). A summary-statistic representation in peripheral vision explains visual crowding. Journal of Vision, 9(12), 13. Beck, D. M., & Kastner, S. (2009). Top-down and bottom-up mechanisms in biasing competition in the human brain. Vision Research, 49, 1154–1165. Biederman, I. (1987). Recognition by components: A theory of human image understanding. Psychological Review, 94, 115–147. Block, N. (2007). Two neural correlates of consciousness. In N. Block (Ed.), Collected Papers, vol. 1 (pp. 342–362). Cambridge: MIT Press. Block, N. (2014). Seeing-as in the light of vision science. Philosophy and Phenomenological Research, 89(3), 560–572. Brogaard, B., & Gatzia, D. (2017). Color and cognitive penetrability. Topics in Cognitive Science, 9(1), 193–214. Bronfman, Z. Z., Brezis, N., Jacobson, H., & Usher, M. (2014). We see more that we can report: ‘Cost free’ color phenomenality outside focal attention. Psychological Science, 25(7), 1394–1403. Bullier, J. (2001). Integrated model of visual processing. Brain Research Reviews, 36, 96–107. Burge, T. (2010). Origins of Objectivity. Oxford: Clarendon Press. Campbell, J. (2006). Does visual attention depend on sortal classification? Reply to Clark. Philosophical Studies, 127, 221–237. Cant, J. S., Sun, S. Z., & Xu, Y. (2015). Distinct cognitive mechanisms involved in the processing of single objects and object ensembles. Journal of Vision, 15(4), 1–21. Carrasco, M. (2011). Visual attention: The past 25 years. Vision Research, 51, 1484–1525. Carrasco, M., Ling, S., & Read, S. (2004). Attention alters appearance. Nature Neuroscience, 7, 308–313. Cavanagh, P. (2011). Visual cognition. Vision Research, 51, 1538–1551. Cecchi, A. (2014). Cognitive penetration, perceptual learning, and neural plasticity. Dialectica, 68(1), 63–95. Chaumon, M., Drouet, V., & Tallon-Baudry, C. (2008). Unconscious associative memory affects visual processing before 100 ms. Journal of Vision, 8(3), 1–10. Chelazzi, L., Miller, E., Duncan, J., & Desimone, R. (1993). A neural basis for visual search in inferior temporal cortex. Nature, 363, 345–347.

214     A. Raftopoulos

Chong, S. C., & Treisman, A. (2003). Representation of statistical properties. Vision Research, 43(4), 393–404. Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36, 181–253. Crane, T. (2009). Is perception a propositional attitude. The Philosophical Quarterly, 59(236), 453–470. Crouzet, S. M., Kirchner, H., & Thorpe, S. J. (2010). Fast saccades toward faces: Face detection in just 100 ms. Journal of Vision, 10(4), 1–17. Davis, M. (1995). Tacit knowledge and subdoxastic states. In C. MacDonald & G. MacDonald (Eds.), Philosophy of Psychology: Debates on psychological Explanation. Oxford: Blackwell. Delorme, A., Rousselet, G. A., Mace, M. J.-M., & Fabre-Thorpe, M. (2004). Interaction of top-down and bottom up processing in the fast visual analysis of natural scenes. Brain Research, 19, 103–113. Dennett, D. C. (1983). Styles of mental representation. Proceedings of the Aristotelian Society, 83, 213–226. Deroy, O. (2013). Object-sensitivity versus cognitive penetrability of perception. Philosophical Studies, 162, 87–107. Drewes, J., Goren, G., Zhu, W., & Elder, J. H. (2016). Recurrent processing in the formation of percept shapes. The Journal of Neuroscience, 36(1), 185–192. Eisenberg, M. L., & Zacks, J. M. (2016). Ambient and focal visual processing of naturalistic activity. Journal of Vision, 16(2), 1–12. Evans, G. (1982). The Varieties of Reference. Oxford: Clarendon Press. Evans, M. A., Shedden, J. M., Hevenor, S. J., & Hahn, M. C. (2000). The effect of variability of unattended information on global and local processing: Evidence from lateralization at early stages of processing. Neurophysiologia, 38, 225–239. Firestone, C., & Scholl, B. J. (2016). Cognition does not affect perception: Evaluating the evidence for ‘top-down’ effects. Behavioral and Brain Sciences. http://dx.doi.org/10.1017/S0140525X15000965. Fodor, J. (1983). The Modularity of Mind. Cambridge: MIT Press. Fodor, J. (2007). The revenge of the given. In B. P. McLaughlin & J. Cohen (Eds.), Contemporary Debates in the Philosophy of Mind. Malden, MA: Blackwell. Fodor, J., & Pylyshyn, Z. (2015). Minds Without Meanings: An Essay on the Content of Concepts. Cambridge: MIT Press. Foxe, J. J., & Simpson, G. V. (2002). Flow of activation from V1 to frontal cortex in humans. Experimental Brain Research, 142(1), 139–150.

3  Early Vision and Cognitive Penetrability     215

Frassle, S., Sommer, J., Jansen, A., Naber, M., & Einhauser, W. (2014). Binocular rivalry: Frontal activity relates to introspection and action but no to perception. The Journal of Neuroscience, 34(1), 1738–1747. Freiwald, W. A., & Kanwisher, N. G. (2004). Visual elective attention: Evidence from brain imaging and neurophysiology. In M. Gazzaniga (Ed.), The Cognitive Neurosciences III (3rd ed.). Cambridge: MIT Press. Gatzia, D., & Brogaard, B. (2017). The real epsitemic significance of percptual learning. Inquiry, 543–558. https://doi.org/10.1080/00201 74x.2017.1368172. Gilbert, C. D., & Li, W. (2013). Top-down influences on visual processing. Nature Reviews Neuroscience, 14(5), 350–363. Gilbert, C. D., Ito, M., Kapadia, M., & Westheimer, G. (2000). Interactions between attention, context and learning in primary visual cortex. Vision Research, 40, 1217–1226. Goldstone, R. L., de Leeuw, J. R., & Landy, D. H. (2015). Fitting perception in and to cognition. Cognition, 135, 24–29. Grill-Spector, K., Henson, R., & Martin, A. (2006). Repetition and the brain: Neural models of stimulus-specific effects. Trends in Cognitive Sciences, 10, 14–23. Grill-Spector, K., Kushnir, T., Hendler, T., Edelman, S., Itzchak, Y., & Malach, R. (1998). A sequence of object-processing stages revealed by FMRI in the Human occipital lobe. Human Brain Mapping, 6, 316–328. Haugeland, J. (1998). Having Thought. Cambridge: Harvard University Press. Hayden, B. Y., & Gallant, J. L. (2009). Combined effects of spatial and feature-based attention on responses to V4 neurons. Visio Research, 49, 1182–1187. Heck, R. G., Jr. (2007). Are there different kinds of content? In J. Cohen & B. McLaughlin (Eds.), Contemporary Debates in the Philosophy of Mind. Oxford: Blackwell. Heeger, D. J., & Ress, D. (2004). Neuronal correlates of visual attention and perception. In M. Gazzaniga (Ed.), The Cognitive Neurosciences (3rd ed.). Cambridge: MIT Press. Hegde, J., & Kersten, D. (2010). A link between visual disambiguation and visual memory. The Journal of Neuroscience, 30(45), 15124–15133. Heinen, K., Jolij, J., & Lamme, V. A. (2015). Figure-ground segregation requires two distinct periods of activity in V1: A transcranial magnetic study. Neuroreport, 16(13), 1483–1487.

216     A. Raftopoulos

Hopfinger, J. B., Luck, S. J., & Hillyard, S. A. (2004). Selective attention. In M. S. Gazzaniga (Ed.), The Cognitive Neuroscience (3rd ed.). Cambridge: MIT Press. Im, H. Y., & Halberda, J. (2013). The effects of sampling and internal noise of the representation of ensemble average size. Attention, Perception, Psychophysics, 75, 278–286. Innui, K., & Kakigi, R. (2006). Temporal analysis of the flow from V1 to extrastriate cortex. Journal of Neurophysiology, 96, 775–784. Itti, L., & Koch, C. (2001). Computational modelling of visual attention. Nature Reviews Neuroscience, 2, 194–204. Johnson, J. S., & Olshausen, B. A. (2005). The earliest EEG signatures of object recognition in a cued target task are postsensory. Journal of Vision, 5, 299–312. Johnston, M. (2006). Better than mere knowledge: The function of sensory awareness. In T. S. Gendler & J. Hawthorne (Eds.), Perceptual Experience. Oxford: Clarendon Press. Joulesz, B. (1981). Textons, the elements of texture perception, and their interactions. Nature, 90(12), 91–97. Kastner, S., & Ungerleider, L. G. (2000). Mechanisms of visual attention in the human cortex. Annual Review of Neuroscience, 23, 315–341. Kastner, S., Pinsk, M. A., De Weerd, P., Desimone, R., & Ungerleider, L. (1999). Increased activity in human visual cortex during directed attention in the absence of visual stimulation. Neuron, 22, 751–761. Kelly, S. D. (2001). Demonstrative concepts and experience. The Philosophical Review, 110(3), 397–420. Kerkoerle, T. van., M. W., Dagnino, B., Gariel-Mathis M.-A., Poort, J., Togy, C. van der., & Roelfsema, P. R. (2014). Alpha and gamma oscillations characterize feedback and feedforward processing in monkey visual cortex. Proceedings of the National Academy of Science, USA (PNAS), 114(40), 14332–14341. Keysers, C., Xiao, D. K., Fldiak, P., & Perrett, D. (2014). The speed of sight. Journal of Cognitive Neuroscience, 13, 90–101. Kirchner, H., & Thorpe, S. J. (2006). Ultra-rapid object detection with saccadic movements: Visual processing speed revisited. Vision Research, 46, 1762–1776. Kok, P., Brouwer, G., van Gerven, M., & de Lange, F. (2013). Prior expectations bias sensory representations in visual cortex. Journal of Neuroscience, 33, 16275–16284.

3  Early Vision and Cognitive Penetrability     217

Kok, P., Failing, M., & de Lange, F. (2014). Prior expectations evoke stimulus templates in the primary visual cortex. Journal of Cognitive Neuroscience, 26, 1546–1554. Lamme, V. A. F. (2003). Why visual attention and awareness are different. Trends in Cognitive Sciences, 7(1), 12–18. Lamme, V. A. F. (2005). Independent neural definitions of visual awareness and attention. In A. Raftopoulos (Ed.), The Cognitive Penetrability of Perception: An Interdisciplinary Approach. Hauppauge, NJ: NovaScience Books. Lamme, V. A. F., & Roelfsema, P. R. (2000). The distinct modes of vision offered by feedforward and recurrent processing. Trends in Neuroscience, 23, 571–579. Lawrence, B. M., White, R. L., & Snyder, L. H. (2005). Delay-period activity in visual, visuomovement, and movement neurons in the front eye field. Journal of Neurophysiology, 94(2), 1498–1508. Lee, J., & John, H. R., & Maunsell, J. H. R. (2009). A normalization model of attentional modulation of single responses. PLoS One, IV(2), e4651. Ling, S., Liu, T., & Carrasco, M. (2009). How spatial and feature-based attention affect the gain and tuning of population responses. Vision Research, 49, 1194–1204. Liu, T., Stevens, S. T., & Carrasco, M. (2007). Comparing the time course and efficacy of spatial and feature-based attention. Vision Research, 47, 108–113. Liu, H., Agam, Y., Madsen, J., & Krelman, G. (2009). Timing, timing, timing: Fast decoding of object information from intracranial field potentials in human visual cortex. Neuron, 62, 281–290. Luck, S. J. 1995. Multiple mechanisms of visual-spatial attention: Recent evidence from human electrophysiology. Behavioral and Brain Research, 71, 113–123. Lupyan, G. (2015). Object knowledge changes visual appearance: Semantic effects on color afterimages. Acta Psychologica, 161, 117–130. Marr, D. (1982). Vision: A Computational Investigation into Human Representation and Processing of Visual Information. San Francisco, CA: Freeman. Montemayor, C., & Haladjian, H. (2015). Consciousness, Attention, and Conscious Attention. Cambridge: MIT Press. Morishima, Y., Akaishi, R., Yamada, Y., Okuda, J., Toma, K., & Sakai, K. (2008). Task-specific signal transmission from prefrontal cortex in visual selective attention. Nature Neuroscience, 12(1), 85–90.

218     A. Raftopoulos

Murray, S. O., Schrater, P., & Kersten, D. (2004). Perceptual grouping and the interactions between visual cortical areas. Neural Networks, 17, 695–705. Newen, A., & Vetter, P. (2016). Why cognitive penetration of our perceptual experience is still the most plausible account. Consciousness and Cognition. http://dx.doi.org/10.1016/j.concog.2016.09.005. Nobre, A. C., Rohenkhol, G., & Stokes, M. G. (2012). Nervous anticipation: Top-down biasing across space and time. In M. Posner (Ed.), Cognitive Neuroscience of Attention (2nd ed.). New York, NY: The Guilford Press. Ogilivie, R., & Carruthers, P. (2015). Opening up vision: The case against encapsulation. Review of Philosophy and Psychology. https://doi.org/10.1007/ s13164-015-0294. O’Shea, J., Muggleton, N. G., Cowey, A., & Walsh, V. (2004). Timing of target discrimination in human front eye fields. Journal of Cognitive Neuroscience, 16(6), 1060–1067. Peacocke, C. (1998). Nonconceptual content defended. Philosophy and Phenomenological Research, 58(2), 381–388. Peacocke, C. (2001). Does perception have a nonconceptual content? The Journal of Philosophy, XCVIII(5), 239–269. Peterson, M. A., & Enns, J. (2005). The edge complex: Implicit memory for figure assignment in shape perception. Perception and Psychophysics, 67(4), 727–740. Phillips, B. (2017). The shifting border between perception and cognition. Nous, 1–31. https://doi.org/10.1111/nous.12218. Plomp, G., Hervais-Adelma, A., Astofli, L., & Michel, C. M. (2015). Early recurrence and ongoing parietal driving during elementary visual processing. Nature, Scientific Reports, 5, 18733. https://doi.org/10.1038/srep18733. Potter, M. C., Wyble, B., Hagmann, C. E., & McCourt, E. S. (2014). Detecting meaning in RSVP at 13 ms per picture. Attention, Perception, Psychophysics, 76, 270–279. Pylyshyn, Z. (1999). Is vision continuous with cognition? Behavioral and Brain Sciences, 22, 341–365. Pylyshyn, Z. (2003). Seeing and Visualizing: It’s Not What You Think. Cambridge: MIT Press. Pylyshyn, Z. (2007). Things and Places: How the Mind Connects with the World. Cambridge: MIT Press. Raftopoulos, A. (2001a). Is perception informationally encapsulated? The issue of the theory-ladenness of perception. Cognitive Science, 25, 423–451.

3  Early Vision and Cognitive Penetrability     219

Raftopoulos, A. (2001b). Reentrant pathways and the theory-ladenness of observation. Philosophy of Science, 68, 187–200. Raftopoulos, A. (2009). Cognition and Perception: How Do Psychology and Neural Science Inform Philosophy? Cambridge: MIT Press. Raftopoulos, A. (2014). The cognitive impenetrability of the content of early vision is a necessary and sufficient condition for purely nonconceptual content. Philosophical Psychology, 27(5), 601–620. Raftopoulos, A. (2015). Cognitive penetrability and consciousness. In J. S. Zeimbekis & A. Raftopoulos (Eds.), Cognitive Effects on Perception: New Philosophical Perspectives (pp. 268–298). Oxford: Oxford University Press. Raftopoulos, A., & Zeimbekis, J. (2015). The cognitive penetrability of perception: An overview. In J. Zeimbekis & A. Raftopoulos (Eds.), The Cognitive Penetrability of Perception: New Perspectives. Oxford: Oxford University Press. Raftopoulos, A. (2017). Pre-cueing, the epistemic role of early vision and the cognitive impenetrability of early vision. Frontiers in Science, Psychology, 8, 1156. https://doi.org/10.3389/fpsyg.2017.01156. Reynolds, J. H., & Chelazzi, L. (2004). Attentional modulation of visual processing. Annual Review of Neuroscience, 27, 611–647. Rock, I. (1983). The Logic of Perception. Cambridge: MIT Press. Roe, A. W., Chelazzi, L., Connor, C. E., Conway, B. R., Fujita, I., Gallant, J. L., et al. (2012). Toward a unified theory of visual areas V4. Neuron, 74, 12–29. Roelfsema, P. R., Lamme, V. A. F., & Spekreijse, H. (1998). Object-based attention in the primary visual cortex of the macaque monkey. Nature, 395, 376–381. Rosenholtz, R., Huang, J., Raj, A., Balas, B. J., & Ilie, L. (2012). A summary statistic representation in peripheral vision explains visual search. Journal of Vision, 12(4), 14. Schall, J. D., & Bichot, N. P. (1998). Neural correlates of visual and motor decision processes. Current Opinions in Neurobiology, 76, 2841–2852. Searle, J. R. (1995). Consciousness, explanatory inversion and cognitive science. In C. MacDonald & G. Macdonald (Eds.), Philosophy of Psychology: Debates on Psychological Explanation. Oxford: Blackwell. Shibata, K., Yamagishi, N., Naokazu, G., Yoshioka, T., Yamashita, O., Sato, M., et al. (2008). The effects of feature attention on prestimulus cortical activity in the human visual system. Cerebral Cortex, 18, 1644–1675.

220     A. Raftopoulos

Silvanto, J., Lavie, N., & Walsh, V. (2006). Stimulation of the human frontal eye fields modulates sensitivity of extrastriate visual cortex. Journal of Neurophysiology, 96(2), 941–945. Silvanto, J., Cowey, A., Lavie, N., & Walsh, V. (2005). Striate cortex (V1) activity gates awareness of motion. Nature Neuroscience, 8(2), 143–144. Spelke, E. S. (1988). Object perception. In A. I. Goldman (Ed.), Readings in Philosophy and Cognitive Science (pp. 447–461). Cambridge: MIT Press. Stich, S. (1978). Beliefs and subdoxastic states. Philosophy of Science, 45, 499–518. Taylor, P. C. J., & Nobre, A. (2007). FEF TMS affects visual cortical activity. Cerebral Cortex, 17, 391–399. Thompson, K. G., & Schall, J. D. (2000). Antecedents and correlates of visual detection and awareness in macaque prefrontal cortex. Vision Research, 40, 1523–1538. Torralba, A., & Oliva, A. (2003). Statistics of natural image categories. Network, 14, 391–412. Ullman, S., Vidal-Naquet, M., & Sali, E. (2002). Visual features of intermediate complexity and their use in classification. Nature Neuroscience, 5(7), 682–687. Utochkin, I. S. (2015). Ensemble summary statistics as a basis for rapid visual categorization. Journal of Vision, 15(4), 1–14. Vandenbroucke, A. R. F., Fahrenfort, J. J., Sligte, I. G., & Lamme, V. A. F. (2014). Seeing without knowing: Neural signatures of perceptual inference in the absence of report. Journal of Cognitive Neuroscience, 26(5), 955–969. Vetter, P., & Newen, A. (2014). Varieties of cognitive penetration in visual perception. Consciousness and Cognition, 27, 62–75. Wang, P., & Cottrell, G. W. (2017). Central and peripheral vision for scene recognition: A neurocomputational modeling exploration. Journal of Vision, 17(4), 1–22. Wilson, H. R., & Wilkinson, F. (2015). From orientations to objects: Configural processing in the ventral system. Journal of Vision, 15(7:4), 1–10. Wokke, M., Sligte, I. G., Scholte, H. S., & Lamme, V. A. F. (2012). Two critical periods in early visual cortex during figure-ground segregation. Brain and Behavior, 2(6), 763–777.

3  Early Vision and Cognitive Penetrability     221

Wyart, V., Nobre, A., & Summerfield, C. (2012). Dissociable prior influences of signal probability and relevance on visual contrast sensitivity. Proceedings of the National Academy of Sciences, 109, 3593–3598. Zhou, H. H., & Thompson, K. G. (2009). Cognitively directed spatial attention in the frontal eye field in anticipation of visual stimuli to be discriminated. Vision Research, 49, 1205–1215.

4 The Cognitive Effects on Early and Late Vision and Their Epistemological Impact

1 Introduction In Chapter 1, I examined some of the epistemic problems related to the Cognitively Penetrability (CP) of perception. I agreed with Siegel’s view that only the cognitive effects that affect perceptual processing and, in this sense, act internally to perception should count as cases of CP. I explained in Chapter 2 why this should be thus. In the same chapter, I raised a problem for Siegel’s view that attention does not entail CP because, according to Siegel, attention just selects the input and does not affect perception itself. I argued that not all attentional effects act externally to perception and that there are attentional effects that affect perceptual processing itself and, thus, may entail CP of perception. It is wrong, thus, to exclude attention from signifying CP, as Siegel does. The attentional effects that occur in late vision, for example, which are

© The Author(s) 2019 A. Raftopoulos, Cognitive Penetrability and the Epistemic Role of Perception, Palgrave Innovations in Philosophy, https://doi.org/10.1007/978-3-030-10445-0_4

223

224     A. Raftopoulos

internal since they affect perceptual processing itself, do signify that late vision is CP. At the same time as seen in Chapter 3, other internal attentional effects, such as pre-cueing, do not affect early vision and do not entail the CP of early vision, notwithstanding their internal character. We saw that we must account for the fact that some forms of CP epistemically downgrade perception, while others do not. I proposed in Chapter 1 that the confluence of emotive and cognitive influences on late vision could create the irrational etiology that leads to the epistemic downgrade of perception by rendering it less sensitive to the data. I also claimed that any adequate account of the epistemic effects of CP should incorporate a discussion of the extent to which the CP diminishes the sensitivity to the data, because this is what undermines at a last analysis the evidential role of perception. In the previous chapters, I argued that a necessary, but not sufficient condition, for this to occur is that the processes by means of which information is retrieved from a visual scene (which constitute early vision) should not be affected directly by cognitive states. In other words, that some cognitive effects on perception do not epistemically downgrade perception is made possible by the fact that there is a perceptual stage, namely early vision, which is not affected directly by any cognitive states. This is not a sufficient condition because the Cognitive Impenetrability (CI) of early vision does not prevent CP from downgrading perception during late vision, which means that the causes of the epistemic downgrade of perception are related to the way cognition affects late vision and, specifically, the way late vision exploits the iconic image formed during early vision; any irrational etiology in perception comes from late vision. In the first section of this chapter, I develop further the repercussions for the role of perception of the fact that early vision is CI since neither is it affected directly by cognition, nor is its epistemic role affected by the indirect cognitive effects on it. In the second section, I turn to late vision and argue that owing to the CI of early vision, the fact that late vision is CP does not entail any of the consequences that relativism thought it did.

4  The Cognitive Effects on Early and Late Vision …     225

2 Indirect Cognitive Effects on Early Vision and Their Epistemic Impact In the preceding chapters, I argued that there do not exist any direct cognitive influences on early vision, but only indirect effects. The fact that early vision is only indirectly affected by cognitive states and, thus, the cognitive states do not have a saying as to which information is retrieved from the visual scene since they do not affect perceptual processing itself, entails that cognition does not affect the epistemic role of early vision, because all data from the visual scene are in the iconic image and are available to late vision. Early vision, therefore, is not responsible for the epistemic downgrade of perception that the CP of perception may cause. The culprit must be late vision, where the percept is formed. If early vision were directly affected by cognition, the cognitive states would determine which information is retrieved from the visual scene, and, consequently, they would shape the iconic image and, thus, the visual data used by late vision. This would make late vision necessarily insensitive to the distal data because the iconic image would contain only the distal data that favor the penetrating cognitive states and would disregard nonconforming information. Should this be the case, cognitive effects would always make perception insensitive to the data and, thus, would necessarily downgrade perception and, moreover, there would be no way to reverse the harmful epistemic effects. The reader should recall that Siegel (2016) argues that the rationality of the perceptual processes that issue the percept depends crucially on whether these processes, and consequently the percept, are resistant to the evidence, which in the case of perception is contained in what Siegel calls pre-experiential perceptual states. When desires control the perceptual processes and control the evidence, the percept is resistant to the information contained in these pre-experiential perceptual states, in the sense that it resists change in the face of incongruent information. In this case, the CP of perception by desire downgrades perception because it undermines the epistemic status of the percept in justifying perceptual beliefs; perception becomes irrational.

226     A. Raftopoulos

The rational state of affairs obtains when the evidence controls the perceptual processes, which means that the processes that form the percept should be affected in the face of incongruent evidence, that is, when the information contained in the pre-experiential perceptual states does not support the percept favored by the penetrating cognitive state(s). The whole discussion presupposes that the information contained in the pre-experiential perceptual states, which functions as the evidence or the information used by the perceptual processes, is itself immune to influences by desires or other cognitive states because, otherwise, the fact that the evidence controls the percept would not be enough to ensure the rationality of the process since the evidence itself could be contaminated by cognitive states. In other words, Siegel’s account is premised on the assumption that the pre-experiential perceptual states are CI. Differently put, there is a perceptual stage which is CI and in which these pre-experiential perceptual states are formed. If there were not such a stage, perception would lose in principle its capability to be sensitive to the evidence (in the sense that it is the evidence that controls the percept and not the other way around) because the evidence would be already controlled by desires or other cognitive states. In my account, these pre-experiential perceptual states (although in my account the ‘experiential’ signifies cognitive access conscious and not phenomenal consciousness) are the states of early vision, and the fact that early vision is not directly affected by cognition entails that early vision does indeed provide the cognitive-free evidence needed for perception having the capability to be sensitive to the evidence. This capacity of early vision ensures both that there are cases of CP that do not downgrade perception (which means that our contact with the world is not severed) and that on most occasions in which the cognitive effects do downgrade perception through their influence on late vision the results are reversible and the situation could be alleviated, which bars the main thesis of epistemological constructivism. Suppose a prior mental state, say the belief B1, makes viewer S1 select a distal stimulus and form a perceptual state P1, and another belief B2 makes viewer S2 select a different distal stimulus and form a perceptual state P2. Now, a harmful epistemic effect would ensue if the difference between the two percepts led the two viewers to doubt the

4  The Cognitive Effects on Early and Late Vision …     227

reliability, or justificatory role of their perceptions, that is, if S1 came to doubt whether her belief P1* (where P1* is the belief whose content matches the perceptual content of P1) that is based on the content of her perceptual state P1 is well founded; the fact that S2 perceives a different percept may mean that there is something wrong with S1’s own perceptual processes. The case at hand, however, is not an epistemically speaking harmful situation that epistemically downgrades perception because one could ask S1 to attend the other distal stimulus, the one to which S2 attended, in which case S1 would form the perceptual state P2 just as S2 did. Cognitive effects that act externally do not lead to any epistemic downgrade of perception because any differences in the percept could be easily resolved by neutralizing the cognitive impact just by refocusing attention to a different location or feature/object. This consideration can be extended to cover cases in which attention refocusing concerns some location, point, or part of a single object, which may reorganize the perceived schema and lead to the perception of a different object, as is the case with ambiguous or bi-stable stimuli. In Raftopoulos (2009, 2011, 2014), I argue that focusing on some parts or crucial points of such a figure induces a different organization of the figure leading to different percepts. Thus, if S1 attends to a certain crucial point of a bi-stable figure and S2 focuses on a different crucial point of the same stimulus, they will perceive two different figures. This does not pose any epistemic threat, though, because as I have argued, and Siegel (2013a, 717) eloquently points out, if S1 or S2 are asked to focus to each other’s crucial point they will see the other’s figure and this bans any relativistic considerations. We have seen that most of the existing definitions of CP exclude from being cases of CP those cases in which attention acts externally to the viewer, by directing eye or body movements, or, generally, cases in which attention performs what Siegel (2011) calls a global selection of the stimulus, that is, cases in which attention determines which part of the environment will be visually processed. Cases of genuine CP are those in which the same stimulus is perceived differently on account of the role of some cognitive states in affecting perceptual processing and not those in which different cognitive states lead to the selection of different stimuli. Why is this so? After all, even in the latter cases a causal

228     A. Raftopoulos

account of why the two viewers see different things would have to include the role of the cognitive states in directing attention to different stimuli or different points/locations of the same stimulus, which means that the cognitive states play some role in shaping the percept. The existing definitions of CP are not clear why this sort of cognitive influences should not be deemed as a case of genuine CP, but I think that the reason is not hard to find and the discussion in the preceding paragraph shows why. The latter are not genuine cases of CP because if the difference in the percept is due to the selection of different stimuli, this difference does not pose any serious epistemic problems for the epistemic role of perception because it can be remedied and the effects of the differing cognitive states be neutralized, as Siegel’s (2013a, 717) discussion of ambiguous figures makes clear. In other words, if the consequences of the cognitive effects on perceptual processing could be mitigated in a relatively straightforward way, such as by just changing the input, these cognitive effects are not serious enough to be considered cases of CP because none of the harmful consequence associated with CP occurs. As we shall see in Chapter 5, when we view a visual scene, most often we search the iconic image for characteristic clues or data confirming the various hypotheses concerning the identity of the distal objects that late vision forms. These hypotheses are in essence predictions about which kind of object is most likely the object in the visual scene. The hypothesis that survives this testing is selected and becomes part of the content of the perceptual state that constitutes the percept and will be used as evidence for the relevant perceptual belief. For example, if the selected hypothesis that passes the testing process is “that O is F”, this leads to the experience of O as F. This experience, in turn, supports the belief that there is an O that is F in the visual scene. The search for clues is directed by cognitively driven attention that operates in late vision guiding the search to the locations in the iconic image that most likely contain the information/evidence for the tested hypothesis. This information is part of the rich iconic information that is retrieved from the visual scene during early vision and is used to test the various hypotheses or predictions concerning the identities of objects. If cognitive states such as beliefs could affect the contents of the early vision states, which include iconic image, through direct attentional modulation, that is, if

4  The Cognitive Effects on Early and Late Vision …     229

some beliefs could shape the contents of the iconic image that are used for testing the aforementioned hypotheses, we would have a situation in which some beliefs shape the evidence that will be used to form the percept and eventually confirm or disconfirm a perceptual belief. To return to our example, suppose that upon being presented with a visual scene a viewer, on the basis of the visual information retrieved from the scene, forms the hypothesis that the object O in the scene is F, where F is a set of visual features. To test this hypothesis, the viewer’s visual system searches the iconic image for the relevant visual information which if found would confirm the hypothesis that O is F. This search is guided by the knowledge, usually the result of experience and knowledge about the world, of which locations in the iconic image most likely contain the relevant information concerning F; in other words, the search is guided by the belief that “O is F”. Suppose now that the hypothesis that O is F, which guides the search, was held as a belief by the viewer before the perceptual encounter and had affected the processes of early vision by means of which information was retrieved from the visual scene in a way that the information retrieved favors the hypothesis that O is F. This biasing could be done, for example, by selecting from the visual scene only confirming and deselecting disconfirming information and letting in the iconic image only information that is congruent with the hypothesis that O is F. Since any incongruent information will have been excluded nothing could disconfirm the hypothesis “O is F”. Since the hypothesis “O is F” will be selected on the basis of this evidence, the viewer experiences an O that is F and, thereby, uses this experience as evidence for the perceptual belief “O is F”. This is a case of confirmation bias that is epistemically embarrassing and undermines the epistemic role of perception. If, however, cognitive states could not affect either through attention or otherwise the early visual processes that retrieve information, that is, if the information contained in the states of early vision is information retrieved directly from a visual scene independent of any cognitive/conceptual influences, then the hypotheses tested could not have shaped the evidence on the basis of which hypotheses will be tested. The information retrieved from the visual scene and stored in the iconic image reflects only the environment and the perceptual makeup

230     A. Raftopoulos

of the viewer and not any of the viewer’s cognitive states. This means that the information stored in the iconic image will contain information that is incongruent with the favored hypothesis if such information exists in the environment. Whether this information will be used during late vision to reject the favored hypothesis or whether the CP of late vision will lead to a testing of the hypothesis that is biased in favor of this hypothesis so that any incongruent information be ignored is immaterial to the epistemic role of early vision; the epistemic duty or responsibility of early vision was to deliver all available information and this it did. This makes it possible in principle for late vision to reject the favored hypothesis since the disconfirming information is there to be used. If early vision were CP, the recalcitrant information would not even be there to be used, in which case late vision would have no other choice but to confirm the favored hypothesis; viewers would be doomed to seeing only what their cognitive states dictate. This, in turn, allows early vision to play the role of a neutral arbiter for perceptual beliefs. This is why, even though cognitively driven visual search is the norm in perception, the fact that the search is guided by some beliefs/ hypotheses does not entail that early vision is epistemically downgraded, since cognition does not affect the information retrieved from a visual scene. To put it differently, the fact that we do not usually see what is not there even though some prior beliefs may dispose us to do so and even though late vision is undoubtedly affected directly by these beliefs is evidence that early vision is not CP. One could object that various pre-cueing effects that indirectly affect early vision may highlight some information at the expense of other and this, arguably, may affect the epistemic role of early vision. This answer is that in cases of spatial pre-cueing, as I have argued, no information from the attended visual scene is privileged; both targets and nontargets in the visual scene selected by spatial attention are taken in by early vision. All information present in it is equally processed; what spatial attention does is to select one visual scene from the environment among others. In cases of feature/object pre-cueing, the information that matches the cue is indeed highlighted and receives a prior boost in its attempt to win the attentional competition at the expense of other information and be further processed. This means that the hypothesis concerning the

4  The Cognitive Effects on Early and Late Vision …     231

identity of the feature/object that matches the cue likely will be the first hypothesis to be formed and tested during late vision; pre-cueing facilitates the formation of a hypothesis concerning feature/object identity. Despite the initial boost of some neuronal activity in the early visual circuits owing to pre-cueing, however, early vision still retrieves in parallel all the information in the visual scene. All this information, therefore, is there in the iconic image because the cue does not affect perceptual processing but only changes the values of some parameters before the onset of perceptual processing so that some of the incoming information be highlighted. When a hypothesis is tested, the evidence in the iconic image can either confirm or disconfirm the hypothesis. Thus, by itself, pre-cueing does not introduce at a last analysis any confirmation bias and, thus, does not entail any harmful epistemic effects for perception. If the facilitated hypothesis passes the test, which means that the facilitated feature/object is indeed present in the visual scene, then the pre-cueing has increased the efficiency of perception, which means that it increased its reliability. What is important is that information incongruent with the favored hypothesis be included in the evidential basis provided by early vision so that late vision would have the possibility to reject the hypothesis independent on whether it will finally do so. Notice that this property of early vision undercuts not only arguments concerning confirmation bias but, also, the more general constructivist attack on the ability of perception to relate us directly, meaning without conceptual intervention, to the world that, as we have seen, underlines the traditional criticisms emanating from the claim that perception is cognitively penetrable and theory-laden. I turn to discuss this issue in the next section.

3 The CP of Late Vision Does Not Justify Constructivism Even if I am right that cognitive states do not affect early vision so as to shape the content of the states of early vision and thereby create the evidence that would eventually confirm perceptual beliefs, the cognitively guided search in late vision for information that confirms or

232     A. Raftopoulos

disconfirms a hypothesis about the identity of an object in the visual scene has certainly an impact on the epistemic character of the evidence provided by perception. This process may not create the iconic image because it does not affect early vision, but it selects information in the iconic image that confirms the hypothesis and may ignore other information in that image that disconfirms the hypothesis. The reader has noticed that the focus of discussion has shifted from early vision to late vision and on the way, late vision exploits the information retrieved from the visual scene by early vision. This scenario presupposes that there is in the iconic image information that could lead to conflicting beliefs concerning the identity of the object in the visual scene. This is possible, for example, if by selecting a set of data someone organizes the image in a different way from which they would have organized the image had they selected another set of data in the image; as a result they view a different percept than they would have viewed had they organized the image differently. This property of allowing different organizations of an image in a way that results in different percepts is a characteristic trait of the so-called ambiguous figures. In the duck–rabbit ambiguous figure, for instance, there are some crucial points fixation to which leads to a certain organization of the image and the perception of either a duck or a rabbit. (Strictly speaking, I should have said duck-like or rabbit-like rather than duck or rabbit, because early vision does not convey information about kind-membership and, therefore, it does not output a duck or a rabbit image. It outputs a figure that it is shaped like a duck or like a rabbit and this is called a duck-like or a rabbit-like figure.) Thus, one person sees a rabbit and another person, by concentrating attention to a different location in the image, sees a duck. Since the fixation to a certain location can be certainly guided by some cognitive state, this cognitive state co-determines the percept. Moreover, different cognitive states may lead to conflicting percepts by guiding attention to different locations in the image. It is thus, possible, that the belief that there is a duck in the picture leads one to search for the characteristic beak of the bird and, thereby, concentrate on the part of the image that induces the perception of the duck. Had that person concentrated on another crucial location in the image, she would have perceived a rabbit.

4  The Cognitive Effects on Early and Late Vision …     233

The discussion need not be restricted to ambiguous figures. In Siegel’s (2013a, b) example, someone is looking at a pair of pliers in a visual scene and the pliers share some features with guns. Suppose that the viewer holds the belief, or the desire, that there exist a gun somewhere in the visual scene, and this leads the viewer’s late vision to form the hypothesis that there is a gun in the visual scene. When this hypothesis is tested against the information contained in the iconic image this belief draws attention to those features that guns share with pliers and disregards the other features of pliers that guns do not have. By concentrating on certain parts of the iconic image retrieved from the visual scene, viewers may be led to experience a gun and form the corresponding belief that there is a gun before them. As a result, they see a gun even though there is no gun in the visual scene. By concentrating on these properties and ignoring other discrepant evidence, the viewer is led to see the object that confirms the belief the viewer had held before the perceptual encounter; the belief that there is a gun nearby biased the viewer to look for guns, selecting thus from the iconic image those properties that guns have in common with pliers and ignoring other properties that guns lack. This is a case in which the CP of perception by a prior belief makes the viewer form a percept that justifies a false belief, a sign that perception is epistemically downgraded. Siegel (2011, 2013a, b) and Lyons (2011) have argued that familiarity and expertise are epistemically benign cases of CP, since they increase rather than hinder the justificatory role of perception by making viewers more sensitive to the visual information and, for Siegel, rendering the perceptual inference involved more rational, and, thus, increase the probability that they will detect the salient features of the familiar object. These are not cases in which CP downgrades perception and, thus, do not threaten its epistemic role on justifying perceptual beliefs, This is not, however, as epistemically innocent as it seems. Suppose that constructivists are right and different conceptual frameworks in different paradigms entail that viewers immersed into these different paradigms perceive the world differently owing to the CP of perception. Suppose further that scientist A endorses paradigm Π1 and scientist B endorses paradigm Π2. A working with a class of problems in a domain in Physics within Π1 has acquired expertise E1. B working with the

234     A. Raftopoulos

same class of problems within Π2 acquires expertise E2. As a result of the differences in their training, when facing the same physical configuration, A sees pattern P1 while B sees another pattern, P2—let us suppose that P1 and P2 are mutually exclusive. A’s expertise facilitates the detection of P1, and B’s expertise facilitates the detection of P2 and so, within their respective paradigms, there is no problem with the CP of their perception, since we have accepted that familiarity and expertise do not epistemically downgrade perception. This benevolent result, however, is limited within each paradigm. Across paradigms, the epistemic problem remains because how are we to decide which of the two patterns is really present in the visual scene? This is a very well known problem in the Philosophy of Science raised by Kuhn’s and Hanson’s views that perception is CP through and through. To disentangle this issue it does not suffice to say, as Siegel and Lyons do, that familiarity and expertise are innocuous cases of CP; for constructivism not only are they not, but they are a characteristic case of the epistemic effects of the CP of perception that may be used to argue for all the traditional consequences of constructivism for realism, the rationality of scientific testing, etc. To counteract the threat to realism and the rationality of scientific practice posed by constructivism, one should examine the way familiarity and expertise work, that is, the perceptual processes that subserve them in order to asses the extent and the way in which these processes are CP and, the epistemic impact of this CP. Based on this assessment, one should determine whether the cognitive effects that set apart the viewers in the two different paradigms by making them see different patterns when they face the same distal stimulus (recall that in extreme cases of constructivism the existence itself of the distal stimulus is threatened) could be mitigated allowing contact and communication between the two viewers. I will argue that this can be done because the effects of CP could in principle be mitigated and the reason they can be mitigated is that, as we have seen in Chapter 3, familiarity either operates in a way that does not entail CP, or in a way that involves CP but only of late vision; in any case, early vision remains unaffected. It is this trait of early vision that explains both why the cases of benign CP that Siegel and Lyons discuss are benign, and also why in cases in which CP downgrades perception, the downgrading effects

4  The Cognitive Effects on Early and Late Vision …     235

of CP cannot be extended to undercut our contact with the world and vindicate constructivism. The CP of late vision of the sort described above, which expresses itself in the formation of hypotheses-implicit beliefs concerning the identity of the objects in a visual scene through a synergy of the output of early visions and cognitive/semantic information, and in guiding the hypothesis testing and selection of confirming or disconfirming information, is a ubiquitous feature of perception, which, however, does not epistemically downgrade perception, while wishful thinking of the sort involved in the Jill and Jack case or in the pliers/gun case is conducive to epistemic downgrade. Why is this so and why is perceptual learning non-conducive to epistemic downgrade? Note that an adequate answer to this question would also be a reply to constructivism’s claim that the CP of perception undermines a viewer’s access to the world by undercutting the possibility of a ‘given’. If what matters at a last analysis is the sensitivity to the facts and the reliability of perception, one has to examine closer the perceptual processes involved in these cases in order to determine whether and under which conditions these processes decrease the sensitivity to the facts. But first, one has to determine which are the facts to which perception must be sensitive. The intuitive answer is that the facts consist in the distal stimulus, i.e., the visual scene and the objects and their properties that figure in it. The question is, then, which cases of CP reduce sensitivity to these facts and why? Our perceptual systems, however, do not transduce objects and their properties; our access to them is through the representational contents of our perceptual states. These contents consist in information retrieved from the environment by our perceptual system. The question concerning the sensitivity to the facts should be, therefore, how does the CP of perception affect the retrieval process? If it affects perceptual processing in such a way that the information retrieved does not reflect the environmental input but conforms more to the contents of the cognitive states that penetrate perception, then obviously the sensitivity to the facts is severely diminished and the reliability of perception is decreased, which means that CP downgrades perception. As I have said, the information retrieved from a visual scene constitutes the iconic image, which I have defined as the rich iconic

236     A. Raftopoulos

information retrieved from the visual scene in early vision and stored in visual circuits. The hypotheses concerning object identity formed in late vision are tested against this information and the hypothesis that passes the test determines the percept. To decide whether CP diminishes the sensitivity to the world and, thus, the extent to which it epistemically downgrades perception, one has to examine the locus at which cognition affects perception and the way in which it does. I argued in the previous section that the processes of early vision that retrieve visual information from the environment are not influenced by cognition in a way that affects their epistemic role, and that early vision ‘faithfully’ retrieves the environmental information. The percept, however, is formed in late vision and late vision is CP. How are the CI of early vision and the CP of late vision related, and what does this have to with the epistemic downgrade of perception? Recall that the hypotheses formed in late vision are tested against the information retrieved in early vision and this information has not been affected by whatever cognitive processes occur in late vision. Therefore, the evidential basis of the hypothesis testing is neutral with respect to the tested hypotheses; differently put, the cognitive information that penetrates late vision does not create the information that will be used in the testing. Certainly, the testing assumes the form of highlighting some of the information retrieved by early vision and selecting this information while deselecting some other, but it does not create or shape this information; it does not affect and therefore cannot change the incoming data so that they conform to the penetrating cognitive state or states. This means that the data are there, for a short time interval of course, or they can be recreated in a subsequent encounter of the same visual scene, and can be accessed by the viewer. The CI of early vision, which results in the independence of the information retrieved from the environment from any cognitive states, imposes a severe restriction to the extent and impact of the epistemic effects of the CP of late vision. The penetrating cognitive states may affect the selection of the stored information but they cannot create this information, which, thus remains available to, and can be revisited by, the viewer. So even though late vision involves hypothesis testing that is usually guided by certain cognitive states, late vision cannot create the

4  The Cognitive Effects on Early and Late Vision …     237

facts but it can only select or deselect some among them. In most cases, the affecting cognitive states guide attention to the parts of the iconic image that most likely contain the information needed for hypothesis testing. Usually, if the confirming information is not found another hypothesis is formed and so on. This is the case of a good variation of the Jill and Jack scenario in which Jill’s prior belief that Jack is angry makes her search for the relevant clues. If the clues are there, which means that Jack is really angry, her prior belief made her more sensitive to them and she either found them more quickly, or she found them while otherwise she would have missed them. If the clues are absent or are weak, Jill will not see an angry Jack and will not form the belief that Jack is angry, revising thus her former belief. The same argument applies to cases of familiarity and expertise that facilitate object recognition through the attentional effects that operate in late vision. In all these cases the CP of late vision does not epistemically downgrade perception. To repeat a point made earlier, what the CP of late vision affects is the choice of the information from the iconic image and, thus, the cognitive states that penetrate late vision determine the percept by guiding attention that selects the information from the iconic image that will confirm some hypothesis concerning object identity. Owing to the CI of early vision, however, the cognitive states do not affect the processes of early vision that retrieve information from a visual scene. This information is stored in the iconic image and is available to the viewer should she decide to revisit it. Let us go back to Jill’s case and suppose that Jill, guided by her prior belief that Jack is angry, chooses those clues in the iconic image that confirm the perceptual hypothesis that Jack is angry; as a result she sees Jack to be angry and forms the c­ orresponding belief. Suppose that for some reason Jill’s attention is drawn to some other visual features of Jack’s expression stored in the iconic image that are incompatible with his being angry. Since this information has been retrieved from the visual scene and is stored, Jill’s prior belief notwithstanding, Jill can access it and select it. If she does that, she will form the percept of a non-angry Jack and she will revise her previous belief that Jack is angry. This way, her previous belief that Jack is angry either is overridden, perhaps he was angry before but he is not angry

238     A. Raftopoulos

now, or is revised, perhaps he was not angry after all. Suppose now that early vision was CP by the belief that Jack is angry. It would follow that the processes of early vision, being affected by the belief, would change the incoming environmental information to conform to the content of the penetrating belief and this would change the stored iconic image, which now would not match the information present in the visual scene. If this were the case, any connection to the visual scene would be lost, which is epistemological constructivism’s main thesis. As a result, no revisiting of the iconic image could change Jill’s belief that Jack is angry because the rebutting information simply would be absent from the iconic image. Lyons (2011) claims that bad cases of CP are distinguished from cases of innocuous CP on account of the fact that the former decrease the reliability of perception, while the latter either increase or leave unaffected the reliability of perception. This seems to be on the right track. In cases of bad CP, like the one described above, CP leads to object misclassification; the pair of pliers is mistaken for a gun and this undermines the reliability of perception. This is different from cases in which familiarity or expertise facilitate object recognition, which all parties involved agree is an innocent case of CP in that it does not threaten the justificatory role of perception; the objects are classified correctly, albeit faster and more reliably. If Lyons is right, to distinguish between bad CP and innocent CP, or equivalently, between cases of good selection and cases of bad selection, one has to examine the extent to which CP limits the sensitivity of perception to the data. I explained above why the CI of early vision ensures that perception is at a last analysis sensitive to the distal data. In the cases discussed above, the potentially disconfirming evidence is in the iconic image; it is just bypassed or ignored on account of the beliefs held by the viewer. If early vision were directly affected by cognition, there would be no disconfirming information in the iconic image in the first place. This means that the counterevidence is there and could in principle be retrieved should the viewer, for some reason or other, decide to redirect her attention. This has an important impact on whether a viewer could be convinced that her seeming or percept may not reflect what is in the environment. If early vision were CP and the cognitive states could

4  The Cognitive Effects on Early and Late Vision …     239

mold the retrieval processes and affect what information is stored, the possibility of convincing would be lost. This sort of convincing is important because it paves the way to addressing one of the problems that, constructivism thought, emerges from the CP of perception, namely the impossibility of communication between viewers who belong to different paradigms (and, thus, have different conceptual frameworks) concerning what they see, which undermines at a last analysis the rationality associated with theory testing. If viewers belonging to one paradigm could be convinced as to what other viewers in the other paradigm see because they could be made to see the same thing by revisiting the iconic image and selecting different information from it, the communication between the two would be restored and matters of meaning could be addressed. Lyons (2011, 302) argues that this sort of convincing comes in grades of difficulty; if, for example, the viewers’ current occurrent beliefs influences their perception and make them see that P it will be easier to convince them that P may not be the case and that Q is the case, than it would be if it were the viewers’ longstanding beliefs that P that penetrated their perception and made them see that P. This is correct; if it is a recently acquired belief that penetrates perception and makes the viewer select some information at the expense of some other, it will be relatively easy to guide the viewer to concentrate to another part either of the distal stimulus (in cases of pre-cueing), or of the iconic image (when the viewer seeks to confirm a hypothesis concerning the identity of an object in a visual scene by revisiting the information stored in the iconic image). If, for example, viewers concentrate on a part of an image and see a duck because their belief that there is a duck in the image made them search for the relevant cues at that part of the image, it is easy to redirect their attention to another part of the image and make them see a rabbit, if they possess the concept RABBIT, or a rabbit-like figure in case they do not possess the concept RABBIT. If, however, it is the viewers’ training and acquired expertise that makes them detect certain patterns and see that P, whereas other viewers with a different training (say, because they endorse another paradigm) do not detect these patterns and see something different, then it would require to retrain the latter viewer to the patterns with which the former viewers have

240     A. Raftopoulos

been trained in order to allow the latter viewers to see what the former viewers see. I argued that the effects of the CP of late vision could be mitigated, where one of the criteria of mitigation is whether a viewer could be convinced that things are not as they initially seemed. If one could train a scientist to detect patterns that she could not previously see, the cognitive effects due to perceptual learning or expertise can be mitigated. If one could make Jill see that despite her first visual seeming Jack does not look angry, the bad influence of the CP would be removed and the results of CP mitigated. Let us see what this means for the constructivist view that the CP of perception undermines its epistemic role undercutting our contact with the world. Constructivism holds that the CP of perception entails, among other things, that two viewers in different conceptual frameworks see the world differently and, thus, matters concerning the meaning of their perceptual states cannot be resolved on the basis of a neutral perceptual basis that would allow one viewer to understand what the other sees. Let us examine whether the indirect conceptual influences on early vision through the determination of the spatial focus, which occurs in spatial pre-cueing, vindicate such constructivist claims. The following discussion is meant to provide only a sketch of the springboard on which an in-depth discussion of the epistemological effects of cognitive penetration of perception should be based and is not meant in any way as an answer to the problem. Suppose that X and Y view the duck–rabbit ambiguous figure. As I said in Chapter 3, there are three ways attention could determine the percept when viewing ambiguous figures. In the account that follows I consider the option in which early vision outputs a neutral figure and during late vision attention selects information from some part of the iconic image, that is, the neutral figure, organizes the image accordingly and forms the percept favored by this organization. X, by activating for some reason or other, the concept ‘rabbit’, selects information from one part of the iconic image, and decomposes and organizes the iconic image retrieved from the scene in such a way that X sees a rabbit. Y, on the other hand, by activating the concept ‘duck’, focuses and selects information from a different part of the iconic image and organizes the

4  The Cognitive Effects on Early and Late Vision …     241

figure in a different way, and, as a result, sees a duck. If the role of spatial attention at work were to constitute a case of CP of early vision, it should preclude early vision from playing the epistemological role of providing a theory-neutral basis on which to resolve matters pertaining to seeing. Let us, thus, examine whether early vision allows the resolution of differences related to seeing despite the cognitive influences on late vision through spatial attention. Suppose that cognitive factors have given rise to a context in which X is biased toward rabbits and hence, expects a rabbit, because the concept ‘rabbit’ is activated and used. When X is presented with a rabbit–duck ambiguous shape, focusing her attention onto the location on the iconic image at which she expects the characteristic ears, which in the standard drawings of the ambiguous figure is the upper part of the image, sees a rabbit. Y, who expects a duck because of the activation of the concept ‘duck’ owing to a different set of cognitive factors, focuses onto the lower part of the picture and sees a duck. This holds true because, as we saw in Chapters 1 and 3, it is well documented by a host of empirical studies with bi-stable stimuli (Attneave 1971; Britz and Pitts 2011; Driver and Baylis 1996; Hochberg and Peterson 1987; Kornmeier et al. 2009; Peterson and Hochberg 1983; Pitts et al. 2007) that shed light on the mechanisms that underlie the way perceptual set biases object segmentation, that the cognitive states of the observer do not affect by themselves the organization of the stimulus. Some crucial points of fixation influence the organization of the stimulus through the role of spatial attention; that is, there are locations in the image fixation on which favors one or the other percept. In other words, the way a bi-stable stimulus can be visually interpreted depends on where the observer fixes her attention, because there are crucial points fixation on which determines the perceptual interpretation. This means that the mechanism underlying the effect of perceptual set in ambiguous figures involves the voluntary control of spatial attention; the perceptual set induces observers to allocate their attention to specific regions in the stimulus (Peterson and Gibson 1994). This causes the figure to be experienced the way favored by perceptual set. Let us assume now that X for some reason locks her attention onto the lower part of the image instead onto the upper part at which her

242     A. Raftopoulos

cognitive stances had led her to focus in the first place. Under these circumstances she would see a duck, as does Y. Note that X may see the duck-like figure, that is a configuration that a person who possesses the concept ‘duck’ would recognize as a duck, even when X does not possess the concept ‘duck’ and the focus to the salient part of the image was the result of some other cognitive influence. The important thing is that perceivers can shift their attention to the iconic image when they test hypotheses concerning the identity of the object in the visual scene, despite their differing theoretical commitments since spatial attention can be controlled. The point is that although in the case of ambiguous figures cognition mediates the processes determining the percept through the allocation of spatial attention, and, thus the contents of the relevant cognitive states will enter in an explanation of why this specific percept was formed, spatial attention can be controlled since people can refocus their attention on such and such a location. This way, the cognitive influences could be mitigated. The same conclusion can be drawn from other pre-cueing effects. The resulting biasing can be controlled for the same reasons that spatial attention focusing can be controlled and, thus, that cognitive effects of this sort could be mitigated too. Two perceivers who might perceive different percepts given the same stimulus on account of some sort of pre-cueing may ‘interchange’ percepts by receiving one of the cues or the other. Since the perceptual processing is otherwise not affected, controlling for the cues controls the percept. This negates the harmful effects of the role of concepts and, hence, even though attention affects the internal perceptual ongoings, these effects do not count as a case of CP. If my analysis of the situation is correct, the CP of late vision does not threaten systematically either the reliability of perception, or our contact with world; the CI of early vision, which retrieves from the environment information unaffected by cognitive effects, ensures that perception is reliable because the cognitive effects on late vision notwithstanding, the CP of late vision cannot systematically and irrevocably make us miscategorize things, and cannot undercut our access to the world. This means that despite the fact that the CP of late vision may, and does occasionally, undermine the justificatory role of experience

4  The Cognitive Effects on Early and Late Vision …     243

and, thus, epistemically downgrade it, measures can be taken to mitigate the epistemic consequences of the cognitive effects on perception. Viewers could be asked to focus attention to other parts of a visual scene, or viewers could be pre-cued differently, or they could undergo some perceptual training and acquire an expertise that they lacked before; all these would result in these viewers forming a different percept from before and this would allow them to understand either that things were not as they seemed before, or why other viewers saw one thing, whereas they saw something else. A systematic, irreducible CP that undercuts our contact with the world and limits us inescapably within the frame of our conceptual frameworks does not exist. To recapitulate, CP sometimes does epistemically downgrade perception. If one examines, however, the cases in which CP does not downgrade perception, as the cases of the formation of hypotheses in late vision that are ubiquitous in perception and almost indispensable for object identification and the formation of the percept, or cases in which CP enhances the justificatory role of perception, such as the cases of expertise and familiarity, one would discover that in all these cases, it is the CI of early vision that safeguards the epistemic role of perception by making it keep in contact with the world and be reliable. If, as Siegel (2011, 2013a) and Lyons (2011) concur, what is important is (a) that perception should be sensitive to the facts if it is to be reliable and capable to play its justificatory role, and (b) that as long as CP does not diminish, or increases the sensitivity to the facts, CP is an epistemically speaking virtuous phenomenon, the sensitivity to the facts should be singled out because it is this cause that, at a last analysis, ensures the reliability and safeguards the justificatory role of perception. I have claimed that perception is sensitive to the facts because early vision, the set of processes that retrieve information from the environment, is CI. This means that the causal role of cognitive states cannot create the information that gets in since it does not affect the retrieval process itself; it cannot put in the environment what is not there. It can only, and sometimes as in cases of pre-cueing does, select what information gets in, of course, and, thus, could shape the iconic image, but as I have argued the effects of this selection can be altered by various means, mitigating thus the cognitive effects.

244     A. Raftopoulos

This shows, against Lyons’ (2011, 297, 303–304) claim that the locus of CP does not have any significant epistemic importance and that CP is equally worse when it occurs in early vision as when it occurs in late vision, that the locus of CP is epistemically speaking very important. The CP of late vision can be mitigated exactly because early vision is CI. Were early vision CP, then, for the reasons brought to the forth by constructivism, the justificatory role of perception would be irrevocably undermined.

4 Concluding Discussion The main conclusion of this chapter is a twofold thesis. First, since not all cognitive effects on perception affect its epistemic role, cognitive influences on perception should count as instances of CP only if they affect the justificatory role of perception in grounding perceptual beliefs; they could either enhance this role, as in the cases of familiarity or perceptual expertise, they could undermine this role as when they create a confirmation bias, or they could just support this role, as in the cases of regular visual encounters where viewers recognize objects even though they do not have any expertise. I claim that when cognitive states affect perceptual processing directly or online, as in late vision, they affect its epistemic role and should be deemed cases of CP. When the cognitive states affect perceptual processing indirectly, as in early vision, they do not affect the epistemic role of perception and should not be considered cases of CP because CP was originally, and is still widely, thought to affect perception in such a way as to affect negatively its epistemic role. Second, even when perception is CP and the CP downgrades its epistemic role, the epistemic effects of CP are not intractable and nonreversible; they can be mitigated and this bars epistemological constructivism’s skepticism concerning perception. The reason why they can be mitigated is related to the way the visual perceptual system operate, and in particular to the fact that the processes of late vision that are CP include a phase in which various hypotheses concerning the identity of an object in the perceived visual scene are tested against information,

4  The Cognitive Effects on Early and Late Vision …     245

which, because it was retrieved in early vision, is not affected by any cognitive influences; what is in the environment will get in independent of any cognitive states. The indirect cognitive effects on early vision do not impact on perceptual justification because the indirect cognitive effects on early vision do not affect perceptual processing itself. Specifically, the fact that the processes of early vision that retrieve information from the environment are not cognitively affected in a direct way entails that early vision retrieves all information contained in the visual scene. This information, which I call the iconic image, is available to late vision so that it can both form hypotheses concerning the identity of objects in the visual scene, in conjunction with various beliefs that are brought to bear to allow object recognition, and test these hypotheses against the information contained in the iconic image. If early vision retrieves all relevant information from the visual scene, and this includes information that may either confirm or disconfirm a hypothesis about the identity of an object in the visual scene, the epistemic role of early vision is neutral. The epistemic role of perception is mainly determined by late vision, because it is in late vision that the percept is formed and the percept is the crucial factor in the epistemic role of perception. Late vision is undoubtedly CP since it is directly affected by cognitive states, which, thus, affect perceptual processing during late vision. If all hypotheses concerning object identity formed in late vision are tested against the iconic image formed in early vision, which retrieves all visual information contained in the visual scene unaffected by any direct cognitive influences, the following question emerges. Why is it that usually this testing is not characterized by any confirmation bias since the perceptual system searches the iconic image for either confirmatory or disconfirming clues for the tested hypothesis, while on other cases the perceptual system searches the image for confirmatory clues while it disregards any disconfirming evidence? In the former case, the CP of late vision does not downgrade epistemically perception, while in the latter case it does. The fact that the evidential basis, that is, the iconic image against which the hypotheses of late vision are tested contains all information present in the visual scene entails that when this basis is revisited

246     A. Raftopoulos

for whatever reason, any evidence that was initially disregarded in a case of a confirmation bias or of wishful thinking may be selected and processed just by refocusing cognitively driven attention (whether it be spatial-centered or feature/object attention) to another, initially neglected, part of the iconic image. This entails that upon such a revisiting the percept may change, which means that the viewer may come to realize that things are not as they seemed to be (she may, e.g., come to see a pair of pliers instead of a gun). It follows that the initial epistemic downgrade of perception owing to the CP of late vision, which allowed a prior belief or desire to affect perceptual processing in late vision, can be alleviated; the harmful epistemic effects of the CP of late vision can be mitigated owing to the fact that the iconic image delivered by early vision is not cognitively affected in a way that changes the processing in early vision. Thus, the epistemic downgrade of perception by CP, when it occurs, is neither systematic nor intractable, and this undercuts constructivism. One could draw a parallel between the role of early vision in forming the iconic image by retrieving directly information from the environment, a iconic image, which, by being unaffected by cognitive influences, is ‘theory-neutral’, and the role of the distal stimulus when cognitive effects involve external causal links. Recall that all definitions of CP exclude cognitive effects that operate through an external causal link from being cases of CP because in these cases cognition selects the stimulus that serves as input to perception, and CP is supposed to be about the possibility of having two different percepts while looking at the same stimulus. The reason for this is that if a difference in the percept comes from a difference in the stimulus, refocusing attention to the same stimulus would neutralize the cognitive effects. Something analogous occurs with the indirect effects on early vision. By not affecting perceptual processing online they do not shape the iconic image, which thus, contains all visual information in the distal stimulus. So, as the distal stimulus is available to attentional external refocusing and this mitigates the repercussions of the cognitive influences, so the iconic image is available to internal attentional refocusing and this mitigates the repercussions of CP.

4  The Cognitive Effects on Early and Late Vision …     247

An important and interesting problem that emerges from our attempt to explicate the notion of CP is that concerning the relation between the two clauses of the proposed definition of CP in Chapter 2, that is, the relation between the demand that CP occurs when cognition affects perception directly and the demand that CP occurs when cognition affects the epistemic role of perception. I think that there is a sort of a bootstrapping relation between the two clauses. It is true that an indirect cognitive effect on early vision does not threaten the epistemic role of early vision because cognition does not intervene in the process of retrieval of information from the environment and, thus, does not diminish the sensitivity of early vision to the environment because early vision does not use any cognitive information while it retrieves information from a visual scene; our discussion of the various pre-cueing effects suggests that much. It follows that the epistemic role of early vision is unaffected by cognition because early vision is not directly affected by cognition since the pre-cueing effects influence early vision indirectly. But one might wonder why the indirect cognitive effects do not entail that early vision is CP and the answer to this is that by being indirect they do not affect the epistemic role of early vision and the discussion concerning CP is philosophically interesting, as many philosophers have argued, only if the cognitive effects on perception undermine its epistemic role in grounding or justifying perceptual beliefs. These considerations, finally, bear directly on two claims concerning CP and its role in Epistemology and Philosophy of Science that were recently made by Lyons and Siegel. Lyons’ (2011, 297, 303–304) claim that the locus of CP does not have any significant epistemic importance and that CP is equally worse when it occurs in early vision as when it occurs in late vision, that the locus of CP is epistemically speaking very important. If the arguments put forth in this paper are sound, the locus of CP is crucial in determining the epistemic impact of CP; if early vision were CP, there would be no way to mitigate the harmful epistemic effects of CP, that is, the insensitivity to the data that CP may inflict on perception. Siegel (2016, 7) claims that even though selection effects and CP differ psychologically, her defense of the Simple Argument entails that

248     A. Raftopoulos

epistemically they are similar. CP per se has no special epistemological significance and, therefore, an important category in the psychology of perception is less important in the epistemology of perception. Selection effects, according to Siegel, refer to those cases in which some mental state of a subject makes a subject’s perception to select objects and properties for the subject to experience without necessarily influencing the phenomenal content of the subject’s experience. When CP occurs, on the other hand, the mental state influences the phenomenal content of the experience. The Simple Argument defended by Siegel concludes that perceptual experiences can be ill-founded (that is, downgraded) by mental states such as wishful thinking because (i) beliefs can be ill-founded by wishful thinking; (ii) wishful thinking is possible; and (iii) if wishful thinking can ill-found beliefs, wishful seeing can ill-found perceptual experiences. Siegel’s motive for claiming that CP has no special significance for the epistemology of perception is that both CP and selective effects can ill-found perceptual experiences. I think that Siegel is wrong even if one puts aside the objection that the distinction between selective effects and CP is not clear-cut because selection effects of information in the iconic image during late vision is also found in typical cases of CP. The source of Siegel’s problem can be found in the discussion concerning the putative CP of early vision and its epistemic repercussions. In a nutshell, the insensitivity to that data that results from selecting which objects and properties are perceived is different from the insensitivity to the data that would have resulted were early vision CP. This is so because, first, in the former case shifting attention can be controlled, while in the latter case it cannot (Raftopoulos 2006, 2009, 2015). As Siegel remarks [W]hen prior mental states influence what you look at or attend to, without influencing how things look to you when you see them, the result might seem to be a mere selection effect. If you want the Necker cube or the duck-rabbit to shift, you can make it shift by adjusting your focus to the relevant part of the figure, thereby affecting the contents of your experience. (Siegel 2013a, 717)

4  The Cognitive Effects on Early and Late Vision …     249

This focus adjustment guided by the viewer’s will cannot happen when attention is used online during late vision to test tentative hypotheses concerning the identity of the objects in the visual scene. This means, in turn, that while any harmful epistemic effects due to selection effects could be mitigated by such focus readjustments, the same cannot happen when CP is the source of the harmful epistemic effects. There is a second reason, related to the first one, that I think Siegel is wrong in assimilating the epistemic effects of object/feature selection with CP. As I mentioned above in discussing Lyon’s claim that the site of CP does not matter, if early vision were CP, there would be no way to mitigate the harmful epistemic effects of CP, that is, the insensitivity to the data that CP may inflict on perception. If this is correct, were early vision CP, this would severely undercut the possibility to meet the skepticist’s objection of the sort raised by constructivists in a way that the harmful epistemic effects of object/feature selection does not. This entails that CP is more fundamental for the epistemic role of perception than any selection effects.

References Attneave, F. (1971). Multistability in perception. Scientific American, 225, 63–71. Britz, J., & Pitts, M. (2011). Perceptual reversals during binocular rivalry: ERP components and their concomitant source differences. Psychophysiology, 48, 1489–1498. Driver, J., & Baylis, G. S. (1996). Eye-assignment and figure-ground segregation in short-term visual matching. Cognitive Psychology, 31, 248–306. Hochberg, J., & Peterson, M. A. (1987). Piecemeal organization and cognitive components in object perception. Journal of Experimental Psychology: General, 116, 370–380. Kornmeier, J., & Bach, M. (2009). Object perception: When our brain is impressed but we do not notice it. Journal of Vision, 9(1), 1–10. Lyons, J. (2011). Circularity, reliability, and the cognitive penetrability of perception. Philosophical Issues, 21, The Epistemology of Perception, 289–311.

250     A. Raftopoulos

Peterson, M. A., & Gibson, B. S. (1994). Must figure-ground organization precede object recognition? An assumption in peril. Psychological Science, 5, 253–259. Peterson, M. A., & Hochberg, J. (1983). Opposed set-measurements procedure: A quantitative analysis of the role of local cues and intention in form perception. Journal of Experimental Psychology: Human Perception and Performance, 9, 183–193. Pitts, M., Nerger, J., & Davis, T. J. R. (2007). Electrophysiological correlates of perceptual reversals for three different types of multistable images. Journal of Vision, 7(1), 1–14. Raftopoulos, A. (2006). Defending realism on the proper ground. Philosophical Psychology, 19(1), 1–31. Raftopoulos, A. (2009). Cognition and Perception: How Do Psychology and Neural Science Inform Philosophy? Cambridge: MIT Press. Raftopoulos, A. (2011). Ambiguous figures and representationalism. Synthese, 181, 489–514. Raftopoulos, A. (2014). The cognitive impenetrability of the content of early vision is a necessary and sufficient condition for purely nonconceptual content. Philosophical Psychology, 27(5), 601–620. Raftopoulos, A. (2015). The cognitive impenetrability of perception and theory-ladenness. Journal of General Philosophy of Science, 46(1), 87–103. Siegel, S. (2011). Cognitive penetrability and perceptual justification. Nous, 46, 201–222. Siegel, S. (2013a). The epistemic impact of the etiology of experience. Philosophical Studies, 162, 697–722. Siegel, S. (2013b). Can selection effects influence the rational role of experience? In T. Gelder (Ed.), Oxford Studies in Epistemology (Vol. 4, pp. 240– 270). Oxford: Oxford University Press. Siegel, S. (2016). How is wishful seeing like wishful thinking? Philosophy and Phenomenological Research. https://doi.org/10.1111/phpr.12273.

5 Early and Late Vision: Their Processes and Epistemic Status

1 Introduction Helmholtz (1878 [1925]) thought that perception is a sort of inference in that the brain uses probabilistic knowledge-driven inferences to induce the causes of the sensory input from this input. Perception extracts from the effects of the light emanating from the objects in a visual scene as it impinges on transducers (the retina) the various aspects of the world that cause the input. The brain computationally integrates the retinal properties of the image of an object projected onto the retina with other relevant sources of information to determine the properties of the object. In a similar vein, Rock (1983) claims that perceptual systems combine inferentially information to form the percept; using visual angle and distance information, for example, the perceptual system infers and, thus, perceives size. This inference may be automatic and outside the authority and control of viewers, but is an inference nevertheless. Similarly, Spelke (1988) suggests “perceiving objects may be more akin to thinking about the physical world than to sensing the immediate environment.” This is so because the perceptual system, to solve the © The Author(s) 2019 A. Raftopoulos, Cognitive Penetrability and the Epistemic Role of Perception, Palgrave Innovations in Philosophy, https://doi.org/10.1007/978-3-030-10445-0_5

251

252     A. Raftopoulos

underdetermination problem of both the distal object from the retinal image and of the percept from the retinal image, employs a set of object principles that reflect the physics of our environment. Since the principles could be construed as some form of knowledge about the world, perception engages in inferential processes from some pieces of worldly knowledge and visual information to the percept. Recently Clark (2013) argued that To perceive the world just is to use what you know to explain away the sensory signal across multiple spatial and temporal scales. The process of perception is thus inseparable from rational (broadly Bayesian) processes of belief fixation … As thought, sensing, and movement here unfold, we discover no stable or well-specified interface or interfaces between cognition and perception. Believing and perceiving, although conceptually distinct, emerge as deeply mechanically intertwined.

The aim of this amalgam of faculties that together constitute perception is to enable perceivers to respond to environmental signals, modify when needed their responses, and eventually adapt their responses as they interact with the environment, so as to tune themselves to the environment in a way that increases the likelihood of success in our dealings with the world. Success relies on inferring correctly (or nearly so) the nature of the source of the incoming signal from the signal itself. In all these views, the visual system constructs the percept in a way similar to the way in which thinking constructs new thoughts on the basis of thoughts that are already entertained. Hence, like thought, vision is a cognitive, thought-involving process. If perception is some sort of thinking, its processes must necessarily include transformations of states that are expressed in symbolic or propositional form, and these transformations must be inferences from some states that function as premises to a state that functions as the conclusion; visual processes are inferences or arguments as are the processes of rational belief formation. These two conditions follow directly from the claim that perception is some form of thinking, since the characteristic trait of thinking is drawing deductive, abductive, or inductive inferences operating on symbolic forms by means of inference rules that are represented in the system,

5  Early and Late Vision: Their Processes and Epistemic Status     253

although thinking is not reduced to drawing inferences this way. It follows that the principles guiding the transformations of perceptual states, that is, the principles acting as the inference rules in perceptual inferences, should be expressed in the system and must be represented in a symbolic form. Whenever the system needs some principle to draw an inference it activates and uses this principle, a process which most of the time is outside the viewer’s awareness and control. In addition, the premises and the conclusion of a visual argument are represented in a propositional-like, symbolic form. If these conditions are met, perception involves discursive inferences, and draws propositions/conclusions from other propositions acting as premises. Clark’s view quoted above echoes this thesis since Clark conceives the processes of visual perception as a rational process of belief fixation. It follows that the inferences used in perception are not very different from the inferences used in thought. A word of caution is needed here. The previous analysis assumes the standard view of the brain as a physical machine that processes symbols in a purely formal/syntactic way on the basis of the physical properties of the symbols; the brain performs digital computations. These symbols have meaning and so do the transformations of these symbols, but the processes in the brain are independent of any meaning. Otherwise put, the brain is a syntactic machine that processes symbols that have meaning. The standard view can be modified by adding the thesis that digital computations are not merely formal syntactic manipulations but also involve semantics, that is, the contents of the states that participate in computations are causally relevant in the production of the computations’ outputs (Rescorla 2014). Although this is the standard, algorithmic view of cognition, it is no means unequivocally endorsed. There is another, competing, view of cognition, according to which the brain is not a syntactic machine that processes symbols through algorithms. Instead, the brain represents information in a nonsymbolic, analogue-like form, as activation patterns across a number of units. Furthermore, the processes in the brain do not assume the form of algorithmic but of algebraic transformations; this is the connectionist view of cognition, of which Clark is a firm proponent. In this view of cognition, the brain does not use discursive

254     A. Raftopoulos

inferences, although some of its behavior simulates the usage of discursive inferences. Therefore, Clark’s thesis that perception is inseparable from the rational processes of belief-fixation does not commit him to the view that perception employs discursive inferences because thinking itself does not implicate such inferences. In what follows, I assume the standard view of the brain since it is the view espoused by most if not all the empirical studies examined in this book. Given the propositional/symbolic form of the format in which the states of the visual system must be represented if vision is akin to thinking, the contents of these states, the information carried by the states, consists of concepts that roughly correspond to the symbols implicated. It is, thus, conceptual content, which means that if vision is some sort of thinking its contents must be conceptual contents. This entails that either the visual circuits store conceptual information that they use to process the incoming information, or they receive from the inception of their function such information from the cognitive areas of the brain. Spelke thinks that the principles that guide visual processing to form the percept are examples of conceptual content. Against these views, Sellars (1954, 29–30) claimed that “Having sensations is having causes of judgments not reasons for judgments … having sensations is not knowing premises from which one draws inferences.” I agree with Sellars and examine the processes that occur in early and late vision and address the problem of the inferences involved in these two stages of visual processing. I argue first that the processes of early vision are not discursive inferences because discursive ­inferences involve concepts and early visual contents have Non-Conceptual Concept (NCC) and because discursive inferences presuppose that the inference rules are represented in the system and used by it whenever necessary but early vision does not represent anywhere such rules. I also examine whether late vision should be construed as a properly speaking perceptual stage, or as a thought-like discursive stage. I argue that late vision, its (partly) conceptual nature notwithstanding, neither is constituted by, nor does it implicate, pure thoughts, that is, propositional structures that are formed in the cognitive areas of the brain through, and participate in, discursive reasoning and inferences. It follows that the processes of late vision are not discursive inferences.

5  Early and Late Vision: Their Processes and Epistemic Status     255

They are abductive, probably Bayesian, inferences that exemplify pattern matching. The early visual input to late vision, which is constituted by the NCC of perception, elicits or prompts the application of concepts in a way typical of pattern matching processes. In this sense, the concepts are applied directly and without any inferential mediation to the phenomenal content of perception. In so far as the application of concepts leads to the formation of perceptual beliefs, my claim is that these beliefs are directly caused by the NCC of perception and the relevant semantic content. As Sellars has repeatedly argued, however, these beliefs are not caused by the contents of perception in the absence of any background knowledge, or as Sellars calls them ‘presuppositions’. In late vision tentative hypotheses concerning the identity of an object in the visual scene are formed and tested and one hypothesis is selected and the corresponding perceptual belief, which usually has the form ‘That O is F’ is formed. The construction of the tentative hypotheses and their testing, and, in fact, perceptual processes at all levels, are guided by many assumptions about the environmental objects, their properties, and their behavior; these form the body of presuppositions that guises the formation of perceptual thoughts. These are none other than the operational constraints discussed in the previous chapter. As the reader recalls, however, most likely these are not representational states and, therefore, cannot act as premises in inferences. The output of late vision, usually an explicit belief about the identity and category membership of an object (a recognitional belief ) or its features, eventually enters into discursive reasoning. Using Jackendoff’s (1989) distinction between visual awareness, which characterizes perception, and visual understanding, which characterizes pure thought, I claim that the contents of late vision belong to visual awareness and not to visual understanding and that although late vision implicates beliefs, either implicit or explicit, these beliefs are hybrid visual/ conceptual constructs and not pure thoughts because they are accompanied constitutively by some sort of phenomenal content/character. Distinguishing between these hybrid representations and pure thoughts and delineating the nature of the representations of late vision lays the ground for examining, among other things, the process of conceptualization that occurs in visual processing and the way concepts modulate

256     A. Raftopoulos

perceptual content affecting its representational content. I hasten to note that I do not discuss the epistemological relations between the representations of late vision and the perceptual judgments they “support” or “evidence” or “entitle.” However, the specification of the nature of late vision lays the ground for attacking that problem as well. Before proceeding let me remind the reader that in previous work (Raftopoulos 2009, 2014) I argued that a state with NCC does not have a propositional content and that two states cannot have the same content and one has NCC and the other conceptual content. In this chapter I assume both theses. I also assume that part of the NCC is content at the personal level and that one has phenomenal awareness of that content.

2 Early Vision Since I discussed early vision in previous chapters, I only restate some basic facts. The early vision includes a feedforward sweep (FFS) in which signals are transmitted bottom-up. In visual areas (from Lateral Geniculate Nucleus [LGN] to IT and Front Eye Fields [FEF]) FFS lasts for about 100 ms. It also includes a stage at which lateral recurrent processes that are restricted within the visual areas and do not involve signals from cognitive centers occur. Local recurrent processing (LRP) starts at about 80–100 ms and culminates at about 120– 150 ms. The unconscious FFS extracts high-level information that could lead to categorization, and results in some initial feature detection. LRP produces further binding and segregation. By not involving signals from the cognitive areas of the brain, FFS and LRP are cognitively impenetrable/conceptually encapsulated, since the transmitting of signals within the visual system is not affected by top-down signals produced in cognitive areas. Early vision processing is not affected directly by top-down signals from cognitive states through attention— that is, attention does not affect the early visual processes although it may affect pre-perceptual and post-perceptual stages of vision. As we saw in Chapter 2, this leads to the thesis that early vision has NCC,

5  Early and Late Vision: Their Processes and Epistemic Status     257

provided that concepts do not figure inherently in the perceptual system, a possibility that I also rejected in Chapter 3 (see also Raftopoulos 2009, 2015a). The processes of early vision retrieve from the environment the information that allows perception to perceive a visual scene with as much accuracy as possible. In order to do so, early vision gradually constructs representations of increasing complexity. All these constructions of representations are in essence inferences from one set of information to another. If, as I have argued, early vision is Cognitively Impenetrable (CI) and conceptually encapsulated, one might wonder about the sort of the inferences performed in early vision. Since early vision does not involve any concepts, the inferences that take place in early vision cannot be discursive inferences. Current research (see Clark [2013] for a discussion) sheds light on the nature of inferences implicated in visual perception in general and in early vision in particular. Specifically, the top-down and lateral effects within early vision aim to test hypotheses concerning the putative distal causes of the sensory data encoded in the lower neuronal assemblies in the visual processing hierarchy. This testing assumes the form of matching predictions made on the basis of this hypothesis about the sensory information that the lower levels should encode assuming that the hypothesis is correct, with the current, actual sensory information encoded at the lower levels; the hypothesis that best matches the sensory data is selected. The whole process of hypothesis selection can be construed as an abductive inference or inference to the best explanation, which could very well be carried through by Bayesian nets. One should note that this account of early vision shows that the standard constructivist theories of visual processing can be reconciled and benefit from the recent conceptions of the brain as a generative, predictive machine. Recent empirical findings and modeling shed light on the way the brain may realize these processes. This proposal entails some deviations from the traditional constructivism image of the visual brain, which concern (a) the sort of information transmitted bottom-up; in the new framework only prediction errors are transmitted to the next level,

258     A. Raftopoulos

(b) the nature of the representations constructed; they are distributions of probabilities rather than having a unique value (note that this new approach emphasizes the indispensable role of representations in visual processing), and (c) the interaction between perception and cognition. This last trait is very important and has important repercussions for our discussion on the relation between perception and cognition. According to this view of visual perception, brains are predictive machines. They are bundles of cells that support perception and action by constantly attempting to match incoming sensory inputs with top-down expectations or predictions. This is achieved using a hierarchical generative model that aims to minimize prediction error within a bidirectional cascade of cortical processing. (Clark 2013)

According to the hierarchical generative model of visual processing, the brain uses the top-down flow of information (enabled by top-down neural connections) in an attempt to generate a visual representation of the visual scene that causes the light pattern impinging on the transducers and the low-level visual responses to this light pattern. The brain attempts to recover gradually the various aspects of a visual scene that cause and, thus, are responsible, for the retinal image seen as a data structure (i.e., the sensory data). The brain achieves this by capturing the statistical structure of the sensory data, that is, by discovering the deep regularities underlying the retinal structure, on the very plausible assumption that the deep structure underneath the sensory data reflects the causal structure of the visual scene. To do this, hierarchical generative models construct at each level hypotheses about the probable cause of the information represented in the immediately preceding level, and testing these hypotheses by matching their predictions with the actual sensory data at the preceding processing level. To form hypotheses concerning the probable cause of the sensory data at a certain level, at a specific spatial and temporal scale, the neuronal assembly at level l uses information not only about the sensory data at the previous level that is transmitted bottom-up,

5  Early and Late Vision: Their Processes and Epistemic Status     259

but also other information that is transmitted to l laterally, that is, from neuronal assemblies at the same level (neurons in V1 processing wave-lengths inform other neurons in V1 processing shape information), or top-down from levels higher in the hierarchy (neurons in V4 are informed about color from neurons in IT as a result of pre-cueing). In addition, this higher-level information is about general aspects of the world (such as “solid objects do not penetrate each other”, or “solid objects do not occupy exactly the same space at the same time”, etc.), and may also reflect knowledge about conglomerations of properties of specific objects learned through experience (see Chapter 3). This lateral and top-down flow of information provides the context in which each neuronal assembly constructs the most probable hypothesis that would explain the sensory data at the lower level. Context-sensitivity is a characteristic trait of the processing of hierarchical predictive coding; the contextualized information significantly affects, and on occasions (as in hallucinations) overrides, the information carried by the input. There seems to be a crucial discrepancy between the account of early vision presented here and Clark’s account of generative hierarchical predictive models that has implications for the nature of the contents of early vision. It concerns the role of context, or previously acquired knowledge, in the formation of the working hypotheses and its direct consequence that because of the role of this context visual perception and thinking are inseparable, which means, among other things, that early vision involves concepts, which are the hallmark of thinking. If early vision is restricted to processes occurring within the visual cortex and excludes any cognitive influences, then, first, previous knowledge seems to play no role in the formation of the working hypotheses, and, second, early vision does not involve any thinking since the latter requires the participation of the cognitive centers of the brain. Moreover, the representations in early vision are analogue-like, iconic and not symbolic and this entails that early vision cannot be some sort of discursive thinking since the latter operates on symbolic forms. Even if one objects that the sort of thinking that forms an inextricable link with visual perception is not discursive, a view that Clark would probably endorse, any sort of thinking involves concepts or

260     A. Raftopoulos

concept-like analogues that afford the viewer certain capabilities (such as recognitional capacities). Early vision, however, affords no such capacities and this indicates that it does not involve concepts or any other concept-like entities, making early vision distinctly different from thinking. With respect to the first point, there is actually no real discrepancy. Recall that lateral and local recurrent processes play a fundamental role in the formation of the representations constructed in early vision. Moreover, as we have seen (see also Raftopoulos 2009, 2015a), all visual processes including those of early vision, are constrained by certain principles that reflect general regularities about the world and its geometry. Now, one could say that these constraints constitute a body of knowledge that informs early vision processing and affects early vision from the within and not in a top-down manner, since as we saw there are no cognitive top-down effects in early vision. This is misleading because as we have seen these constraints do not constitute some form of knowledge that by affecting early vision renders it theory-laden, as Clark claims. Finally, as we saw in Chapter 3 (see also Raftopoulos 2015a), early vision is also affected by associations of object properties that reflect statistical regularities in the environment and are stored in the early visual circuits through perceptual learning. These associations do not constitute a body of knowledge, which as such contains concepts, that affects early vision rendering it theory-laden and conceptually structured. The lateral and local recurrent processes, the constraints, and the associations build in the early visual circuits constitute a rich context that contributes to the formation of the hypotheses that early vision constructs to explain the sensory data at the lower processing levels. This context, however, does not involve any body of knowledge that renders perception theory-laden. As far as the second point is concerned, there is indeed a discrepancy between my account of early vision and Clark’s views. Early vision, by being CI and conceptually encapsulated does not involve thinking or any other sort of inference that operates on conceptually structured entities and, thus, is radically different from thinking. In fact, not even late vision that involves concepts and is affected by

5  Early and Late Vision: Their Processes and Epistemic Status     261

the viewers’ knowledge about the world is like thinking. Another discrepancy between Clark’s views and mine concerns the nature of the bottom-up signals used in perception. According to Clark, these carry only prediction-error information, whereas I hold that bottom-up signals transmit to the higher-levels information about the results of their processing as well. As we shall see next, in late vision the synergy of bottom-up visual and top-down cognitive processing entails that late vision inextricably involves cognition and this may taken to mean that as far as late vision is concerned Clark’s view are vindicated. We shall see, however, that this is not the case in late vision either. As I said before, testing hypotheses and altering them as a result of any prediction errors until the prediction error is minimized and, thus, until the most probable cause of the sensory data has been discovered, is an inference. Being a probabilistic inference that aims to discover the most probable hypothesis that explains away a set of data, it is very plausible that the computational framework of hierarchical predictive processing realizes a Bayesian inferential strategy. Late vision, like early vision, most likely employs Bayesian inferential strategies, in which, however (unlike early vision) participate concepts. I will argue next that these Bayesian inferences are radically different from discursive thinking.

3 Late Vision The conceptually1 modulated stage of visual processing is ‘late vision’. Starting at 150–200 ms, signals from higher executive centers including mnemonic circuits intervene and modulate perceptual processing in the visual cortex and this signals the beginning of global recurrent processing (GRP). In 50 ms low spatial frequency (LSF) information reaches the IT and in 100 ms high spatial frequency (HSF)

1Concepts

are constant, context independent, and freely repeatable elements that figure constitutively in propositional contents; they correspond to lexical items.

262     A. Raftopoulos

information reaches the same area (Kihara and Takeda 2010). (LSF signals precede LSF signals. LSF information is transmitted through fast magnocellular pathways, while HSF information is transmitted through slower parvocellular pathways.) Within 130 ms poststimulus, parietal areas in the dorsal system but also areas in the ventral pathway (IT cortex) semantically process the LSF information and determine the gist of the scene based on stored knowledge that generates predictions about the most likely interpretation of the input, even in the absence of focal attention. This information reenters the extrastriate visual areas and modulates (at about 150 ms) perceptual processing facilitating the analysis of HSF, by specifying certain cues in the image that might facilitate target identification (Barr 2009; Kihara and Takeda 2010; Peyrin et al. 2010). Determining the gist may speed up the FFS of HSF by allowing faster processing of the pertinent cues, using top-down connections to preset neurons coding these cues at various levels of the visual pathway (Delorme et al. 2004). Thus, at about 150 ms, specific hypotheses regarding the identity of the object(s) in the scene start to be formed using HSF information in the visual brain and information from visual working memory (WM) (Barr 2009; Kosslyn 1994). These hypotheses are tested against the detailed iconic information stored in early visual circuits including V1. Event Related Potential’s (ERP) waveforms that distinguish scenes and objects in object recognition tasks are registered at about 150 ms in extrastriate areas and are thought to be early indices of P32 (Fabre-Thorpe et al. 2001; Johnson and Olshausen 2005). This testing requires that top-down signals reenter the early visual areas of the brain, and mainly V1. Evidence shows that V1 is reentered by signals from higher cognitive centered mediated by the effects of object/ feature-centered attention at 235 ms poststimulus (Chelazzi et al. 1993; Roelfsema et al. 1998). This leads to the recognition of the object(s) in the visual scene. This occurs, as signaled by the P3 ERP waveform, at 2The P3 waveform is elicited about 250–600 ms and is generated in many areas in the brain and is associated with cognitive processing and the subjects’ reports. P3 may signify the consolidation of the representation of the object(s) in working memory.

5  Early and Late Vision: Their Processes and Epistemic Status     263

about 300 ms in the IT cortex whose neurons contribute to the integration of LSF and HSF information. Kosslyn (1994) offers a detailed analysis of the process of hypothesis testing. Even though one may not subscribe to some of the assumptions presupposed by Kosslyn’s account (see Raftopoulos 2010 for criticism), any disagreements do not undermine the framework. Upon viewing an object, a retinotopic image is formed in the visual buffer, which is a set of visual areas in the occipital lobe that is organized retinotopically. An attentional window selects the input from a contiguous set of points for detailed processing. This is allowed by the spatial organization of the visual buffer. The information in the attention window is sent to the dorsal and ventral system where different features of the image are being processed. The ventral system processes the features of the object, whereas the dorsal system processes information about the location, orientation, and size of the object. Eventually, the shape, the color, and the texture of the object are registered in anterior portions of the ventral pathway. This information is transmitted to the pattern activation subsystems in IT where the image is matched against representations stored there, and the compressed image representation of the object gets activated. This representation (in effect, an hypothesis about the identity of an object) provides imagery feedback to the visual buffer where it is matched against the input image to test the hypothesis against the fine pictorial details registered in the retinotopical areas of the visual buffer. If the match is satisfactory, the category pattern activation subsystem sends the relevant pattern code to associative memory or WM, where the object is tentatively identified with the help of information about size, location, and orientation arriving at the WM through the dorsal system. On certain occasions, the match in the pattern activation subsystems is enough to select the appropriate representation in WM. On other occasions, the input to the ventral system does not match well with a visual memory in the pattern activation subsystems. Should this happen, a hypothesis is formed in WM. This hypothesis is tested with the help of other subsystems (including cognitive ones) that access representations of such objects and highlight their more distinctive feature.

264     A. Raftopoulos

The information gathered shifts attention to a location in the image where an informative characteristic can be found. The attention window zooms on the object’s distinctive feature, and the pattern code for it is sent to the pattern activation subsystem and to the visual buffer where the second cycle of matching commences. ERP experiments registering the time onset of various waveforms related to specific processes in the brain largely confirm this analysis. The N1 ERP component that signifies cognitively driven spatial attentional effects on the extrastriate cortex is registered at about 170–200 ms; by 170 ms spatial attention directly modulates visual processing. However, cognitive top-down modulation of the extrastriate cortex, mainly V4, from the IT and parietal cortex is found as early as 150 ms, which is the first sign of the process of object identification. Eventually there is considerable competition since only a few items can enter in interactions with the higher hierarchically processing levels. Further selection becomes necessary when several stimuli reach the brain but only one response is possible. Attentional selection intervenes to resolve this competition. The selection results from the combination of bottom-up information processing with WM and long-term memory (LTM) that recover the meaning of input and relate it to the subject’s current goals. In the biased competition account of attention (Desimone and Duncan 1995), attention is the competition between neuronal populations that encode environmental stimuli. All the stimuli in a visual scene are initially processed in parallel and activate neuronal assemblies that represent them. These assemblies eventually engage in competitive interactions for several reasons (when, for example, some behaviorally relevant feature or object must be selected among all present stimuli). Recurrent interactions with areas outside the visual stream make storage in visual WM possible and give rise to Global Recurrent Processing (GRP). In GRP, standing knowledge, that is, information stored in the synaptic weights of the neurons is activated (becoming part of WM) and modulates visual processing, which up to that point was

5  Early and Late Vision: Their Processes and Epistemic Status     265

conceptually encapsulated. Consequently, during GRP the conceptualization of perceptual content starts and the states formed during this stage have (perhaps partly) conceptual and eventually propositional contents. This is the stage where the 3D sketch is formed, since the recovery of the 3D sketch, that is, the representation of an object independently of the viewer’s perspective, cannot be the output of early vision. This recovery cannot be purely data-driven, since what is regarded as an object depends on the subsequent usage of the information, and thus depends on the knowledge about objects. It follows that the formation of the 3D sketch requires constitutively the application of concepts.3 Seeing 3D sketches of objects is an instance of amodal perception, i.e., the representation of object parts or features that are not visible from the viewer’s standpoint. Thus, late vision involves a synergy of perceptual bottom-up p and top-down processing where knowledge from past experiences guides the formation of hypotheses about the identity of the visual objects. I should stress at this point that the previous account of how objects in a visual scene are recognized and categorized although predominant in most cases is by no means the only way objects could be recognized. As we saw in Chapter 3, on certain occasions object recognition can occur very fast without any conceptual involvement through purely perceptual processes, that is, through either solely feedforward or LRP processes (recall that in our discussion of early vision, I mentioned that FFS may lead to early categorizations). Stored associations of low-level properties of objects in the environment or the statistics of natural scenes extracted in early vision enable purely bottom-up processes to recognize objects in a visual scene. When scenes or faces, for example, are successively presented without any pre-cueing, participants can perceive them better than chance at a rate of 12 per second (Potter et al. 2014), which 3The

view that the formation of the viewer independent representations of objects relies on object knowledge is common in theories of the formation of the 3D viewer independent representation. Biederman (1987) thinks that object recognition is based on part decomposition, the first stage in forming a structural description of an object. This decomposition cannot be determined by general principles reflecting the structure of the world alone, since the decomposition appears to depend upon knowledge of specific objects.

266     A. Raftopoulos

means that each object is recognized in about 80 ms, a time interval that does not allow for any substantial LRP , let alone global recurrent processing but could be achieved only through bottom-up processes. This, however, does not detract from the main theses developed in this book in general and chapter in particular, namely that early vision is not conceptually modulated, whereas late vision is essentially so modulated. This sort of rapid categorization only shows that on certain occasions, probably for evolutionary reasons, concepts or concept-like entities could be activated very fast. There are two kinds of completion; modal and amodal completion. In modal completion one has a distinct visual impression of a hidden contour or other hidden features even though these features are not occurrent sensory features since they are not present in the visual scene. The perceptual system fills in the missing features, which become as phenomenally occurrent as the occurrent sensory features of the object. In amodal completion, in general, one does not have a perceptual impression of the hidden features of the object since the perceptual system does not fill in the missing features as in modal perception; the hidden features are not perceptually occurrent. There are cases of amodal perception that are purely perceptual, that is, bottom-up. In these cases, although no direct signals from the hidden features impinge on the retina (there is no local information available), the perceptual system can extract information about them from the global information in the visual scene without any cognitive involvement, as the resistance of the percepts to beliefs indicates. In such cases, the hidden features are not perceived. One has the visual impression of a single concrete object that is partially occluded and not the impression of various disparate image regions. Therefore, in these perceptually driven amodal completions there is no mental imagery involved, since no top-down signals from cognitive areas are required for the completion, and the hidden features are not phenomenologically present. There are also cases of amodal completion that are cognitively driven—C-completions (Briscoe 2011), such as the formation of the 3D sketch of an object, in which the hidden features of the object are represented through the top-down activation of the visual cortex. In some

5  Early and Late Vision: Their Processes and Epistemic Status     267

of these cases, however, top-down processes activate the early visual areas and fill in the missing features that become, thus, phenomenologically present. If one calls the top-down effects that originate from the activation of some concepts and affect the visual areas giving rise to some phenomenology ‘imagination’, then as Nanay (2010, 252) argues, amodal completion in certain cases is accompanied by some sort of phenomenology subserved by the activation of the early visual areas, whereby the hidden parts and features of an object are not merely believed in but are present in the object of perception as actualities by being imagined. In other cases of C-completion, the viewer forms a pure thought concerning the hidden structure in the absence of any activation of the visual areas and, thus, in the absence of mental imagery. Since the construction of the representations of the putative causes of the perceptual inputs in late vision takes place through the synergy of bottom-up processing transmitting information registered at the lower levels or prediction errors, and top-down processing transmitting information relevant to the testing of hypotheses concerning the probable causes of the input, and in so far as the processes constructing these hypotheses are informed by high-level knowledge about worldly objects, visual perception unifies cognition and late vision; these two become intertwined. This means that late vision inextricably involves cognition. Notice that this account of visual perception necessarily involves representations; it requires that each level retain a representation of the data represented at this level so that the topdown transmitted predictions of the hypotheses formed at subsequent higher levels be matched against the information represented at the lower level in order for the hypothesis to be tested, and also requires the representation of the putative causes of the sensory data at the preceding level; these are called the representation-units, which operate along the error units (the units that compute the error signal, that is, the discrepancy between prediction and actual data) in a hierarchical generative system. As I said before, testing hypotheses and altering them as a result of any prediction errors until the prediction error is minimized and, thus, until the most probable cause of the sensory data has been discovered,

268     A. Raftopoulos

is an inference. Being a probabilistic inference that aims to discover the most probable hypothesis that explains away a set of data, it is very plausible that the computational framework of hierarchical predictive processing realizes a Bayesian inferential strategy. Late vision, like early vision, most likely employs Bayesian inferential strategies, in which, however (unlike early vision) participate concepts. I will argue next that these Bayesian inferences are radically different from discursive thinking. Before I proceed, allow me to delve on “mental imagery,” since the way it is used may cause some confusion concerning the top-down processes in late vision. Imagery is central in Kosslyn’s (1994) account of object recognition. As we saw, Kosslyn thinks that visual imagery is involved in all cases of perception and covers all the top-down flow of information either from the associative areas of the brain or the pattern activation subsystems in the IT cortex. Strawson (1974) also holds that object recognition involves visual imagery. Discussions on amodal completion emphasize the role of imagery in completing the hidden features by representing them and occasionally making them phenomenologically present even though they are perceptually absent (Nanay 2010).4 In discussing late vision, I emphasized the role of top-down processes that are necessary for object recognition. Now, it is well known that many of the neural systems engaged in mental imagery are also actively involved in the formation of the percept, most notably the early visual areas. Since mental imagery is usually related to top-down processes, imagery could be assimilated to late vision, which involves top-down processes too. As mental imagery involves top-down activation of the visual areas, it is tempting to claim that the top-down processes in late vision are instances of visual imagery, especially so in the case of C-completion in which the object or feature that is represented through mental imagery is absent from the visual scene.

4The phenomenal/non-phenomenal distinction is orthogonal to the discussion on mental imagery since mental imagery, exactly like perception, can either be accompanied by consciousness, or it can be implicit (as in implicit perception).

5  Early and Late Vision: Their Processes and Epistemic Status     269

To decide the issue one should define mental imagery. Usually mental imagery is related to the mental construction of the image of an object or feature in its absence. The image formed from actual (perceptual) experience is called a percept to distinguish this image from an imagined or mental image. When a subject is asked to recall a visual object, the image formed in memory is called a mental image. The mental image is constructed via top-down processes (when, for example, subjects are presented with a lower case letter and are asked to form a mental image of the upper case letter, a task that is cognitively driven since it requires knowledge of the uppercase letter), while the percept is constructed through a synergy of topdown and bottom-up processes. Thus, mental imagery is usually construed as (i) involving only top-down cognitively driven processes, and (ii) taking place in the absence of the imagined object or feature from the agent’s objective visual field. This is how I use the term. (On a historical note, the distinction I have in mind corresponds to Hume’s (2003 [1739–1740]) distinction between ‘impressions’ and ‘ideas’, the former including perceptual experiences and the latter mental images.) Kosslyn (1994) and Strawson (1974), in contrast, use the term to designate the top-down processes in object recognition. Kosslyn talks about imagery feedback to the visual buffer both from the associative concept involving areas of the brain, and the pattern activation subsystems that Kosslyn thinks store nonconceptualized information. Therefore, mental imagery can be either cognitively driven or data-driven, which goes against the usual construal of mental imagery. Moreover, mental imagery is engaged in perceptual tasks of object recognition, which means that Kosslyn foregoes the second trait of mental imagery as well. Nanay (2010, 244–246, 250) uses visual imagery to account for cognitively driven amodal completion, and specifically, to designate the top-down knowledge-driven effects on visual processing. Mental imagery is perceptually and not propositionally coded, even though it may start with the activation of concepts in an associative

270     A. Raftopoulos

memory (Kosslyn 1994). However, the activation of the visual areas in a top-down manner in mental imagery is not the same as the activation of these same areas by the sensory signal. For example, the topdown induced activation in the absence of retinal input is weaker and, thus, the modal “mode” associated with mental imagery is not as strong or lively as in perception. Although it is true that when an object is imagined as opposed to merely thought about a number of properties must be added to the description, these properties fall far short of all those that would be present in perception. Not only some features may be omitted, but also precise iconic and metric information is lost in mental imagery. Since the concepts that activate the visual cortex represent abstract categorical information, such as bright red, and not the determinate color say red21 (which is why one cannot recall the determinate color of an object but only its category membership), not all visual details of the actual visual scene can be the contents of a state of visual imagery (Raftopoulos 2010). To put it differently, the contents of perception and imagination, although of the same kind (in that they are hybrid states involving iconic and conceptual contents) differ in degree in the sense that mental imagery is less determinate than perception, and may involve fewer abstractions than perception. This latter point needs some explication. Perceptual contents, owing to their analog character, represent not only determinate properties (say, scarlet) but also more determinable properties of the same kind (that is, being red or being colored). Thoughts do not have this property; the thought that “this is scarlet” does not represent the object demonstrated by ‘this’ as being red or colored too. Kulvicki (2007, 2015) refers to this property of analog representations as ‘vertical articulation’ and is a direct consequence of the main characteristic of analog representations, according to Dretske (1981), namely that an analog representation of “a is F”, unlike a digital/symbolic representation of it, always carries more information than simply the fact that a is F. In late vision, on the other hand, the presence of the visual object allows perceptual demonstratives to rely on the presence of the sample and overcome any conceptual limitations. Thus, in perception but not in mental imagery, the processes are causally driven by the

5  Early and Late Vision: Their Processes and Epistemic Status     271

stimulus, the trait par excellence of perception (Beck 2017; Burge 2010; Nanay 2015; Phillips 2017; Raftopoulos 2009, 2017). Another reason to be skeptical as to whether the cognitively driven top-down processes involved in late vision are those of mental imagery is that although areas involved in visual perception are also involved in mental imagery, the correspondence is not as net as it is usually thought. As Brogaard and Gatzia (2017, 197) remark [I]t was found that visual perception and visual imagery engage frontal and parietal regions in ways more akin to each other than the ways that they engage temporal and occipital regions. Since it is the occipital regions that process visual information and send it to the parietal and temporal regions, if visual experience were cognitively penetrable in the way described in Macpherson’s model, we would expect to find greater similarity in the occipital regions. These findings, however, suggest that at least some sensory processes are engaged differently by visual perception and visual imagery.

Macpherson’s (2012) model includes the view that cognition affects visual perception through mental imagery, and, thus, Brogaard and Gatzia argue against this thesis, in agreement with my view that the cognitive effects during late vision, which is a perceptual stage, are different from those engaged in mental imagery. Since late vision constitutively involves a synergy of bottom-up and top-down processing, whereas mental imagery, as I construe it, involves only top-down flow of information to early visual areas in the absence of sensory stimulation, I prefer (pace Kosslyn and Nanay) not to use “imagery” to designate the top-down activation of the visual cortex in late vision, even in those cases in which top-down processing completes hidden features of objects. Mental imagery differs from seeing in that it uses only the late processing components of the perceptual system when the early processing sensory-driven processes are unavailable (as when there is no sensory stimulation). Visual imagery activates the (inactive) visual processing areas to recreate to a certain extent a visual scene. As such, mental imagery, unlike late vision, involves only top-down

272     A. Raftopoulos

processes. Although in both cases the early visual areas are reentered from signals emanating from cognitive centers, in late vision the cognitive centers are activated through bottom-up signals from the visual cortex, while in visual imagery the cognitive centers are activated in the absence of any sensory stimulation on the retina. Thus, I think that the top-down processes in late vision should be distinguished from mental imagery in that the former are essentially engaged by the existence of sensory stimuli on the retina, whereas in the latter there are no sensory stimuli. Let me close this section by saying a few things about the representational contents employed in early and late vision. An iconic representation is a dense homogeneous representation that does not have an internal logical or formal canonical structure and, thus, it does not admit of a canonical decomposition. Perceptual representations are iconic and cannot recombine, whereas conceptual representations are discursive and can be recombined the right sort of way. The reason is that iconic representations have no canonical decomposition, that is, although they have interpretable parts, they have no constituent parts. Discursive representations, on the other hand, have canonical decomposition because they consist of distinguishable parts. Simply put, a representation is compositional if its syntactic structure is determined by the syntactic structure of its parts and the syntactic features that are used in the composition. Having syntactic structure means that some parts of the representation are constituents and others parts are not. “Φ”, for instance, is a constituent of the representation ‘Φ (a)’ but ‘Φ (’is not a constituent. In that sense, discursive structures are not homogeneous. Iconic representations, on the other hand, satisfy the Picture Principle, which states that if P is a picture of X, then parts of P are pictures of parts of X (Fodor 2007, 173). In that sense, iconic structures are homogeneous. But then, all the parts of a picture are among its constituents and, thus, an icon is compositional whichever way you curve it up, that is, no matter how you cut the picture you always get a picture of something. To appreciate the difference between iconic and discursive representations think of it in the following way: any part of the picture of the ocean is a picture of a part of the ocean, whereas not any part of the discursive representation Φ (a) is a discursive representation of a

5  Early and Late Vision: Their Processes and Epistemic Status     273

part of Φ (a). So pictorial representations are structurally different from conceptual, discursive representations. One might be tempted to think the discussion about the differences in format are situated at the syntactic level of analysis and, thus, are not directly applicable to the semantic level, the level of content. However, this would be misleading because it is very doubtful whether discussions about format or vehicles could be separated from discussions about contents since claims about differences in format usually go hand in hand with differences in content (Fodor 2007, Heck 2007; Phillips 2017; Raftopoulos 2014). Fodor’s discussion, for instance, relies heavily on the type of semantics that iconic and cognitive states have, that is, on the types of the contents they have. Complex iconic states have contents that are composed of parts but their meaning is not determined exhaustively by the meanings of the parts plus any rules of syntax, which is exactly what happens with cognitive/discursive states that have a compositional semantics. The reason is that the latter, but not the former, have a canonical decomposition. Rescorla (2009) discusses the view that perceptual states have a map-like iconic format, whereas, cognitive states have a discursive format. Research in perception supports the distinction between dense iconic perceptual representations and the categorical/symbolic representations used in WM and LTM and which support conceptual thought. Here I repeat the discussion in Chapter 2 omitting the references. Since attention is involved in Visual Short Term Memory (VSTM) and visual LTM memory, it seems that the attentional modulation of the output of early vision results not only in restricting the number of objects that can be held in memory (up to four objects), but also in impoverishing the information about those objects that are stored in WM. In general, it is thought that iconic representations are high-density representations in the order of 100,000 bits of information. In view of the fact that iconic representations concern proto-objects, it is thought that upon viewing a scene one perceives about 1000 proto-objects with the simple properties of proto-objects accounting for about 100–1000 bits of information making the total up to 100,000 bits of information. The representations in VSTM, on the other hand, have a much lower density, about 30–40

274     A. Raftopoulos

bits of information, with perhaps 10 bits per item and 3–4 bits for each feature represented. Coding of the content of iconic representations is done through basis functions. Iconic representations are modal (visual in this case) and represent by means of basis functions. A color, for instance, is represented by a vector or a pattern of activation values (scalars that represent the relative activity of red, green, and blue) across columns of neurons that distributively represent colors. Basis-functions seem to work at the early perceptual levels. VSTM coding is also done by means of basis functions but these basis functions are sparser. VSTM codes of colors, for example, concern categories like ‘blue’, ‘light’ etc., but they do not encode the fine color information regarding hues, intensities, etc., that is available to low-level color channels. Thus, information stored in VSTM does not allow the fine discriminations made available via lowlevel color channels and the representations in visual areas differ from the representations stored in VSTM. It is debatable whether the representations in visual LTM function as descriptors that code in a categorical all or nothing manner (for example something is red or not), or by means of sparse basis functions of the short used in VSTM. Evidence suggests that VSTM acts as a gate of visual information for visual LTM and that limitations in VSTM affect visual LTM. Visual LTM, therefore, cannot store information in a richer format than that of VSTM, although, of course, can store more information than VSTM. Specifically, although visual LTM can store thousands of images and has a massive storage capacity for image details, it is suggested that only 20 bits appear to be stored for each image.

4 Is Late Vision a Visual Stage or a Discursive Thought-Like Stage? 4.1 The Problem Jackendoff (1989) draws a distinction between visual awareness and visual understanding. He bases this on, among other things, the fact that phenomenologically speaking there is a qualitative difference

5  Early and Late Vision: Their Processes and Epistemic Status     275

between the experience of a 3D sketch and the experience of a 21/2D sketch. One is aware of the 3D sketch or of category-based representations, but this is not visual awareness but some other kind of awareness. Visual awareness is awareness of Marr’s 21/2D sketch, which is the viewer-centered representation of the visible surfaces of objects and in which the properties of objects retrieved from the environment figure in the phenomenal content of the experience. That is, the way the world is presented as being in perception. The awareness of the 3D sketch, in contradistinction, is visual understanding. The unseen surfaces that are not represented in the 21/2D sketch are represented in the 3D sketch of the object but they are not present in the phenomenal content of the experience; they are not strictly speaking seen. The 3D sketch is the result of an inference and the features of the object that belong only to the 3D sketch are visually understood but not perceived. It follows that amodal completion, in general, is an inference, which places Jackendoff’s views into the so-called belief-based account of amodal completion: the 3D sketch is the result of beliefs inferred from the object’s visible features and other background information from past experiences. Jackendoff’s distinction between visual awareness and visual understanding gives rise to the problem whether object identification and C-completion that occur in late vision and are both dependent on concepts should be thought of as cases of vision or as cases of discursive understanding involving inferences. If late vision necessarily involves concepts and if the role of concepts consists, among other things, in providing some initial interpretation of the visual scene and in forming hypotheses about the identity of objects that are tested against perceptual information, one is tempted to say that this stage and the hypothesis testing that occurs in it rely on inferences and, thus, late vision differs in essence from the purely perceptual processes of early vision. Perhaps it would be better to construe late vision as a discursive stage involving thoughts, in the way of Jackson’s (1977) epistemic seeing, where “seeing” is used in a metaphorical nonperceptual sense, as where one says of his friend whom she visited “I see he has left,” based on perceptual evidence. Dretske (1993, 1995) also likely thinks that seeing in the doxastic sense is not a visual but, rather, a discursive stage.

276     A. Raftopoulos

One might object that abandoning this usage of ‘to see’ violates ordinary usage. A fundamental ingredient of visual experience consists of meaningful 3D solid objects. If one adopts this proposal, then one should resist talking of seeing tigers and start talking about seeing viewercentered visible surfaces. “By this criterion, much of the information we normally take to be visually conscious would not be, including the 3D shape of objects as well as their categorical identity” (Palmer 1999, 649). The arguments to common language notwithstanding, I think that one should not assume that late vision is an inferential discursive stage that constitutively involves thoughts in the capacity of premises in inferences whose conclusion is the content of the states of late vision or that late vision consists in discursively entertaining thoughts. The reason is twofold. First, I think that seeing an object is not the result of an inference, that is, a movement in thought from some premises to a conclusion and, thus, a discursive process, even though it involves concepts. Second, late vision is a stage in which conceptual modulation and perceptual processes form an inextricable link that differentiates late vision from discursive stages and renders it a different sort of a set of processes than understanding, even though implicit beliefs concerning objects have a role to play in late vision by guiding the formation of hypotheses about object identity, and an explicit belief of the form “that O is F” eventually arises in the final stages of late vision. Late vision has an irreducible visual ingredient, which makes it different from discursive understanding. To put this is in a familiar setting; late vision does not belong to the realm of reasons. Before I discuss all these, let me clarify some terminological issues.

4.2 Beliefs Traditionally judgments are construed as occurrent states, whereas beliefs are dispositional states. To judge that O is F is to predicate F-ness to O, while endorsing the predication (McDowell 1994). To believe that O is F is to be disposed to judge, under the right circumstances, that O is F. This is the first sense in which beliefs are dispositional items. In addition, bearing in mind the distinction between standing

5  Early and Late Vision: Their Processes and Epistemic Status     277

knowledge—information stored in LTM—and information that is activated in WM, the belief that O is F may be a standing information in LTM, a memory, because, say, one has seen O to be F in the past, even though presently one does not have an occurrent thought about O. Beliefs need not be consciously or unconsciously recalled or apprehended in order to be possessed by a subject, which means that beliefs are dispositional rather than occurrent items; this is a second sense in which beliefs are dispositional. Being dispositional means that they can be activated when their content is activated, owing, for example, to a certain perceptual input. In this sense, one could say that the beliefs qua standing knowledge are implicitly grasped. When this information is activated, the thought that O is F emerges in WM. In the literature one finds the distinction between ‘thought’ and ‘standing knowledge’ (Prinz 2002, 148). Accordingly, all thoughts are occurrent states by being activated in WM. Thus, I use ‘occurrent thought’ and ‘thought’ as synonymous. It follows that a belief qua dispositional state may be either a piece of standing knowledge, in which case it is dispositional in the sense that when activated it becomes a thought, or a thought that awaits endorsement to become a judgment, in which case the belief is dispositional in the sense that it has the capacity to become a judgment. In the first case, if beliefs are stored in LTM as standing knowledge and if thoughts are occurrent states, beliefs are not the same as thoughts although a belief when activated becomes a thought. (This distinction between ‘thought’ and ‘belief ’ is different from the distinction that holds on account of the fact that a belief is a typical way to entertain a thought, but there may be cases in which one holds a thought without believing it. In this case the belief is conceived as a mode in which a thought can be held rather than as a piece of standing knowledge; a thought could be the content of a desire etc.) In the second case, a belief is a thought held in WM, albeit one that has not been yet endorsed. There are interesting epistemological implications but they are irrelevant here. In what follows, I assume depending on the context that beliefs are either pieces of standing information, or thoughts that have not been endorsed and, thus, are not judgments. One might wonder how is it possible to

278     A. Raftopoulos

understand a belief as an occurrent thought that is not endorsed? An explanation has to wait until I have explained why late vision does not involve discursive inferences.

4.3 Inference My claim is that the processes in late vision are not inferential processes where “inference” is understood as discursive, that is, as a process that involves drawing propositions—conclusions from other propositions acting as premises by applying (explicitly or implicitly) inferential rules that are also represented. Inferences in thought are characteristic cases of discursive inferences. The inferences characterizing discursive reasoning are also called belief-like inferences, or cognitive inferences in that they require the presence of conceptual representations that encode the processing rules and the premises used in the inference (Hatfield 2002). Boghossian (2014) thinks that discursive inferences must satisfy the taking condition, namely, that inferring necessarily involves the thinker taking his premises to support the conclusion and drawing this conclusion because of this fact. All these put together entail that in the space of reasons, inferences provide reasons, the premises, for believing a proposition; the premises of the inference constitute an epistemic ground for the conclusion, and this presupposes: (a) that there is a semantic and a logical relation between the contents of the premises in the inference and the content of the conclusion; (b) that the cognizer draws the conclusion because of the semantic and logical relation, which means that the cognizer operates upon the information provided by the premises and uses the form of the inference to draw the conclusion; and (c) as a consequence of the second, that the cognizer represents implicitly or explicitly both the premises and the rule of inference Discursive inferences are distinguished from“inferences” as understood by vision scientists according to whom any transformation of signals carrying information according to some rule is a form of inference.

5  Early and Late Vision: Their Processes and Epistemic Status     279

“Every system that makes an estimate about unobserved variables based on observed variables performs inference… We refer to such inference problems that involve choosing between distinct and mutually exclusive causal structures as causal inference” (Shams and Beierholm 2010). Burge (2014, 574) also claims that propositional inferences, that is, discursive inferences, differ from other types of transformations, such as those occurring within the visual system, and significantly (for reasons that will become clear in a while), such as those characterizing the transformations from perception to perceptual belief. In other words, Burge claims that the process that leads from the percept to the formation of a perceptual belief is not a discursive inference. It is also important to stress that Burge (2003, 2014) carefully distinguishes between perceptual beliefs and judgments that lead to the endorsement of these beliefs. There is another notion of inference that is more restrictive than the abovementioned, which however falls short of being a discursive inference. Cavanagh (2011) argues that the processes that lead to the formation of a conscious percept constitute “visual cognition” in virtue of using inferences. The construction of a percept is “the task of visual cognition and, in almost all cases, each construct is a choice among an infinity of possibilities, chosen based on likelihood, bias, or a whim, but chosen by rejecting other valid competitors” (Cavanagh 2011, 1538). This process is an inference in that “it is not a guess. It is a rule-based extension from partial data to the most appropriate solution” (ibid., 1539); the selection process is abductive. For Cavanagh (2011, 1545) for an inference to take place in the visual system, the system should not rely only on purely bottom-up analyses of the image that use retinal information, such as sequences of filters that underly facial recognition, or the cooperative networks that converge on the best descriptions of surfaces and contours. Rather, the visual system uses some object knowledge, which is non-retinal, context-dependent information. By ‘object knowledge’ Cavanagh means any kind of non-retinal information that may be needed for the filling in that leads to the construction of the percept. This knowledge consists in rules that guide or constrain visual processing in order to solve the underdetermination problem mentioned above; they provide the rule-based extension from partial data that constitutes an inference.

280     A. Raftopoulos

These rules do not influence visual processing in a top-down way, since they reside within the visual system; they are “from the side”. As we had the chance to see in Chapter 3, evidence shows that an important ‘body of information’ affects perceptual processing almost at every level not in a top-down manner but from within and this might be construed as evidence for the CP of visual perception from its inception. This body of information constitutes Raftopoulos’ (2009) ‘operational constraints’ or Burge’s (2010) ‘formation principles’. I discussed these constraints in detail in Chapter 3, and the conclusion drawn there was that even though the perceptual system uses the operational constraints to represent some entity in the world and, thus, operates in accord with the principles reflected in the constraints (since the constraints are hardwired in the perceptual system, physiological conditions instantiate the constraints), the perceiver does not represent these constraints in any form. Moreover, inferences presuppose that the subject applies explicitly or implicitly inferential rules that are represented in the subject. But the operations by means of which signals are transformed from one into the other in the visual system are not represented at all; they are just hardwired in the perceptual system. For this reason, perceptual operations should not be construed as inference rules, although they are describable in terms of inference rules. For this reason, perceptual operations should not be construed as inference rules, although they are describable as such, and they do not constitute either a body of knowledge or some theory about the world.

4.4 Late Vision, Hypothesis Testing, and Inference I hold the view that the states of late vision are not discursive or propositional inferences from premises that include the contents of early vision states, even though it is usual to find claims that one infers that a tiger, for example, is present from the perceptual information retrieved from a visual scene. A discursive inference relates some propositions in the form of premises with some other proposition, the conclusion.

5  Early and Late Vision: Their Processes and Epistemic Status     281

However, the objects and properties as they are represented in early vision do not constitute contents in the form of propositions, since they are part of the nonpropositional NCC of perception. In late vision, the perceptual content is conceptualized but the conceptualization is not a kind of discursive inference but the application of stored concepts to some input that enters the cognitive centers of the brain and activates concepts by matching their content, a view shared by Brossel (2017, 9–12), Gauker (2012, 44–45), Heck (2000, 511), and Millar (2011, 338), all of whom argue that in perception at some point concepts are immediately, that is, non-inferentially, applied in response to seeing something; a concept, for example, is applied this way to the visual object of an early vision representation as a response to this representation when global recurrent processing occurs in late vision and concepts are brought to bear on perceptual contents. I have argued elsewhere (Raftopoulos 2015b) that the processes by which concepts apply to, or are associated with, perceptual contents, may assume the form of pattern matching. Thus, even though the states in late vision are formed through the synergy of bottom-up visual information and top-down conceptual influences, they are not inferences from perceptual content. Late vision involves hypotheses about the identity of objects that are tested against the sensory information stored in the iconic image. One might conclude that this testing process involves inferences, since testing hypotheses is an inferential process even though it is not an inference from perceptual content to a recognitional thought. It is, more likely, an argument of the form if A and B then (conclusion) C, where A and B are background assumptions and the hypothesis about the identity of an object, respectively (‘A’ consists of implicit beliefs about the features of the hypothesized visual object), and C is the set of visual features that the object is likely to have. If C obtains in the visual areas, that is, if the predicted visual features match those that are stored in the iconic image, the hypothesis concerning the identity of the object is likely correct. However, the test-basis or evidence (whether the term ‘evidence’ can be used in this context has been discussed in Chapter 1) against which these hypotheses are tested for a match, that

282     A. Raftopoulos

is, the iconic information stored in the sensory visual areas, is not a set of propositions but patterns of neuronal activations whose content is nonpropositional. This matching process is not an inference or an inference-like process. It is a comparison between the activations of neuronal assemblies that encode the visual features in the scene and the activations of the neuronal assemblies that are activated top-down from the hypotheses. If the same assemblies are activated there is a match. If they are not, the hypothesis fails and should be rejected. The matching can be done through purely associational processes of the sort employed in connectionist networks that process information according to rules and, thus, can be thought of as instantiating processing rules, without either representing these rules or operating on language-like symbolic representations. Since discursive inferences are realized through the application of rules that are represented in the system and operate on symbolic structures, the processing in a connectionist network does not involve inferences, although it can be described in terms of inference making. Similarly, even though seeing an object in late vision involves the application of concepts that unify the appearances of the object and its features under some category, it is not an inferential process. The recognitional abilities manifested in late vision are not inferential, and the categorization of an object under a certain category is not the result of an inference; “Exercising an ability consists in an application of a concept in immediate response to seeing something” (Millar 2011, 338). The processes in late vision, despite their reliance on background beliefs, do not entail (in a logical sense) a recognitional belief. There is not a logical, inferential relation between the inputs to late vision, i.e., the contents of early vision including the phenomenal (NCC) content of perception and the ensuing perceptual belief. Thus, when a perceptual belief is formed, the relevant act is not an inference from some premises to a conclusion but the immediate or direct application of some concepts that are caused by the inputs to late vision, that is, the iconic contents of states of early vision, which as I have argued retrieve information directly from the visual scene. In this sense, one could agree with Coates (2007, 12) that “the concepts exercised by the perceiver in perceptual experience refer directly to the external physical

5  Early and Late Vision: Their Processes and Epistemic Status     283

objects perceived,” provided that one construes the ‘direct reference’ as the direct application of concepts without any inferential intermediary. To the extent that concepts are applied directly, in the sense explained above, the perceptual beliefs formed as a result of the application of concepts are ‘judgmentally direct’ and causally mediated by the phenomenal states, as Mackie (1976, 45) rightly observes, although he uses the term ‘judgment’ differently from the way I employ it here. Let me say a little more about the way conceptualization takes place revisiting Brossel’s (2017) account of the interaction between perception and cognition in the form of belief states, which we discussed in Chapter 2 when I presented an account of how cognition could interact with perception. Brossel aims at explaining how perceptual experiences could justify perceptual beliefs, that is, why viewers can reason rationally from perceptual experiences to perceptual beliefs, despite the fact that the former have NCC and, thus, nonpropositional structure, while the latter inherently involve concepts and, thus, have a propositional structure. To show how this is possible, Brossel examines the structure of both perceptual experiences and perceptual beliefs. Perceptual experiences (Brossel 2017, 9–10) are analyzed in terms of their position in phenomenal spaces. The color space, for example, represents the shades of colors viewers can experience by placing them in a geometrical space where each shade occupies a point and the distances between points represent the dissimilarities between two shades of color along the three axes that define the color space, namely, hue, brightness, and saturation. Each shade/point specifies through the projections to the three axes the hue, brightness, and saturation of the phenomenal content of the experience. Brossel points out that the content of perceptual experiences construed in this way is analog content since it represents shades of colors as points in a continuous space, in agreement with the same remark I made above. Brossel (2017, 11–12) examines next the conceptual structure space of perceptual beliefs or conceptual spaces. These are geometrical spaces whose structure captures semantical properties and relations. One main purpose of perceptual concepts is to allow for the categorization of objects in a few linguistic categories on the basis of our manifold

284     A. Raftopoulos

Pes [perceptual experiences] of those objects. For example, the color concept VIOLET allows one to categorize objects according to their manifold shades of color as experienced by the agent. With the help of such concepts, we achieve two goals. First, these perceptual concepts can be understood as coming with what has been called ‘language entry rules’… They allow us to introduce a concept or a word for a given object on the basis of a non-conceptual fundament… Second, such concepts also allow us to group similar shades of color together by subsuming them under one concept or category. For example, the concept VIOLET subsumes various different shades of color under one label and delimits them from various other shades of color. Given this job description, it is only natural to understand the concept VIOLET as corresponding, at least in part, to a region in the phenomenal similarity space. This region then includes all those manifold points in the space that we want to group in one category under one label and it thereby delimits these points in the space from other points in the space that lie outside of the relevant region in the phenomenal similarity space.

According to Brossel, conceptual spaces, where the relevant concepts are the so-called perceptual concepts, naturally correspond to regions in some phenomenal space. Through this correspondence, concepts mainly allow grouping together under one heading various shades that belong to the same region in the phenomenal similarity space. This is the conceptual structure Brossel alludes to and this is also the structure of the conceptual space that I discussed in Chapter 2, when I claimed against Burnston that concepts are not atomic symbols devoid of any content; they have a semantic content that reflects their position and, hence, their interrelations with the other concepts in the conceptual space. Note also that the fact that these regions are discrete spaces in the landscape and that no space between two consecutive conceptual spaces is a conceptual space entails that the set of symbols/concepts is not continuous, justifying their characterization as ‘symbols (the result of their representation through sparse functions), in contrast to the points in a basin of attraction, where between two points (phenomenal shades) there is always a third point, phenomenal shade, that also belongs to the same basin of attraction (the result of their representation through

5  Early and Late Vision: Their Processes and Epistemic Status     285

dense functions), justifying the claim that the distribution in a phenomenal space is continuous and analog. Let us return to the process of conceptualization of the purely perceptual, nonconceptual contents of the states of early vision, which, as I have argued, takes place in late vision where global recurrent processing allows semantic, conceptual information to be used as an information resource by perceptual processes. This conceptualization can be described as the application in a non-inferential way of concepts to some perceptual representation. Let us suppose that early vision outputs the structural, i.e., something akin to the 21/2D, representation, of an object in a visual scene. The representation is a vector or, equivalently a point, the tip of the vector, in the activational space of the system. This point falls within one of the basins of attraction (although in certain cases with unfamiliar objects, or in complex scenes, or when the viewing conditions are not nominal, the point may fall in the borderline between different basins of attraction) that have been sculpted in the system as a result of its initial dispositions and learning, and is eventually drawn to the attractor that lies on the energy bottom of the basin of attraction. The concept/symbol that corresponds in the sense explained above to the attractor is then applied to the object, which, if the symbol stands for, say, a kind-concept, is categorized and recognized as being such and such; that is, the object is subsumed under this concept. It is evident that there is nothing inferential, in the discursive sense, involved in this process and this renders this sort of conceptualization markedly different from conceptualizations that take place in the space of reasons and are based on discursive inferences.5 That there are no discursive inferences in perception is not a novel claim. Sellars (1956) has extensively argued that there is not a logical or quasi-logical relation between phenomenal and conceptual states, and,

5Gauker

(2012, 45) reaches the same conclusion starting from a similar structural description of the perceptual system but without any mention of neural networks, although his construal of representations as marks and his discussion of perceptual similarity space in which these marks are located is close to the connectionist account I have sketched here.

286     A. Raftopoulos

consequently, that there cannot be a relation of entailment or discursive inference between them. It follows that there exists not an inferential relation between the phenomenal contents of perception and the perceptual occurrent thoughts formed in late vision through the application of concepts. Supposing that such a relation exists is to succumb to “the myth of the Given” that seeks to provide a foundation of knowledge on the immediate appreciation of the contents of phenomenal states. Sellars (1956), importantly, also notes that the phenomenal states and the perceptual beliefs qua conceptual states are not inferentially related despite the fact that the perceptual beliefs are formed in the light of background knowledge. As we have seen when discussing the processes of late vision that lead to the formation of perceptual beliefs, an important and essential role is played by the formation of tentative hypotheses concerning the identity of the object(s) in the visual scene. These hypotheses are formed both as a causal response to the iconic input from early vision, and by considering the existing knowledge about the environment and the objects and their properties in it, which acts, thus, a set of background beliefs. To recapitulate the discussion in this section, since discursive inferences are carried out through rules that are represented in the system and operate on symbolic or symbolic-like structures, the processing in a connectionist network does not involve discursive inferences, although it can be described in terms of discursive inference making. It operates through a pattern matching process that is probably realized by Bayesian inferences. This pattern matching very likely concerns matching of visually constructed templates with templates stored in IT, in which perceptual visual features are associated with semantic information retrieved from memory (Gonzales-Cassilas et al. 2018). Kosslyn’s (1994) account of object recognition that we encountered in Sect. 3 provides a detailed description of such a matching process. I have argued (Raftopoulos 2015a, 2017) that the processes of late vision could be modeled in this way. This means that even though seeing an object in late vision involves the application of concepts that unify the appearances of the object and of its features under some category, it is not a discursive inferential process and in this sense is very different from thinking despite the fact that it employs concepts.

5  Early and Late Vision: Their Processes and Epistemic Status     287

4.4.1 Perceptual Beliefs and Pattern Matching in Dynamic Neural Networks I have argued that the conceptualization process in late vision may assume the form of pattern matching. Let me explain this idea by starting with few things about the class of connectionist systems that perform the pattern matching that could explain the inferences involved in late vision, and their ability to represent propositional structures (i.e., the propositional descriptions of a visual scene) despite the fact that they do not rely on inferences and need not use as input propositional structures. Cognitive states are represented, transformed, and processed by means of operations performed on data structures purely on the basis of either formal properties (for algorithmic models of cognition) or algebraic properties (for dynamical models). These transformations can be either algorithmic (determined by means of a set of rules that apply to discrete static symbols that are the representations of the system) or dynamical (determined by means of mathematical relations that apply to continuous variables and specify their interrelations and evolution in time). These processes are mathematical-state transitions and describe the way the system moves between points in its state-space. I am going to assume that a cognitive system is associated with a dynamical system physically realized by a neural network. Studies in the cognitive neurosciences suggest that the brain is a complex system of interconnected neurons that interact by conducting electricity and releasing neurotransmitters. Neurons are organized in modules, large-scale units consisting of tens or hundreds of thousands of neurons. Neurons within modules may be connected through feedback loops. Modules also interact through feedback loops that allow signals to be transmitted among modules back and forth. These are called ‘reentrant connections’. Recurrent neural networks (Elman et al. 1996) with distributed representations and continuous activation levels can naturally be construed in a dynamical way. They can be described by means of the evolution of the activation values of their units over time. To be able to model growth and avoid problems of lifelong (mainly catastrophic interference), one needs to consider a special class of networks, namely adaptive

288     A. Raftopoulos

or generative networks. These networks can modify their structure during learning by adding or deleting nodes and can change their learning rates. The number of units of the network determines the number of dimensions of the state-space associated with the system. Their activation values constitute the actual position in the state-space of the system. Adding a time-dependent parameter yields the phase-space of the system. Both in state- and phase-space, one can represent all the possible states that a system can take in time. Hence, in the connectionist account, the states of a cognitive system are depicted by the sets of activation values of the units that distributively encode these states. These activation values are the variables of the dynamical system and their temporal variation constitutes the internal dynamics of the system. In addition to the state-space of a system, an external control space is also defined. The external space contains the real-value control parameters that control the behavior of the system, i.e., the connection weights, biases, thresholds, and, in networks in whose structural properties are implemented as real-value parameters the structure of the system. In dynamical systems the fast internal dynamics are often accompanied by a slow external dynamics. The external dynamics consist of the temporal paths in the external control space. The external dynamics consist of the network’s learning dynamics (the various learning rules) and the dynamics that determine structural changes, such as the rules for inserting nodes in cascade correlation and growing radial basis function networks. When the network receives input, activation spreads from the input units to the rest of the network. Each pattern of activation values defines a vector, or a point, within the activational space of the system whose coordinates are the activation values of the pattern. The activation rules determine the state transitions that specify the internal dynamics of the system, i.e., the functions of the evolution of the system in time. Thus, the behavior of such a system is depicted as a trajectory between points in the activational state-space. Parallel neural networks represent information through the activation patterns of a set of units that simulate the neurons in the brain. Each activation pattern determines a vector in the state-space of the system, a space whose dimensions are determined by the number of

5  Early and Late Vision: Their Processes and Epistemic Status     289

units in the network, and which comprises all the possible activation vectors of the system. Neural networks perform vector completion in order to produce the best output given the input into the system and the task at hand. In the case of the visual system, the information that the visual system receives through the retina consists in light intensities that result from the projection of light information emanating from three-dimensional objects onto the two-dimensional surface of the retina. It is well known that for various reasons the impinging information underdetermines the percept that visual perception delivers as its final product. This is a typical case in which from impoverished or partial data the visual system must construct its representations that culminate with the percept. This capacity for passing from partial data to a complete visual representation, which is an interpretation of the input, is a case of vector completion since the system must compute the activation pattern of the neurons (and the vector that this pattern defines) that realize the percept by completing the information missing from the partial input. Since the percept is constructed by adding information to the input, vector completion is an ampliative inference, an abduction, which means that the percept may not correctly represent the distal object from which the input data emanate even though the perceptual systems function adequately. Note that the algebraic and, thus, continuous, nature of state transformations in neural networks, as opposed to the algorithmic discrete-like operations of classical AI (which assumes that the brain is a syntactic machine that processes discrete symbols according to rules that are also represented in the system) suits best the analogue nature of iconic representations. In connectionism, some of the neurons or modules whose signals reenter the network constitute the “context units”, or clean-up units (Elman et al. 1996). They receive input from the hidden units of a network and output to these same hidden units. Their role is to feed back to the hidden units the results of the processing of the input to the system by the same hidden units. Consequently, the hidden units simultaneously receive and process the external input along with their own previous activation state. When a part of the input has been fed to the network, the input units are activated and activate the hidden units. The activation of the hidden units is fed to the context units, which are

290     A. Raftopoulos

activated in their turn. This activation is fed back to the hidden units, which in this way, receive not only the activations coming from the input units that receive the next part of the input, but those of the context units as well. As a result, the hidden units simultaneously receive and process the external input along with their own previous activation state. Thus, in each cycle, the hidden units are informed of their activations in the previous cycle. The context units constitute the working memory of the network, in the sense that they “store” the results of each processing cycle. These results are “retrieved” for reprocessing and interaction with other processing cycles, by means of the context units’ output to the hidden units. Recurrent or feedback loops, thus, introduce iterative processes in the brain and allow context and history to affect signal processing. This is so because the signal that reaches a neuron or a module is added to the signal that reenters the neuron or the module through the reentrant connections from other neurons or modules, the “context neurons.” Context neurons or modules may process other aspects of the incoming signal and inform the neuron or module for the results of their processing, rendering thus the processing at that site sensitive to context. Alternatively, they may process part of the input signal that has first passed from our neuron or module (first pass). Thus, upon receiving the signal from the context neurons, the neuron or module processes the external input along with its own previous activation state, which may have eventually been modified by the processing in the context units themselves. Thus, the context in which the input occurs, as well as the history of the previous activations of the neuron or module that does the processing affects each step in the processing. Iteration, sensitivity to context and history, and interdependence of the components of a system are signs of complex dynamical systems, where the dynamics of a system can result in more expressively powerful structures by means of self-organization, and of the nonlinear dynamic governing the activation functions of their processing units. Such transitions from a lower to a higher level of complexity abound in dynamical theories of cognitive and motor processes (Elman et al. 1996; Kelso 1995; Thelen and Smith 1994).

5  Early and Late Vision: Their Processes and Epistemic Status     291

It is reasonable to assume that the brain/mind plus environment system forms a complex dynamical whole structure, and that recurrent neural networks with nonlinear activation functions of their units capture this dynamic nonlinear character. Recurrent neural networks (Elman et al. 1995) can naturally be construed in a dynamical way. That is, they can be described by means of the evolution of the activation values of their units over time. The number of units making up the network determines the number of dimensions of the state-space associated with the system. Their activation values constitute the actual position in the state-space of the system. The states of a cognitive system are depicted by the sets of activation values of the units that distributively encode these states. The set of activation values is a vector whose tip defines a specific point in the state-space. Hence, a point in the statespace realizes a specific cognitive state of the system with some content; this point is called a content-realizing point (Horgan and Tienson 1996). The activational state-space of a network is a high-dimensional mathematical entity, a landscape. The state transitions in such a system are trajectories from one point on the landscape to another. To get a better understanding of this landscape and of its role, think of a recurrent network that is given a certain input and goes through various processing cycles before it settles down into a certain attractor state, i.e., before it stabilizes at a certain output. During the phase of activation changes the system passes through various outputs. All these outputs can be viewed as lying on an energy surface. When the system passes through a certain output-state whose energy is not lower than the energies of the neighboring states, it goes through another phase of activation-value changes in order to reduce the energy of the output state. When it reaches a point at which all the neighboring states have higher energies, it settles. These states of minimum local energy are the attractors and can be construed as valley bottoms on energy surfaces. Thus, attractors should be distinguished from the networks’ outputs in general. Not all outputs are settling points. Attractors form a subset of the set of outputs of a network, in that they are those outputs at which the system can settle.

292     A. Raftopoulos

When the input of the system is such that the ensuing activation state of the system lies within the walls of the valley, the system will settle at the attractor that corresponds to its bottom; the valley is the basin of attraction that leads to the specific attractor state of minimum energy. Since the network has many attractors and basins of attraction, the relative positions of a valley with respect to others will shape the relief of the activational landscape of the system, which determines the possible trajectories in the state-space. A landscape is a multidimensional space of overlapping attractors and basins of attraction. The topological distribution of the systems’ attractors and basins of attraction (which constitute the dynamics of the system) determines the possible trajectories of state changes within the system/environment whole structure. Thus, when the network receives input, activation spreads from the input units to the rest of the network. Each pattern of activation values defines a vector, or a point, within the activational space of the system whose coordinates are the activation values of the pattern. The activation rules determine the state transitions that specify the internal dynamics of the system. In other words, the functions that determine how the system evolves in time by specifying how the state of the system at time t+dt is a function of the state of the system at time t. The behavior of the system is depicted as a trajectory between points in the activational state-space. The activation states, in which a network may settle into after it is provided with an input signal, are the attractors of the system. These are the regions in state-space toward which the system evolves in time. The points in state-space from which the system evolves toward a certain attractor lie within the basin of attraction of this particular attractor. Thus, the inputs that land within the basin of attraction of an attractor will be transformed by the connectivity patterns of the network so that they end up at this attractor where the system will settle Not all possible outputs of the system are settling points and, thus, attractors. Networks in which the outputs change over time until the pattern of activation of the system settles into one of several states, depending upon the input, are called attractor networks. The sets of possible states into which the system can settle are the attractors. If the network is used to model cognitive behavior, then the attractors

5  Early and Late Vision: Their Processes and Epistemic Status     293

can be construed as realizing cognitive (or mental) states to which the system moves from other cognitive states that lie within the attractor’s basin of attraction. The process by which the input patterns are transformed into attractor patterns is the following: a given input moves the system into an initial state realized by an initial point. This input feeds the system with an activation that spreads throughout the network causing the units of the system to change their states. The processing may take several steps, as the signal is recycled through the recurrent connections in the network. Since any pattern of activity of the units corresponds to a point in activation space, these changes correspond to a movement of the initial point in this state-space. When the network settles, this point arrives at the attractor that lies at the bottom of the basin in which the initial point had landed; the inputs fed into the system are the initial conditions of the dynamic system. As a dynamical system settles into a mode depending on its initial conditions, so the neural network settles into the attractor state in whose basin of attraction the input falls. In this case, the input pattern matches the pattern of the attractor state. The concepts “attractor” and “basin of attraction” suggest a way of simulating the classical notion of symbol. The attractor basins that emerge as the network interacts with specific inputs might be construed to have symbolic-like properties, in that inputs with small variations that fall within the same attractor basin are pulled toward the same attractor (or cognitive state) of the system. Thus, various inputs (tokens) give rise to the same stable point of attraction, the attractor (type), which in this sense offers a dynamical analog of the classical symbol and of the notion of concept construed are a constant, context independent, and freely repeatable element that figures constitutively in propositional contents and corresponds to a lexical item. The dynamical “symbols”, unlike the symbols of classical cognitivism, are dynamic and fluid rather than static and context independent. The dynamic properties result from the dynamical nature of the activations of associative patterns of units. As the network learns and develops, the connection strengths continuously change. The same happens when new units emerge and old units “die” and the system reconfigures to maintain its knowledge and skills. All these cause changes in the original pattern in which an

294     A. Raftopoulos

attractor/symbol was created in the first place, and as a result, subsequent activations differ. This construal of ‘concept’ paves the way of discussing the possibility of nonpropositional conceptual representations in late vision. Even though the state transformations in a dynamical system are algebraic and not syntactic and even though the states themselves have analog, continuous contents, the attractors, by pulling together the inputs that fall within their basin of attraction, perform a basic function of ‘concepts’, that of assuming under the same heading (type) of different tokens provided that these tokens fall within the same basin of attraction, which means that they say a number of features that suffice to land them within the same basin of attraction. Thus, various input states of (different tokens of ), for example, cats are categorized by the system under the same attractor, which, thus, corresponds to the concept ‘cat’ despite the fact that no propositional states are involved and no discursive inferences have taken place. If concepts are applied in late vision in a way similar to pattern matching, especially if this occurs in the manner in which pattern matching takes place in dynamic neural networks, and not by means of discursive inferences, this raises questions concerning the nature of the hybrid representations in late vision. In particular, it leads one to wonder whether conceptual contents in late vision need be propositionally structured, with, perhaps, the exception of the recognitional belief that late vision outputs, which seems to be in the form of an occurrent thought. Since knowledge about the world stored in memory is used to form hypotheses concerning the identities of the visual objects, cognition plays a role in late vision. In addition, to the extent that for the percept to be formed these objects must have been recognized and object recognition presupposes the application of concepts, late vision involves concepts. It is also likely that some states in late vision have both NCC and conceptual content. Until the percept is constructed and a recognitional belief is formed, however, it is possible that no propositional structures exist in late vision despite the cognitive/conceptual involvement in perceptual processing. Following Burge (2014), one could call the concepts involved in such nonpropositional conceptual contents ‘pre-concepts’ and the relevant cognitive representations

5  Early and Late Vision: Their Processes and Epistemic Status     295

‘pre-conceptual’. This, of course, presupposes a notion of ‘concept’ that is dissociable from propositional structures. Note that if some of the states of late vision could have conceptual contents that are not propositionally structured, the thesis that late vision does not involve discursive inferences is strengthened because such inferences relate propositional structures. Burge (2014, 574–575) argues for the existence of ‘preconceptual cognitive representations’, where ‘pre-conceptual’ intends to convey that these representations are not propositionally structured. Another type of (cognitive) pre-conceptual representation is formed through learning or other processing in long-term memory (modal or amodal). Even certain operations on working visual memory count as cognitive. Like perceptual-level representation, the foregoing cognitive representations are pre-conceptual in lacking propositional representational format. (Burge 2014, 574)

These pre-conceptual cognitive representations are perceptual because Like perception, these types of pre-conceptual cognitive representation have the same structure as noun phrases constituted of contextual-determiner-dominated attributives— the structure of that F or those Fs. When representation occurs, the representational types are applied in a demonstrative-like manner. In all perceptual-level and most pre-conceptual cognitive level representations, such determiner-governed attributions are part of a complex iconic array. Visual perception consists in a rich, topographical array of demonstratively applied attributives, at various levels of specificity.

I will argue later on, in order to defend the thesis that late vision states are perceptual states the role of cognition and concepts notwithstanding, that there is an inextricable link between thought and perception in late vision that establishes the essentially contextual, in Perry’s (2001) and Stalnaker’s (2008, 78–82) sense, character of the states of late vision. Let me say here only that the indexical character of the states of early vision is interrelated with the fact that the components of the perceptual contents function as perceptual demonstratives

296     A. Raftopoulos

(Raftopoulos and Muller 2006; Raftopoulos 2009), in the way Burge describes in the citation above. The contents of the hybrid states of late vision comprise both iconic information and semantic, categorical information. If one were to take a snapshot picture of such a state, one would find a widely distributed activation in the occipital lobe. The visual areas along the ventral stream, in the areas constituting the dorsal stream (such as the FEF in parietal cortex), as well as in areas involved in WM and VSTM. The former are responsible for the iconic content, while the latter are responsible for the semantic content. This hybrid content need not contain propositional structures because the role of the semantic elements consists in sending top-down information concerning the identity of the visual objects that affects the iconic information in the iconic image by selecting that iconic information that is pertinent to the identity of the hypothesized visual objects and if a match is found the visual object is recognized as such and such. A set of pattern matching mechanisms of the sort described by Kosslyn (1994) could accomplish this without a need to invoke any propositional structures. Since the top-down activation is guided by the concepts/categories activated in VSTM they are demonstratively applied to the objects individuated in a visual scene, Burge (2014, 574–575) is certainly right to claim that “[v]isual perception consists in a rich, topographical array of demonstratively applied attributives, at various levels of specificity.” The reader should notice that this last trait of late vision that is a direct consequence of the application of concepts sets late vision apart from early vision. Even though in early vision object attributes are used to individuate the visual objects, the nonconceptual representations in early vision do not encode these attributes, that is, the attributes thus used are not attached to the objects so that they could be stored in VSTM and used to recognize the objects at a later time (Raftopoulos 2009; Chapter 2). In late vision, in contradistinction, owing to the role of concepts such encodings occur and, thereby, allow storage in memory. Thus, the states of early vision have iconic representational contents that are nonpropositional, while the states of late vision are perceptual states with hybrid iconic nonconceptual/conceptual representations,

5  Early and Late Vision: Their Processes and Epistemic Status     297

which, perhaps, do not have propositional structure. Cognitive states that involve concepts, in contradistinction, are propositionally structured (Block 2014; Burge 2014). One should be careful not to take the inherent pre-conceptual structure of the states of late vision and the fully propositional structure of the cognitive states to entail that they have some sort of priority over the purely perceptual contents. Soames (2010, 107) has argued that “propositions, properly conceived, are not an independent source of that which is representational in mind and language; rather, propositions are representational because of their intrinsic connection to the inherently representational cognitive events in which agents predicate some things of other things.” To that, we could add that in their cognitive lives, agents predicate some features of objects because perception delivers some features in the environment as attributes of objects in perceptual contents.

4.4.2 Ambiguous Figures: An Exemplification of the Matching Process I have claimed that the brain can be viewed as a dynamic neural network and that its processes can be described in terms of trajectories in the state- or phase-space of the system. These processes are vector completions and the algebraic transformations that the system performs are essentially activation pattern matching processes. I assume now that perceptual processes can be described thus, and proceed to give an example of how the perception of ambiguous figures could be described in term of the function of a dynamical system that settles into attractors. Dynamic nonlinear systems exhibit a property that is very important in explaining the perception of ambiguous figures, namely, “intermittency”. If the brain is seen as a dynamical system, the fact that the brain consists of neurons that spike periodically entails that the neuronal behavior can be described in terms of the oscillations of the neuronal spiking and their frequencies. Empirical studies show that the gradual formation of the successive representations of the retinal input that eventually result in the perception of a pattern structure begins with the sensory activity of individual neurons. Organization occurs

298     A. Raftopoulos

through the linking of activity in local populations of neurons. The brain is a system of coupled oscillators (coupled in the sense that owing to neuronal connections/synapses the neurons interact and one entrains the other). It is well known, for example, that when the brain parses objects in the visual field and binds properties that are processed across different neuronal channels, this is done by means of the synchronous (in phase) oscillations of the neurons that code the respective features. Moreover, neurons in early visual areas with separate receptive fields oscillate in synchrony in response to two moving bars only if the two bars have the same orientation. In dynamic systems, models of synchronizing oscillatory activity typically use phase locking of periodic signals as a way to acquire synchrony (Van Leeuwen et al. 1997). The synchronicity of oscillatory activity could be a collective variable in a system that shows both stable and unstable behavior, depending on whether the perceptual pattern is unambiguous or ambiguous. As we have seen, a system moves from one attractor to another, a case of phase transition. This is a movement from one stable state to another, as the system tends to seek to rest in places of static equilibrium. In phase transitions, the mechanism that is responsible for the transitions uses an active process (a parameter change or fluctuation) to switch the system from one stable state to another. In each state there is an absolute coordination of the system’s components in which they have perfectly synchronized spiking frequencies, as where the neurons in the perceptual system are synchronized when objects are parsed and affect the binding of features in the visual scene to that specific object. This is a case of pure phase locking. There are cases of relative coordination, however, in which the system’s components are relatively rather than absolutely coordinated, because the individual components have intrinsic properties that persist even when the components are coordinating with each other. Biological systems have this property. There is a tendency toward phase and frequency synchronization but sometimes the system slips from pure phase locking. Then intermittency arises as a property of the system. Instead of using an active process to switch from one state to another, “the system is poised near critical points where it can spontaneously switch in and out. Strictly speaking, in the intermittent regime it no

5  Early and Late Vision: Their Processes and Epistemic Status     299

longer possesses any stable states at all” (Kelso 1995, 99). In these cases, the system is in a state of constant dynamic equilibrium in which its behavior can change spontaneously from one attractor to another. That the transitions from one state to another can occur spontaneously does not mean that changes in controlling parameters or their fluctuations do not cause state transitions. The property of switching spontaneously between alternative settling states is called “instability”, and the two alternative states are called meta-stable states. Ambiguous or bistable figures are usually two-dimensional figures that admit of two different organizations and depending on the organization the viewer can experience one of two alternative percepts, which are mutually exclusive; one cannot see both figures at the same time, neither does one see an averaged figure, that is, a figure the results from an overlapping or weighting summation of the two alternative percepts. These figures are also called bi-stable in that they give rise to two stable states in the perceptual system, namely the two alternative percepts; these figures give rise to two stable states in the perceptual system. Research (Britz and Pitts 2011; Hochberg and Peterson 1987; Kawabata 1986; Kornmeier and Bach 2009; Peterson and Gibson 1991; Pitts et al. 2007) has shown that for each ambiguous figure there are some critical points focusing on which through spatial attention determines the percept; that is, there are locations in the image fixations on which favor one or the other perceptual response since the information contained there favors one or the other representation. Although these critical points are sufficient to cause the perception of a percept, they are not necessary since one can see one or the other percept even if one focuses on some neutral region in the image. Furthermore, the simple introduction of a neutral fixation point, the middle of the duck/rabbit figure, for instance, does not stop figure reversals. The role of critical points as sufficient factors in determining the percept suggests that spatial attention can influence the way an ambiguous figure is perceived, a hypothesis that has received considerable experimental support (Long and Toppino 2004; Meng and Tong 2004; Toppino 2003). At this juncture, however, an elucidation concerning the precise role of spatial attention is needed. Fixation points draw automatically spatial attention to a specific location in the figure, which in

300     A. Raftopoulos

turn determines a certain organization of the image and, thus, the percept. Spatial attention may be drawn to a specific location in the image by a biasing cue. When this happens, there is an increase in the baseline firing rate of the neurons with receptive fields at the retinotopic position of the focus of spatial attention that prepares the neurons to process signals stemming from the focused areas These shifts are independent of the stimulus; in fact, they are independent of whether a stimulus exists at the specific location (Murray 2008). These effects are called anticipatory effects and are established prior to viewing the stimulus. In this sense, they do not modulate processing during stimulus viewing but they bias the process before it starts via the increase in the baseline firing rates; they rig-up, as it were, perceptual processing without affecting it online. Other research suggests that object/feature based attention can also influence the perception of ambiguous figures (Leopold and Logothetis 1999; Long and Toppino 2004; Meng and Tong 2004; Toppino 2003) and, indeed, the latter, as well as selective spatial-attentional control, can override the bottom-up effects of fixation points when fixation conditions and the demands of intentional instructions are incompatible (for example, participants are instructed to maintain intentionally a designated orientation of the Necker cube), or when the figure is small enough to allow for selective processing of different sets of focal features. However, this is done by causing covert attention shifts to other places in the image, where the features of the figure at that location that are selected induce a different organization of the image (Britz and Pitts 2011; Pitts et al. 2007). The phenomena of the reversals when one focuses on a neutral point in the image, or of choosing one interpretation of the image over the other even when one fixates on a neutral point may be explained by the role of cognitively driven selective attention, whether it be spatial or feature/object based that acts during late vision where the percept is formed. The same attention effects can explain why one can shift from one interpretation to the other in the first place. It suffices to shift covertly, either automatically or intentionally, from one critical location to another for the figure to reverse. Thus, the perception of ambiguous figures can be caused either by purely bottom-up factors, or by a

5  Early and Late Vision: Their Processes and Epistemic Status     301

combination of both bottom-up, stimulus-driven factors, and topdown, cognitive-driven factors. Reversals can also be explained in the absence of attention through the spontaneous activity of the visual system; perceptual, visual adaptation can explain such shifts. Perceptual adaptation is the phenomenon of the neural fatigue of the neurons encoding a given stimulus. When, for instance, one looks at something moving down, owing to the adaptation of the neurons that encode this movement, the threshold for detecting downward motion is raised (it becomes more difficult to see a downward motion) biasing seeing the percept as moving upwards, which means that if a stationary object is presented afterward, it will look like moving upwards. Similarly, after a certain time during which a bistable figure looks a certain way, perceptual adaptation leads to the selection of the alternative interpretation. To simulate the confluence of the top-down and bottom-up factors, Long and Toppino (2004, 761) proposed a model of interacting neural networks. Top-down effects from higher-order, global processes affect representations in visual areas in IT and in the extrastriate cortex (V4 and MT) and may influence either the anticipatory activity before stimulus presentation, or the recurrent processing during stimulus viewing. These intermediate levels receive input from the feature-extraction level, presumably the early visual cortex. The locus of the resolution of the competing alternative interpretations of the ambiguous figure is at the intermediate levels (IT, V4, and MT) as evidence suggests that IT neurons almost completely conform to the subjective experience, while neurons in V4 and MT are associated with the transition from one interpretation to the other (Andrews et al. 2002; Leopold and Logothetis 1999). We have now in place the apparatus to discuss the way the perception of ambiguous figures can be described in terms of the pattern matching process taking place in neural networks and, thus, to offer an alternative to the view that the interpretation of ambiguous figures is a sort of a discursive inference that takes place in late vision and in which the perceptual system draws an inference from the propositions expressing the two alternative percepts and the propositions expressing the background information that is brought to bear by being activated by the concepts

302     A. Raftopoulos

that guide attentional focus to the relevant part of the image to the conclusion that the figure is a duck or a rabbit. This is usually expressed by the claim that the perceptual system outputs both interpretations and then cognition chooses one of them on the basis of all relevant information available, or by the claim that early vision outputs both nonconceptually formed figures one of which will be selected in late vision and the corresponding percept will be formed through an inference-like process in late vision. The bistable figures can be modeled on the basis of the idea of dynamic bistability, according to which when an ambiguous figure is being presented two stable states are formed in the phase space of the dynamical perceptual system. These two states coexist and each one of them has a basin of attraction. If the initial condition makes the input to fall within the basin of attraction of the first state, then the first stable state is perceived, otherwise the second stable state is perceived. The two stable states can be thought of as two states of minimum energy toward which the system tends to settle. As the examination of some networks that model this process will show, the initial conditions that determine within which of the two basins of attraction the input will fall, settling thereby the attractor to which the input will settle in and, thus, the percept, express the effects of attention, which, as we have seen determines the interpretation of the ambiguous figure. This account of the perception of ambiguous figures centers on the coexistence in the phase space of the system of two stable states, i.e., the two percepts. This does not entail, however, that the perceptual system outputs both interpretations, a fact that would seem to corroborate the view that perception outputs both interpretation and then cognition chooses one of them. The coexistence of two stable states in the phase space of the system means that both options are available to the visual system and the initial conditions just determine which one of these two states the perceptual system will settle in and, thus, output; it does not mean that the visual system outputs both. There are several models of the mechanism that may be responsible for the switch from the one state to the other. The one I consider here is the mechanism of phase transition and metastability that is entailed by the property of intermittency. Such models can have metastable states and exhibit state transitions (phase transitions if one factors in the

5  Early and Late Vision: Their Processes and Epistemic Status     303

parameter time) that can be used to explain perceptual bistability and switching between an alternative interpretation of the same stimulus. Such a neural network model has been developed by Van Leeuwen et al. (1997). The model is based on the knowledge that the perceptual system analyzes external input for low and high spatial frequency features and gradually organizes them in a globally coherent pattern, i.e., the percept. The salience of a feature determines the control parameter A (the collective variable used is the synchronicity of oscillatory activity of the nodes of the system) for each neuron/node in the system. The more salient the input is to a receptive field (when, for, example, the input coincides with the preferred stimulus of the node/neuron the salience is maximal) the lower the value of A (Amin). Reducing A across a population of neurons results in the strengthening of the tendency for these neurons to couple and, thus, provide a stable percept. Amin, then, represents the figure in a visual scene, while Amax represents the background. Hence, synchronization of the system near Amin means that the representation of the figure becomes stable, while synchronization near Amax that represents the background is only temporary. In ambiguous figures, this means that the system synchronizes first in one stable interpretation, then it synchronizes temporarily in a non-meta-stable state, and then spontaneously switches to the other meta-stable state, that is, the other alternative interpretation of the ambiguous figure. The equations governing the behavior of the dynamic system when A = Amax and the system destabilizes switching between synchronized and unsynchronized states (between a stable percept to no percept and then back to another percept) show that the interval between two successive switches varies stochastically, in accordance with experimental evidence concerning switches with ambiguous figures (Meng and Tong 2004; Long and Toppino 2004). It is important to note that the system may go into the circle of synchronized and unsynchronized activity driven by its internal dynamics and not by some external control. In other words, the self-­organizing activity of the system can explain its behavior. When the coupling among the nodes is weak there is no well-defined percept, as the binding problem is not solved. When the coupling is strong there is a

304     A. Raftopoulos

uniform synchronization of all the nodes which means that the background stands out and the objects are absorbed into the background and not perceived. There is a range of optimal values for coupling and synchronization that allow the perception of a well-formed pattern structure. In neural networks, “such structures are the result of automatic generalization, due to the pattern recognition capacities of the adaptive weights” (Van Leeuwen et al. 1997, 339–340). As we have seen, in order for the system to be able to construct a percept, the synchronization must be within a range of parameter values. The automatic processes of the system can construct both local and global structures without strategic, top-down cognitive control, that is, without the need for attentional processes to integrate local features into a coherent perceptual structure. This means that the perceptual system upon receiving some input can, driven by its own internal dynamics, construct the percept without strategic attentional control. However, the need to constrain synchronization entails that the system must operate within the range of an attentional parameter that controls the number of units that can become synchronized. This is a sort of nonspecific attentional control whose role is not to reduce computational complexity and solve competition problems in terms of selecting neurons that encode one or the other feature, but to constrain the number of units that can become synchronized, so that no distorted patterns or no patterns at all are constructed. Thus, the percept that prevails, that is, the way an ambiguous figure is perceived, is determined by the focus of spatial attention, because spatial attention is the factor that constrains the number of neurons that are activated and can become synchronized by increasing the activation of the neurons whose receptive fields fall within the area of focus and by decreasing the activations of neurons with receptive fields outside the focused area. This entails that the features of the image at the focused location receive enhanced processing and, thus, it is they that determine the organization of the image and the ensuing percept. Another simulation (Borisyuk et al. 2009) that also uses phase transitions and meta-stability to model the perception of ambiguous figures posits a central unit surrounded by peripheral units. Each time an ambiguous figure is presented to the system, depending on the initial

5  Early and Late Vision: Their Processes and Epistemic Status     305

viewing conditions, the central unit co-opts, by partial synchronization, a group of peripheral oscillators and one interpretation is seen. Owing to its internal dynamics and the presence of noise in the sensory channels that renders the bursting activity and synchronization of the neurons irregular, this regime spontaneously switches so that the central unit becomes partially synchronized with another group of peripheral units and the alternative interpretation is seen. The presence of the central unit that selects subgroups of peripheral modules allows the model to reflect the hierarchical organization of information processing in the brain. The central module can be seen as simulating the parietal and frontal areas that are known to be involved in the awareness and switching, respectively, accompanying the perception of ambiguous figures, while the peripheral nodes model the neuronal assemblies in the visual areas of the brain. This way one can also model the effects of attention to the perception of ambiguous figures and, specifically, its effects on the temporal characteristics of the alternation perceptual process. The attentional modulation of perceptual switching is done through the modification of connection strengths between the central and peripheral units. The work by Nakatani and van Leeuwen (2006) further elucidates the role of attention in perceiving ambiguous figures. As we saw that attention modulates perceptual switching through the modulatory effects of the central unit, which can be seen as the complex system of parietal and frontal areas involved in perceiving ambiguous figures and the concomitant perceptual switching. Since these are areas associated with cognitive processes in the brain, the central unit models the cognitive influences on perceptual visual processing. Nakatani and van Leeuwen show that when these areas participate in ambiguous figure perception, there is a synchronized activity in right parietal areas that are responsible for perceptual awareness and in the right frontal areas that are related to perceptual flexibility and, hence, to perceptual switching. Their research shows two cycles of synchrony in the gamma band; the first occurs 800–600 ms, and the second 400–200 ms before button pressing. The same areas are also involved in top-down selective attention (Corbetta and Shulman 2002). The first period of synchronicity coincides with a drastic suppression of eye blinks, which is thought to be

306     A. Raftopoulos

related to attentional demands (Ito et al. 2003). The second period of synchronicity in the observed activity patterns in the fronto-parietal complex coincides with the maximum saccade frequency that reaches its peak at about 250 ms before the switch response. Since saccade frequency is associated with shifts of (overt) attention (Leopold and Logothetis 1999), the second period of synchronicity may reflect the final focus of spatial attention after a series of attentional shifts, which, by determining the critical points on the image that will be processed, also determines which interpretation of the ambiguous figure will be perceived. Nakatani and van Leeuwen (2006) also explored the role of the activity of frontal and occipital cortex during switching episodes. They found that the theta activity in the frontal cortex is a general characteristic of the processing activity of viewers that perform frequent switches when viewing an ambiguous figure but is not specifically related to perceptual switching. In contradistinction, the alpha band activity that is observed in the occipital cortex is specifically related to frequent switches. Increased theta band activity in the frontal cortex is related to the concentration of attention to a task and, hence, to the inhibition of eye blink (Yamada 1998). At the same time, increased alpha activity in the occipital cortex is known to be related to attention to the visual stimulus by enhancing the efficiency of information processing (Yamagishi et al. 2003). Specifically, recent evidence (van Kerkoerle et al. 2014) suggests that selective attention enhances the γ highfrequency rhythm activity in the striate cortex and also increases α lowfrequency rhythm activity in this area. Importantly, V1 neurons have their α low-frequency rhythm activity increased only if their receptive fields fall on the background and not the target. Since γ high-frequency rhythm is related to the feedforward propagation of information, while α low-frequency rhythm is associated with feedback information, these results suggest (i)  that selective attention, by increasing γ activity, either boosts the feedforward flow of information from the affected neurons

5  Early and Late Vision: Their Processes and Epistemic Status     307

to higher areas, or is an expression of the fact that there is a more efficient feedforward flow for attended stimuli; in either case, the findings can be described as the result of a gaining mechanism; (ii) that feedback signals to V1 act so as to suppress the activity of the neurons that encode the background, sharpening thus the activity of the neuronal population so that the target be preferred, a result supported by the finding that strong a waves are associated with decreased firing rates of V1 neurons. This is consistent with the view that spatial attention operates as or acts through a gaining mechanism, whereas feature/object-based attention acts as or operates through both a gaining that increases firing rates, and a tuning mechanism that sharpens neuronal responses (Ling et al. 2009). Thus, the concentrating frontal and occipital cortex activity during perceptual switches signifies the crucial role of attentional modulation of the perception of ambiguous figures and its effects on the rate of perceptual switches. The pattern matching process can be modeled as a Bayesian inference using the Bayesian inference concept, a model of how the perceptual system constructs the percept by resolving ambiguities in the input. The probability P(S/I) of a scene S, given the retinal image I, is proportional to the product of the probability of the scene P(S) prior to the retinal image based on previous knowledge that manifests itself through the effects of cognitively driven attention, and the probability P(I/S) of the retinal image I, given the scene S. That is, P(S/I) P(S) x P(I/S), given that P(I) = 1. The main idea is that the information that is unavailable to the eyes but necessary to construct a stable percept can be estimated by combining the available retinal information with prior perceptual experiences stored in memory. P(I/S) may be seen as a reliability indicator expressing how much an interpretation of the image is impressed by the visual input. The more ambiguous the image, the less certain the inferences drawn and the smaller the P(I/S) and, thus, the less stable the percept, and the more probable a spontaneous perceptual change (Kornmeier and Bach 2009).

308     A. Raftopoulos

5 Late Vision and Discursive Understanding Even if late vision does not comprise discursive inference and involves a pattern matching process that ensures the best fit with the available data, it is still arguable that late vision should be better construed as a stage of discursive understanding rather than as a visual stage. If object recognition involves forming a belief about class-membership, even if the belief is not the result of an inference, one could maintain that recognizing an object is an experience-based belief, which is a case of understanding rather than vision and constitutes a pure thought that belongs to the space of reasons. There are several reasons for which this thesis should be rejected.

5.1 Late Vision Is More Than Object Recognition First, late vision involves more than a recognitional belief, which is the final stage following immediately the formation of the percept. Suppose that S sees an animal and recognizes it as a lion. In the parallel preattentive early vision, the proto-object6 that corresponds to the tiger is being represented among the other objects in the scene. The relevant activations enter the parietal and temporal lobes, and the prefrontal cortex, where the neuronal assemblies encoding the information about tigers are activated and this activation spreads through top-down signals to the visual areas of the brain where the iconic image stores the proto-­objects extracted from the visual scene. The activations of the cells encoding the proto-object corresponding to the lion and its properties are strengthened and win the competition against the assemblies encoding the proto-objects corresponding to the other objects in the scene. After a proto-object has been selected, the object recognition system forms hypotheses concerning the identity of the object. For the viewers’ confidence to reach the threshold that will allow them to form beliefs about

6For

an elaborate treatment of proto-objects see Raftopoulos (2009).

5  Early and Late Vision: Their Processes and Epistemic Status     309

the identity of the object and report it, these hypotheses must be tested (Treisman 2006). To test these hypotheses the visual system explores the presence in the iconic image of features and regions that would confirm or disconfirm the hypotheses. Conceptual information about lions affects visual processing and after some hypothesis testing the animal is recognized as a lion through the synergy of visual circuits and WM. At this point the explicit belief “O is F” is formed. This occurs after 300 ms, when the viewer consolidates the object in WM and identifies it with enough confidence to report it, which means that beliefs are formed at the final phases of late vision. The conceptual modulation of visual processing and the process of conceptualization that eventually leads to object recognition, however, begins at 130–200 ms. There is, consequently, a time gap, between the onset of conceptualization and the recognition of an object, which is a prerequisite for the formation of an explicit recognitional belief. Although the formation of hypotheses concerning the categorization of objects may occur within 130–200 ms poststimulus onset (the timing depends on the saliency of the object), it takes another 100 ms for subsequent processes to bring this information into awareness so that the perceiver could be aware of the presence of an object and be able to report it (Treisman and Kanwisher 1998). To form the recognitional belief that “O is F,” one should be first aware of the presence of an object token and construct a coherent representation of that object. This requires the enhancement through attentional modulation of the visual responses in early visual circuits that encode rich sensory information in order to integrate them into a coherent representation, which is why beliefs are delayed in time compared with the onset of conceptualization. It follows that not all of the late vision involves explicit beliefs.

5.2 Late Vision as a Synergy of Bottom-Up and TopDown Information Processing The second reason why the beliefs, or occurrent thoughts, formed in late vision are partly visual constructs and not pure thoughts is that late

310     A. Raftopoulos

vision constitutively involves visual circuits. Pure thought involves primarily an amodal form of representation formed in higher centers of the brain, even though these amodal representations can trigger in a top-down manner the formation of mental images, and could be triggered by sensory stimulation. Amodal representations can be activated without a concomitant activation of the visual cortex (see Prinz’s [2002] notion of default concepts that are amodal representations). The representations in late vision, in contrast, are modal since they constitutively involve visual areas and, thus, should be deemed as perceptual. What distinguishes beliefs in late vision from pure thoughts, however, is not so much their modal or amodal character (pure thoughts can also be accompanied by some sort of phenomenology, whether it be cognitive phenomenology or visual phenomenology since there is evidence that thoughts can activate in a top-down manner the visual areas of the brain and, elicit, thus, visual phenomenology), as the fact that the beliefs in late vision are necessarily formed through a synergy of bottom-up and top-down activation and their maintenance requires the active participation of the visual circuits, which is another way to say that the processes are controlled by the stimulus. Pure thoughts, in contrast, could be activated and maintained in the absence of any activation in visual circuits. As we have seen, in late vision concepts are applied to the phenomenal content delivered by early vision directly and spontaneously without any inferences involved, and they refer directly to the objects in the visual scene and their properties. The application of concepts allows tentative categorizations of the perceptible objects and the subsequent testing of the categorizations made on the basis of predictions based on these categorizations. Late vision is the visual stage in which bottom-up and top-down processes collaborate to form the percept and give rise to perceptual beliefs. There is perhaps no better way to portray this synergy than in the formation of the 3D representation of an object, that is, the representation of an object that includes its hidden from the viewer part owing to the fact that perception is only perspectival. As we have seen, Jackendoff (1989), Spelke (1988), and Rock (1983) think that the representation of the whole object (the 3D representation) in perception is the result of a discursive inference. In contrast to this view, I argued

5  Early and Late Vision: Their Processes and Epistemic Status     311

that even though the 3D representation is the result of the application of, mainly, recognitional concepts pertaining to objects and properties, there is nothing inferential in the application of the concepts, since they are applied directly and spontaneously. Let me elaborate on this. Upon perceiving, say, a horse behind a fence, a perceiver does not see the whole horse because some parts of it are blocked from view by the slats of the fence and, thus, no information impinges on the retina emanating from these hidden parts of the horse. It follows that the phenomenal content of the perceiver’s experience represents the seen parts of the horse in a nonconceptual way and in this sense the hidden parts are not actually present in the phenomenal content of the experience. Let us suppose that after the animal has been identified as a horse, the background beliefs concerning the way the hidden parts look do not activate through imagination the visual cortex and, thus, the hidden parts of the horse do not figure in the phenomenology of the scene. Even so, there is a sense in which the whole horse is a constituent of the perceiver’s experience despite the fact that there is no visual sensation corresponding to the whole horse. As Noe (2004, 60) puts it “one experiences the presence of that which one perceives to be out of view.” The presence of the representation of the whole horse is the result of the application of the concepts of the horse and its properties. It is as if the representation of the whole horse is superimposed on the phenomenal content of perception to complete the full representation of the horse in late vision. This being the case, there are two sorts of consciousness involved in the representation of the horse in late vision, namely, the phenomenal consciousness, the ‘thing awareness’ of the horse, and the conceptual consciousness in the form of content that is accessed by cognition, the ‘fact awareness’ of the horse. The constitutive reliance of late vision on visual processing entails that late vision relies on the presence of the object of perception; it cannot cease to function as a perceptual demonstrative that refers to the object of perception, as this has been individuated through the processes of early vision (Raftopoulos and Muller 2006; Burge 2010, 542). Thus, late vision is constitutively context dependent since the demonstration of the perceptual particular is always context dependent. Thought, on the other hand, by its use of context-independent symbols, is free of the

312     A. Raftopoulos

particular perceptual context. Even though both recognitional beliefs in late vision and pure perceptual beliefs involve concepts (Burge’s [2010] pure attributive elements), the concepts function differently in the two contexts. As Burge (2010, 545) claims perceptual belief makes use of the singular and attributive elements in perception. In perceptual belief, pure attribution is separated from, and supplements, attributive guidance of contextually purported reference to particulars … Correct conceptualization of a perceptual attributive involves taking over the perceptual attributive’s range of applicability and making use of its (perceptual) mode of presentation.

Note that the attributive and singular elements in perception correspond to the perceived objects and their properties and not to concepts concerning these objects and properties. The attributive elements (properties in perception) guide the contextual reference to particulars (the objects of perception) since the referent in a demonstrative perceptual reference is fixed through the properties of the referent as these properties are presented in perception—what I have called the nonconceptual mode of presentation of the object in perception (Raftopoulos and Muller 2006). Hence, the attributive elements belong to the nonconceptual content of perception (Burge 2010, 538). Concepts enter in their capacity as pure attributions that make use of the perceptual mode of presentation. Burge’s view that in perceptual beliefs pure attributions supplement attributions that are used for contextual reference to particulars entails that perceptual beliefs are hybrid states involving both visual elements (the contextual attributions used for determining reference to objects and their properties) and conceptualizations of these perceptual attributives in the form of pure attributions. In this case, the role of perceptual attributives is ineliminable and, thus, Burge’s perceptual beliefs map onto my recognitional beliefs of late vision. In late vision, unlike in pure beliefs, there can be no case of pure attribution, that is, of attribution of features in the absence of perceptually relevant particulars since the attributions are used to single out these particulars. There seems to be a difference between the account developed in this chapter and Burge’s view, to wit, that I talk about the involvement of

5  Early and Late Vision: Their Processes and Epistemic Status     313

concepts and cognition in the processes of late vision (Burge and I agree that cognition enters the picture because of the role of working memory in late vision), whereas Burge talks about cognitive representations that are, nevertheless, pre-conceptual. However, if one bears in mind, on the one hand, that Burge uses the term ‘pre-conceptual’ to convey the sense that these representations are not propositionally structured and not that they do not involve concepts or attributions, and that, on the other hand, concepts figure in my account on the proviso that they may not be involved in, or presuppose the existence of, propositional structures, one readily sees that there is no real disagreement. Burge and I agree that in perception no propositional structures exist and that even though cognitive resources are employed in perception, these are only in the service of the formation of states whose function is to individuate and identify particulars in a visual scene. In pure cognition, on the other hand, the states operate on already individuated and identified particulars. The only difference is that I think that when the percept is constructed, a recognitional belief, a seeming, that, as such, is propositionally structured is formed at the end of early vision, whereas Burge denies the existence of propositional structures in perception, apparently placing such recognitional beliefs to the realm of pure cognition that employs. The concepts that figure in perceptual recognitional beliefs in late vision need not correspond to perceptual attributives, that is, they need not be restricted to concepts that late vision employs when it takes over the mode of presentation of the perceptual content. Visual systems have perceptual attributives for features such as shape, size, spatial relations, color, motion, orientation, texture, and affordances (Pylyshyn 2003; Raftopoulos 2009; Burge 2010, 546), which are matched (partly, because one does not have concepts for all perceptual attributives) by the salient concepts. However, they do not have perceptual attributives for tigers, yet one does have perceptually based beliefs about tigers. They are perceptual in that even though they do not conceptualize perceptual content and do not take over the mode of presentation of perceptions (category membership does not have a perceptual mode of presentation), they depend for their empirical applications on perceptual attributives (the concept “tiger” depends for its application on perceptual attributives such as size, shape, and color).

314     A. Raftopoulos

I said that visual systems do not have perceptual attributives for category membership, which means that these higher-order properties cannot be visually represented; one does not perceive, say, tigerness, as Bayne (2009) and Siegel (2006) argue. Let me explain this. The fact that late vision outputs recognitional beliefs that are not pure beliefs does not entail that one has visual awareness of the high-level properties that figure in the recognitional beliefs. The IT cortex (which is the highest visual area) may represent objects in 3D, their 2D projections, viewer-centered representations, viewer independent representations, whole objects, and parts of objects, but not category membership. One has cognitive access awareness (CAA) of higher-level properties. (CAA is about perceptual content that is accessed by cognition becoming available to introspection and refers to episodes of thinking about the contents of one’s perceptual experience.) These beliefs are inextricably linked to a perceptual context but this does not entail that there is a visual phenomenology of category membership. It means, however, that the belief modulates top-down the processing in the visual areas of the brain and enhances the activation of the visible features that knowledge of the category membership highlights. Thus, having recognized an object affects the perception of some of its visible features by changing their representation and phenomenology, but one does not have visual awareness of high-level features of objects. The essential link between perceptual beliefs or thoughts and a perceptual context, which explains the phenomenological character associated with the perceptual belief or thought that late vision outputs (a belief which, as I have explained, should be distinguished from the corresponding pure thought that belongs to the space of reasons), is bequeathed to late vision by early vision in which a direct link with a visual scene is established (Raftopoulos 2009). This link creates the line through which information is retrieved from the visual scene directly and not through any description associated with the visual scene. This allows one to claim that the relation between early vision and a visual scene is a liberalized form of Russell’s notion of ‘acquaintance’. The retrieval of information from the scene in this direct way and, therefore, this direct link to the visual scene, is passed on to late vision despite the fact that the tentative hypotheses formed in late vision certainly are

5  Early and Late Vision: Their Processes and Epistemic Status     315

formed on the basis, and provide a sort, of descriptions pertaining to the putative objects in the visual scene and their properties. This is so because although these descriptions play an important and irreducible role in the formation of the hypotheses, the testing of these hypotheses is made on the basis of information retrieved directly, i.e., in a nondescriptive way, from the visual scene. It is for this reason that one could affirm of late vision Evan’s (1982, 64) view Of a more ‘intimate’, more ‘direct’ relation in which a subject may stand to an object (a situation in which the subject would be ‘en rapport’ with the object), and the idea that when a subject and his audience are both situated vis-à-vis an object in this way, there exists the possibility of using singular terms to refer to, and to talk about, that object in quite a different way—expressing thoughts which would not have been available to be thought and expressed if the object had not existed.

Evan’s view is justified because, as I have argued (Raftopoulos 2009, ch. 6), early vision is characterized by the use of nonconceptual perceptual demonstratives that function as indexicals. The inextricable link between thought and perception in late vision establishes the essentially contextual, in Perry’s (2001) and Stalnaker’s (2008, 78–82) sense, character of the beliefs formed in late vision. The proposition expressed by the belief is not detachable from the perceptual context in which it is believed, and cannot be reduced to another belief in which some objective content from the perspective of a third person is substituted for the indexicals that figure in the thought (in the way one can substitute via Kaplan’s characters the indexical terms with their referents and get the “objective” truth-evaluable content of the belief ). This is because the belief is tied to an idiosyncratic viewpoint of the viewer by making use of the viewer’s physical presence and occupation of a certain location in space and time; the context in which an essentially indexical thought is believed is essential to the information conveyed. There are not, to follow Stalnaker (2008, 86–87), some relevant objective facts that the person (S1) who entertains the objective thought that purports to express the essentially indexical content has to learn in order to entertain the same content as S2 who uses the essentially contextual thought.

316     A. Raftopoulos

This means that the way the world is thought by S2 is different from the way the world is thought by S1 not because there are some different facts the two thoughts are about, but because S1’s and S2’s perspectives on the same facts are different. It follows that the singular element in the perceptual content “is an occurrent context-bound application of “that” referring to a non-repeatable property-instance such as an object or event or a trope” (Block 2014, 560, ft.1). Perception has the demonstrative reference force of “that object” and, thus, perceptual objects are determined relationally (Burge 2010). For an object to be an object of a perceptual state it must stand in a certain kind of relation to that state. Being acquainted perceptually with an object means that one is in direct contact with the object itself and retrieves information from it and not through a description (Burge 1977). Perception puts one in a de re relationship with the object (as opposed to a descriptivist relationship). Since recognitional beliefs rely on the presence of the object (reference to the object is fixed through a demonstrative as in “That x is F”), they are de re beliefs. Pure perceptual beliefs, on the other hand, have their referents fixed through a description of the object in memory. The de re relationship to a visual object eventually results in the formation of a de re belief about it. The outcome of late vision is a de re belief tied into a perceptual context. In contradistinction, pure thoughts and the pure attributions they render possible can be used outside any perceptual context and they are descriptivist beliefs.7 It is sometimes argued that the main difference between thoughts and perceptions is that perceptual experiences, unlike pure thoughts, have an essential sensory or phenomenal quality to them (Coates 2007; Dretske 1993, 436; Sellars 1977, 1981). As Phillips (2017, 27, ft.20)

7In

a de re belief, one retrieves information from the object itself and not through a description. In late vision where information in WM guides the formation of hypotheses about object identity, these hypotheses are based on descriptions in addition to visual information, since the knowledge stored in memory is a description of the object. Thus, the ensuing recognitional belief is based on a combination of information deriving from the object and from a description of it in memory. It is not a pure de re belief.

5  Early and Late Vision: Their Processes and Epistemic Status     317

puts it, possessing a certain type of phenomenology unifies the class of broadly construed perceptual states and distinguishes them from the class of narrowly defined cognitive states, where the class of ‘broad perceptual states’ encompasses both pure perceptual states (what I call, the states of early vision) and quasi-perceptual states, and the class of narrowly defined cognitive states that includes what I have called ‘pure thoughts’. One could put this in the following way: there is a fundamental difference between the way the properties of objects in a visual scene are presented to a perceiver in the phenomenal content of the perceiver’s experience and the way these same properties are represented in pure thought. Although the amodal character of cognitive states as opposed to the modality-specific character of perceptions is a good place to start to explain this difference, one could object because thoughts are not in a sense necessarily purely amodal since they may be accompanied by experiences that have a phenomenal character. The thought “the orange is round and yellow” may have a modality-specific content, in that when one holds this thought, visual areas of one’s brain encoding color and shape may also be activated (Prinz 2002). However, things are complicated. First, this activation does not entail that there is necessarily a visual awareness of these features. Second there is a large literature on this issue with conflicting results. Third, even if some phenomenal content could accompany some thoughts through the function of mental imagery that activates in a top-manner the visual areas of the brain, this phenomenal content is not essential to the pure thought; one could have had the exact same thought without any accompanying phenomenology. In perceptual experience, in contrast, the phenomenal content/ character is an essential part of it. Sellars (1981) attempts to capture the essential role of phenomenal content in perceptual experience as opposed to its absence in pure thoughts by claiming that the phenomenal properties involved in seeing have an actual existence—in the sense that a visual scene is made manifest to a perceiver through this phenomenal content, which, thus is actualized and present in her experience—while, in thought, what is believed has intentional inexistence. Presumably, on account of the fact that the content of a perceptual thought merely represents a visual

318     A. Raftopoulos

scene and has no presentational character in that the represented properties are not presented to the perceiver as being in some way (which is another way to say that there is no phenomenology essentially associated with the thought), the objects and properties involved are the intentional objects and properties of the thought and in this sense they exist only intentionally and not actually; hence, they are intentionally inexistent. Coates (2007, 41) attempts to explicate Sellar’s distinction in the following way: when a perceiver has a visual experience the world is presented to her in some way and in this sense something is actually present to her from the perceiver’s perspective. This means that the perceiver may be in some state that presents the world to her as being some way and, thus, that there exists something that is actually present to her, even though there may nothing in the environment that corresponds to the content of her experience, as in the case of hallucinations. In contradistinction, when a thinker entertains a perceptual thought, the content of the thought that she is aware of “is sometimes referred to as an ‘intentional in-existent object’. But such intentional objects lack any real existence; they are nothing but ways of describing the content of the belief. Nothing exists … that has the phenomenal quality …” There is, however, a rather obvious problem with Coates’s analysis. In hallucinations, where the object of which the perceiver hallucinates does not exist in the world, the object of the hallucinatory state exists only as the content of the perceiver’s state and, thus, has only an intentional existence, exactly like the content of a thought; it is intentionally inexistent. Coates attempts to evade the problem by arguing that in hallucinations there is something, perhaps a state that has some phenomenal qualities, while in thinking such a state does not exist. Coates, however, cannot object that hallucinatory states qua states with phenomenal character have a presentational character which renders the content actually existent, while thoughts qua states do not have such a character and, therefore, their contents are intentionally inexistent, because the distinction between states with phenomenal character and states without an essential phenomenal character is what the distinction between actual existence and intentional inexistence is supposed to explain and, therefore, it cannot presuppose the former distinction.

5  Early and Late Vision: Their Processes and Epistemic Status     319

The distinction between actual existence and intentional inexistence stems from Sellar’s view that the phenomenal content of a perceptual experience is an intrinsic component of perceptual experience that differs from the experience’s representational or intentional component. Against Sellars, I think that phenomenal contents are representational or, at least, have a representational component, but I will not pursue this issue any further. I will only note here that if phenomenal contents are representational, and in this aspect they are not different from the contents of thoughts, the essential difference between their irreducible phenomenal content or character and the non-modal character of thoughts remains and must be accounted for. In fact, the essentiality of phenomenal character in perceptual experience signifies a firm distinction between perceptual beliefs or thoughts and pure beliefs or thoughts as well, as I explain in the concluding discussion in this chapter.

6 Beliefs: Take Two If the recognitional beliefs formed in late vision are not endorsed to become judgments, they are in some sense hypotheses. Suppose that upon viewing a scene containing an object O, S comes to believe that O is F. Since things may not be as they seem, S refrains from judging that O is F; S does not endorse the content of her perceptual belief. How is this recognitional belief different from the hypotheses or implicit beliefs that are constructed during the earlier stages of late vision in order to establish the identity of the object beyond the fact that the one is explicit, while the other is implicit? In my view, the main difference consists in that the early hypotheses are tested against the iconic information stored in visual areas. This is an unconscious process that is outside the control of the viewer who is usually aware only of the content of the winner, that is, the content of the explicit recognitional belief. However, the recognitional belief of late vision must be tested against a different sort of evidence in order to become a judgment. It must be tested against other sorts of propositional structures, that is, pure beliefs in which the predicate terms function as pure attributions. The aim of the testing is to put aside various

320     A. Raftopoulos

possible defeaters of the belief. For example, viewers have to decide whether they are the victim of some hallucination, etc. The processes involved in this testing may be available to the viewer’s consciousness, they are usually under her control, and they have the form of inferences from propositional contents to propositional contents, unlike the processes in late vision. The viewer tries to determine whether she should take the content of her late vision at face value. This is why testing the recognitional belief against other pure beliefs is a discursive process that is within the space of reasons, whereas testing the implicit hypotheses to come up with a recognitional belief belongs to late vision. In this sense, the recognitional beliefs formed in late vision are at the interface between the space of reasons and the perceptual space and, thus, have a pivotal role to play in accounts of justification of perceptual judgments. I can explain now my claim that a belief is a dispositional state as opposed to a judgment that is an occurrent state. I tried to express the thought that perception, through its seeming that is formed in late vision, gives us a prima facie inclination to believe that O is F but other evidence may override this and preclude us from forming the judgment that O is F. For example, some illusions give us a prima facie reason to believe that O is F but we do not endorse this because we do not believe that O is F. Undoubtedly, when O appears F in one’s experience, one is inclined to form (this is what I mean by “prima facie”) the recognitional belief that O is F. However, one need not endorse that thought. That O appears F in one’s experience should not be equated with one endorsing that O is F. To do that, one has to consider other relevant beliefs. Thus, to transform the belief to a judgment, one has to integrate it in the nexus of other beliefs, putting it, thus, within the space of reasons. This is possible because the recognitional belief already has a propositional structure. There are two notions of belief here. The one is related to the expression of the content of a conceptual perceptual state, the recognitional belief, and the other is constitutively related to the notion of judgment. The relation of the belief in the first sense to late vision contents is not inferential. The relation of the same recognitional belief with the nexus of other beliefs is an inferential relation; if endorsed, the belief becomes a judgment. The belief is, thus, a disposition to make judgments

5  Early and Late Vision: Their Processes and Epistemic Status     321

(McDowell 1994, 60) that do not introduce some new content but simply endorse the content of the recognitional belief. One might object that in the first case it would be wrong to use the term ‘belief ’ to denote the thoughts that are formed as hypotheses concerning the identity of visual objects because a mental item could be properly characterized as a belief only if it is sensitive to revision from non-perceptual sources of information, because these hypotheses are tested and eventually revised in the light of the nonpropositional visual information contained in early vision. Although this is certainly true, as the fact that our belief that the Muller-Lye configuration creates an illusion does not alter the fact that the content of our relevant perceptual state supports prima facie the thought/belief that the two lines are unequal in length clearly shows, still, being prima facie, this belief can be revised after it has been formed in the light of non perceptual information (a measurement, for example, of the length of the two lines). Thus, the fact that the hybrid mental state that late vision elicits enters the space of reasons, having the requisite structure owing to the conceptualization process described above, allows the usage of the term ‘belief ’ to denote it.8 Johnston (2006, 282) argues that the judgments that perception outputs are not inferentially based on perceptual content. “My judgment does not go beyond its truthmaker, which sensory experience has made manifest. Its truth is thus guaranteed by its origins. This is how immediate perceptual judgments often have the status of knowledge. There is no evidence from which they are inferred; instead they are reliable formed out of awareness of their truth maker, often in the absence of any evidence to the contrary.” Johnston talks about immediate perceptual judgments, whereas I talk about recognitional beliefs that may or may not become judgments. Johnston’s view that perceptual judgments

8The

view that the result of the conceptualization process occurring in perception can be described as a prima facie belief is explicitly endorsed by Gauker (2012, 45), and implicitly assumed by most among those who discus the epistemic role of perception in rationally supporting beliefs.

322     A. Raftopoulos

are not inferred from perceptual evidence is correct. Our difference stems from considerations pertaining to the sentence “often in the absence of evidence to the contrary.” I have claimed that to examine possible evidence against a recognitional belief, the belief must be inferentially tested against other pure beliefs (perceptual or otherwise). Only when it passes the test it becomes a judgment. Thus, I qualify Johnston’s view that perceptual judgments are not inferred from any evidence, by distinguishing between perceptual beliefs and perceptual judgments and by adding that the former are not inferred from any evidence as outputs of late vision, but to become judgments they have to enter into inferential relations with possible defeaters. My notion of perceptual, recognitional beliefs or occurrent thoughts that are formed in late vision and are not endorsed to become judgments corresponds to Tucker’s (2010) notion of ‘seeming’ and McGrath’s (2013a, b) notion of ‘nonreceptive seeming’ (except of course that unlike McGrath and Tucker, I do not hold that the occurrent thoughts in late vision are the results of inferences or quasi-inferences from some perceptual data, which in McGrath’s case would be the receptive seemings) in that they are both seemings, that is, they are essentially having a phenomenal character and are, thus, perceptual contents and not thought-contents, and are also conceptually/propositionally structured. My conception of recognitional beliefs as dispositional states with propositional contents and phenomenal characters also conforms to McAllister’s view (2018) that seemings are mental states that have propositional contents and phenomenal characters (the Experience View of seemings). McAllister’s attacks the view that seemings are beliefs, more precisely that a seeming that p is a belief that p, but his criticisms (McAllister 2018, 3082–3086) target a conception of belief according to which a belief is a mental state that is endorsed, and as such do not apply to my dispositional view of ‘belief ’. My view of recognitional beliefs as dispositional might make one think that I endorse the Inclinational View of seemings, according to which a seeming that p is a conscious inclination to believe that p. This, however, is far from my views. A recognitional belief, in my view, is the cause of such inclinations and, thus, it cannot be that inclination since causes differ from effects. Moreover, being dispositional in the way explained above does

5  Early and Late Vision: Their Processes and Epistemic Status     323

not entail that one has a recognitional belief whose content one is inclined to believe. To return to our main theme, I do not claim that recognitional beliefs are always tested in this way to become judgments. Under normal conditions they are not tested at all. One might argue, however, that the absence of testing means that the viewer thinks that there is no reason to doubt the recognitional belief, which in itself is a sort of implicit inference. Or, one might think that in these normal cases, the recognitional belief becomes automatically a judgment without any inferential involvement. Still, the distinction holds because on certain occasions the recognitional belief is inferentially tested against other beliefs in order to become a judgment and, thus, recognitional beliefs and perceptual judgments belong to different categories, the first being a state that has the potential to become a judgment, even if the potentiality is actualized on certain occasions automatically.

7 Late Vision, Amodal Completion, and Inference Nanay (2010) thinks that mental imagery is necessary to account for amodal completion. Nanay (2010, 252) also thinks that amodal completion in some cases is accompanied by some sort of phenomenology subserved by the activation of the early visual areas. In this case, the hidden parts and features of an object are not merely believed in but are present in the object of perception as actualities by being imagined.9 Moreover, even in cases of amodal completion that is not 9Nanay’s

wording is very similar to Sellar’s (1977) explanation of the way the unseen parts of objects participate in the sense-image-model of the object that unifies contributions from the senses and the imagination. According to Sellars (1977) But while these features are not seen, they are not merely believed in. These features are present in the object of perception as actualities. They are present in virtue of being imagined. Sellars, unlike Nanay, however, thinks that the abovementioned remark applies to all cases of perception and not just to those in which imagination activates the visual cortex and gives rise to

324     A. Raftopoulos

accompanied by some sort of phenomenology, the hidden parts or features are perceptually represented. This provides us with the opportunity to delineate further the distinction between visual awareness and visual understanding and why late vision is a case of visual awareness. Briscoe (2011, 165–167), argues that although imagery is sufficient for amodal completion, it is not necessary since one could either C-complete a visual scene by forming beliefs about the hidden parts of an object based on its visible features without projecting a mental image (the belief based account of C-completion), or one could amodally complete a scene in bottom-up perceptual ways, in the way explained in the third section.10 Briscoe (2011) remarks that there are cases of C-completion, for example, the 3D sketch of an object whose backsides are hidden from view, which are cognitively driven in that to complete the hidden parts the viewer must draw from object knowledge. This may produce activation of the visual cortex, such that one has a mental image of the hidden parts, or it may produce simply a thought that there are some parts hidden from view without any mental images, or it may produce both (Briscoe 2011, 158). If the visual cortex is involved in C-completing the picture there is a synergy of bottom-up and top-down processes. 3D completion occurs in late vision where certain visual processing areas are activated.

some sort of phenomenology, which is probably why Sellars says that these features are not seen. This immediately creates the problem (discussed by Coates 2007, 175–176) of how one should understand the claim that the unseen features of objects are present in the object of perception as actualities even in those cases in which imagination does not activate the visual cortex. When it does, there is a clear sense in which these properties are present in perception as actualities, as Nanay remarks, but when it does not, their presence as actualities needs explanation. Coates (2007, 177) proposes convincingly, that a way to understand Sellars is to think of the ‘presence as actuality of these features’ in terms of dispositional presence in the form of anticipations of the ways our phenomenal experience could be transformed should we, or the objects, move around and change the perspective from which the perceiver views the object. Such a change in perspective could render the hitherto unseen hidden parts of the object perceptible properties of the objects, which, thus, may be phenomenally experienced. 10Note

that Nanay (2010, 244) seems to talk about a perceptually driven amodal completion that is insensitive to other beliefs.

5  Early and Late Vision: Their Processes and Epistemic Status     325

If C-completion involves a pure perceptual thought about the hidden parts that results from an inference based on past experience and the current visual evidence, this is a case of visual understanding and not of visual awareness. I do not think that this possibility undermines my thesis that seeing the 3D sketch takes place in late vision. First, it is not clear whether there is empirical evidence for C-completion through pure thought and in the absence of any activation in visual areas. Second, if there are such cases, this only shows that sometimes C-completion does not occur in late vision but in discursive reasoning. Third, Briscoe’s example from which he argues that C-completion may involve a pure thought involves a picture of the backside of what looks like a horse. In this case C-completion takes the form of a pure thought that this is a horse without any visual awareness. This is clearly a case of an inference involving visual understanding that occurs in the space of reasons and not in late vision. My claim is, on the other hand, that seeing the 3D sketch is a case of C-completion that takes place in late vision and involves visual awareness. Thus, even if there are cases of C-completions through pure thoughts, there are sorts of C-completions, such as seeing the 3D sketch, that take place in late vision and are cases of visual awareness. Consider the white surface of a wall seen in a shadow and perceived as gray. Even though the viewer knows that the gray shade is caused by the shadow cast on a white wall, the phenomenal character of her experience is that of gray. The phenomenal character of her experience of the situation-dependent color property (Schellenberg 2008), or of the phenomenal property (Shoemaker 2006), or of the centered property (Brogaard 2010) is gray not white. Of course, being aware of the shadow she could infer the intrinsic (Schellenberg 2008) or objective (Shoemaker 2006) color of the wall but this is an inference based on the visible grayness, knowledge of the effects of shadows on surfaces, etc. In this case, one does not perceive the whiteness in any sense of “seeing” and the output of late vision is not the belief that the color of the wall is white. That the wall is experienced in late vision as gray is a case of visual awareness, where the concomitant belief takes over the mode of presentation of the object of experience. One may abstain from endorsing the recognitional belief attributed by perception and form, instead,

326     A. Raftopoulos

the judgment that the wall is white even though it looks gray, but this representation is in the realm of pure thought. It is a case of visual understanding, a process in which one draws a conclusion based on the evidence of the senses and other relevant information. Suppose now that one sees one’s hand moving back and forth. One sees the hand having the same size, a case of size constancy. If the constancy is due to cues that are available in the retinal image, the viewer is phenomenally aware of the same size despite differences in the viewing conditions. If size constancy is not effectuated through visual information and cognitive sources are needed, it is achieved in late vision; the viewer believes that the size is constant and has the phenomenal experience of a constant size. Should visual information be insufficient for perceptual constancy and should the nonvisual information that ensures constancy be not available (as where attention is diverted elsewhere), the viewer would be aware of changes in size. This is what Epstein and Broota (1986) show by demonstrating that when attention is directed elsewhere, the size constancy operations fail. Thus, the experience of a stable size is the product of late vision, created by the knowledge of the size and stability of our hand in synergy with visual information coming from the hand. There is a large amount of literature supporting the view that many a perceptual constancy relies on object knowledge (Granrud 2004; Cohen 2008; Hatfield 2009). Such experiments show, in addition, that the perceiver can become aware of the phenomenological difference between the perception of a visual scene that involves no conceptual influences and the perception of the same visual scene when concepts, through the role of cognitively driven attention, intervene. Nanay (2010, 252) argues that amodal completion in some cases is accompanied by some sort of phenomenology subserved by the activation of the early visual areas. In this sense, the hidden parts and features of an object are not merely believed in but are present in the object of perception as actualities by being imagined. Even though in the case of the white wall discussed in the preceding paragraph it is not the case that through imagination the wall is seen as white, let us suppose that on some occasions the knowledge that the wall is really white may, through the exercise of imagination, activate the visual areas of the brain and create the image of a white wall and, as a result, the perceiver

5  Early and Late Vision: Their Processes and Epistemic Status     327

perceives a white wall. How is this combined with her perception of a gray wall which precedes the perception of the wall as white—the perception of the gray wall precedes the imagined white wall because in this context the perceptual act of perceiving gray precedes the activity of imagination in so far as the latter requires that the object and the color are recognized as such, that the illumination conditions are factored in, that background knowledge about the color of that particular wall is accessed, and that the conclusion is reached that the wall is white, which activating the concept ‘white’ causes the imagination of the color white? Is, for example, the perceiver aware of the change of colors? The problem is more severe because it extends beyond cases that involve imagery to everyday perception. As I argued in previous chapters, attention changes the phenomenology of the visual scene. Given that in late vision attention affects perceptual processing, the conclusion is drawn that when a perceiver encounters a visual scene, she forms first in early vision a phenomenal awareness of some purely NCC and, a few ms. later in late vision, another phenomenology of the visual scene is formed on account of conceptual/attentional modulation of the perceptual processes, a phenomenology that may present to the perceiver the visual scene differently than the presentation of the visual scene in phenomenal awareness. Now there are cases, which involve some ingenious experimentation such as that of Epstein and Broota’s (1986), in which the perceiver can become aware of the change in the phenomenology of a scene when knowledge and, thus, concepts intervene and modulate perceptual processing. Since early vision is the visual stage that is not affected by concepts directly, and late vision is the visual stage that is conceptually modulated, the above means that a perceiver can experience the change in the phenomenological content of her experience owing to the role of concepts and attention. In ordinary perception, however, the perceivers are not conscious of such a shift in the phenomenology of the visual scene they encounter. If the attentional effects on perception during late vision are a ubiquitous fact of perception, why are not the perceivers usually aware of the change in the way the visual scene appears to them? In other words, how could the unity of the phenomenology of experience be explained in view of the phenomenological effects of attention? If one thinks that without cognitively driven

328     A. Raftopoulos

attention there is no awareness whatsoever then the problem is solved because what transpires in the non-attentionally modulated early vision is at a purely subpersonal level that is beyond the purview of consciousness. If, however, one thinks, as I do, that one could have a phenomenal awareness of the NCC in early vision, then the unity of the phenomenology or the phenomenological unity of perception has to be explained. I will deal with this problem in the this chapter. Returning to the problem at hand, despite the role of thoughts in late vision, these cases should be better construed as visual awareness and not as visual understanding because, first, the states of late vision do not consist in pure thoughts but in hybrid states and, second, because the processes that lead to perceptual constancy are not discursive inferences. To recapitulate, in pure thought the beliefs formed result from discursive processes (which may include perceptual information cast in a propositional form) and their attributives are context-free, while in late vision there are no discursive processes but only conceptually modulated visual processing and the relevant attributives are context bound. These differences result from the constitutive involvement in late vision of visual circuits, an involvement that is absent in pure thought. This view entails that in amodal completion, which is one of the processes that take place in late vision, the missing or occluded features are nor represented by pure perceptual beliefs, a view supported by (partially) independent considerations offered by Nanay (2010, 243–246).

8 Concluding Discussion I have said that the non-inferential process that results in the formation of a recognitional thought/belief can be recast in the form of an argument from some premise to a conclusion. However, this does not entail that the formation of the perceptual thought is a piece of reasoning, that is, a transition from a set of premises that act as a reason for holding the thought to the thought itself. Admittedly, perceivers can be asked on what grounds they hold the thought that O is F, in which case they may reply “because I saw it” or “I saw that O is F”. However, this does not mean that the reason they cite as a justification of their

5  Early and Late Vision: Their Processes and Epistemic Status     329

thought is a premise from which they inferred the thought. They do not argue from her thought “I saw it to be thus and so” to the thought “It is thus and so”. They just form the thought on the basis of the evidence included in their relevant perceptual state in a non-inferential way. What warrants the recognitional thought “O is F” is not the thought held by the perceivers that they see O to be F but the perceptual state that presents to them the world as being such and such. “When one knows something to be so by virtue of seeing to be so, one’s warrant for believing it to be so is that one sees it to be so, not one’s believing that one sees it to be so” (McDowell 2011, 33). Some philosophers consider that there is a sharp distinction between vision and thought and attempt to explain various phenomena (such as modal and amodal completion, or cognitive effects on perception) either (exclusive “either… or”) as perceptual or thought-based. MacPherson (2012) considers evidence for the effects of knowledge of the typical colors of objects on the perception of these colors and after having rejected a thought-based explanation of these effects argues that knowledge affects perception itself through the processes of mental imagery and that, consequently, perception is cognitively penetrable. The main reason that drives MacPherson to conclude that color perception is cognitively penetrable is that cognition affects the phenomenology of the way colors look and this cannot be explained by a belief-based account but only by admitting that it is the perceptual stage itself that is cognitively affected. However, if one allows for the possibility of a stage of visual processing in which visual processing and cognitive effects coexist and, consequently, allows for a stage of visual processing that is cognitively penetrated and has its own phenomenology, one can explain the cognitive effects on visual phenomenology without drawing the conclusion that all visual processes are cognitively penetrable, since early vision may still be cognitively impenetrable. There is a hybrid stage of vision/ thought in which perception and cognition are intermingled. This is the cognitively penetrated stage of late vision. Since late vision does not involve pure thoughts, the belief-based accounts are wrong but that does not entail that early vision is cognitively penetrable. In Raftopoulos (2014a), I stressed that the essential relation between CI and NCC allows us to understand better Burge’s (2010) claim that

330     A. Raftopoulos

(purely) perceptual states are under the causal control of the stimulus, and Beck’s (2017) and Phillips (2017) view that (purely) perceptual states, but not cognitive states, are essentially sustained by present proximal stimulation, and that perceptual states, in contradistinction to cognitive states, have the function of representing a visual scene in a stimulus-dependent fashion. In so far as the states of early vision, or pure perception, are formed by processes that are data-driven (as they are affected by the operational constraints that we discussed in Chapter 3, which do not implicate any cognitive influences in perceptual processing), which is the essence of the thesis that they are CI, the perceptual states are under the causal control of the present stimulus only. In view of our discussion on the nature of late vision, however, this assessment needs some qualifications. According to Phillips, the quasi-perceptual states are those associated with mental imagery that, as a matter of course, have a topdown cognitive ingredient. These states are not stimulus-dependent and are not causally controlled by a stimulus because paradigmatically mental imagery occurs in the absence of stimulus. We saw that many authors (Macpherson is one example) refer to mental imagery to signify any top-down cognitive influence on perceptual processing. In this sense, late vision involves mental imagery and, as such belongs to quasi-perception. If this is the case, however, Phillips claim that quasiperceptual states do not require the presence of the stimulus does not apply to late vision because, as we have seen, late vision processes depend necessarily on the presence of the stimulus. In addition, they do depend causally on the stimulus but they are not entirely under its sole causal control because they are also affected by top-down cognitive influences. Phillips (2017, 7) posits that a process or a state is perceptual just in case it functions so as to produce representations of a visual scene by being causally controlled by these proximal stimuli that the entities in the scene produce. It follows that for Phillips late vision is not a perceptual state but belongs, rather, to quasi-perception. Since Phillips (2017, 27) emphasizes that he is interested in the distinction between narrow perception/broad cognition borders, his

5  Early and Late Vision: Their Processes and Epistemic Status     331

discussion concerns the borders between pure perception that does not involve any cognitive influences (in my term, early vision), on the one hand, and quasi-perceptual (in my term, late vision) and purely cognitive states, on the other hand. Phillips, therefore, may be taken to suggest that late vision, being broadly cognitive, does not have the function to represent a visual scene in a stimulus-dependent fashion and that late vision does not depend on the presence of the stimulus. I write ‘may be taken to suggest’ because Phillips by quasi-perceptual states refers to the states of mental imagery and not to what I call late vision and, thus, his discussion cannot be transferred automatically to late vision. It is, however, plausible to assume that for the reasons explained in the preceding paragraph he would classify late vision as quasi-perceptual. Be that as it may, I have argued that late vision does depend on the presence of the stimulus but is partly under the causal control of the stimulus-driven information. In late vision, therefore, we have a set of processes that necessarily depends on the presence of a stimulus but is not entirely under its causal control. Let me add one more layer of complication. Beck, Burge, Phillips, and I have been talking about perceptual processes, unlike cognitive ones, being essentially dependent on the presence, and under the causal control, of the stimulus. Strictly speaking this is not correct. We know that both the iconic sensory memory and the fragile visual short-term memory can retain information retrieved from a visual scene for several seconds in the absence of the visual scene. During this time, perceptual processing functions on the basis of this information independent of whether the stimulus is still present, and still depends on this information and is exhaustively under the causal control of this information. Therefore, it would be better to say that perception is under the causal control, and depends on the presence, of the information retrieved from the visual scene and stored in either iconic sensory memory or fragile visual short-term memory rather than it is under the causal control of the stimulus itself, except in the sense that both types of memory require the presence of a stimulus and directly retrieve information from the stimulus.

332     A. Raftopoulos

References Andrews, T. J., Schluppeck, D., Homfray, D., Matthews, P., & Blakemore, C. (2002). Activity in the fusiform gyrus predicts conscious perception of Rubin’s vase-face illusion. Neuroimage, 17, 890–901. Barr, M. (2009). The proactive brain: Memory for predictions. Philosophical Transactions of the Royal Society London, B, Biological Sciences, 364, 1235–1243. Bayne, T. (2009). Perception and the reach of phenomenal content. Philosophical Quarterly, 39, 385–405. Beck, J. (2017). Marking the perception-cognition boundary: The criterion of stimulus-dependence. Australasian Journal of Philosophy. https://doi.org/10. 1080/00048402.2017.1329329. Biederman, I. (1987). Recognition by components: A theory of human image understanding. Psychological Review, 94, 115–147. Block, N. (2014). Seeing-as in the light of vision science. Philosophy and Phenomenological Research, 89(3), 560–572. Boghossian, P. (2014). What is inference? Philosophical Studies, 169(1), 1–18. Borisyuk, R., Chik, D., & Kazanovich, Y. (2009). Visual perception of ambiguous figures: Synchronization based models. Biological Cybernetics, 100, 491–504. Briscoe, R. E. (2011). Mental imagery and the varieties of amodal perception. Pacific Philosophical Quarterly, 92, 153–173. Britz, J., & Pitts, M. (2011). Perceptual reversals during binocular rivalry: ERP components and their concomitant source differences. Psychophysiology, 48, 1489–1498. Brogaard, B. (2010). Strong representationalism and centered content. Philosophical Studies, 151, 373–392. Brogaard, B., & Gatzia, D. (2017). Color and cognitive penetrability. Topics in Cognitive Science, 9(1), 193–214. Brossel. P. (2017). Rational relations between perception and belief: The case of color. Review of Philosophy and Psychology. https://doi.org/10.1007/ s13164-017-0359-y. Burge, T. (1977). Belief de re. Journal of Philosophy, 74, 338–362. Burge, T. (2003). Perceptual entitlement. Philosophy and Phenomenological Research, 67(3), 503–548. Burge, T. (2010). Origins of Objectivity. Oxford: Clarendon Press. Burge, T. (2014). Reply to block: Adaptation and the upper border of perception. Philosophy and Phenomenological Research, 89(1), 573–583.

5  Early and Late Vision: Their Processes and Epistemic Status     333

Cavanagh, P. (2011). Visual cognition. Vision Research, 51, 1538–1551. Chelazzi, L., Miller, E., Duncan, J., & Desimone, R. (1993). A neural basis for visual search in inferior temporal cortex. Nature, 363, 345–347. Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36, 181–253. Coates, P. (2007). The Metaphysics of Perception. New York, NY: Routledge. Cohen, J. (2008). Colour constancy as counterfactual. Australasian Journal of Philosophy, 86, 61–92. Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven attention in the brain. National Review of Neuroscience, 3(3), 201–215. Delorme, A., Rousselet, G. A., Mace, M. J.-M., & Fabre-Thorpe, M. (2004). Interaction of top-down and bottom up processing in the fast visual analysis of natural scenes. Brain Research, 19, 103–113. Desimone, R., & Duncan, J. 1995. Neural mechanisms of selective visual attention. Annual Review of Neurosciences, 18, 193–222. Dretske, F. (1981). Knowledge and the Flow of Information. Oxford: Blackwell. Dretske, F. (1993). Conscious experience. Mind, 102(406), 263–283. Reprinted in Noë and Thompson. (2002). Vision and Mind. Cambridge: MIT Press. Dretske, F. (1995). Naturalizing the Mind. Cambridge: MIT Press. Elman, J. L., Bates, E. A., Johnson, M. H., Karmiloff-Smith, A., Parisi, D., & Plunkett, K. (1996). Rethinking Innateness: A Connectionist Perspective on Development. Cambridge: MIT Press. Epstein, W., & Broota, K. D. (1986). Automatic and attentional components in perception of size-at-a-distance. Perception Psychophysics, 40, 256–262. Evans, G. (1982). The Varieties of Reference. Oxford: Clarendon Press. Fabre-Thorpe, M., Delorme, A., Marlot, C., & Thorpe, S. (2001). A limit to the speed of processing in ultrarapid visual categorization of novel natural scenes. Journal of Cognitive Neuroscience, 13, 171–180. Fodor, J. (2007). The revenge of the given. In B. P. McLaughlin & J. Cohen (Eds.), Contemporary Debates in the Philosophy of Mind. Malden, MA: Blackwell. Gauker, C. (2012). Perception without propositions. Philosophical Perspectives, 26, Philosophy of Mind, 19–50. Gonzales-Cassilas, A., Parra, L., Avila-Contreras, C., Ramirez-Pedraza, R., Vargas, N., Luis del Valle-Padilla, J., & Ramos, F. (2018). Towards a model of visual recognition based on neuroscience. Biologically Inspired Cognitive Architectures. https://doi.org/10.1016/j.bica.2018.07.018.

334     A. Raftopoulos

Granrud, C. E. (2004). Visual metacognition and the development of size constancy. In D. T. Levin (Ed.), Thinking and Seeing (pp. 75–95). Cambridge: MIT Press. Hatfield, G. (2002). Perception as unconscious inference. In D. Heyer & R. Mausfeld (Eds.), Perception and the Physical World: Psychological and Philosophical Issues in Perception. West Sussex: Wiley. Hatfield, G. (2009). Perception and Cognition: Essays in the Philosophy of Psychology. Oxford: Clarendon Press. Heck, R. G., Jr. (2000). Nonconceptual content and the ‘space of reasons’. Philosophical Review, 109, 483–523. Heck, R. G., Jr. (2007). Are there different kinds of content? In J. Cohen & B. McLaughlin (Eds.), Contemporary Debates in the Philosophy of Mind. Oxford: Blackwell. Helmholtz, von H. (1878[1925]). Treatise on Psychological Optics. New York: Dover. Hochberg, J., & Peterson, M. A. (1987). Piecemeal organization and cognitive components in object perception. Journal of Experimental Psychology: General, 116, 370–380. Horgan, T., & Tienson, J. (1996). Connectionism and the Philosophy of Psychology. Cambridge: MIT Press. Hume, D (2003 [1739–1740]). A Treatise of Human Nature. Mineola, NY: Dover. Ito, J., Nikolaev, A. R., Luman, M., Aukes, M. F., Nakatani, C., & Van Leeuwen, C. (2003). Perceptual switching, eye movements, and the bus paradox. Perception, 32(6), 681–698. Jackendoff, R. (1989). Consciousness and the Computational Mind. Cambridge: MIT Press. Jackson, F. (1977). Perception: A Representative Theory. Cambridge: Cambridge University Press. Johnson, J. S., & Olshausen, B. A. (2005). The earliest EEG signatures of object recognition in a cued target task are postsensory. Journal of Vision, 5, 299–312. Johnston, M. (2006). Better than mere knowledge: The function of sensory awareness. In T. S. Gendler & J. Hawthorne (Eds.), Perceptual Experience. Oxford: Clarendon Press. Kawabata, N. (1986). Attention and depth perception. Perception, 15, 563–572. Kelso, S. (1995). Dynamic patterns: The self organization of brain and behavior. Cambridge: MIT Press.

5  Early and Late Vision: Their Processes and Epistemic Status     335

Kihara, K., & Takeda, Y. (2010). Time course of the integration of spatial frequency-based information in natural scenes. Vision Research, 50, 2158–2162. Kornmeier, J., & Bach, M. (2009). Object perception: When our brain is impressed but we do not notice it. Journal of Vision, 9(1), 1–10. Kosslyn, S. M. (1994). Image and Brain. Cambridge: MIT Press. Kulvicki, J. (2007). Perceptual content is vertically articulate. American Philosophical Quarterly, 44(4), 357–369. Kulvicki, J. (2015). Analog representation and the parts principle. Review of Philosophy and Psychology, 6, 165–180. Leopold, D. A., & Logothetis, N. K. (1999). Multistable phenomena: Changing views in perception. Trends in Cognitive Science, 3(7), 254–264. Ling, S., Liu, T., & Carrasco, M. (2009). How spatial and feature-based attention affect the gain and tuning of population responses. Vision Research, 49, 1194–1204. Long, G. M., & Toppino, C. T. (2004). Enduring interest in perceptual ambiguity: Alternating views of reversible figures. Psychological Bulletin, 130(5), 748–768. Macpherson, F. (2012). Cognitive penetration of colour experience: Rethinking the issue in light of an indirect mechanism. Philosophy and Phenomenological Research, 84(1), 24–62. Mackie, J. (1976). Problems from Locke. Oxford: Oxford University Press. McAllister, B. (2018). Seemings as sui generis. Synthese, 195, 3079–3096. McDowell, J. (1994). Mind and World. Cambridge, MA: Harvard University Press. McDowell, J. (2011). Reception as a Capacity for Knowledge. Milwaukee, WI: Marquette University Press. McGrath, M. (2013a). Siegel and the impact for epistemological internalism. Philosophical Studies, 162(3), 723–732. McGrath, M. (2013b). Phenomenal conservatism and cognitive penetration. In C. Tucker (Ed.), Seemings and Justification (pp. 225–247). Oxford: Oxford University Press. Meng, M., & Tong, F. (2004). Can attention selectively bias bistable perception? Journal of Vision, 4, 539–551. Millar, A. (2011). How visual perception yields reasons for belief. Philosophical Issues, 21, The Epistemology of Perception, 332–351. Murray, S. O. (2008). The effects of spatial attention in early human early visual cortex are stimulus independent. Journal of Vision, 8(10), 1–11.

336     A. Raftopoulos

Nakatani, H., & van Leeuwen, C. (2006). Transient synchrony of distant brain areas and perceptual switching in ambiguous figures. Biological Cybernetics, 94, 445–457. Nanay, B. (2010). Perception and imagination: Amodal perception as mental imagery. Philosophical Studies, 150, 239–254. Nanay, B. (2015). Perceptual content and the content of mental imagery. Philosophical Studies, 172, 1723–1736. Noe, A. (2004). Action in Perception. Cambridge: MIT Press. Palmer, S. (1999). Vision Science. Cambridge: MIT Press. Perry, J. (2001). Knowledge, Possibility, and Consciousness. Cambridge: MIT Press. Peterson, M. A., & Gibson, B. S. (1991). Directing spatial attention within an object: Altering the functional equivalence of shape descriptions. Journal of Experimental Psychology: Human Perception and Performance, 17, 170–182. Peyrin, C., Michel, C. M., Schwartz, S., Thut, G., Seghier, M., Landis, T., et al. (2010). The neural processes and timing of top-down processes during coarse-to-fine categorization of visual scenes: A combined fMRI and ERP study. Journal of Cognitive Neuroscience, 22, 2678–2780. Phillips, B. (2017). The shifting border between perception and cognition. Nous, 1–31. https://doi.org/10.1111/nous.12218. Pitts, M., Nerger, J., & Davis, T. J. R. (2007). Electrophysiological correlates of perceptual reversals for three different types of multistable images. Journal of Vision, 7(1), 1–14. Potter, M. C., Wyble, B., Hagmann, C. E., & McCourt, E. S. (2014). Detecting meaning in RSVP at 13ms per picture. Attention, Perception, Psychophysics, 76, 270–279. Prinz, J. J. (2002). Furnishing the Mind. Cambridge: MIT Press. Pylyshyn, Z. (2003). Seeing and Visualizing: It’s Not What You Think. Cambridge: MIT Press. Raftopoulos, A. (2009). Cognition and Perception: How Do Psychology and Neural Science Inform Philosophy? Cambridge: MIT Press. Raftopoulos, A. (2010). Can nonconceptual content be stored in visual memory? Philosophical Psychology, 23(5), 639–668. Raftopoulos, A. (2014). The cognitive impenetrability of the content of early vision is a necessary and sufficient condition for purely nonconceptual content. Philosophical Psychology, 27(5), 601–620. Raftopoulos, A. (2015a). The cognitive impenetrability of perception and theory-ladenness. Journal of General Philosophy of Science, 46(1), 87–103. Raftopoulos, A. (2015b). Abductive inferences in late vision. In L. Magnani, W. Park, & Li Ping (Eds.), Philosophy and Cognitive Science II, Studies in

5  Early and Late Vision: Their Processes and Epistemic Status     337

Applied Philosophy, Epistemology and Rational Ethics 20 (pp. 155–177). Basel, Switzerland: Springer. Raftopoulos, A. (2017). Cognitive penetration lite and nonconceptual content. Erkenntnis, 82(5), 1097–1122. https://doi.org/10.1007/s10670-016-9861-3. Raftopoulos, A., & Muller, V. (2006). Nonconceptual demonstrative reference. Philosophy and Phenomenological Research, 72(2), 251–285. Rescorla, M. (2009). Cognitive maps and the language of thought. British Journal for the Philosophy of Science, 60(2), 377–407. Rescorla, M. (2014). The causal relevance of content to computation. Philosophy and Phenomenological Research, 88(1), 173–208. Rock, I. (1983). The Logic of Perception. Cambridge: MIT Press. Roelfsema, P. R., Lamme, V. A. F., & Spekreijse, H. (1998). Object-based attention in the primary visual cortex of the macaque monkey. Nature, 395, 376–381. Schellenberg, S. (2008). The situation dependency of perception. Journal of Philosophy, 105, 55–84. Sellars, W. (1954). Physical realism. Philosophy and Phenomenological Research,15(1), 13–32. Sellars, W. (1956). Empiricism and the philosophy of mind. In H. Feigl & M. Scriven (Eds.), Minnesota Studies in the Philosophy of Science (Vol. I, pp. 253–329). Minneapolis: University of Minnesota Press. Sellars, W. (1977). Some reflections on perceptual consciousness. In R. Bruzina & B. Wilshire (Eds.), Selected Studies in Phenomenology and Existential Philosophy (pp. 169–185). The Hague: Nijhoff. Sellars, W. (1981). Foundations for the metaphysics of pure process (The Carus lectures). The Monist, 64, 3–90. Shams, L., & Beierholm, U. R. (2010). Causal inference in perception. Trends Cognitive Science (Regular Edition), 14, 425–432. Shoemaker, S. (2006). On the way things appear. In T. S. Gendler & J. Hawthorne (Eds.), Perceptual Experience (pp. 461–481). Oxford: Clarendon Press. Siegel, S. (2006). Which properties are represented in perception? In T. S. Gendler & J. Hawthorne (Eds.), Perceptual Experience (pp. 481–504). Oxford: Clarendon Press. Soames, S. (2010). What Is Meaning? Princeton, NJ: Princeton University Press. Spelke, E. S. (1988). Object perception. In A. I. Goldman (Ed.), Readings in Philosophy and Cognitive Science (pp. 447–461). Cambridge: MIT Press.

338     A. Raftopoulos

Stalnaker, R. C. (2008). Our Knowledge of the Internal World. Oxford: Clarendon Press. Strawson, P. (1974). Imagination and perception. In P. Strawson (Ed.), Freedom and Resentment (pp. 45–65). London: Methuen. Thelen, E., & Smith, L. (1994). A Dynamic System Approach to the Development of Cognition and Action. Cambridge: MIT Press. Toppino, T. (2003). Reversible-figure perception. Perception and Psychophysics, 65, 1285–1295. Treisman, A. (2006). How the deployment of attention determines what we see. Visual Cognition, 14, 411–443. Treisman, A., & Kanwisher, N. G. (1998). Perceiving visually presented objects: Recognition, awareness, and modularity. Current Opinions in Neurobiology, 8, 218–226. Tucker, C. (2010). Why open-minded people should endorse dogmatism. Philosophical Perspectives, 24, Epistemology. van Kerkoerle, T., Self, M. W., Dagnino, B., Gariel-Mathis, M.-A., Poort, J., van der Togy, C., & Roelfsema, P. R. (2014). Alpha and gamma oscillations characterize feedback and feedforward processing in monkey visual cortex. Proceedings of the National Academy of Science, USA (PNAS), 114(40), 14332–14341. Van Leeuwen, C., Steyvers, M., & Nooter, M. (1997). Stability and intermittency in large-scale coupled oscillator models for perceptual segmentation. Journal of Mathematical Psychology, 41, 319–344. Yamada, F. (1998). Frontal midline theta rhythm and eye linking activity during a VDT task and a video game. Ergonomics, 41, 678–688. Yamagishi, N., Callan, D. E., Goda, N., Anderson, S. J., Yoshida, Y., & Kawato, M. (2003). Attentional modulation of oscillatory activity in human visual cortex. Neurotic Age, 20, 98–113.

References

Andersen, S., Muller, M., & Hillyard, S. (2012). Tracking the allocation of attention in visual scenes with SSEVP. In M. I. Posner (Ed.), Cognitive Neuroscience of Attention. New York, NY: Guilford Press. Andrews, T. J., Schluppeck, D., Homfray, D., Matthews, P., & Blakemore, C. (2002). Activity in the fusiform gyrus predicts conscious perception of Rubin’s vase-face illusion. Neuroimage, 17, 890–901. Arratha, M., & Moore, C. M. (2014). Orientation summary statistics are limited in processing capacity. Visual Cognition, 22(8), 1018–1022. Arratha, M., & Moore, C. M. (2015a). The perceptual processing capacity of summary statistics between and within feature dimensions. Journal of Vision, 15(4), 1–17. Arratha, M., & Moore, C. M. (2015b). The capacity limitations of orientation summary statistics. Attention, Perception, Psychophysics, 77, 116–1131. Attneave, F. (1971). Multistability in perception. Scientific American, 225, 63–71. Audi, R. (2003). Epistemology: A Contemporary Introduction to the Theory of Knowledge (2nd ed.). London: Routledge. Austin, J. (1962). Sense and Sensibilia. Oxford: Oxford University Press. Bach, K. (1987). Thought and Reference. Oxford: Clarendon Press. Balas, B., Nakano, L., & Rosenholtz, R. (2009). A summary-statistic representation in peripheral vision explains visual crowding. Journal of Vision, 9(12), 13. © The Editor(s) (if applicable) and The Author(s) 2019 A. Raftopoulos, Cognitive Penetrability and the Epistemic Role of Perception, Palgrave Innovations in Philosophy, https://doi.org/10.1007/978-3-030-10445-0

339

340     References

Balcetis, E., & Dunning, D. (2006). See what you want to see: Motivational influences on visual perception. Journal of Personality and Social Psychology, 91, 612–625. Barr, M. (2009). The proactive brain: Memory for predictions. Philosophical Transactions of the Royal Society London, B, Biological Sciences, 364, 1235–1243. Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and BrainSciences, 22(4), 577–609. Bayne, T. (2007). Conscious states and conscious creatures: Explanation in the scientific study of consciousness. Philosophical Perspectives, 21, Philosophy of Mind, 1–22. Bayne, T. (2009). Perception and the reach of phenomenal content. Philosophical Quarterly, 39, 385–405. Beck, J. (2017). Marking the perception-cognition boundary: The criterion of stimulus-dependence. Australasian Journal of Philosophy. https://doi.org/10. 1080/00048402.2017.1329329. Beck, D. M., & Kastner, S. (2009). Top-down and bottom-up mechanisms in biasing competition in the human brain. Vision Research, 49, 1154–1165. Bergeron, V. (2007). Anatomical and functional modularity in cognitive science: Shifting the focus. Philosophical Psychology, 20(2), 175–195. Bermúdez, J. L. (1995). Nonconceptual content: From perceptual experience to subpersonal computational states. Mind and Language, 10(4), 333–369. Biederman, I. (1987). Recognition by components: A theory of human image understanding. Psychological Review, 94, 115–147. Bishop, S. J., Jenkins, R., & Lawrence, A. D. (2007). Neural processing of fearful faces: Effects of anxiety are gated by perceptual capacity limitations. Cerebral Cortex, 17, 1595–1603. Block, N. (2007a). Two neural correlates of consciousness. In N. Block (Ed.), Collected Papers, Vol. 1 (pp. 342–362). Cambridge: MIT Press. Block, N. (2007b). Consciousness, accessibility, and the mesh between Psychology and Neuroscience. Brain and Behavioral Sciences, 30, 481–548. Block, N. (2014). Seeing-as in the light of vision science. Philosophy and Phenomenological Research, 89(3), 560–572. Boghossian, P. (2014). What is inference? Philosophical Studies, 169(1), 1–18. Borisyuk, R., Chik, D., & Kazanovich, Y. (2009). Visual perception of ambiguous figures: Synchronization based models. Biological Cybernetics, 100, 491–504.

References     341

Boxtel, J. J. A., Tsuchiya, N., & Koch, C. (2010). Consciousness and attention: On sufficiency and necessity. Frontiers in Psychology, 1(217). https:// doi.org/10.3389/fpsyg.2010.00217. Brady, T. F., Konkle, T., Alvarez, G. A., & Oliva, A. (2008). Visual long-term memory has a massive storage capacity for object details. Proceedings of the National Academy of Sciences, USA, 105, 14325–14329. Brewer, B. (2011). Perception and Its Objects. Oxford: Oxford University Press. Briscoe, R. E. (2011). Mental imagery and the varieties of amodal perception. Pacific Philosophical Quarterly, 92, 153–173. Britz, J., & Pitts, M. (2011). Perceptual reversals during binocular rivalry: ERP components and their concomitant source differences. Psychophysiology, 48, 1489–1498. Brogaard, B. (2010). Strong representationalism and centered content. Philosophical Studies, 151, 373–392. Brogaard, B. (2013). Phenomenal seemings and sensible dogmatism. In C. Tucker (Ed.), Seemings and Justification (pp. 270–289). Oxford: Oxford University Press. Brogaard, B. & Gatzia, D. (2017a). Color and cognitive penetrability. Topics in Cognitive Science, 9(1), 193–214. Brogaard, B., & Gatzia, D. (2017b). The real epistemic significance of perceptual learning. Inquiry. http://dx.doi.org/10.1080/0020174X.2017.1368172. Bronfman, Z. Z., Brezis, N., Jacobson, H., & Usher, M. (2014). We see more that we can report: ‘Cost free’ color phenomenality outside focal attention. Psychological Science, 25(7), 1394–1403. Brossel, P. (2017). Rational relations between perception and belief: The case of color. Review of Philosophy and Psychology. https://doi.org/10.1007/ s13164-017-0359-y. Bullier, J. (2001). Integrated model of visual processing. Brain Research Reviews, 36, 96–107. Burge, T. (1977). Belief de re. Journal of Philosophy, 74, 338–362. Burge, T. (2003). Perceptual entitlement. Philosophy and Phenomenological Research, 67(3), 503–548. Burge, T. (2007). Reflections on two kinds of consciousness. In T. Burge (Ed.), Foundations of Mind (Vol. 2, pp. 392–419). Oxford: Clarendon Press. Burge, T. (2010). Origins of Objectivity. Oxford: Clarendon Press. Burge, T. (2014). Reply to block: Adaptation and the upper border of perception. Philosophy and Phenomenological Research, 89(1), 573–583.

342     References

Burnston, D. (2017). Cognitive penetration and the cognition-perception interface. Synthese, 194, 3645–3668. Burnston, D. C., & Cohen, J. (2015). Perceptual integration, modularity, and cognitive penetration. In J. Zeimbekis & A. Raftopoylow (Eds.), The Cognitive Penetrability of Perception: New Philosophical Perspectives (pp. 123– 144). Oxford: Oxford University Press. Byrne, A. (2014). Perception and evidence. Philosophical Studies, 170, 101–113. Byrne, A. (2016). The epistemic significance of experience. Philosophical Studies, 173, 947–967. Campbell, J. (2002). Reference and Consciousness. Oxford: Oxford University Press. Campbell, J. (2006). Does visual attention depend on sortal classification? Reply to Clark. Philosophical Studies, 127, 221–237. Campbell, J. (2009). Consciousness and reference. In B. McLaughlin, A. Beckermann, & S. Walter (Eds.), Oxford Handbook to the Philosophy of Mind. Oxford: Oxford University Press. Cant, J. S., Sun, S. Z., & Xu, Y. (2015). Distinct cognitive mechanisms involved in the processing of single objects and object ensembles. Journal of Vision, 15(4), 1–21. Carrasco, M. (2011). Visual attention: The past 25 years. Vision Research, 51, 1484–1525. Carrasco, M., Ling, S., & Read, S. (2004). Attention alters appearance. Nature Neuroscience, 7, 308–313. Cavanagh, P. (2011). Visual cognition. Vision Research, 51, 1538–1551. Cavedon-Taylor, D. (2018). Naive realism and the cognitive penetrability of perception. Analytic Philosophy, 59(3), 391–412. Cecchi, A. (2014). Cognitive penetration, perceptual learning, and neural plasticity. Dialectica, 68(1), 63–95. Cecchi, A. (2018). Cognitive penetration of early vision in face perception. Consciousness and Cognition. http://doi.org/10.1016/j.concog.2018.06.005. Chaumon, M., Drouet, V., & Tallon-Baudry, C. (2008). Unconscious associative memory affects visual processing before 100 ms. Journal of Vision, 8(3), 1–10. Chelazzi, L., Miller, E., Duncan, J., & Desimone, R. (1993). A neural basis for visual search in inferior temporal cortex. Nature, 363, 345–347. Chong, S. C., & Treisman, A. (2003). Representation of statistical properties. Vision Research, 43(4), 393–404. Churchland, P. M. (1988). Perceptual plasticity and theoretical neutrality: A reply to Jerry Fodor. Philosophy of Science, 55, 167–187. Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36, 181–253.

References     343

Cohen, J. (2008). Colour constancy as counterfactual. Australasian Journal of Philosophy, 86, 61–92. Conway, B. R., Moeller, S., & Tsao, D. Y. (2007). Specialized color modules in macaque extrastriate cortex. Neuron, 56, 560–573. Conway, B. R., & Tsao, D. Y. (2006). Color architecture in alert macaque cortex revealed by FMRI. Cerebral Cortex, 16(11), 1604–1613. Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven attention in the brain. National Review of Neuroscience, 3(3), 201–215. Crane, T. (1992). The nonconceptual content of experience. In T. Crane (Ed.), The Contents of Experience: Essays on Perception (pp. 136–157). Cambridge: Cambridge University Press. Crane, T. (2009). Is perception a propositional attitude. The Philosophical Quarterly, 59(236), 453–470. Crouzet, S. M., Kirchner, H., & Thorpe, S. J. (2010). Fast saccades toward faces: Face detection in just 100 ms. Journal of Vision, 10(4), 1–17. Cussins, A. (1990). The connectionist construction of concepts. In M. Boden (Ed.), The Philosophy of Artificial Intelligence. Oxford: Oxford University Press. Dartnall, T. (2007). Internalism, active externalism, and nonconceptual content: The ins and outs of cognition. Cognitive Science, 31(2), 257–285. Davidson, D. (1986). A coherence theory of truth and knowledge. In E. LePore (Ed.), Truth and Interpretation: Perspectives on the Philosophy of Donald Davidson. Oxford: Basil Blackwell. Davis, M. (1995). Tacit knowledge and subdoxastic states. In C. MacDonald & G. MacDonald (Eds.), Philosophy of Psychology: Debates on Psychological Explanation. Oxford: Blackwell. Dehaene, S., Changeux, J.-P., Naccache, L., Sackur, J., & Sergent, C. (2006). Conscious, preconscious, and subliminal processing: A testable taxonomy. Trends in Cognitive Science, 10(5), 204–211. Delk, J. L., & Fillenbaum, S. (1965). Differences in perceived colour as a function of characteristic color. The American Journal of Psychology, 78(2), 290–293. Delorme, A., Rousselet, G. A., Mace, M. J.-M., & Fabre-Thorpe, M. (2004). Interaction of top-down and bottom up processing in the fast visual analysis of natural scenes. Brain Research, 19, 103–113. Dennett, D. C. (1983). Styles of mental representation. Proceedings of the Aristotelian Society, 83, 213–226.

344     References

Deroy, O. (2013). Object-sensitivity versus cognitive penetrability of perception. Philosophical Studies, 162, 87–107. Desimone, R., & Duncan, J. 1995. Neural mechanisms of selective visualattention. Annual Review of Neurosciences, 18, 193–222. Dokic, J. (2010). Perceptual recognition and the feeling of presence. In B. Nanay (Ed.), Perceiving the World. New York, NY: Oxford University Press. Dokic, J., & Martin, J.-R. (2012). Disjunctivism, hallucinations, and metacognition. WIREs Cognitive Science, 3, 533–543. Dretske, F. (1981). Knowledge and the Flow of Information. Oxford: Blackwell. Dretske, F. (1993). Conscious experience. Mind, 102(406), 263–283. Reprinted in Noë and Thompson. (2002). Vision and Mind. Cambridge: MIT Press. Dretske, F. (1995). Naturalizing the Mind. Cambridge: MIT Press. Dretske, F. (2006). Perception without awareness. In T. Gendler & J. Hawthorne (Eds.), Perceptual Experience (pp. 147–180). Oxford: Oxford University Press. Drewes, J., Goren, G., Zhu, W., & Elder, J. H. (2016). Recurrent processing in the formation of percept shapes. The Journal of Neuroscience, 36(1), 185–192. Driver, J., & Baylis, G. S. (1996). Eye-assignment and figure-ground segregation in short-term visual matching. Cognitive Psychology, 31, 248–306. Driver, J., David, G., Russell, C., Turatto, M., & Freeman, E. (2001). Segmentation, attention and phenomenal visual objects. Cognition, 80, 61–95. Eisenberg, M. L., & Zacks, J. M. (2016). Ambient and focal visual processing of naturalistic activity. Journal of Vision, 16(2), 1–12. Elman, J. L., Bates, E. A., Johnson, M. H., Karmiloff-Smith, A., Parisi, D., & Plunkett, K. (1996). Rethinking Innateness: A Connectionist Perspective on Development. Cambridge: MIT Press. Epstein, W., & Broota, K. D. (1986). Automatic and attentional components in perception of size-at-a-distance. Perception Psychophysics, 40, 256–262. Evans, G. (1982). The Varieties of Reference. Oxford: Clarendon Press. Evans, M. A., Shedden, J. M., Hevenor, S. J., & Hahn, M. C. (2000). The effect of variability of unattended information on global and local processing: Evidence from lateralization at early stages of processing. Neurophysiologia, 38, 225–239. Fabre-Thorpe, M., Delorme, A., Marlot, C., & Thorpe, S. (2001). A limit to the speed of processing in ultrarapid visual categorization of novel natural scenes. Journal of Cognitive Neuroscience, 13, 171–180.

References     345

Fantl, J., & McGrath, M. (2009). Knowledge in an Uncertain World. Oxford: Oxford University Press. Fazekas, P., & Nanay, B. (2017). Pre-cueing effects: Attention or mental imagery? Frontiers in Cognitive Science. https://doi.org/10.3389/ fpsyg.2017.00222. Fazekas, P., & Overgaard, M. S. (Eds.). (2018). Perceptual consciousness and cognitive access. Philosophical Transactions of the Royal Society B, 373(1755). Ferrante, D., Gerbino, D., & Rock, I. (1997). The right angle. In I. Rock (Ed.), Indirect Perception. Cambridge: MIT Press. Feyerabend, P. (1981). Realism, Rationalism and Scientific Method: Philosophical Papers (Vol. 1). Cambridge: Cambridge University Press. Firestone, C., & Scholl, B. J. (2016). Cognition does not affect perception: Evaluating the evidence for ‘top-down’ effects. Behavioral and Brain Sciences. http://doi.org/10.1017/S0140525X15000965. Fodor, J. (1983). The Modularity of Mind. Cambridge: MIT Press. Fodor, J. (2007). The revenge of the given. In B. P. McLaughlin & J. Cohen (Eds.), Contemporary Debates in the Philosophy of Mind. Malden, MA: Blackwell. Fodor, J., & Pylyshyn, Z. (2015). Minds Without Meanings: An Essay on the Content of Concepts. Cambridge: MIT Press. Foxe, J. J., & Simpson, G. V. (2002). Flow of activation from V1 to frontal cortex in humans. Experimental Brain Research, 142(1), 139–150. Frassle, S., Sommer, J., Jansen, A., Naber, M., & Einhauser, W. (2014). Binocular rivalry: Frontal activity relates to introspection and action but no to perception. The Journal of Neuroscience, 34(1), 1738–1747. Freiwald, W. A., & Kanwisher, N. G. (2004). Visual elective attention: Evidence from brain imaging and neurophysiology. In M. Gazzaniga (Ed.), The Cognitive Neurosciences III (3rd ed.). Cambridge: MIT Press. Gatzia, D., & Brogaard, B. (2017). The real epsitemic significance of perceptual learning. Inquiry, 543–558. https://doi.org/10.1080/00201 74X.2017.1368172. Gauker, C. (2012). Perception without propositions. Philosophical Perspectives, 26, Philosophy of Mind, 19–50. Gegenfurtner, K. R. (2003). Cortical mechanisms of cortical vision. Nature Review Neuroscience, 4, 563–572. Gegenfurtner, K. R., & Rieger, J. (2000). Sensory and cognitive contributions of color to the recognition of natural scenes. Current Biology, 10, 805–808.

346     References

Ghijsen, H. (2016). The real epistemic problem of cognitive penetration. Philosophical Studies, 173, 1457–1475. Ghose, G. M., & Bearl, D. W. (2010). Attention directed by expectations enhances receptive fields in cortical area MT. Vision Research, 50, 441–451. Gilbert, C. D., & Wiesel, T. N. (1989). Columnar specificity of intrinsic horizontal and corticocortical connections in cat visual cortex. Journal of Neuroscience, 9, 2432–2442. Gilbert, C. D., & Li, W. (2013). Top-down influences on visual processing. Nature Reviews Neuroscience, 14(5), 350–363. Gilbert, C. D., Ito, M., Kapadia, M., & Westheimer, G. (2000). Interactions between attention, context and learning in primary visual cortex. Vision Research, 40, 1217–1226. Gobell, J., & Carrasco, M. (2005). Attention alters the appearance of spatial frequency and gap size. Psychological Science, 16, 644–651. Goldstone, R. L., de Leeuw, J. R., & Landy, D. H. (2015). Fitting perception in and to cognition. Cognition, 135, 24–29. Gonzales-Cassilas, A., Parra, L., Avila-Contreras, C., Ramirez-Pedraza, R., Vargas, N., Luis del Valle-Padilla, J., & Ramos, F. (2018). Towards a model of visual recognition based on neuroscience. Biologically Inspired Cognitive Architectures. https://doi.org/10.1016/j.bica.2018.07.018. Granrud, C. E. (2004). Visual metacognition and the development of size constancy. In D. T. Levin (Ed.), Thinking and Seeing (pp. 75–95). Cambridge: MIT Press. Granrud, C. E. (2012). Judging the size of a distant object: Strategy use by children and adults. In G. Hatfield & S. Allred (Eds.), Visual Experience: Sensation, Cognition, and Constancy. Oxford: Oxford University Press. Grill-Spector, K., Kushnir, T., Hendler, T., Edelman, S., Itzchak, Y., & Malach, R. (1998). A sequence of object-processing stages revealed by FMRI in the Human occipital lobe. Human Brain Mapping, 6, 316–328. Grill-Spector, K., Henson, R., & Martin, A. (2006). Repetition and the brain: Neural models of stimulus-specific effects. Trends in Cognitive Sciences, 10, 14–23. Gross, S. (2017). Cognitive penetration and attention. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2017.00221. Gross, S., Chaisilprungraung, T., Kaplan, E., Menendez, J., & Flombaum, J. (2014). Problems for the purported cognitive penetration of perceptual color experience and Macpherson’s proposed mechanism. In E. Machery

References     347

& J. Prinz (Eds.), Thought and Perception (pp. 1–30). Lawrence, KS: New Prairie Press. Gupta, A. (2006). Empiricism and Experience. Oxford: Oxford University Press. Hansen, T., Olkkonen, M., Walter, S., & Gegenfurtner, K. (2006). Memory modulates color appearance. Nature Neuroscience, 9(11), 1367–1368. Hanson, N. R. (1958). Patterns of Discovery. Cambridge: Cambridge University Press. Harada, T., Goda, N., Ogawa, T., Ito, M., Toyoda, H., Sadato, N., & Komatsu, H. (2009). Distribution of colour-selective activity in the monkey inferior temporal cortex revealed by functional magnetic resonance imaging. European Journal Neuroscience, 30, 1960–1970. Hatfield, G. (2002). Perception as unconscious inference. In D. Heyer & R. Mausfeld (Eds.), Perception and the Physical World: Psychological and Philosophical Issues in Perception. West Sussex: Wiley. Hatfield, G. (2009). Perception and Cognition: Essays in the Philosophy of Psychology. Oxford: Clarendon Press. Haugeland, J. (1998). Having Thought. Cambridge: Harvard University Press. Hayden, B. Y., & Gallant, J. L. (2009). Combined effects of spatial and feature-based attention on responses to V4 neurons. Visio Research, 49, 1182–1187. Helmholtz, von H. (1878[1925]). Treatise on Psychological Optics. New York: Dover. Heck, R. G., Jr. (2000). Nonconceptual content and the ‘space of reasons’. Philosophical Review, 109, 483–523. Heck, R. G., Jr. (2007). Are there different kinds of content? In J. Cohen & B. McLaughlin (Eds.), Contemporary Debates in the Philosophy of Mind. Oxford: Blackwell. Hegde, J., & Kersten, D. (2010). A link between visual disambiguation and visual memory. The Journal of Neuroscience, 30(45), 15124–15133. Heeger, D. J., & Ress, D. (2004). Neuronal correlates of visual attention and perception. In M. Gazzaniga (Ed.), The Cognitive Neurosciences (3rd ed.). Cambridge: MIT Press. Heinen, K., Jolij, J., & Lamme, V. A. (2015). Figure-ground segregation requires two distinct periods of activity in V1: A transcranial magnetic study. Neuroreport, 16(13), 1483–1487. Heywood, C. A., & Kentridge, R. W. (2003). Achromatopsia, colour vision & cortex. Neurological Clinics of North America, 21, 483–500.

348     References

Hochberg, J., & Peterson, M. A. (1987). Piecemeal organization and cognitive components in object perception. Journal of Experimental Psychology: General, 116, 370–380. Hollingworth, A. (2006). Visual memory for natural scenes: Evidence from change detection and visual search. Visual Cognition, 14(4–8), 781–807. Hopfinger, J. B., Luck, S. J., & Hillyard, S. A. (2004). Selective attention. In M. S. Gazzaniga (Ed.), The Cognitive Neuroscience (3rd ed.). Cambridge: MIT Press. Horgan, T., & Tienson, J. (1996). Connectionism and the Philosophy of Psychology. Cambridge: MIT Press. Huemer, M. (2007). Compassionate phenomenal conservatism. Philosophy and Phenomenological Research, 74, 30–55. Huemer, M. (2013). Epistemological asymmetries between belief and experience. Philosophical Studies, 162(3), 741–748. Hume, D. (1739–1740 [2003]). A Treatise of Human Nature. Mineola, NY: Dover. Innui, K., & Kakigi, R. (2006). Temporal analysis of the flow from V1 to extrastriate cortex. Journal of Neurophysiology, 96, 775–784. Irwin, D. E., & Andrews, R. (1996). Integration and accumulation of information across saccadic eye movements. In T. Inui & J. L. McLelland (Eds.), Attention and Performance XVI: Information Integration in Perception and Communication. Cambridge: MIT Press. Im, H. Y., & Halberda, J. (2013). The effects of sampling and internal noise of the representation of ensemble average size. Attention, Perception, Psychophysics, 75, 278–286. Ito, J., Nikolaev, A. R., Luman, M., Aukes, M. F., Nakatani, C., & Leeuwen, Van C. (2003). Perceptual switching, eye movements, and the bus paradox. Perception, 32(6), 681–698. Itti, L., & Baldi, P. (2005). A principled approach to detecting surprising events in video. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1, 631–637. Itti, L., & Koch, C. (2001). Computational modelling of visual attention. Nature Reviews Neuroscience, 2, 194–204. Jackendoff, R. (1989). Consciousness and the Computational Mind. Cambridge: MIT Press. Jackson, F. (1977). Perception: A Representative Theory. Cambridge: Cambridge University Press.

References     349

Johnson, J. S., & Olshausen, B. A. (2005). The earliest EEG signatures of object recognition in a cued target task are postsensory. Journal of Vision, 5, 299–312. Johnston, M. (2006). Better than mere knowledge: The function of sensory awareness. In T. S. Gendler & J. Hawthorne (Eds.), Perceptual Experience. Oxford: Clarendon Press. Joulesz, B. (1981). Textons, the elements of texture perception, and their interactions. Nature, 90(12), 91–97. Kaplan, D. (1978). Dthat. Syntax and Semantics, 9, 221–243. Kastner, S., & Ungerleider, L. G. (2000). Mechanisms of visual attention in the human cortex. Annual Review of Neuroscience, 23, 315–341. Kastner, S., Pinsk, M. A., De Weerd, P., Desimone, R., & Ungerleider, L. (1999). Increased activity in human visual cortex during directed attention in the absence of visual stimulation. Neuron, 22, 751–761. Kawabata, N. (1986). Attention and depth perception. Perception, 15, 563–572. Kelly, S. D. (2001). Demonstrative concepts and experience. The Philosophical Review, 110(3), 397–420. Kelso, S. (1995). Dynamic Patterns: The Self Organization of Brain and Behavior. Cambridge: MIT Press. Kerkoerle, T. van., Self, M. W., Dagnino, B., Gariel-Mathis M.-A., Poort, J., Togy, C. van der, & Roelfsema, P. R. (2014). Alpha and gamma oscillations characterize feedback and feedforward processing in monkey visual cortex. Proceedings of the National Academy of Science, USA (PNAS), 114(40), 14332–14341. Keysers, C., Xiao, D. K., Fldiak, P., & Perrett, D. (2014). The speed of sight. Journal of Cognitive Neuroscience, 13, 90–101. Kihara, K., & Takeda, Y. (2010). Time course of the integration of spatial frequency-based information in natural scenes. Vision Research, 50, 2158–2162. Kirchner, H., & Thorpe, S. J. (2006). Ultra-rapid object detection with saccadic movements: Visual processing speed revisited. Vision Research, 46, 1762–1776. Kitcher, P. (2001). Real realism: The Galilean strategy. Philosophical Review, 110(2), 151–199. Kok, P., Jehee, J. M. F., & de Lange, F. P. (2012). Less is more: Expectation sharpens repesentions in the primary visual cortex. Neuron, 75(2), 265–270.

350     References

Kok, P., Failing, M., & de Lange, F. (2014). Prior expectations evoke stimulus templates in the primary visual cortex. Journal of Cognitive Neuroscience, 26, 1546–1554. Kok, P., Brouwer, G., van Gerven, M., & de Lange, F. (2013). Prior expectations bias sensory representations in visual cortex. Journal of Neuroscience, 33, 16275–16284. Komatsu, H., Ideura, Y., Kaji, S., & Yamane, S. (1992). Color selectivity of neurons in the inferior temporal cortex of the awake macaque monkey. Journal of Neuroscience, 12, 408–424. Kornmeier, J., & Bach, M. (2009). Object perception: When our brain is impressed but we do not notice it. Journal of Vision, 9(1), 1–10. Kosslyn, S. M. (1994). Image and Brain. Cambridge: MIT Press. Kuhn, T. S. (1962). The Structure of Scientific Revolutions. Chicago: Chicago University Press. Kulvicki, J. (2007). Perceptual content is vertically articulate. American Philosophical Quarterly, 44(4), 357–369. Kulvicki, J. (2015). Analog representation and the parts principle. Review of Philosophy and Psychology, 6, 165–180. Kvaning, J. (2003). Propositionalism and the perspectival aspect of justification. American Philosophical Quarterly, 40(1), 3–18. Lamme, V. A. F. (2003). Why visual attention and awareness are different. Trends in Cognitive Sciences, 7(1), 12–18. Lamme, V. A. F. (2005). Independent neural definitions of visual awareness and attention. In A. Raftopoulos (Ed.), The Cognitive Penetrability of Perception: An Interdisciplinary Approach. Hauppauge, NJ: NovaScience Books. Lamme, V. A. F., & Roelfsema, P. R. (2000). The distinct modes of vision offered by feedforward and recurrent processing. Trends in Neuroscience, 23, 571–579. Lamme, V. A., Zipser, K., & Spekreijse, H. (1998). Figure-ground activity in primary visual cortex is suppressed by anesthesia. Proccedings National Academy Science USA, 95, 3263–3268. Lamme, V. A. F., Supér, H., Landman, R., Roelfsema, P. R., & Spekreijse, H. (2000). The role of primary visual cortex (V1) in visual awareness. Vision Research, 40, 1507–1521. Lavie, N. (2005). Distracted and confused? Selective attention under load. Trends in Cognitive Science, 9, 75–82.

References     351

Lawrence, B. M., White, R. L., & Snyder, L. H. (2005). Delay-period activity in visual, visuomovement, and movement neurons in the front eye field. Journal of Neurophysiology, 94(2), 1498–1508. Leopold, D. A., & Logothetis, N. K. (1999). Multistable phenomena: Changing views in perception. Trends in Cognitive Science, 3(7), 254–264. Lee, J., & John, H. R., & Maunsell, J. H. R. (2009). A normalization model of attentional modulation of single responses. PLoS One, IV(2), e4651. Ling, S., Liu, T., & Carrasco, M. (2009). How spatial and feature-based attention affect the gain and tuning of population responses. Vision Research, 49, 1194–1204. Liu, H., Agam, Y., Madsen, J., & Krelman, G. (2009). Timing, timing, timing: Fast decoding of object information from intracranial field potentials in human visual cortex. Neuron, 62, 281–290. Liu, T., Stevens, S. T., & Carrasco, M. (2007). Comparing the time course and efficacy of spatial and feature-based attention. Vision Research, 47, 108–113. Livingstone, M., & Hubel, D. H. (1987). Psychophysical evidence for separate channels for the perception of form, color, movement, and depth. Journal of Neuroscience, 7, 3416–3468. Long, R. (2017). How wishful thinking is not like wishful thinking. Philosophical Studies. http://doi.org/10.1007/s11098-017-0917-2. Long, G. M., & Toppino, C. T. (2004). Enduring interest in perceptual ambiguity: Alternating views of reversible figures. Psychological Bulletin, 130(5), 748–768. Luck, S. J. 1995. Multiple mechanisms of visual-spatial attention: Recent evidence from human electrophysiology. Behavioral and Brain Research, 71, 113–123. Lowe, J. (1995). Locke on Human Understanding. London: Routledge. Lupyan, G. (2015). Object knowledge changes visual appearance: Semantic effects on color afterimages. Acta Psychologica, 161, 117–130. Lyons, J. (2005). Perceptual beliefs and nonexperiential looks. In J. Hawthorne (Ed.), Philosophical Perspectives, 19: Epistemology. Malden: Blackwell. Lyons, J. (2009). Perception and Basic Beliefs. Oxford: Oxford University Press. Lyons, J. (2011). Circularity, reliability, and the cognitive penetrability of perception. Philosophical Issues, 21, The Epistemology of Perception, 289–311. Lyons, J. C. (2015). Inferentialism and cognitive penetrability of perception. Episteme, 13(1), 1–28. Lyons, J. C. (2016). Experiential evidence. Philosophical Studies, 73, 1053–1079. Macpherson, F. (2012). Cognitive penetration of colour experience: Rethinking the issue in light of an indirect mechanism. Philosophy and Phenomenological Research, 84(1), 24–62.

352     References

Mackie, J. (1976). Problems from Locke. Oxford: Oxford University Press. Marchi, F. (2016). Attention and cognitive penetrability: The epistemic consequences of attention as a form of metacognitive regulation. Consciousness and Cognition. https://doi.org/10.2016/j.concog.2016.06.014. Markie, P. J. (2005). The mystery of direct perceptual justification. Philosophical Studies, 126, 347–373. Markie, P. J. (2006). Epistemically appropriate perceptual belief. Nous, 40, 118–142. Markie, P. J. (2013). Searching for true dogmatism. In C. Tucker (Ed.), Seemings and Justification (pp. 248–269). Oxford: Oxford University Press. Marr, D. (1982). Vision: A Computational Investigation into Human Representation and Processing of Visual Information. San Francisco, CA: Freeman. Matthen, M. (2005). Seeing, Doing, and Knowing. Oxford: Oxford University Press. McAllister, B. (2018). Seemings as sui generis. Synthese, 195, 3079–3096. McDowell, J. (1994). Mind and World. Cambridge: Harvard University Press. McDowell, J. (2011). Reception as a Capacity for Knowledge. Milwaukee, WI: Marquette University Press. McGrath, M. (2013a). Siegel and the impact for epistemological internalism. Philosophical Studies, 162(3), 723–732. McGrath, M. (2013b). Phenomenal conservatism and cognitive penetration. In C. Tucker (Ed.), Seemings and Justification (pp. 225–247). Oxford: Oxford University Press. McGrath, M. (2016). Schellenberg on the epistemic force of experience. Philosophical Studies, 173, 897–905. McGrath, M., & Fantl, J. (2002). Evidence, pragmatics, and justification. The Philosophical Review, 111, 67–94. Meng, M., & Tong, F. (2004). Can attention selectively bias bistable perception? Journal of Vision, 4, 539–551. Millar, A. (2011). How visual perception yields reasons for belief. Philosophical Issues, 21, The Epistemology of Perception, 332–351. Mole, C. (2015). Attention and cognitive penetration. In J. Ziembekis & A. Raftopolous (Eds.), The Cognitive Penetrability of Perception: New Philosophical Perspectives (pp. 218–238). Oxford: Oxford University Press. Montemayor, C., & Haladjian, H. (2015). Consciousness, Attention, and Conscious Attention. Cambridge: MIT Press.

References     353

Morishima, Y., Akaishi, R., Yamada, Y., Okuda, J., Toma, K., & Sakai, K. (2008). Task-specific signal transmission from prefrontal cortex in visual selective attention. Nature Neuroscience, 12(1), 85–90. Muller, M. M., Andersen, S. K., Trujillo, N. J., Valdes-Sosa, P., Malinowski, P., & Hillyard, S. A. (2006). Feature-selective attention enhances color signals in early visual areas of the human brain. Proceedings of the National Academy of Science USA, 103, 14250–14254. Murray, S. O. (2008). The effects of spatial attention in early human early visual cortex are stimulus independent. Journal of Vision, 8(10), 1–11. Murray, S. O., Schrater, P., & Kersten, D. (2004). Perceptual grouping and the interactions between visual cortical areas. Neural Networks, 17, 695–705. Nakatani, H., & van Leeuwen, C. (2006). Transient synchrony of distant brain areas and perceptual switching in ambiguous figures. Biological Cybernetics, 94, 445–457. Nanay, B. (2010). Perception and imagination: Amodal perception as mental imagery. Philosophical Studies, 150, 239–254. Nanay, B. (2015). Perceptual content and the content of mental imagery. Philosophical Studies, 172, 1723–1736. Newen, A., & Vetter, P. (2016). Why cognitive penetration of our perceptual experience is still the most plausible account. Consciousness and Cognition. http://dx.doi.org/10.1016/j.concog.2016.09.005. Niedeggen, M., Wichmann, P., & Stoerig, P. (2001). Change blindness and time to fconsciousness. European Journal of Neuroscience, 14, 1719–1726. Nikolic, D., & Singer, W. (2007). Creation of visual long-term memory. Perception and Psychophysics, 69(6), 904–912. Nobre, A. C., Rohenkhol, G., & Stokes, M. G. (2012). Nervous anticipation: Top-down biasing across space and time. In M. Posner (Ed.), Cognitive Neuroscience of Attention (2nd ed.). New York, NY: Guilford Press. Noe, A. (2004). Action in Perception. Cambridge: MIT Press. Norretranders, T. (1998). The User Illusion: Cutting Consciousness Down to Size. New York: Penguin Books. O’Callaghan, C., Kveraga, K., Shine, J. M., Adams, R. B. Jr., & Bar, M. (2016). Predictions penetrate perception: Converging insights from brain, behavior and disorder. Consciousness and Cognition. http://doi. org/10.1016/j.concog.2016.05.003.

354     References

Ogilivie, R., & Carruthers, P. (2015). Opening up vision: The case against encapsulation. Review of Philosophy and Psychology. http://doi.org/10.1007/ s13164-015-0294. O’Shea, J., Muggleton, N. G., Cowey, A., & Walsh, V. (2004). Timing of target discrimination in human front eye fields. Journal of Cognitive Neuroscience, 16(6), 1060–1067. Palmer, S. (1999). Vision Science. Cambridge: MIT Press. Pautz, A. (2015). What is my evidence that here is a cup. Philosophical Studies, 173, 915–927. Peacocke, C. (1998). Nonconceptual content defended. Philosophy and Phenomenological Research, 58(2), 381–388. Peacocke, C. (2001). Does perception have a nonconceptual content? The Journal of Philosophy, XCVIII(5), 239–269. Peacocke, C. (2004). The Realm of Reason. New York, NY: Oxford University Press. Perry, J. (2001). Knowledge, Possibility, and Consciousness. Cambridge: MIT Press. Peterson, M. (2003). Overlapping partial configurations in object memory. In M. Peterson & G. Rhodes (Eds.), Perception of Faces, Objects, and Scenes: Analytic and Holistic Processes. New York, NY: Oxford University Press. Peterson, M. A., & Enns, J. (2005). The edge complex: Implicit memory for figure assignment in shape perception. Perception and Psychophysics, 67(4), 727–740. Peterson, M. A., & Gibson, B. S. (1991). Directing spatial attention within an object: Altering the functional equivalence of shape descriptions. Journal of Experimental Psychology: Human Perception and Performance, 17, 170–182. Peterson, M. A., & Gibson, B. S. (1994). Must figure-ground organization precede object recognition? An assumption in peril. Psychological Science, 5, 253–259. Peterson, M. A., & Hochberg, J. (1983). Opposed set-measurements procedure: A quantitative analysis of the role of local cues and intention in form perception. Journal of Experimental Psychology: Human Perception and Performance, 9, 183–193. Peyrin, C., Michel, C. M., Schwartz, S., Thut, G., Seghier, M., Landis, T., et al. (2010). The neural processes and timing of top-down processes during coarse-to-fine categorization of visual scenes: A combined FMRI and ERP study. Journal of Cognitive Neuroscience, 22, 2678–2780.

References     355

Phillips, B. (2017). The shifting border between perception and cognition. Nous, 1–31. http://doi.org/10.1111/nous.12218. Pitts, M., Martinez, A., Brewer, J. B., & Hillyard, S. (2010). Early stages of figure-ground segregation during perception of the face-vase. Journal of Cognitive Neuroscience, 23(4), 880–895. Pitts, M., Nerger, J., & Davis, T. J. R. (2007). Electrophysiological correlates of perceptual reversals for three different types of multistable images. Journal of Vision, 7(1), 1–14. Plantinga, A. (1993). Warrant and Proper Function. New York, NY: Oxford University Press. Plomp, G., Hervais-Adelma, A., Astofli, L., & Michel, C. M. (2015). Early recurrence and ongoing parietal driving during elementary visual processing. Nature, Scientific Reports, 5, 18733. https://doi.org/10.1038/srep18733. Potter, M. C., Wyble, B., Hagmann, C. E., & McCourt, E. S. (2014). Detecting meaning in RSVP at 13ms per picture. Attention, Perception, Psychophysics, 76, 270–279. Prinz, J. J. (2002). Furnishing the Mind. Cambridge: MIT Press. Pryor, J. (2000). The sceptic and the dogmatist. Nous, 34, 517–549. Pryor, J. (2005). There is immediate justification. In M. Steup & E. Sosa (Eds.), Contemporary Debates in Epistemology (pp. 181–201). Maiden, MA: Blackwell. Pylyshyn, Z. (1984). Computation and Cognition: Toward a Foundation for Cognitive Science. Cambridge: MIT Press. Pylyshyn, Z. (1999). Is vision continuous with cognition? Behavioral and Brain Sciences, 22, 341–365. Pylyshyn, Z. (2003). Seeing and Visualizing: It’s Not What You Think. Cambridge: MIT Press. Pylyshyn, Z. (2007). Things and Places: How the Mind Connects with the World. Cambridge: MIT Press. Raftopoulos, A. (2001a). Is perception informationally encapsulated? The issue of the theory-ladenness of perception. Cognitive Science, 25, 423–451. Raftopoulos, A. (2001b). Reentrant pathways and the theory-ladenness of observation. Philosophy of Science, 68, 187–200. Raftopoulos, A. (2006). Defending realism on the proper ground. Philosophical Psychology, 19(1), 1–31. Raftopoulos, A. (2008). Perceptual systems and realism. Synthese, 164(1), 61–91.

356     References

Raftopoulos, A. (2009). Cognition and Perception: How Do Psychology and Neural Science Inform Philosophy? Cambridge: MIT Press. Raftopoulos, A. (2010). Can nonconceptual content be stored in visual memory? Philosophical Psychology, 23(5), 639–668. Raftopoulos, A. (2011a). Ambiguous figures and representationalism. Synthese, 181, 489–514. Raftopoulos, A. (2011b, November 30). Late vision: Its processes and epistemic status. Frontiers in psychology, 2, 382. http://doi.org/10.3389/ fpsyg.2011.00382. Raftopoulos, A. (2014a). The cognitive impenetrability of the content of early vision is a necessary and sufficient condition for purely nonconceptual content. Philosophical Psychology, 27(5), 601–620. Raftopoulos, A. (2014b). Nonconceptual content: A reply to Toribio’s “nonconceptualism and the cognitive penetration of early vision.” Philosophical Psychology, 27(5), 643–651. Raftopoulos, A. (2014c). Does the emotional modulation of visual experience entail the cognitive penetrability or emotional penetrability of early vision? In P. B. M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Conference of the of the Cognitive Science Society (pp. 1178– 1184). Mahwah, NJ: Lawrence Erlbaum. Raftopoulos, A. (2015a). The cognitive impenetrability of perception and theory-ladenness. Journal of General Philosophy of Science, 46(1), 87–103. Raftopoulos, A. (2015b). Cognitive penetrability and consciousness. In J. S. Zeimbekis & A. Raftopoulos (Eds.), Cognitive Effects on Perception: New Philosophical Perspectives (pp. 268–298). Oxford: Oxford University Press. Raftopoulos, A. (2015c). Abductive inferences in late vision. In L. Magnani, W. Park, & Li Ping (Eds.), Philosophy and Cognitive Science II, Studies in Applied Philosophy, Epistemology and Rational Ethics 20 (pp. 155–177). Switzerland: Springer. Raftopoulos, A. (2017a). Cognitive penetration lite and nonconceptual content. Erkenntnis, 82(5), 1097–1122. http://doi.org/10.1007/ s10670-016-9861-3. Raftopoulos, A. (2017b). Vision, thinking, and model-based inferences. In L. Magnani & T. Bertolotti (Eds.), Springer Handbook of Model-Based Science (pp. 573–605). Dordrecht: Springer. Raftopoulos, A. (2017c). Pre-cueing, the epistemic role of early vision and the cognitive impenetrability of early vision. Frontiers in Psychology, 8, 1156. https://doi.org/10.3389/fpsyg.2017.01156.

References     357

Raftopoulos, A., & Muller, V. (2006a). The phenomenal content of experience. Mind and Language, 27(2), 187–219. Raftopoulos, A., & Muller, V. (2006b). Nonconceptual demonstrative reference. Philosophy and Phenomenological Research, 72(2), 251–285. Raftopoulos, A., & Zeimbekis, J. (2015). The cognitive penetrability of perception: An overview. In J. Zeimbekis & A. Raftopoulos (Eds.), The Cognitive Penetrability of Perception: New Perspectives. Oxford: Oxford University Press. Ramachandran, V. S. (1988). Perception of shape from shading. Nature, 331, 163–166. Recanati, F. (1997). Direct Reference: From Language to Thought. Oxford: Blackwell. Rensink, R. A. (2000a). Seeing, sensing, and scrutinizing. Vision Research, 40, 1469–1487. Rensink, R. A. (2000b). Visual search for change: A probe into the nature of attentional processing. Visual Cognition, 7, 345–376. Rescorla, M. (2009). Cognitive maps and the language of thought. British Journal for the Philosophy of Science, 60(2), 377–407. Rescorla, M. (2014). The causal relevance of content to computation. Philosophy and Phenomenological Research, 88(1), 173–208. Retzeperis, I., Nikolaev, A. E., Kiper, D., & van Leeuwen, C. (2014). Distributed processing of color and form in the visual cortex. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2014.00932. Reynolds, J. H., & Chelazzi, L. (2004). Attentional modulation of visual processing. Annual Review of Neuroscience, 27, 611–647. Reynolds, J. H., Chelazzi, L., & Desimone, R. (2000). Attention increases sensitivity in V4 neurons. Neuron, 26, 703–714. Rock, I. (1983). The Logic of Perception. Cambridge: MIT Press. Roe, A. W., Chelazzi, L., Connor, C. E., Conway, B. R., Fujita, I., Gallant, J. L., et al. (2012). Toward a unified theory of visual areas V4. Neuron, 74, 12–29. Roelfsema, P. R., Lamme, V. A. F., & Spekreijse, H. (1998). Object-based attention in the primary visual cortex of the macaque monkey. Nature, 395, 376–381. Rosenholtz, R., Huang, J., Raj, A., Balas, B. J., & Ilie, L. (2012). A summary statistic representation in peripheral vision explains visual search. Journal of Vision, 12(4), 14.

358     References

Santos, I. M., Iglesias, J., Olivares, E. I., & Young, A. W. (2008). Differential effects of object-based attention on evoked potentials to fearful and disgusted faces. Neurophysiologia, 46, 1468–1479. Schall, J. D., & Bichot, N. P. (1998). Neural correlates of visual and motor decision processes. Current Opinions in Neurobiology, 76, 2841–2852. Schellenberg, S. (2008). The situation dependency of perception. Journal of Philosophy, 105, 55–84. Schellenberg, S. (2011). Perceptual content defended. Nous, 45, 714–750. Schellenberg, S. (2013). Experience and evidence. Mind, 122(487), 699–747. Schellenberg, S. (2014). The epistemic force of perceptual experience. Philosophical Studies, 170, 87–100. Schellenberg, S. (2016a). Phenomenal evidence and factive evidence. Philosophical Studies, 173, 875–896. Schellenberg, S. (2016b). Phenomenal evidence and factive evidence defended: Replies to MacGrath, Pautz, and Neta. Philosophical Studies, 173, 929–946. Searle, J. R. (1995). Consciousness, explanatory inversion and cognitive science. In C. MacDonald & G. Macdonald (Eds.), Philosophy of Psychology: Debates on Psychological Explanation. Oxford: Blackwell. Searle, J. (2015). Seeing Things as They Are. Oxford: Oxford University Press. Sellars, W. (1954). Physical realism. Philosophy and Phenomenological Research, 15(1), 13–32. Sellars, W. (1956). Empiricism and the philosophy of mind. In H. Feigl & M. Scriven (Eds.), Minnesota Studies in the Philosophy of Science (Vol. I, pp. 253–329). Minneapolis: University of Minnesota Press. Sellars, W. (1968). Science and Metaphysics. London: Routledge & Kegan Paul. Sellars, W. (1977). Some reflections on perceptual consciousness. In R. Bruzina & B. Wilshire (Eds.), Selected Studies in Phenomenology and Existential Philosophy (pp. 169–185). The Hague: Nijhoff. Sellars, W. (1981). Foundations for the metaphysics of pure process (The Carus lectures). The Monist, 64, 3–90. Sellars, W. (1982). Sensa or sensings: Reflections on the ontology of perception. Philosophical Studies, 41, 83–111. Sellars, W. (2002). The role of imagination in Kant’s theory of experience. In J. F. Sicha (Ed.), Kant’s Transcendental Metaphysics: Sellars’s Cassirer Lectures and Other Essays. Atascadero, CA: Ridgeview Publishing. Sergent, C., Baillet, S., & Dehaene, S. (2005). Timing of the brain events underlying access to consciousness during the attentional blink. Nature Neuroscience, 8, 1391–1400.

References     359

Shams, L., & Beierholm, U. R. (2010). Causal inference in perception. Trends Cognitive Science (Regular Edition), 14, 425–432. Shibata, K., Yamagishi, N., Naokazu, G., Yoshioka, T., Yamashita, O., Sato, M., & Kawato, M. (2008). The effects of feature attention on prestimulus cortical activity in the human visual system. Cerebral Cortex, 18, 1644–1675. Shoemaker, S. (2002). Introspection and phenomenal character. In D. Chalmers (Ed.), Philosophy of Mind. Oxford: Oxford University Press. Shoemaker, S. (2006). On the way things appear. In T. S. Gendler & J. Hawthorne (Eds.), Perceptual Experience (pp. 461–481). Oxford: Clarendon Press. Siegel, S. (2006). Which properties are represented in perception? In T. S. Gendler & J. Hawthorne (Eds.), Perceptual Experience (pp. 481–504). Oxford: Clarendon Press. Siegel, S. (2011). Cognitive penetrability and perceptual justification. Nous, 46, 201–222. Siegel, S. (2012). The Contents of Visual Experience. Oxford: Oxford University Press. Siegel, S. (2013a). The epistemic impact of the etiology of experience. Philosophical Studies, 162, 697–722. Siegel, S. (2013b). Can selection effects influence the rational role of experience? In T. Gelder (Ed.), Oxford Studies in Epistemology (Vol. 4, pp. 240– 270). Oxford: Oxford University Press. Siegel, S. (2015). Epistemic charge. Proceedings of the Aristotelian Society, CVX(3), 277–305. Siegel, S. (2016). How is wishful seeing like wishful thinking? Philosophy and Phenomenological Research. https://doi.org/10.1111/phpr.12273. Siegel, S., & Silins, N. (2014). Consciousness, attention, and justification. In D. Dodds & E. Zardini (Eds.), Scepticism and Perceptual Justification (pp. 149–169). Oxford: Oxford University Press. Silins, N. (2005). Deception and evidence. Philosophical Perspectives, 19, 375–404. Silvanto, J., Cowey, A., Lavie, N., & Walsh, V. (2005). Striate cortex (V1) activity gates awareness of motion. Nature Neuroscience, 8(2), 143–144. Silvanto, J., Lavie, N., & Walsh, V. (2006). Stimulation of the human frontal eye fields modulates sensitivity of extrastriate visual cortex. Journal of Neurophysiology, 96(2), 941–945.

360     References

Sligte, I. G., Vandenbroucke, A. R. E., Scholte, H. S., & Lamme, A. F. (2010, October). Detailed sensory memory, sloppy working memory. Frontiers in Psychology, 1, 175. Sligte, I. G., Wokke, M. E., Tesselaar, J. P., Scholte, H. S., & Lamme, V. F. A. (2011). Magnetic stimulation of the dorsolateral prefrontal cortex dissociates fragile visual short-term memory from visual working memory. Neuropsychologia, 49, 1578–1588. Smith, A. D. (2002). The Problem of Perception. Cambridge: The Harvard University Press. Smithies, D. (2014). The phenomenal basis of epistemic justification. In J. Kallestrup & M. Sprevak (Eds.), New Waves in Philosophy of Mind. New York: Palgrave Macmillan. Smithies, D. (2016). Perception and the external world. Philosophical Studies, 173, 1119–1145. Soames, S. (2010). What Is Meaning? Princeton, NJ: Princeton University Press. Sosa E. (2003). Beyond internal foundations to external virtues. In L. Bonjour & E. Sosa (Eds.), Epistemic Justification: Internalism vs. Externalism, Foundations vs. Virtues. Oxford: Blackwell. Sosa, D. (2011). Some of the structure of experience and belief. Philosophical Issues 21, The Epistemology of Perception, 474–484. Speaks, J. (2005). The Phenomenal and the Representational. Oxford: Oxford University Press. Spelke, E. S. (1988). Object perception. In A. I. Goldman (Ed.), Readings in Philosophy and Cognitive Science (pp. 447–461). Cambridge: MIT Press. Stalnaker, R. C. (2008). Our Knowledge of the Internal World. Oxford: Clarendon Press. Stazicker, J. (2011). Attention, visual consciousness and indeterminacy. Mind and Language, 26(2), 156–184. Sterrzer, P., Kleinschmidt, A., & Rees, G. (2009). The neural bases of multistable perception. Trend in Cognitive Science, 13(7), 310–319. Steup, M. (2018). Destructive defeat an justificational force: The diallectic of dogmatism, conservatism, and metaevidentialism. Synthese, 195, 2907–2933. Stich, S. (1978). Beliefs and subdoxastic states. Philosophy of Science, 45, 499–518. Stokes, D. (2012). Perceiving and desiring: A new look at the cognitive penetrability of experience. Philosophical Studies, 158(3), 479–492.

References     361

Stokes, D. (2015). Towards a consequentialist understanding of cognitive penetration. In J. Zeimbekis & A. Raftopoulos (Eds.), Cognitive Penetrability of Perception: New Philosophical Perspectives (pp. 75–100). Oxford: Oxford University Press. Stokes, D. (2017). Attention and the cognitive penetrability of perception. Australasian Journal of Philosophy. https://doi.org/10.1080/00048402.2017 .1332080. Strawson, P. (1974). Imagination and perception. In P. Strawson (Ed.), Freedom and Resentment (pp. 45–65). London: Methuen. Sugase, Y., Yamane, S., Ueno, S., & Kawano, K. (1999). Global and fine information coded by single neurons in the temporal visual cortex. Nature, 400(6747), 869–873. Tanigawa, H., Lu, H. D., & Roe, A. W. (2010). Functional organization for color and orientation in macaque V4. Nature Neuroscience, 13, 1542–1548. Tatler, B. W. (2002). What information survives saccades in the real world? In J. Hyona, D. P. Munoz, W. Heide, & R. Radach (Eds.), Progress in Brain Research, 140, 149–163. Taylor, P. C. J., & Nobre, A. (2007). FEF TMS affects visual cortical activity. Cerebral Cortex, 17, 391–399. Thelen, E., & Smith, L. (1994). A Dynamic System Approach to the Development of Cognition and Action. Cambridge: MIT Press. Thompson, K. G., & Schall, J. D. (2000). Antecedents and correlates of visual detection and awareness in macaque prefrontal cortex. Vision Research, 40, 1523–1538. Toppino, T. (2003). Reversible-figure perception. Perception and Psychophysics, 65, 1285–1295. Toribio, J. (2014). Nonconceptualism and the cognitive impenetrability of early vision. Philosophical Psychology. https://doi.org/10.1080/09515089.20 14.893386. Torralba, A., & Oliva, A. (2003). Statistics of natural image categories. Network, 14, 391–412. Travis, C. (2004). The silence of the sense. Mind, 113, 57–94. Treisman, A. (2006). How the deployment of attention determines what we see. Visual Cognition, 14, 411–443. Treisman, A., & Kanwisher, N. G. (1998). Perceiving visually presented objects: Recognition, awareness, and modularity. Current Opinions in Neurobiology, 8, 218–226.

362     References

Tse, P. U. (2005). Voluntary attention modulates the brightness of overlapping transparent surfaces. Vision Research, 45, 1095–1098. Tucker, C. (2010). Why open-minded people should endorse dogmatism. Philosophical Perspectives, 24, Epistemology, 529–545. Tucker, C. (2014). If dogmatists have a problem with cognitive penetration, you do too. Dialectica, 68(1), 35–62. Tye, M. (1995). Ten Problems of Consciousness. Cambridge: MIT Press. Tye, M. (2000). Consciousness, Color, and Content. Cambridge: MIT Press. Tye, M. (2002). Visual qualia and visual content revisited. In D. Chalmers (Ed.), Philosophy of Mind (pp. 447–457). Oxford: Oxford University Press. Tye, M. (2006). Nonconceptual content, richness and fineness of grain. In T. Gendler & J. Hawthorne (Eds.), Perceptual Experience (pp. 504–530). Oxford: Oxford University Press. Tye, M. (2009). Consciousness Revisited: Materialism Without Phenomenal Concepts. Cambridge: MIT Press. Ullman, S., Vidal-Naquet, M., & Sali, E. (2002). Visual features of intermediate complexity and their use in classification. Nature Neuroscience, 5(7), 682–687. Usher, M., & Niebur, E. (1996). Modeling the temporal dynamics of IT neurons in visual search: A mechanism for top-down selective attention. Journal of Cognitive Neuroscience, 8(4), 311–327. Utochkin, I. S. (2015). Ensemble summary statistics as a basis for rapid visual categorization. Journal of Vision, 15(4), 1–14. Vahid, H. (2014). Cognitive penetration, the downgrade principle, and extended cognition. Philosophical Issues, 24(1), 439–459. Vandenbroucke, A. R. E., Sligte, I. S., & Lamme, V. A. F. (2011). Manipulations of attention dissociate fragile visual short-term memory from visual working memory. Neuropsychologia, 49, 1559–1568. Vandenbroucke, A. R. F., Fahrenfort, J. J., Sligte, I. G., & Lamme, V. A. F. (2014). Seeing without knowing: Neural signatures of perceptual inference in the absence of report. Journal of Cognitive Neuroscience, 26(5), 955–969. Van Leeuwen, C., Steyvers, M., & Nooter, M. (1997). Stability and intermittency in large-scale coupled oscillator models for perceptual segmentation. Journal of Mathematical Psychology, 41, 319–344. Van Rullen, R., & Thorpe, S. J. (2001). The time course of visual processing: From early perception to decision making. Journal of Cognitive Neuroscience, 13, 454–461.

References     363

Vetter, P., & Newen, A. (2014). Varieties of cognitive penetration in visual perception. Consciousness and Cognition, 27, 62–75. Vogel, E. K., Woodman, G. F., & Luck, S. J. (2001). Storage of features, conjunctions, and objects in visual working memory. Journal of Experimental Psychology: Human perception and Performance, 27, 92–114. Wang, P., & Cottrell, G. W. (2017). Central and peripheral vision for scene recognition: A neurocomputational modeling exploration. Journal of Vision, 17(4), 1–22. Watzl, S. (2017). Structuring the Mind: The Nature of Attention & How It Shapes Consciousness. Oxford: Oxford University Press. Williamson, T. (2000). Knowledge and Its Limits. Oxford: Oxford University Press. Wilson, H. R., & Wilkinson, F. (2015). From orientations to objects: Configural processing in the ventral system. Journal of Vision, 15(7:4), 1–10. Witzel, C., & Gegenfurtner, K. R. (2018). Are red, yellows, green, and blue perceptual categories? Vision Research. https://doi.org/10.1016/j. visres.2018.04.002. Wokke, M., Sligte, I. G., Scholte, H. S., & Lamme, V. A. F. (2012). Two critical periods in early visual cortex during figure-ground segregation. Brain and Behavior, 2(6), 763–777. Wolfe, J. M., Oliva, C., Horowitz, T. S., Butcher, S. J., & Bompas, A. (2002). Segmentation of objects from background in visual search tasks. Vision Research, 42, 2985–3004. Wu, W. (2013). Visual spatial constancy and modularity: Does intention penetrate vision? Philosophical Studies, 165, 647–669. Wu, W. (2017). Shaking up the mind’s ground floor: The cognitive penetration of visual attention. The Journal of Philosophy, 114(1), 5–32. Wyart, V., Dehaene, S., & Tallon-Baudry, C. (2012). Early dissociation between neural signatures of endogenous spatial attention and perceptual awareness during visual masking. Frontiers in Human Neuroscience, 10. https://doi.org/10.3389/fnhum.2012.00016. Wyart, V., Nobre, A., & Summerfield, C. (2012). Dissociable prior influences of signal probability and relevance on visual contrast sensitivity. Proceedings of the National Academy of Sciences, 109, 3593–3598. Xiao, Y., Wang, Y., & Felleman, D. J. (2003). A spatially organized representation of colour in macaque cortical area V2. Nature, 421, 535–539.

364     References

Yamada, F. (1998). Frontal midline theta rhythm and eye linking activity during a VDT task and a video game. Ergonomics, 41, 678–688. Yamagishi, N., Callan, D. E., Goda, N., Anderson, S. J., Yoshida, Y., & Kawato, M. (2003). Attentional modulation of oscillatory activity in human visual cortex. Neurotic Age, 20, 98–113. Zeimbekis, J. (2013). Color and cognitive penetrability. Philosophical Studies, 165(1), 167–175. Zeimbekis, J., & Raftopoulos, A. (2015). Cognitive Penetrability: New Philosophical Perspectives. Oxford: Oxford University Press. Zhou, H. H., & Thompson, K. G. (2009). Cognitively directed spatial attention in the frontal eye field in anticipation of visual stimuli to be discriminated. Vision Research, 49, 1205–1215.

Index

A

B

Ambiguous figures 100, 108, 111, 116, 228, 232, 233, 240–242, 297, 300–305, 307 Amodal completion 206, 266–269, 275, 323, 324, 326, 328, 329 Analogy thesis 28, 44, 50, 51, 53, 56 Attention 3, 10, 11, 16, 21, 24, 26, 30–33, 35, 36, 38, 39, 45, 46, 49, 65, 85–93, 95, 100, 102, 103, 105–115, 117–122, 126, 130, 133–137, 143, 146, 150, 161, 174, 179–182, 189, 193, 196, 197, 200–211, 223, 227–230, 232, 233, 237–243, 246, 248, 249, 256, 262–264, 273, 299–302, 304–307, 326–328

Beliefs 1–5, 7–22, 24–26, 28, 29, 31, 34, 36, 37, 39–41, 43, 47, 50, 52, 57, 58, 60–67, 72, 74, 75, 89, 91–93, 95, 96, 99, 104, 120–122, 124, 125, 142, 144, 149, 165, 166, 171, 172, 195, 225, 228–233, 235, 238, 239, 244, 245, 247, 248, 255, 266, 275–277, 279, 281–283, 286, 308–316, 319–324, 328 C

Cognitive impenetrability 96 Cognitive penetration, penetrability 36, 94, 96, 107, 112, 113, 126, 128, 135, 240

© The Editor(s) (if applicable) and The Author(s) 2019 A. Raftopoulos, Cognitive Penetrability and the Epistemic Role of Perception, Palgrave Innovations in Philosophy, https://doi.org/10.1007/978-3-030-10445-0

365

366     Index

Conceptualization 21, 22, 255, 265, 281, 283, 285, 287, 309, 312, 321 Consequentialism 103 Constructivism 3, 103, 105, 122, 226, 234, 235, 238–240, 244, 246, 257 D

Deflationary 115 Dense basis functions 138, 140, 147 Dense representations 136, 272, 284 Direct cognitive effects 4, 89, 91, 101, 106, 117, 135 Directness condition 105, 117, 118, 122, 123 Discursive inferences 19, 27, 35, 44, 48, 49, 55, 56, 97, 98, 253, 254, 257, 278, 279, 282, 285, 286, 294, 295, 328 Dispositional 276, 277, 320, 322, 324 E

Early vision 4, 7, 11, 17, 18, 21, 30, 37, 38, 40, 45, 46, 54, 55, 65, 68, 73, 74, 77, 86, 87, 90, 91, 93–98, 114, 120, 121, 123–125, 128, 134, 137, 138, 140, 159–161, 168–175, 179, 181–188, 192, 195–201, 205, 206, 208, 209, 211, 212, 224–226, 228–232, 234, 236–238, 240–249, 254, 256, 257, 259–261, 265, 266, 268, 273, 275, 280–282, 285, 286,

295, 296, 302, 308, 310, 311, 313–315, 317, 321, 327–331 Epistemic condition 92, 120, 122, 123 Epistemic role of early vision 4, 17, 18, 90, 124, 125, 160, 161, 200, 208, 212, 225, 230, 245, 247 Epistemic role of late vision 124 Epistemic role of perception 1–5, 12, 13, 25, 37, 40, 56, 60, 89, 91, 92, 103, 104, 106, 111, 120–124, 228, 229, 243–245, 247, 249, 321 Evidence 3, 8–11, 14, 15, 17–20, 31, 33, 34, 36–42, 45, 46, 49–53, 56–59, 65–69, 71, 72, 86, 95, 98, 105, 114, 117, 125, 131, 134, 138, 140, 141, 151, 159–162, 172–174, 178, 181, 182, 185–187, 189, 191, 196–198, 201, 203, 204, 206, 208, 210, 225, 226, 228–233, 238, 245, 246, 256, 262, 274, 275, 280, 281, 301, 303, 306, 310, 319–322, 325, 326, 329 Externalism, externalist 5–9, 12, 13, 17, 19, 20, 22, 27, 36, 46, 52, 59, 66, 70, 77 F

Factive evidence 59, 72, 73, 75 H

Hierarchical generative model 258

Index     367

183, 184, 200, 206, 208, 211, 223–226, 228, 230–237, 240–249, 254–256, 260, 261, 265–268, 270–272, 275, 276, 278, 280–282, 285–287, 294–297, 300–302, 308–316, 319–322, 324–331

I

Iconic image 11, 33–38, 40, 45, 46, 65, 66, 88, 109–111, 113, 114, 125, 148, 224, 225, 228–233, 235, 237–243, 245, 246, 248, 281, 296, 308, 309 Indirect cognitive effects 4, 39, 77, 86–89, 91, 92, 101, 120, 121, 123, 132, 135, 153, 159–161, 224, 245, 247 Inferences 9, 19, 27, 37, 41, 43, 44, 47, 52–56, 60, 97, 98, 145, 162, 163, 169, 251–255, 257, 261, 268, 275, 276, 278–282, 286, 287, 295, 307, 310, 320, 322 Inferentialism 9, 13, 25, 26, 37, 56 Internalism 5, 12, 14, 17, 22, 26, 52, 66

M

Mental imagery, imagination 161, 266–272, 317, 323, 329–331 Modal completion 266 O

Operational constraints 162–168, 255, 280, 330 P

J

Judgments 8, 73, 74, 87, 99, 254, 256, 276, 277, 279, 319–323 Justification 1, 5, 8, 9, 12, 14, 16, 17, 19–23, 25, 30, 40, 41, 63, 66, 67, 69, 76, 77, 92, 96, 124, 320, 328 L

Late vision 3, 4, 7, 10, 11, 17–19, 21, 30, 33, 34, 37, 38, 40, 42, 45, 46, 49, 56, 65, 66, 86–92, 97, 98, 104–106, 109, 110, 113, 114, 116, 117, 119–122, 124, 125, 128, 132–134, 136, 148, 149, 159, 161, 170, 175,

Pattern matching 49, 255, 281, 286, 287, 294, 296, 297, 301, 307, 308 Percept 1, 4, 5, 8, 9, 11, 12, 21, 25, 31, 32, 34, 36, 37, 39, 40, 43–47, 49–51, 54, 65, 85, 88, 89, 93, 100, 101, 107, 109, 111–114, 116, 124, 125, 139, 146, 149, 150, 152, 161, 162, 165, 190, 200, 203, 208, 225–229, 232, 233, 236–238, 240–243, 245, 246, 251, 252, 254, 268, 269, 279, 289, 294, 299–304, 307, 308, 310, 313 Perceptual capacities 67–73, 172 Perceptual conservatism 14 Perceptual dogmatism 14

368     Index

Perceptual grounding 56 Perceptual justification 5, 14, 16, 17, 19–23, 25, 76, 77, 124, 245 Perceptual learning 27, 30, 34, 47, 49, 60, 101, 110, 160, 170, 172, 235, 240, 260 Perceptual mechanisms 35, 104, 141 Perceptual processing 2–4, 8, 16, 24, 31–33, 35, 37, 38, 45, 46, 48, 49, 86, 88, 91, 95, 101–105, 107–112, 114–120, 123, 126–128, 130, 132–136, 142, 146, 148–153, 160, 162, 167, 169, 171, 179, 191, 199, 201, 204, 207–211, 223–225, 227, 228, 231, 235, 242, 244–246, 261, 262, 280, 294, 300, 327, 330, 331 Phenomenal conservatism 12, 56, 58, 60, 63, 64, 67 Phenomenal dogmatism 14, 16, 20, 25, 46, 65, 66, 76, 77 Phenomenal evidence 57–59, 67–69, 71–73, 75

Precueing 176, 204, 209, 259, 265 R

Recurrent processing 133, 172, 173, 176, 181, 186–189, 191, 198, 207, 256, 261, 266, 281, 285, 301 Responsive mode 11, 31, 33, 34, 38–40, 45, 46, 125 S

Seemings 10, 12, 14, 17, 18, 20, 41–44, 56, 64, 125, 322 Selective mode 4, 11, 27, 31–33, 38, 39, 45, 46, 108 Sensitivity 9, 10, 13, 27, 33, 36, 37, 40, 46, 48, 50, 60, 65, 66, 77, 172, 196, 200, 202–204, 210, 224, 235, 236, 238, 243, 247, 259, 290 Sparse basis functions 138, 140, 142, 143, 147, 274