Empiricism and Philosophy of Physics 3030649520, 9783030649524

This book presents a thoroughly empiricist account of physics. By providing an overview of the development of empiricism

533 47 3MB

English Pages 291 [293] Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Empiricism and Philosophy of Physics
 3030649520, 9783030649524

Table of contents :
Preface
Acknowledgements
Contents
Part I Background
1 Problems in Philosophy of Physics
1.1 Introduction
1.2 Philosophical Questions in Physics
1.3 Summary and General Methodology
2 Some Important Episodes in the History of Physics
2.1 Introduction
2.2 Aristotle's Physics
2.3 From Aristotle's Physics to Classical Mechanics: Galilei and Newton
2.4 Relativity Theory
2.5 Quantum Theory
Part II General Philosophy of Science
3 Empiricism from Ockham to van Fraassen
3.1 Introduction
3.2 Medieval Nominalism: An Empiricist Position
3.3 Classical Empiricism
3.4 Empiricism During the Nineteenth Century
3.4.1 Mach
3.4.2 Poincaré
3.5 The Vienna Circle
3.6 Quine
3.7 Van Fraassen's Constructive Empiricism
3.7.1 Van Fraassen's Empiricist Stance
3.8 Evidence
3.8.1 Evidence and Reasons
3.8.2 Empirical Evidence
3.8.3 Is Inconsistency Counter-Evidence Against a Theory?
3.9 Classification: Natural Kinds
3.10 My Empiricist Stance
4 Mathematical Knowledge and Mathematical Objects
4.1 Introduction
4.2 Kant and Quine on Objects
4.3 Truth Value Gaps in Mathematics
4.4 Are Numbers Universals?
4.5 From Natural Numbers to Reals
4.5.1 Against Reduction of Mathematics to Set Theory
4.5.2 Platonism Versus Constructivism and Reals
4.6 Constructions of Numbers
4.6.1 Constructions of Integers and Rationals
4.6.2 Reals and Infinity
4.6.3 Constructive Analysis
4.7 Gödel's First Incompleteness Theorem and the Law of Excluded Middle
4.8 Summary
5 Induction and Concept Formation
5.1 Induction in the Naturalistic Perspective
5.2 Justification in the Naturalistic Perspective
5.3 Evidence and Justification
5.4 Induction and Concept Formation
5.5 Induction as a Heuristic Device
5.6 Summary
6 Explanation, Unification and Reduction
6.1 Introduction
6.2 Friedman on Unification
6.3 Nagel on Theory Reduction
6.4 Explanation and Understanding
6.5 Summary
7 Realism, Theory-Equivalence and Underdetermination of Theories
7.1 The Physical Content of Theories
7.2 Arguments About Scientific Realism
7.2.1 Defusing Underdetermination
7.2.2 Structural Realism
7.3 Existence
7.4 Are Physical Quantities Real?
7.4.1 Universals
7.4.2 Physical Quantities
7.5 The Use of `Model' in Physics
7.6 Theories of Principle vs Constructive Theories
7.7 Summary
Part III Philosophy of Physics
8 Causation in Physics
8.1 Introduction
8.2 Causes and Laws
8.2.1 Causation and Relativity Theory
8.3 Are Forces Causes?
8.4 Cause Is Agent-Related
8.5 Summary
9 Space, Time and Body; Three Fundamental Concepts
9.1 Observations
9.2 How Does a Theory Connect to the World?
9.3 The Interdependence Between the Predicates place, time and body
9.3.1 Bodies and Particles
9.4 Fundamental Quantities
9.5 Summary
10 Laws
10.1 Introduction
10.2 The Extension of the Predicate ``Law of Nature''
10.3 The Logical Form of Laws
10.4 Semantics and Ontology
10.5 Induction, Concept Formation and Discovery of Fundamental Laws
10.5.1 Laws, Physical Theories and Observations: Top-Down Or Bottom-Up?
10.6 Laws and Fundamental Quantities in Classical Mechanics
10.6.1 The Discovery of Momentum Conservation and the Introduction of mass and force
10.6.2 Types of Laws in Classical Mechanics
10.7 Laws in Special Theory of Relativity
10.8 Laws of Electromagnetism
10.9 Fundamental Laws that Do Not Introduce New Quantities
10.10 Lawhood and Necessity
10.11 Summary
11 Electromagnetism: Fields or Particles?
11.1 Introduction: What Is Real: Fields, Particles or Both?
11.2 Ontological Commitment
11.2.1 Alternating the Ontology of a Theory
11.3 Semantics of Classical Electromagnetism
11.4 Inconsistency of Classical Electromagnetism?
11.5 Why Not a Double Ontology?
11.6 What Do We Observe?
11.7 Relativistic Quantum Electrodynamics
11.8 Summary
12 Propensities
12.1 Introduction
12.2 Objectivity and Chanciness
12.3 Indeterminism and Objective Chance
12.4 Conditional Propensities
12.5 Conditionals vs Conditional Probabilities
12.6 The Scope of Genuine Randomness
12.7 Summary
13 Direction of Time
13.1 Introduction
13.2 Time Reversal and Dynamics of Motion
13.2.1 Time Reversal in Classical Mechanics
13.2.2 CPT Symmetry
13.2.3 Time Asymmetry in Weak Interactions
13.2.4 Time Reversal in Quantum Mechanics
13.3 Time Symmetry and Electromagnetic Radiation
13.4 Conditions for Time and Space Co-ordination
13.5 Definition of dynamical reversibility
13.6 When Is a Quantum System Dynamically Reversible?
13.7 Time and Entropy
13.7.1 Time Reversal and the Second Law of Thermodynamics
13.7.2 Entropy Function Defined on Hilbert Spaces?
13.8 The Arrow of Time and Clocks
13.8.1 Entropy of Clocks
13.8.2 Direction of Time Without a Universal Clock
13.9 Time and Big Bang
13.10 Summary
14 Identity, Individuation, Indistinguishability and Entanglement
14.1 Introduction
14.2 Maxwell-Boltzmann Statistics
14.3 Fermi-Dirac Statistics
14.4 Bose-Einstein Statistics
14.4.1 Elementary Bosons
14.4.2 Composite Bosons
14.5 Individuation and Identity of Quantum States
14.6 Individuation of Quantum Systems
14.7 Entanglement
14.8 Summary
15 Quantum Waves and Indeterminacy
15.1 Introduction
15.2 Quantum Systems Propagate As Waves
15.2.1 Probability Amplitudes
15.3 Indeterminacy, Not Uncertainty!
15.4 Summary
16 The Measurement Problem
16.1 Introduction
16.2 Von Neumann's Account of Measurements
16.3 The Copenhagen View on Measurements
16.4 My Own View: A Collapse Interpretation
16.5 Three Steps of a Measurement
16.6 Discreteness of Interactions
16.7 From Classical to Quantum Mechanics
16.7.1 Replacing Operators for Variables
16.7.2 Interaction and Individuation of Quantum States
16.7.3 Unobserved Interactions
16.7.4 Measurements of Continuous Observables
16.8 State Evolution and Time Dependent Hamiltonians
16.9 A Semi-formal Derivation of Collapse
16.10 Summary
17 What Is Spacetime?
17.1 Introduction
17.2 The Role of Rods and Clocks in Relativity Theory
17.3 GTR: The Relation Between Spacetime Structure and Matter Distribution
17.4 Spacetime Functionalism
17.5 String Theory and Spacetime
17.5.1 The Dimensionality of Space
17.5.2 String Theory and GTR
17.6 Summary
18 Summary and Conclusions
Bibliography
Index

Citation preview

Synthese Library 434 Studies in Epistemology, Logic, Methodology, and Philosophy of Science

Lars-Göran Johansson

Empiricism and Philosophy of Physics

Synthese Library Studies in Epistemology, Logic, Methodology, and Philosophy of Science Volume 434

Editor-in-Chief Otávio Bueno, Department of Philosophy, University of Miami, USA

Editors Berit Brogaard, University of Miami, USA Anjan Chakravartty, University of Notre Dame, USA Steven French, University of Leeds, UK Catarina Dutilh Novaes, VU Amsterdam, The Netherlands Darrell P. Rowbottom, Lingnan University, Hong Kong Emma Ruttkamp, University of South Africa, South Africa Kristie Miller, University of Sydney, Australia

The aim of Synthese Library is to provide a forum for the best current work in the methodology and philosophy of science and in epistemology. A wide variety of different approaches have traditionally been represented in the Library, and every effort is made to maintain this variety, not for its own sake, but because we believe that there are many fruitful and illuminating approaches to the philosophy of science and related disciplines. Special attention is paid to methodological studies which illustrate the interplay of empirical and philosophical viewpoints and to contributions to the formal (logical, set-theoretical, mathematical, information-theoretical, decision-theoretical, etc.) methodology of empirical sciences. Likewise, the applications of logical methods to epistemology as well as philosophically and methodologically relevant studies in logic are strongly encouraged. The emphasis on logic will be tempered by interest in the psychological, historical, and sociological aspects of science. Besides monographs Synthese Library publishes thematically unified anthologies and edited volumes with a well-defined topical focus inside the aim and scope of the book series. The contributions in the volumes are expected to be focused and structurally organized in accordance with the central theme(s), and should be tied together by an extensive editorial introduction or set of introductions if the volume is divided into parts. An extensive bibliography and index are mandatory.

More information about this series at http://www.springer.com/series/6607

Lars-G¨oran Johansson

Empiricism and Philosophy of Physics

Lars-G¨oran Johansson Department of Philosophy Uppsala University Uppsala, Sweden

ISSN 0166-6991 ISSN 2542-8292 (electronic) Synthese Library ISBN 978-3-030-64952-4 ISBN 978-3-030-64953-1 (eBook) https://doi.org/10.1007/978-3-030-64953-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

Praetera censeo metaphysicam esse delendam. (Furthermore, I consider that metaphysics must be destroyed.) Axel Hägerström Entia non sunt multiplicanda praeter necessitatem (Entities should not unnecessarily be multiplied.) William of Ockham The purpose of this book is twofold: (i) to present a thoroughly empiricist position in epistemology and (ii) to apply it to philosophy of physics. My motive is the same as that of the logical empiricists, viz., (i) to tell what to count as empirical evidence for a theory and (ii) to present a philosophical view on science, in particular physics, which does not engage in any superfluous metaphysics. Empiricists are united in holding that empirical evidence is the only evidence there could be for an empirical theory; the crucial question is to say more precisely what to count as empirical evidence. Regarding metaphysics, the crucial question is how to distinguish between superfluous and non-superfluous metaphysics. Adherents to logical positivism were hostile to theoretical entities. A common view was to distinguish between observable and non-observable, that is, theoretical entities, where the latter were viewed with scepticism. Modern empiricists, for example van Fraassen, hold in a somewhat similar vein that we have no good reason to believe theoretical sentences being true. (But he accepts that theoretical sentences are truth-apt.) Scientific realists by contrast believe in the existence of some theoretical entities and that we have good reason to think at least some theories are approximately true. In my view, neither the scientific realist’s defence of their views nor the empiricist’s criticism are well founded. In this book, I will develop what I hope is a more tenable version of empiricism, not based on the distinction between observable/non-observable, but instead based on semantic considerations.

v

vi

Preface

A fundamental distinction in semantics is between singular and general terms, the latter alternatively called predicates.1 The referent of a singular term, if it exists, is an individual, whereas the referent of a general term, if it exists, is a universal. (This will be thoroughly discussed in Chap. 3.) When we accept a theory as (approximately) true, we tentatively accept all its sentences as being true. It follows that its singular terms must refer to things that exist and the domains of its first-order variables must be non-empty. But it does not follow that the general terms deployed must have referents, nor that second-order quantification is needed. My ontological position is that we never have good reason for assuming properties and relations, that is, universals, as referents of general terms; they are superfluous. I thus adhere to nominalism in the vein of Ockham and Goodman, to be discussed in Chap. 3. The question of ontology is intimately connected to semantics. The question ‘What exists?’ may be viewed as short for ‘What exists according to our best theory of the world?’ and the answer is: When we hold a theory to be true, or approximately so, we must accept as existing those things which the theory says exist. But an empirical theory is rarely explicit about the ontological commitments made when accepting it. We need to paraphrase physical theories into predicate logic in order to see their existence assumptions. The fundamental issue in the semantics of scientific theories is the connection between theory and reality. When contemplating this issue, I have found Löwenheim-Skolem’s theorem of utmost importance. This theorem says, roughly, that no theory by itself can determine what it is about. From Löwenheim-Skolem’s theorem it follows that the connection between theory and reality must be established by non-theoretical means.2 These consist of indexical words, such as ‘this’, ‘that’, ‘ here’ and ‘now’, used in connection with pointing gestures; such combinations of indexical expressions and gestures determine the references of certain terms. For example, the reference of the singular term ‘this detector’ used in the sentence ‘This detector now shows 50 counts’ is determined as the very object pointed at in a particular situation where a token of this sentence is used. Löwenheim-Skolem’s theorem tells us that such concrete uses are needed for connecting a theory to reality. This fits nicely into an empiricist outlook. This book is thus based on three foundations: empiricism in epistemology, nominalism in ontology and Löwenheim-Skolem’s theorem in semantics. This will be elaborated in some detail in Chap. 3. A great part of philosophy of science is philosophy of physics. This is easy to understand: since physics is the most basic and general of all empirical sciences,

1 One may hold that general terms and predicates are strictly speaking not the same kinds of things,

but for this discussion the difference doesn’t matter. we use an auxiliary theory for relating a particular theory to reality, we can view this auxiliary theory as an extension of the original theory and Löwenheim-Skolem’s theorem applies also to this extended theory.

2 If

Preface

vii

philosophical discussions concerning fundamental issues in philosophy of physics implicitly cover also some fundamental issues in all sciences. A large proportion of philosophy of science, in particular philosophy of physics, is devoted to logical and formal aspects of theories, such as their formal semantics and axiomatisation. For certain purposes this is interesting, but I will not discuss such matters. My focus of interest in this book is epistemological and ontological questions, both in relation to physics. Therefore, I see no need to say anything about formal semantics, axiomatisation, consistency or set theoretical foundations of science. With only little exaggeration, one may say that these topics are about relations between words only; they rarely have much bearing on how words relate to the world. Nor do I need modal logic, which may astonish most readers, for how could one give an account of natural laws without talking about their necessity? Well, I certainly accept that laws, properly so called, are necessary in some sense, but I need no modal logic for my account of their necessity, as will be explained in Chap. 10. The first two chapters give a background for the rest of the book. Chapter 1 lists a number of philosophical problems in physics, which are interesting from an empiricist point of view, and Chap. 2 presents very briefly the development from Aristotle’s physics to quantum theory. It is not really anything that can be called a history of physics; it contains only some aspects which are relevant for the ensuing chapters on issues in philosophy of physics which I have found interesting. Chapter 3 is an overview of the empiricist tradition and it ends with what I believe is a tenable version of empiricism. Chapter 4, about mathematics, may seem a bit odd in this context, but I have come to the conclusion that it should be included: physics and mathematics are so intimately related that it is almost impossible to avoid taking a stance on questions about mathematical knowledge and mathematical objects when discussing ontology and semantics of physical theories. In particular, I have arrived at the conclusion that even for an empiricist, there is good reason to accept mathematical objects, for example, vectors, in the ontology. But we have no good reason to assume that there are physical entities, such as forces, corresponding to vectors in the mathematical sense. And similar reflections apply to spacetime: Just because a Lorenzian manifold with a metric, a mathematical entity, is very useful in relativity theory, it does not follow that that this mathematical object has a counterpart in the physical world. We can do without assuming a physical entity spacetime. Chapters 5 to 8 discuss topics in general philosophy of science and Chaps. 9 to 17 narrow down to some specific issues in philosophy of physics. Great parts of Chap. 5 is taken from my ‘Induction and Epistemological Naturalism’, Philosophies, 3, 31, 2018. Chapters 10, 11 and 12 are previously published papers, which here are reprinted with minor changes (such as replacing ‘this paper’ with ‘this chapter’) as adaptions to the context of a book. Chapter 10 was published as ‘The Ontology of Electromagnetism’, Studia Philosophica Estonica, 10(1) 2017, 25–44, Chap. 11 was published as ‘An Empiricist View on Laws, Quantities and Physical Necessity’, Theoria 85, 2, 69–101, April 2019, and Chap. 12 was published as ‘Propensities’, pp. 161–175 in Logic, Ethics and all that Jazz. Essays in honour of

viii

Preface

Jordan Howard Sobel, edited by Lars-Göran Johansson, Jan Österberg and Rysiek Sliwinski. Uppsala: Department of Philosophy, Uppsala University. 2009, Uppsala Philosophical Studies 57. For the papers on electromagnetism, epistemology and propensities, I have the copyright, whereas Theoria foundation has the copyright for the paper on laws. I hereby thank Theoria for permission to reprint it here. A note on the terms ‘concept’ and ‘predicate’ The word ‘concept’ is one of the most common words in philosophy, but, alas, it is often not very clear what it is supposed to refer to. In this book, I will use it as meaning ‘general term adjoined with rules for its application’. In physics, these rules are sometimes explicitly stated, as is the case with quantities and units defined in the SI system. Thus, a concept, in my use of that word, is basically a linguistic item, not something in our minds. Use-mention confusion is still common, in spite of Quine’s long fight against it. In order to be absolutely clear about the distinction between talking about a term and using it, I will use the text style SMALL CAPITAL when I talk about a general term (=concept). Thus, observe the contrast between ‘TIME is the fourth coordinate in four-dimensional spacetime’ and ‘The emission time is four seconds’. Similarly, ‘The charge of an electron is −1.6 ∗ 10−19 Coulomb’ whereas ‘CHARGE is an electromagnetic quantity’. However, when I quote someone else, I keep the original text style; so, for example, Weyl uses boldface when he talks about concepts, see the quote in Chap. 9. Concepts, types and tokens A token of a term is a physical object, whereas a term in the sense of type is an abstract object. It is tempting to say that a type is the set of all its tokens. But that is problematic, for we would not accept that a new occurrence of a token of a term, which obviously changes the set, entails that the type has changed; hence, types of terms cannot be given an extensional reading. So, we must accept that terms in the sense of types must be taken as primitive abstract objects, not reducible to sets of concrete objects. The introduction of abstract objects into discourse is a manifestation of our general capacity for generalising. We begin teaching our children words for observable things such as dogs, balls, apples, dolls, etc. When a child masters the word ‘dog’, she/he usually recognises a new object as being a dog, if it is a dog, and may say ‘dog’. The inductive generalisation works on both sides, so to say: a new object is cognised as sufficiently similar to a set of earlier observed objects, and the tokens of the word ‘dog’ heard in association with observations of dogs are cognised as sufficiently similar to each other, and something approximately similar is produced when seeing the dog. So, both cognition and language use is based on our capacity for generalisation. Thus, even at the most basic level of thinking and language use, we deploy abstract objects, that is, word types. Since I will argue for nominalism in the vein of Ockham and Goodman, it might seem inconsistent to accept abstract objects in the universe of discourse. But it is not. The distinction concrete-abstract is not

Preface

ix

the same as the distinction individual-universal. I adopt Aristotle’s notion of an individual object, which Aristotle called a primary substance, as that which only can be the subject of a proposition, never being a predicate, albeit rephrased in modern semantic terms: an individual object is the referent of a singular term, never of a general term. And there are lots of abstract objects that are referred to by singular terms, most obviously numbers, which we refer to by, for example, numerals. That a natural number may be viewed as constructed as a set of sets is no objection. The European Commission consists of 27 persons. Still it is an individual, since ‘The European Commission’ is a singular term. So my nominalism is Goodman’s; I resolutely reject universals as referents of general terms. So, for example, the twoplace predicate ‘∈’ has extension but no referent, as all general terms. Upplands Väsby

Lars-Göran Johansson

Acknowledgements

I have over the years presented drafts of most of the chapters to different audiences and greatly benefitted from comments and discussions with numerous colleagues and students. I do not remember them all. But those I remember as having asked profound questions and giving valuable comments are Richard David, Jan Faye, Maarten Franssen, Rasmus Jaksland, Jonathan Knowles, Henrik Lagerlund, Sten Lindström, Sebastian Lutz, Per Martin-Löf, George Masterton, Hans Mathlein, Keizo Matsubara, Paul Needham, Dag Prawitz and Henning Strandin. I thank them all. Anonymous referees of earlier versions of previously published chapters have given much appreciated comments. I gratefully acknowledge financial support from Åke Wiberg Foundation for travels to conferences where I have presented chapters of this book. Upon finishing this book, I realised the profound truth in Churchill’s statement: Writing a book is an adventure. To begin with, it is a toy and an amusement; then it becomes a mistress, and then it becomes a master, and then a tyrant. The last phase is that just as you are about to be reconciled to your servitude, you kill the monster, and fling him out to the public.

Now is the time to fling this book out. I dedicate it to my grandchildren Ville, Leo, Kerstin, Frida, Klara and Filippa. Grandchildren are the dessert of life! Upplands Väsby, June 2020

LGJ

xi

Contents

Part I Background 1

Problems in Philosophy of Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Philosophical Questions in Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Summary and General Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 3 5 11

2

Some Important Episodes in the History of Physics . . . . . . . . . . . . . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Aristotle’s Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 From Aristotle’s Physics to Classical Mechanics: Galilei and Newton. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Relativity Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Quantum Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15 15 15 18 20 21

Part II General Philosophy of Science 3

Empiricism from Ockham to van Fraassen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Medieval Nominalism: An Empiricist Position . . . . . . . . . . . . . . . . . . . 3.3 Classical Empiricism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Empiricism During the Nineteenth Century . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Mach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Poincaré. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 The Vienna Circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Quine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Van Fraassen’s Constructive Empiricism . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.1 Van Fraassen’s Empiricist Stance . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.1 Evidence and Reasons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.2 Empirical Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.3 Is Inconsistency Counter-Evidence Against a Theory? .

25 25 26 27 29 30 31 32 35 37 39 40 40 41 45 xiii

xiv

Contents

3.9 3.10

Classification: Natural Kinds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . My Empiricist Stance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

46 48

Mathematical Knowledge and Mathematical Objects . . . . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Kant and Quine on Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Truth Value Gaps in Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Are Numbers Universals? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 From Natural Numbers to Reals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Against Reduction of Mathematics to Set Theory . . . . . . . 4.5.2 Platonism Versus Constructivism and Reals . . . . . . . . . . . . . 4.6 Constructions of Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 Constructions of Integers and Rationals . . . . . . . . . . . . . . . . . 4.6.2 Reals and Infinity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.3 Constructive Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Gödel’s First Incompleteness Theorem and the Law of Excluded Middle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53 53 56 59 62 63 64 65 66 66 67 68

5

Induction and Concept Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Induction in the Naturalistic Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Justification in the Naturalistic Perspective . . . . . . . . . . . . . . . . . . . . . . . 5.3 Evidence and Justification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Induction and Concept Formation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Induction as a Heuristic Device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73 73 76 82 83 87 88

6

Explanation, Unification and Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Friedman on Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Nagel on Theory Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Explanation and Understanding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

91 91 92 94 98 98

7

Realism, Theory-Equivalence and Underdetermination of Theories . 7.1 The Physical Content of Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Arguments About Scientific Realism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Defusing Underdetermination . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 Structural Realism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Are Physical Quantities Real? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Universals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Physical Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 The Use of ‘Model’ in Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 Theories of Principle vs Constructive Theories . . . . . . . . . . . . . . . . . . . 7.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

101 101 102 104 106 108 110 110 111 112 113 116

4

69 71

Contents

xv

Part III Philosophy of Physics 8

Causation in Physics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Causes and Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1 Causation and Relativity Theory . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Are Forces Causes? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 CAUSE Is Agent-Related . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

119 119 121 123 124 125 125

9

SPACE, TIME and BODY; Three Fundamental Concepts . . . . . . . . . . . . . . 9.1 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 How Does a Theory Connect to the World? . . . . . . . . . . . . . . . . . . . . . . . 9.3 The Interdependence Between the Predicates PLACE, TIME and BODY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.1 Bodies and Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Fundamental Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

127 127 129

Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 The Extension of the Predicate “Law of Nature”. . . . . . . . . . . . . . . . . . 10.3 The Logical Form of Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 Semantics and Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 Induction, Concept Formation and Discovery of Fundamental Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5.1 Laws, Physical Theories and Observations: Top-Down Or Bottom-Up? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.6 Laws and Fundamental Quantities in Classical Mechanics . . . . . . . 10.6.1 The Discovery of Momentum Conservation and the Introduction of MASS and FORCE . . . . . . . . . . . . . . . . . . . . 10.6.2 Types of Laws in Classical Mechanics . . . . . . . . . . . . . . . . . . . 10.7 Laws in Special Theory of Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.8 Laws of Electromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.9 Fundamental Laws that Do Not Introduce New Quantities . . . . . . . 10.10 Lawhood and Necessity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

139 139 142 143 144

Electromagnetism: Fields or Particles? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Introduction: What Is Real: Fields, Particles or Both? . . . . . . . . . . . . 11.2 Ontological Commitment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.1 Alternating the Ontology of a Theory . . . . . . . . . . . . . . . . . . . . 11.3 Semantics of Classical Electromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Inconsistency of Classical Electromagnetism? . . . . . . . . . . . . . . . . . . . . 11.5 Why Not a Double Ontology?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6 What Do We Observe? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.7 Relativistic Quantum Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

169 169 171 173 174 175 177 178 179 181

10

11

130 134 135 138

147 148 150 150 154 157 159 161 164 167

xvi

Contents

12

Propensities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Objectivity and Chanciness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 Indeterminism and Objective Chance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4 Conditional Propensities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.5 Conditionals vs Conditional Probabilities . . . . . . . . . . . . . . . . . . . . . . . . 12.6 The Scope of Genuine Randomness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

183 183 186 187 190 192 194 195

13

Direction of Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2 Time Reversal and Dynamics of Motion . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2.1 Time Reversal in Classical Mechanics . . . . . . . . . . . . . . . . . . 13.2.2 CPT Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2.3 Time Asymmetry in Weak Interactions . . . . . . . . . . . . . . . . . . 13.2.4 Time Reversal in Quantum Mechanics . . . . . . . . . . . . . . . . . . 13.3 Time Symmetry and Electromagnetic Radiation . . . . . . . . . . . . . . . . . . 13.4 Conditions for Time and Space Co-ordination. . . . . . . . . . . . . . . . . . . . 13.5 Definition of DYNAMICAL REVERSIBILITY . . . . . . . . . . . . . . . . . . . . . . 13.6 When Is a Quantum System Dynamically Reversible? . . . . . . . . . . . 13.7 Time and Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.7.1 Time Reversal and the Second Law of Thermodynamics 13.7.2 Entropy Function Defined on Hilbert Spaces? . . . . . . . . . . . 13.8 The Arrow of Time and Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.8.1 Entropy of Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.8.2 Direction of Time Without a Universal Clock . . . . . . . . . . . 13.9 Time and Big Bang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

197 197 199 199 200 202 202 204 206 206 208 210 210 211 212 213 214 215 215

14

Identity, Individuation, Indistinguishability and Entanglement . . . . . . 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 Maxwell-Boltzmann Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.3 Fermi-Dirac Statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.4 Bose-Einstein Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.4.1 Elementary Bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.4.2 Composite Bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.5 Individuation and Identity of Quantum States . . . . . . . . . . . . . . . . . . . . . 14.6 Individuation of Quantum Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.7 Entanglement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

217 217 219 220 220 220 221 221 222 222 225

15

Quantum Waves and Indeterminacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2 Quantum Systems Propagate As Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2.1 Probability Amplitudes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3 Indeterminacy, Not Uncertainty! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

227 227 228 230 231 232

Contents

xvii

16

The Measurement Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2 Von Neumann’s Account of Measurements . . . . . . . . . . . . . . . . . . . . . . . 16.3 The Copenhagen View on Measurements . . . . . . . . . . . . . . . . . . . . . . . . . 16.4 My Own View: A Collapse Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . 16.5 Three Steps of a Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.6 Discreteness of Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.7 From Classical to Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.7.1 Replacing Operators for Variables. . . . . . . . . . . . . . . . . . . . . . . . 16.7.2 Interaction and Individuation of Quantum States . . . . . . . . 16.7.3 Unobserved Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.7.4 Measurements of Continuous Observables . . . . . . . . . . . . . . 16.8 State Evolution and Time Dependent Hamiltonians . . . . . . . . . . . . . . 16.9 A Semi-formal Derivation of Collapse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

233 233 235 236 237 238 242 245 245 247 249 249 250 251 254

17

What Is Spacetime? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2 The Role of Rods and Clocks in Relativity Theory . . . . . . . . . . . . . . . 17.3 GTR: The Relation Between Spacetime Structure and Matter Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.4 Spacetime Functionalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.5 String Theory and Spacetime. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.5.1 The Dimensionality of Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.5.2 String Theory and GTR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

255 255 257

18

259 261 263 263 265 266

Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

Part I

Background

Chapter 1

Problems in Philosophy of Physics

Abstract This chapter lists the issues in philosophy of science and in particular philosophy of physics, which will be discussed in this book. Some are of a more general character, such as the relation between mathematics and physics, some are more specific, such as the ontology of quantum mechanics.

1.1 Introduction Physics is universally recognized as the most fundamental of the natural sciences. Quine, with his usual wit, expressed this point succinctly: ‘Physics investigates the essential nature of the world, and biology describes a local bump. Psychology, human psychology, describes a bump on the bump.’ Quine (1981b, 93) . And some pages later: ‘Full coverage in this sense is the business of physics and only of physics.’ (op.cit. p. 98). Thus, physics aims to describe the world in its most general traits, what it consists of, its fundamental mechanisms and how it changes with time. No wonder, then, that quite a few philosophers nowadays claim that ontological questions are answered by physics, not by any a priori philosophical inquiry. This is also my view. But ontology cannot directly be read off from physics books; some interpretative effort is needed. And, further, epistemological and semantic questions arise: how do we obtain knowledge about theoretical objects such as Higgs particles or brans? And how do highly theoretical concepts, such as differentiable manifolds, get physical meaning? Perhaps they have no physical meaning at all? It is obvious that physical theories trigger ontological, epistemological and semantic questions. How to proceed answering such questions? On the one hand one needs to start a philosophical inquiry into any field of discourse using ordinary non-technical language; I don’t see any alternative. Hence, explanations of things in higher realms of theoretical physics must be based on every day experiences. On the other hand, we take for granted that our immediate experiences of physical events, which we express in ordinary language, ultimately are to be explained by fundamental physics expressed in highly theoretical vocabulary; explanations in physics are generally conceived as starting © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_1

3

4

1 Problems in Philosophy of Physics

from the ultimate building blocks of the universe, often thought to be elementary particles, to systems of such things, and further to systems of systems of these things in an ever increasing complexity. Physical explanation is usually top-down. So what explains what? My take on this tension between two views on explanation is, briefly, that in explanation of physical concepts we should begin with our repeatable experiences, which we describe by universally generalised sentences, some of which contain physical concepts introduced for this very purpose; one example is the concept of MASS, the introduction of which will be discussed in Chap. 10. When we proceed inquiring into still more hidden features in the world we need new concepts when describing these features; new phenomena require new concepts. Such new concepts must, explicitly or implicitly, be defined in terms of earlier established vocabulary. Thus discoveries in physics are often at the same time introduction of new theoretical concepts. I will discuss some examples of this concept forming process in Sect. 5.5. Explanation of phenomena goes the other way. Once a certain level of theoretical sophistication has been attained, we may derive predictions about repeatable and observable events from fundamental laws, postulates and principles. If these predictions regularly are fulfilled we have confidence in our theory and we describe our cognitive situation as that we can explain particular events by deriving them from laws. The explanatory force hinges on our acceptance and internalisation of theoretical concepts occurring in explanans; this must be achieved before one can say that a derivation explains. This view on the matter is not new, it is basically Aristotle’s view in Posterior Analytics 2.19.100a6-8, where he in a naturalistic way describes how we first acquire concepts and formulate generalities, which then are used as basis for deductions. But what is a scientific explanation? And can one explain one theory using another theory? Can one explain a scientific law? These questions have been much debated and I will discuss these topics in Chap. 6. Before jumping into problems in philosophy of physics it is necessary to discuss some general epistemological topics, viz., empiricism, induction and explanation. I have an empiricist outlook, being in general sympathetic to the views of philosophers such as Ockham, Hume, Mach, Quine and van Fraassen. The basic tenet of empiricism is that all evidence there can be for a scientific theory is empirical evidence. This might seem trivial; the real issue is what to count as empirical evidence. On this issue philosophers disagree and a more detailed discussion is called for, which will be the topic of Sect. 3.8. The relation between mathematics and physics is strong; how strong? Quine famously argued (Quine 1951) that the analytic-synthetic distinction could not be upheld, thereby implicitly merging mathematics and natural science into one single domain of inquiry. Most empiricists, including myself, have been unwilling to follow Quine in this. In Chap. 4 I will give my views on mathematical knowledge and mathematical objects.

1.2 Philosophical Questions in Physics

5

Ever since Hume the induction problem has been a disturbing problem for empiricists and no empiricist can with good conscience avoid discussing it. I have definite views about this problem, which will be presented in Chap. 5. After discussing the induction problem it is time to delve into the particular problems in philosophy of physics.

1.2 Philosophical Questions in Physics Even diehard empiricists are reluctant to dismiss some non-empirical questions in physics as uninteresting metaphysics, although nothing in the application of physics to concrete problem solving depends on our answers to such questions. A physicist may, but seldom does, neglect all non-empirical questions, but philosophers, and some theoretical physicists, do not. Below is a list of questions I find interesting and which will be adressed in this book. It is by no means complete; others may think there are additional topics worthy discussion. Do quantities exist? Physical quantities are usually conceived to be properties of physical systems, bodies and systems of bodies. A quantitative term, such as ‘acceleration’ is often assumed to refer to a property that ‘inheres’ in the accelerating body. This raises the question of individuation of properties; what to count as different properties and what is required for identity between properties? However, this seems to me to be a wholly pragmatic affair. One illustration is that in ordinary discourse we distinguish between times and distances. But since we know that the velocity of light is a universal constant, one can express all distances in terms of time, as we do when we give distances between astrophysical objects in lightyears. So we can reduce distances to times using the fact that velocity of light is a universal constant. And we can continue, using other universal constants, such as Planck’s constant, and reduce further, if we like. So how many quantitative properties are there in reality? Just as medieval nominalists, I hold that we may consistently use general terms without assuming that they refer to properties and relations; quantitative predicates have extensions but do not refer. This is the nominalists’ view and I will argue for this position in Chap. 3 and in Sect. 7.4. However, even after leaving aside this ontological issue by treating quantities as nothing but predicates with extensions but lacking reference, one may still wonder how many such quantitative predicates we need as fundamental, not defined in terms of other quantities. There is a strong argument, given in Sect. 9.4, to the effect that we only need one fundamental quantity, which may be chosen to be TIME, because time measurements require no other physical quantity, it only requires the use of natural numbers.

6

1 Problems in Philosophy of Physics

Mathematics and physics A fundamental question concerns the relation between mathematics and physics. Wigner once wrote a much discussed paper (The Unreasonable Effectiveness of Mathematics in the Natural Sciences (Wigner 1960)), where he expressed his astonishment of the usefulness of mathematics and suggested explanations. Others, e.g. (Schwarz 2006(1966)) took a more critical view on the unreflected confidence on the use of mathematics on science. A coherent empiricist and nominalist position, developed in Chap. 3, requires an analysis of the relation between empirical and formal sciences, i.e., between physics and mathematics. One crucial difference between mathematics and physics is that mathematical statements are certain, whereas physical theories are not; it is always in principle possible that a physical proposition might be wrong. A second difference is that physics is about objects and states of affairs in the natural world, whereas mathematics is about an abstract world, things not being in space and time. Well, some hold that there are no mathematical objects whatsoever. Some physicists and philosophers are full-blown platonists. One recent physicist holding this view is Tegmark, (see e.g. his Tegmark 2008). They identify the physical world with mathematics. This is out of the question for any empiricist and I don’t see how they can avoid the conclusion that physics is a priori true just as mathematics. I believe it being of utmost importance to clearly keep apart mathematical and physical propositions. (And although I’m very sympathetic to Quine’s philosophy, I don’t follow him in his merger of mathematics, logic and empirical science argued for in Two Dogmas.) We need to determine what parts of a physical theory really represent physical objects, phenomena or states of affairs, and what is merely mathematical auxiliaries. Answering this question is the core of the realismantirealism debate in philosophy of science. This task is urgent when interpreting quantum mechanics, but not limited to that theory. Mathematics and physics have since Galilei and Newton been deeply integrated. A textbook on physics is full of equations. These equations can be treated purely mathematically, as steps in derivations. But they are by physicists not thought of merely as mathematical relations, they are conceived as expressing relations between physical quantities, i.e. quantitative predicates. There are firmly established conventions for representing physical quantities by letters: ‘f’ stands for force, ‘m’, stands for mass, ‘t’ stands for time, ‘B’ stands for magnetic field, etc. Therefore physicists usually read equations as relations between physical magnitudes, attributes of physical entities, not merely as mathematical expressions. This has great heuristic and pedagogic value, but it has also drawbacks. Sometimes it happens that physicists are unable to interpret a certain equation as expressing a relation between physical quantities. One very clear example was Heisenberg’s matrix mechanics: it was quite clear that he could use matrices to calculate predictions of experiments, but what did these matrixes represent? Could they represent physical quantities? It seemed impossible. Another problem is Schrödinger’s wave mechanics. The waves that are solutions to Schrödinger’s equation cannot be real physical waves, since these solutions mostly are complex

1.2 Philosophical Questions in Physics

7

valued and one wonders how a complex function of the form exp(ikx) could represent a physical quantity. The relation between mathematics and physics may be stated as a semantic issue; mathematics is a language and using this language for describing the physical world we need to state clearly (i) what in this language have referential function and which not, and (ii) what kinds of objects mathematical terms refer to? Generally speaking, mathematical objects are abstract things, physical objects are concrete things and this distinction is essential for any empiricist. A critique of unreflected identifications of mathematics and physics can be found in Schwarz (2006(1966)). Many empiricists are hostile to all abstract objects. I am not. I will take for granted Quine’s two well-known slogans about existence: ‘To be is to be value of a variable’ and ‘No entity without identity’.1 Mathematical objects such as numbers clearly satisfy these conditions, hence I have no objection about quantifying over numbers and mathematical objects defined in terms of numbers. Ontology of physics: What exists? When interpreting a physical theory (or any theory) we need to express it in first order predicate logic in order to discern our ontological commitments. (Being a nominalist I need no second order logic.) In short, we are committed to the existence of those things comprising the domain of objects we assume as values of the bound variables, when we hold the theory to be true. The next step is to consider identity criteria in this domain; for if we have no clear criterion for what to count as the same object, in other words, when to say that the referents of two singular terms ‘a’ and ‘b’ are identical, the entire exercise loses its meaning. There is no point in assuming an object a if our only way of saying anything about this object is to use its particular name ‘a’. If that were the case we would have a one-one-correspondence between names and objects, and then, if our only access to the object a is via the singular term ‘a’, we have no reason to think that the name and its object are different things. If so we have no reason to think that we are talking about anything else than our own words. A fundamental notion in semantics is therefore reference, the relation between a term and the object it stands for. The very application of the concept of reference presupposes that we can distinguish between the referring term and its referent. If this is impossible, semantic discourse involving reference collapses. This will be further discussed in Chap. 3 and in Sect. 9.2. Realism versus anti-realism? The most debated topic in philosophy of science for more than forty years has been realism versus anti-realism. The great majority of philosophers of science have a

1 Quine

was not the first to see the connection between identity and existence, Aristotle, the scholastics, Leibniz and Russell, all seem to have the same idea, albeit expressed in different manners. For example “Ens et unum convertuntur” (Being and unity are interchangeable) is ascribed to St Thomas and other scholastics.

8

1 Problems in Philosophy of Physics

realistic attitude to physical theories. More precisely, they accept two fundamental principles expressed by Boyd (1984, 1): (i) scientific theories in the mature sciences are approximately true. (ii) central terms in these theories refer to real things. Empiricists are skeptical to one or both of these claims. The term ‘term’ as here used, is ambiguous; does Boyd (and his followers in this debate) mean general or singular terms? If ‘term’ here is understood as ‘general term’ I disagree, while if it is interpreted as ‘singular term’ I agree. I will discuss this issue in Chap. 7. Causes and forces The role of causation in physics is controversial. It seems that Newton, and most physicists and laymen, believe that the physical quantity FORCE is the physical counterpart to the notion of cause in our vernacular. Forces cause changes of motion, it is generally believed. But how to reconcile the notion that causes precede their effects, a basic trait of the notion of cause, with Newton’s third law which says that to every force there is an equal counterforce and there is no time delay, if forces are conceived as causes? And Newton’s second law and the law of gravitation both entail that force and change of motion state are simultaneous. In short, all laws involving forces, such Newton’s laws, the law of gravitation, Lorentz law etc., relate forces to motions of bodies without any reference to timing. We cannot identify forces with causes. But we use physics for identifying events we call ‘causes’ in certain contexts in order to be able to determine what to do to promote desirable events and prevent undesirable ones. We base our actions on knowledge about causes. So how to reconcile our causal thinking with a physics without causes? I will discuss these matters in Chaps. 8 and 10. Fields and/or particles in electromagnetism? The common view among physicists is to conceive of electromagnetism as describing how electromagnetic fields and charged particles interact. But this view is troublesome. As we will see, one hurdle is that a relativistic quantum field theory doesn’t allow a particle interpretation, given some very reasonable conditions on what a particle is. Classical electromagnetism, on the other hand, describes how observable macroscopic bodies having charge, represented in theory as particles, move in fields. So in classical electromagnetism, the empirical evidence consists of observations of things called ‘particles’. But classical electromagnetism is the classical limit of relativistic quantum field theory. Sorting out and answering these questions is the topic of Chap. 11. In addition, the fundamental equations in electromagnetism have two solutions, the retarded and the advanced one, where the latter describes the motion of a charged particle under the influence of distant fields at future times. These advanced solutions are in practice neglected, often using causality arguments. But we should not, in my view, use causal arguments for explaining aspects of a physical theory, rather we should explain causation partly in terms of physics. But, then, what to do

1.2 Philosophical Questions in Physics

9

with advanced solutions? This issue is discussed in Chap. 13. Space, time and spacetime; what are they? Physical events occur in space and time, or so we tend to think at least. What, then, are space and time? Do they exist independently of matter, a view called substantivalism and held by Newton, or are they only abstractions of relations between material things, relationslism, as Leibniz claimed? And what is spacetime? Should we think of spacetime merely as a coordinate system, a mathematical construct necessary for describing spatiotemporal relations between physical events, or is it a physical object? General relativity provides a new perspective on the substantivalism – relationalism debate. The fundamental equation in general relativity, Einstein’s equation, is a quantitative relation between a function of the metric tensor and the stress-energy tensor. The question then arises, is it the case that what exists is a distribution of mass-energy and this distribution can be attributed a spacetime with a metric, or should we think of the universe as a spacetime with a metric in which the observed masses just are manifestations of the curvature of spacetime? These questions will be discussed in Chaps. 9 and 17. When thinking about space, time and matter it is profitable to reflect on the fact that ordinary language is shaped mostly by our everyday experiences of interactions with the immediate environment. This is the reason, I believe, why we express our beliefs in sentences of noun-predicate structure; ordinarily we observe and talk about distinct things and we perceive them as having lots of features. (Sometimes we use mass terms, such a ‘water’ or ‘wood’, which, when need be, can be construed as singular terms referring to localised portions of matter such as ‘the water in my car engine’s cooling system’.) This linguistic structure doesn’t fit well with the content of relativity theory and other theories in advanced physics. Thus, a Kant-inspired inquiry is profitable, i.e. considering the very conditions for expressing beliefs in natural language. In contrast to Kant, however, I do not accept the distinction between an empirical and a transcendental inquiry into the structure of mind. I steadfastly hold on to empiricism and naturalism, which in this context means rejecting the idea that our own mind states are observable things. Hence we should phrase our inquiry in linguistic terms, not in terms of judgements and categories, as Kant did. This will be discussed in Chaps. 3 and 5. Direction of time: real or not? Time differs from spatial dimensions by being directed. There are pairs of events about which we can say that one event precedes the other, no matter what coordinate system you choose. This fact is often described metaphorically as that time flows. Does that mean that time after all is an independently existing object which in some sense ‘moves’? Or is it merely a manner of speech? If we think of time as relations between events, the direction of time must have a basis of irreversible state changes. But fundamental laws are time symmetric. So how could time have direction if there is no such direction in fundamental physics? This is a much discussed problem in philosophy of physics and I will discuss it in Chap. 13.

10

1 Problems in Philosophy of Physics

In the metaphysical discussion about time a core issue is the nature of the distinctions between past, present and future. This topic does not belong to the philosophy of physics; for me, and I guess for most philosophers of physics, the terms ‘past’, ‘present’ and ‘future’ are indexicals, lacking any objective and observer independent meaning. I won’t discuss this topic. An objective direction of time is based on the distinction earlier-later, not on the indexicals past-present-future. Are there any laws of nature? Natural laws have a certain feature often labelled nomological necessity, which distinguishes them from mere accidental generalisations. But how to draw the distinction? The debate about the nature of laws has been intense the last 50 years and no consensus is in sight. This is a core issue in philosophy of physics, since all fundamental laws are thought to be physical laws. In Chap. 10 I will present my stance in this debate. Indeterminism versus determinism. Is the world fundamentally deterministic or not? This is not an epistemological question because determinism is not the same as predictability. Classical mechanics is a deterministic theory, but there are classical systems whose long time evolution is completely unpredictable, even if they are completely isolated from the rest of the world. This has been known since the eighteenth century, when d’Alembert and others discussed the three-body problem. Later Poincaré proved that there are no general analytical solutions of the three-body problem in classical mechanics. So even if the evolution is completely deterministic, it may be impossible ot make long-time predictions of its state. Individual state changes in quantum theory are indeterministic. The question is then, is this so because quantum mechanics is an incomplete theory or is nature really indeterministic? If nature is deterministic it should be possible to add ‘hidden’ variables to quantum mechanics, variables which we can use to predict the outcome of otherwise unpredictable events. So far, all efforts to improve quantum mechanics in this way have been unsuccessful. The pilot-wave theory is perhaps the most well-known hidden-variables theory, but it is not Lorenz-invariant and can therefore not be a candidate for a true theory. Does that prove such efforts are doomed to fail? This is one topic of discussion in Chap. 12. Ontology of quantum mechanics; waves, particles or both? The basic principles of quantum mechanics was established 1926 and there is so far no empirical evidence that any of them is false; this theory has been used for an enormous amount of successful predictions for now more than 90 years. But, curiously, it has been one of the most debated theories among philosophers of science. Richard Feynman once wrote: There was a time when the newspapers said that only twelve men understood the theory of relativity. I do not believe there ever was such a time. There might have been a time when only one man did, because he was the only guy who caught on, before he wrote his paper. But after people read the paper a lot of people understood the theory of relativity in some

1.3 Summary and General Methodology

11

way or other, certainly more than twelve. On the other hand, I think I can safely say that nobody understands quantum mechanics. (Feynman 1967, 129).

But I beg to disagree. Whether we understand a particular theory depends, among other things, on ones criteria for understanding and people may differ on that matter. I think it is possible to understand quantum mechanics, since I do understand it. I will give my views on individuation and identity in Chap. 14, on wave-particle duality in Chap. 15, entanglement in Chap. 14 and on the measurement problem in Chap. 16. Entanglement and non-locality: is it possible to transmit effects instantaneously? Non-local correlations between quantum particles is a consequence of quantum mechanics. It is now tested empirically and must be accepted as an established fact. But how is it possible that two physical objects under certain conditions can influence each other’s states within a time interval shorter than the time it takes for a light signal to go from one particle to the other? Well, for one thing, no signal travels between the correlated particles, so we have no conflict with relativity theory. But, then, how can their states, under certain conditions, be correlated? This is a first class conundrum. Nonlocal correlations are often described using the term ‘entanglement’ and it is a fact about quantum mechanics that the usual way of representing a quantum system composed of subsystems entails entanglement. In Chap. 14 I will discuss these things and relate non-locality and entanglement to individuation of quantum systems. The basic mistaken belief in discussions of entanglement is that particles are individual things. The measurement problem; are collapses real or mere artefacts? The measurement problem in quantum mechanics has been with us since 1932 and still no agreed solution is in sight. The problem can described as follows: during one type of measurements on quantum objects the change of the state function is discrete, indeterministic and irreversible. One might think that it must be due to some influence from the observer doing the measurement, since the undisturbed evolution of any quantum system is continuous, deterministic and reversible. However, saying that it is the observer that brings about the discrete, indeterministic and irreversible state change leads to the idealistic conclusion that our cognitive activity changes an external objects’ state, a conclusion nearly all philosophers and physicists reject. So either the collapse of the wave function is not connected to measurements in particular, or the collapse is an artifact of the theory. This will be discussed in Chap. 16.

1.3 Summary and General Methodology The list above cover some of the debates in philosophy of physics, but also some general issues in epistemology, ontology and semantics. Some of these problems arise when we are unable to integrate what we learn from physics into a world

12

1 Problems in Philosophy of Physics

picture which we express in mostly ordinary language using principles and concepts that we more or less unreflectively adopt. This conflict is sometimes referred to as the contrast between the manifest world picture and the scientific world picture. A useful perspective may be to think of interpretation of physics as a kind of hermeneutics in the vein of Gadamer, see (Gadamer 1989). We start the hermeneutic process when we find a text incomprehensible; we don’t understand the overall message, although we believe we understand isolated parts, some of the words and expressions. Gadamer discusses interpretations of ancient philosophers, but quantum mechanics may be just as good as an example. We know how to use physical theory for making predictions about outcomes of experiments, and we know how to handle operators and variables, but we are unable to integrate the theory into something resembling a coherent picture of the entire world. The problem resembles that of someone reading a text written very long ago and belonging to a culture very different from our modern western one; it doesn’t fit into what we otherwise think of the world. Gadamer rightly remarks that when we read a contemporary text from our own culture, such as a newspaper article, we hardly say that we are doing hermeneutics or interpretation: we just read what it says, period. But even such a text sometimes elicit wonder: what do the author mean? Then we start the hermeneutic process, a part of which is to alternatively reflect on my own background beliefs, what Gadamer calls ‘prejudices’ (not in the derogatory sense, but literally ‘pre-judgements’) and on the text’s background assumptions. Gadamer points out that we have a lot of background beliefs which we unconsciously take for granted; these background beliefs he calls horizon of understanding. Since the text is authored by another person the same goes of course with the text, it is written within the authors’ horizon of understanding. Hence I face difficulties in understanding the text if the background beliefs of the text differs sharply from mine. The interpretative process is referred to as hermeneutic circle, which in fact consists of two ‘circles’; one is going back and forth between my own background assumptions and the overall world view of the text, the other is alternating focus between parts of the text and the entire text, assuming that it has a coherent meaning. When considering my own perspective I bring more and more of my own prejudices under critical scrutiny. Perhaps there is a conflict between some of the things the author takes for granted and what I take for granted, and being aware of such a conflict enables better understanding of what the text means. The process comes to an end when I no longer wonder; I just understand what it says. Gadamer describes this result as horizon merger; the reader and the text, as interpreted by the reader, has come to the same horizon of understanding. So according to Gadamer, the hermeneutical process of reading and finally understanding a foreign text changes the beliefs of the reader. However I don’t think this always is the case. I think a successful result does not always require that one changes one’s beliefs. I do believe that I can, for example, understand most parts of Aristotle’s writings on matter and motion and I still think he is wrong. But I can understand his arguments and how he as conceptualised matter and motion.

1.3 Summary and General Methodology

13

So, in contrast to Gadamer I think understanding doesn’t entail acceptance. That is however a side issue in the present context. I see a rather profound similarity between Gadamer’s hermeneutics and the process of understanding modern physics, starting from our more or less unreflected beliefs about nature ingrained in natural language. The conceptual structure in modern physics is quite different from what we use when we talk about physical events in our common life. The difficulties of understanding depends to some degree on different background assumptions, and the process of understanding requires of us to reflect on our own unreflected assumptions. Some physicists obviously achieve that; they think about nature in terms of identical particles, wave functions and curved spacetime and find these concepts natural. And I do think I have achieved the same level of understanding. Every teacher of classical mechanics will find a nice example of this hermeneutic in the process of trying to convey to students the fundamental lesson that there is no net force on a body which moves uniformly. All people more or less unconsciously adopt the Aristotelian view that a force is needed for keeping a body in motion. In Gadamer’s terminology it belongs to most people’s ‘horizon of understanding’. My own and other physics teacher’s (I was a physics teacher for 19 years!) experience tells us that it takes much effort to give up this mistaken view. One has to confront students with experiences of bodies moving with constant speed without anything acting on it, (not so easy!), then discuss these events thoroughly so that the students realise that such an experience contradicts the notion that a force is needed to keep a body moving. The next step is to discuss why we all are inclined to think wrongly about forces and motion, which is done by introducing friction forces. The process of transforming the meaning of the word ‘force’ from the Aristotelian to the Newtonian concept is long and arduous and it is finished first when the student unreflectively use the Newtonian force concept. Many people never arrive at that stage, despite much tutoring. This final stage, integrating the Newtonian force concept into ones outlook at the world may reasonably be called a ‘merger’ of ones own horizon of understanding with what is required in classical mechanics. There is one important difference, though, between understanding an ancient philosopher’s text and understanding a physics book. In the later case we are forced to change our own ‘pre-judgements’, for as philosophers, or laymen, trying to understand physics the option of saying that physics is wrong because it conflicts with this or that fundamental belief integrated into our ordinary language is not open to us. Common sense is no argument against a physical theory. One way to become more aware of our common sense notions used when talking about the physical world is to consider more thoroughly the conceptual development in physics, starting with Aristotle and proceeding to Galilei, Newton, Einstein, Schrödinger, Heisenberg, Bohr, Feynman, Weinberg, etc. This is the reason for writing the next chapter of this book. In particular, I believe that a thorough study of how new measurable quantities are introduced into the language of physics is profitable for understanding modern physics and also for seeing more clearly how common sense notions stands in the way for understanding physics. And, further, conceptual development is in crucial cases driven by observations.

14

1 Problems in Philosophy of Physics

What’s the point of an inquiry into philosophical problems of physics? As I see it, the goal is to understand in a stronger sense than merely being able to apply physical theories to concrete problems. And this goal is not a means to an end; it belongs to the human nature that we want to understand, as Aristotle once pointed out. That’s why I’m doing philosophy.

Chapter 2

Some Important Episodes in the History of Physics

Abstract In this chapter a selection of episodes from the history of physics, beginning with Aristotle’s physics, are described. The aim is to give a historical background to the philosophical discussions about space, time, matter, force, etc.

2.1 Introduction The aim of this chapter is to highlight some important steps in the development of physics, steps which are useful as background for the discussion about empiricism, semantics and ontology related to physics. A lot of history of physics is left out; for example I won’t say anything about the history of thermodynamics, statistical mechanics or electromagnetism, although the history of these subdisciplines are interesting in their own right. One may say that there has been three conceptual revolutions in physics; the first was the transition from Aristotelian to classical mechanics, the second was the transition from classical mechanics to relativity theory and the third the quantum revolution. These are the foci of this chapter.

2.2 Aristotle’s Physics In its beginning physics was not a discipline of its own, but part of philosophy, viz., natural philosophy. Aristotle may reasonably be said to have started the systematic inquiry into natural phenomena in his book Physics (υσ ισ ; this greek word means ‘that whose nature is to change’.) Aristotle’s disciples organised his writings (or perhaps their notes from his lectures) about change and nature in this book and the more general discussion of what exists was left to the next book Metaphysics, literally ‘after physics’. Metaphysics is generally described as a study of ‘being qua being’, i.e., an inquiry into the most general aspects of being. The contrast intended by ‘qua being’ is, one may assume, to particular aspects of existing things; Aristotle’s Metaphysics is a © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_2

15

16

2 Some Important Episodes in the History of Physics

study of the general traits of everything that exists, concrete changeable objects as well as the eternal things, whereas Physics is an inquiry into changeable things. Aristotle thus made a contrast between things that change and things that do not, the latter being what Plato called ‘ideas’ (ειδoζ ) and Aristotle himself called ‘forms’ (μoρφ η) ´ The starting point for Aristotle’s Physics is the conceptual problem of change. Parmenides and Zeno had both argued that change in general, and motion in particular, is impossible. Zeno’s paradox ‘The Arrow’ is one famous argument for this conclusion. After leaving the bow the arrow moves from place to place. At each point of time it is at a particular place along its path. That is, it is at rest at that point at that time. Hence, at each point of time it is at rest. So it does not move. Hence, motion is mere appearance. Aristotle disagreed; his starting point is that we clearly observe that things change and move, hence there must be some mistake in Zeno’s and Parmenides’ arguments to the effect that change and motion is impossible. Thus, the content of the first three books of Physics is a rebuttal of Parmenides and Zeno’s doctrine, starting with clearly observable events. It is fair to say that Aristotle is an empiricist in so far as he accepts the verdicts of his senses. Having rebutted the conclusion that change and motion is impossible and after having discussed the notions of place and time and their relations to motion, he proceeds to present a theory about motion, which is one species of change. The main question is: why do things move? Apparently Aristotle took for granted that rest was a natural state and no explanation was needed for why things are at rest. Aristotle distinguished two kinds of motion among things on earth, forced and natural motion, when answering this question. This is a natural starting point; we clearly ‘see’ that bodies without support fall to the ground without being forced by anything else, but move horizontally only if being pulled or pushed. Free fall is natural motion and requires no external cause. Aristotle claimed that every thing has in itself a natural tendency to find its proper place in universe; the heavier, the closer to the center of the universe, whereas lighter things, such as fire, move upwards. All things have an inner nature, and it belongs to the nature of movable things to find their proper place in universe. But heavenly bodies are different: they orbit the earth without any observable cause. So earthly and heavenly bodies are quite different things, being subject to different principles. The constant eternal motions of the heavenly bodies must ultimately be explained by there being an unmoved mover. This is not an individual being, not a god, in Aristotle’s theory, but some sort of ether, something that acts and fills the heaven. (But Aquinas and the Catholic Church interpreted the unmoved mover as God.) To Aristotle, as to almost all in antiquity, it appeared obvious that Earth is at rest and the Sun, the Moon, the planets and stars all rotate around the earth; these features appeared to be obvious observable facts. This entails that directions are objective, not mere subjective points of view, as we now think. Up and down, left and right, are objective properties and Aristotle explicitly says that in Physics. Everything that moves, does so relative to the unmovable earth. Hence, the distinction between rest

2.2 Aristotle’s Physics

17

and motion is a real distinction, not a mere relation between the moving object and the observer, as it is from our modern point of view. In retrospect we see that this was a great obstacle for the further development of physics. Another great hindrance for the future development of physics was Aristotle’s conceptions of substance, matter and form. A primary substance is defined as that which can only be object of predication, it cannot be a predicate. So ‘red’ is not a substance, since ‘red’ can occur both as subject and as predicate. From a modern point of view Aristotle commits the sin of confusing use and mention; subject and predicate are now understood as linguistic categories, not the things talked about. This confusion is easy to understand since Aristotle had no theory of reference. Interpreted in modern terms, we may say that Aristotle held that individual things, which he called primary substances, are things referred to by singular terms. His statement that a primary substance cannot be a predicate should then be understood as that a term referring to primary substance, i.e. a singular term, can never occur as predicate, which is a syntactical truth. This is, I think, a possible reading of Aristotle. The distinction between primary and secondary substances is thus made in terms of observable features of language, a praiseworthy expression of empiricism. Any individual thing, i.e., primary substance, is constituted by its matter and form. These are not properties of substances; perhaps one could say that matter and form are two aspects of any existing individual thing. Primary substances have a number of properties, coming in different categories. In Physics Aristotle discerns seven such categories inhering in substances: quality, quantity, place, time, relation, activity and being affected. (In Categories he lists two more.) The conceptual structure of SUBSTANCE and PROPERTY does not admit talking about quantity of matter; quantity is attributed to substances, not to matter. The reason is that matter, or rather ‘matter’, avoiding Aristotle’s confusion of use and mention, can occur as predicate, hence matter is no substance according to Aristotle’s definition. Therefore it cannot have any properties inhering in it. So the concept QUANTITY OF MATTER has no place in Aristotle’s thinking. This conceptual fact is, I think, part of the explanation why physics didn’t make any substantial progress for 2000 years, since QUANTITY OF MATTER, i.e., MASS, is a necessary component in a theory about interactions between bodies, as will be discussed in detail in Sect. 10.6. Aristotle clearly denied the legitimacy of any concept of motion of motion, and since change of motion is motion of motion, he has no concept of acceleration. Hence the very question of what causes accelerations cannot be formulated within the conceptual structure of Aristotle’s physics. Aristotle’s explained motion and change in terms of the distinction between actuality and potentiality. When a body moves from one place to another, the second place is a potential property of the body, and the motion is the actualisation of this potential property. Talk about potentialities invites, but does not necessitate, teleological thinking. The teleological character becomes evident when Aristotle’s uses potentialities as explanations of changes.

18

2 Some Important Episodes in the History of Physics

None of the concepts discussed are constructed as quantities. For example, TIME is discussed in terms of ’now’, ‘before’ and ‘after’, PLACE is explained as a kind of surface surrounding the thing that is at that place and so motion as change of place cannot be quantified. The idea of defining place as a particular point in a coordinate system was invented by Descartes, almost two millennia after Aristotle. One may say that Aristotle was primarily interested in explanation, not in predictions and calculations. I don’t know whether this reflects a lack of interest in predictions or if it reflects the fact that there were no accurate clocks available in ancient Greece; their clepshydra were not useful for making any detailed studies of bodies in motion. Summarising, Aristotle was the first to make a theory of motion. He clearly based his reasoning on observations and in that respect he was an empiricist. He made several mistakes, but his Physics is, in my view, a step forward, away from the rationalistic outlook of Parmenides and Zeno towards an inquiry based on observations. But in order to make further progress, philosophers of nature had to dismiss great parts of Aristotle’s metaphysics of substance, matter, form, essential and accidental properties and the teleological character of physical explanations.

2.3 From Aristotle’s Physics to Classical Mechanics: Galilei and Newton A crucial step in the development of the first successful physical theory, classical mechanics, was that Galilei, Newton and others who contributed to classical mechanics focused on describing the motion of bodies in more detail, instead of discussing the ontological status of motion, its explanation in terms of goals and whether motion of motion is conceivable. Answers to metaphysical questions about the relation between substance and quantity has no empirical consequences. What is relevant for making predictions are the relations between the different physical quantities attributable to a body in motion and to answer such questions one must perform measurements. Galilei started to measure the kinematic quantities, i.e., times, distances and velocities, of moving bodies. (Even if Aristotle had got the idea of doing so, there were no good clocks available in his times; so one important part of the explanation of the lack of progress in physics before Galilei was lack of useful clocks.) Together with his insight that rest and uniform motion are relative to observer, he was able to establish some well confirmed kinematical relations, for example that during free fall the distance travelled is proportional to the squared time of motion and that motion in parables can be analysed in two components, a uniform horisontal one and free fall obeying the aforementioned relation that distance is proportional to the squared time. By realising that uniform motion and rest are not distinct states, but the same state described from different coordinate systems, Galilei concluded that no cause

2.3 From Aristotle’s Physics to Classical Mechanics: Galilei and Newton

19

is needed for uniform motion; it is change of motion that requires an external cause. This is the consequence of the principle of Galilean relativity. In the period between Galilei and Newton three researchers, viz., John Wallis (1616–1703), Christopher Wren (1632–1723) and Christian Huygens (1629–1695), independently of each other studied the motion of colliding balls. They found a regularity, what we now call conservation of momentum. Momentum is the product of mass and velocity and by discovering this regularity, these researchers also implicitly introduced a new concept, quantity of matter, by Newton later called mass.1 This discovery was the starting point for the development of dynamics, the theory describing how bodies interact. Newton refers to the results of Wallis, Wren and Huygens in the first Scholium in Principia and Newton in fact builds upon their results in this book. Newton’s Principia established physics as an independent discipline. Although the full name of his book, The Mathematical Principles of Natural Philosophy, suggests that he conceived it still a kind of philosophy, in fact it is not. (And Philosophical Transactions of the Royal Society was and still is a scientific journal, not a philosophical one!) Due to Newton’s success in solving a lot of dynamical problems using what we now call differential equations, his Principia became a paradigm, in this word’s original sense of ‘pattern’, for problem solving in physics. Using this book as paradigm one could proceed and make great progress without discussing philosophical issues about change, existence, inner principles, etc. One could disagree sharply about such things; that doesn’t matter for the progress of physics. Everyone agreed on criteria for success, viz., quantitative predictions which are confirmed by experiment. There were in classical mechanics a number of questions that never got any answer. Let me mention three. The first is the remarkable feature that gravitational forces act instantaneously at any distance. Since forces was conceived as causes, it is really astonishing that a cause can act without delay independently of distance, which is implied by the law of gravitation, if forces are conceived as causes. Newton clearly acknowledged that, but rejected any effort to explain this fact by reference to hidden mechanisms or powers. He is famous for his statement ‘Hypotheses non fingo’. The second enigma is the concept of infinitesimal. It was a crucial element in calculus, extensively used by Newton, but what is it? An infinitesimal was characterised as a quantity that is lesser than any other quantity but still not zero. How is that possible? Isn’t this very concept a contradiction in terms? The third question is the ontological status of space and time. Newton famously claimed that space and time have independent existence, that even in a completely empty world there is a three-dimensional space and time goes, second after second without any relation to any motion of bodies. Leibniz argued against this view, holding that space and time merely are systems of relations between objects and

1I

will discuss this episode in the history of physics in more detail in Chap. 10.

20

2 Some Important Episodes in the History of Physics

events. This debate continues today, albeit in the framework of relativity theory. I’ll discuss some arguments in Chap. 17. Further development of physics have brought to the fore several other questions, some of which were listed in chapter one. Neither of these older and newer questions can be decided only by observations, so they are not purely empirical or purely physical problems. One is then led to the epistemological question how one may proceed to answer such questions, how one may acquire knowledge. This brings us to the debate between empiricists and metaphysical realists. I will give my views on this debate in Chaps. 3 and 7.

2.4 Relativity Theory Relativity theory is based on two postulates, the generalised relativity principle and the constancy of the velocity of light. The label ‘postulate’ suggests that these starting points have a status similar to that of axioms in mathematics. However, the constancy of the speed of light is in fact derivable from the laws of electromagnetism, as has been shown by Feynman et al. (1964) and Dunstan (2008). The fundamental laws of electromagnetism, i.e. Maxwell’s equations, was universally accepted and recognised as true some 30 years before Einstein published his STR paper (Einstein 1905b). It is also well known that Einstein early on, when thinking about the consequences of Maxwell’s equations, had realised that the velocity of light not only is a constant but also an upper limit for the motion of all bodies. Hence the only new idea in STR is the generalisation of Galilei’s relativity principle; all laws, not only the mechanical ones, have all the same form upon transformations between coordinate systems in uniform motion. The rest consists of derivations of consequences of electromagnetism in conjunction with this generalisation. This is indicated by the title of Einstein’s special relativity paper: ‘Zur Elektrodynamik bewegter Körper’ (On the electrodynamics of moving bodies). The consequences of the two postulates is that we need to reconceptualise space and time; one must join them to a four-dimensional manifold, spacetime. The spatial distance between two events depends on choice of coordinate system and there is no privileged one. Likewise for times: an observer in a spaceship passing the earth at high speed could observe an entire human lifespan of, say 80 years during a few hours. But there is a transformation invariant measure, the spacetime interval. Thus spatial distances and time intervals are no objective measures, but the spacetime interval is. Einstein immediately realised that a further generalisation is called for: why restrict the relativity principle to transformations between inertial systems only? Inertial systems have no privileged status, and, moreover, real systems, such as the earth, are not truly inertial systems. The generalisation to a theory which treats all coordinate systems, whether accelerated or not, as equal, proved very hard to work out in mathematical detail. It requires use of tensors, a kind of mathematical objects

2.5 Quantum Theory

21

whose theory was developed by Ricci-Curbastro and Levi-Civita around 1890. As we will see in Chap. 17, this has profound consequences for our concepts of matter and spacetime.

2.5 Quantum Theory Classical theories, i.e. pre-quantum theories of physics, all take for granted that all physical variables are continuous. Between any two points in time there is another point of time, between any two places in space along a particular direction there is a further place and between any two values of a quantity definable in term of time and distance there is another one. Since all dynamical quantities indeed are definable in terms of time, distance and mass, as we will see in Sect. 9.3, all classical theories are based on the assumption that all dynamical quantities except mass are continuous. This is not merely the mathematical assumption that when we measure a number of values of a quantity the results are points on the real line, but also the physical assumption that physical objects can change state in infinitely small steps without any general restrictions. But, alas, this assumption is false, which is the profound consequence of Planck’s discovery of the law for black-body radiation (Planck 1900). As will be discussed in Chap. 16, it is clear that Planck’s radiation law, which is confirmed beyond doubt by experiments, contradicts the assumption that matter emits and absorbs energy continuously. So energy is discretized. It follows that all quantities that are related to energy, such as momentum, also are discretized. It does not follow that spacetime is a discrete structure. However, all physicists seem to assume that it is possible to quantize gravitation, which, if successful, would likely show that also spacetime is discrete. But so far no success is in sight. The fact that all changes of energy states are discrete is a strong restriction on the applicability of the general method for solving physical problems. Newton, one of the inventors of calculus, solved dynamical problems by the now standard method of formulating a differential equation, solving it, and, lastly, putting in an initial value obtained from observations for the variable one is interested in. Since the solutions to differential equations are continuous functions we can, using initial values as input, calculate all other values of the variable of interest. This method fails if there are discrete steps in the time evolution of the variable of interest, because a function with jumps is not everywhere differentiable. So one needs another technique. This technique is called ‘quantisation’; The idea is to replace continuous variables by differential operators with (mostly) discrete spectra. This step raises difficult interpretational problems, which will be discussed in Chap. 16.

Part II

General Philosophy of Science

Chapter 3

Empiricism from Ockham to van Fraassen

Abstract This chapter gives a brief overview of the empiricist tradition in philosophy. Although medieval nominalism primarily was motivated by logical and semantic considerations, in retrospect it is fair to view it as an empiricist position and a natural starting point for an overview of empiricism. The chapter continues by discussing classical empiricism, Mach’s and Duhem’ empiricism, logical positivism, Quine’s and finally van Fraassen’s versions of empiricism. The core of the chapter is Sect. 3.8 containing a discussion of what to accept as empirical evidence for a theory. In Sect. 3.9 an empiricist and nominalist conception of natural kinds is presented. The chapter ends with a presentation of six doctrines, jointly making up my empiricist position.

3.1 Introduction Although philosophers disagree about the proper formulation of empiricism, it is easy to discern an empiricist tradition in the history of philosophy, a tradition in opposition to a metaphysical tradition. Empiricists in general are skeptical about assumptions about hidden structures, such as natural kinds, properties, relations and irreduicible dispositions, assumptions which metaphysicians typically make in order to explain phenomena. Empiricists claim that evidence for such things is lacking and are skeptical towards such explanations. But it is, alas, a historical fact that many empiricist attempts at stating a basic empiricist principle have not survived criticism. Logical positivism is perhaps the most well-known empiricist effort to overcome metaphysics, but the entire programme finally collapsed due to internal incoherence. Later empiricists have tried new versions of empiricism, and the debate continues. During the last decades the major opposition to empiricism is a set of doctrines called ‘scientific realism’, and empiricists of different kinds may thus be labelled ‘anti-realists’. But I have some misgivings about the focus in the debate, as I will spell out in due course. As a background to my misgivings, and to my own proposal for a basic empiricist principle, a very brief historical overview of empiricist positions is here presented. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_3

25

26

3 Empiricism from Ockham to van Fraassen

3.2 Medieval Nominalism: An Empiricist Position Medieval realists followed Aristotle in holding that there are universals which ‘inhere’ in things. They held that the significance of for example the sentence ‘Bucephalos is a horse’ is based upon there being two entities in the world, the primary substance Bucephalos and the universal HORSENESS, and that the truth of the sentence consists in the fact that HORSENESS ‘inheres’ in the primary substance Bucephalos. Thus, in modern vocabulary we may say that medieval realists held that tokens of (some) general terms refer to universals. This, they held, explains why e.g., horses appear similar to each other; they all share a common universal. Aristotle defined an individual thing, i.e., a primary substance, as that which cannot occur as predicate, while universals can occur both as predicate and as subject.1 To a modern reader this is an example of conflating talk about terms and what these terms refer to: it is terms for substances, for example proper names, that cannot occur in predicate position, while terms for universals can occur both as nouns and as predicates. The conflation of talk about terms and what they stand for is common and understandable if one has no theory of reference. Medieval nominalists had such a theory of reference. They rejected the Aristotelian view that predicates (and their mental counterparts, concepts) refer to universals. They held that the reason why we say of a number of things, seemingly similar to Bucephalos, that they all are horses, is simply that we have given a common name, ‘horse’, to all of them because they appear similar to us. That horses appear similar to us doesn’t entail that there exists any universal over and above the individuals. When a term is used as predicate in a true sentence, the term is, according to medieval nominalists, a name for the same thing as is referred to by the singular term in that sentence. This relation between the predicate and the individual thing it is true of (as we would say), they called personal presupposition. So the words ‘Bucephalos’ and ‘horse’ in the sentence ‘Bucephalos is a horse’ stand for the same thing. Tokens of predicates are names of individual objects, hence the label ‘nominalism’. Nominalists thus saw no need for assuming universals as items in the world. (But they accepted that words and ‘words of the mind’, i.e. mental counterparts to words, were universals.) ‘Need’ is the core word here; their methodological rule was the famous Ockham’s razor: ‘Entia praeter necessitatem non esse mulitiplicanda’ (‘Entities must not be multiplied beyond necessity’). It is usually ascribed to Ockham, although it doesn’t occur in his writings, but it seems to have been a common conviction among medieval nominalists. One is immediately prone to ask ‘necessary for what?’ and one might guess that Ockham and other nominalists meant that universals are not needed for explaining what our words mean, what we in modern terms would call ‘giving a semantics for natural language’. However, 1 “A

substance – that which is called a substance most strictly, primarily, and most of all – is that which is neither said of a subject nor in a subject.” (Aristotle and Ackrill 1963, 2a11).

3.3 Classical Empiricism

27

according to Spade and Panaccio (2019), this was not Ockham’s reason for endorsing nominalism; he held that assuming universals ante res, i.e., the platonic view of independently existing universals, as well as universals in rebus, i.e., universals instantiated in individual things, simply was incoherent.

3.3 Classical Empiricism During the early modern period the focal topic of philosophy shifted from ontology to epistemology. The main battle line was expressed in terms of the epistemological notions rationalism and empiricism. The basic issue were whether one could give any a priori justification for substantial claims outside logic and mathematics. Rationalists held that some very general principles concerning the real world could be known a priori; one often rehearsed example was the principle that everything has a cause. Rationalists tried to establish a priori principles by which empirical knowledge could be justified and they viewed epistemology as a first philosophy, a basis for all knowledge. Empiricists were skeptical towards these efforts; they denied that we could know a priori anything at all about the real world. They held that knowledge about the external world without exception is based on inputs into our sense organs. Hence the slogan that the soul is a tabula rasa, a blank slate; all our ideas ultimately were the effects of impressions on our minds. The conclusion to be drawn is that all evidence is empirical evidence, thus interpreting empirical evidence as impressions on the mind and the ensuing ideas we get from these impressions. Both classical empiricists and their opponents talked about ideas and impressions as if these things somehow were directly accessible and that our knowledge of them were certain. Hume for example, inspired by the success of using experiments and observations in physics, suggested that we should do the same in the ‘moral sciences’ to which he counted philosophy, and in so doing he took for granted that the counterpart to experiment and observation in physics was introspection into our mind, where we would find our ideas and impressions. Thus he implicitly thought of these things as kinds of objects and, moreover, things we could have certain knowledge about; introspection was believed to be infallible. Classical empiricists before Hume implicitly assumed that our impressions and ideas are representations of external objects and they conceived of these representations as established by causal processes: external objects cause our impressions. But, then, how do we know that there really is a causal link between an external object and its representation in our mind? Hume famously recognised that this question cannot be answered using the empiricist principle that all evidence is empirical evidence. Our only access to external things is via the impressions they cause in us. What we directly observe are the effects in us and then we infer that there is an external object causing our impressions of it. This inference cannot be defended by the empiricist principle. Before we can establish the causal relation we must be able to identify its

28

3 Empiricism from Ockham to van Fraassen

relata independently of each other. So classical empiricists faced the embarrassing question: what is the empirical evidence for your own empiricist principle? This version of empiricism invites skepticism. Hume’s response to this problem was to reject the very demand for ultimate justification. He clearly recognised that the skeptic has an upper hand in the theoretical debate. But he also noted that in practice all of us do believe most of the things we observe in our daily life and that total skepticism is not possible in practice. Even a staunch skeptic behaves as if he firmly believes most the reports he gets from his sense organs. His skepticism has no effect on most of his actions and we may reasonably reject his self report of being skeptical as a kind of selfdeception. As Hume put it, nature has not given us any choice on this matter (Hume 1986, book 1, part iv, section 2). Thus, Hume is the first to proclaim a methodological stance which may be called epistemological naturalism. The core idea is that we should not start our considerations about knowledge by doubting everything that can be doubted, as Descartes had done; instead we should take as a point of departure our everyday beliefs. This doesn’t mean that these beliefs about what we observe in ordinary circumstances are certain; but they provide the starting point for our inquiries. A consequence of naturalism is to reject the notion that epistemology is an independent foundation for empirical knowledge, that epistemology could be a first philosophy. Epistemological naturalism is the view that epistemology is an integrated part of empirical science, a view which nowadays has evolved almost into the default option for many epistemologists. Hume’s observation that we have no choice but to take for granted that there are external objects around us indicates, one might think, that there is something wrong with the notion that we observe our own ideas and impressions, not external things. Perhaps we should stick to the common sense idea that we directly observe external physical things, not our mental representations of them? One reason why classical empiricists held that we observe our own impressions, not external things, was the urge for complete certainty as the basis for empirical knowledge and they took for granted that impressions and ideas, i.e., recollections of earlier impressions, are the only things we could be certain about. If we say that we observe external things we must accept that such observations are not always veridical. One may believe one’s own observation, reported as e.g., ‘Lo, there’s horse in the field’, whereas closer scrutiny may reveal that it was in fact an elk. Hence, the careful epistemologist instead reports ‘I have an impression of a horse in the field’. Both rationalists and empiricists took for granted that such introspection reports are absolutely certain and that from certain knowledge of these impressions we could, under certain conditions, infer their sources. The demand for certainty is thus the culprit for the mistake of inserting mental representations, impressions and ideas, as the direct objects of cognition and mediators between external objects and the mind. Once we give up the demand for a completely certain basis for knowledge, we have no good reason to postulate ideas and impressions or any other mind states as objects mediating between our minds

3.4 Empiricism During the Nineteenth Century

29

and the external world. Moreover, there are also strong reasons to be skeptical about the infallibility of introspection.

3.4 Empiricism During the Nineteenth Century The dominant response to Hume’s philosophy was Kant’s transcendentalism. Kant joined the empiricist critique of rationalism, but he was not satisfied with Hume’s skeptical conclusion that ultimate justification of our knowledge is impossible. Kant distinguished between two levels of discourse, the transcendental level and the empirical level. At the transcendental level he claimed to have ‘deduced’ some fundamental principles for both our sensory and rational faculties. The forms of intuition, i.e., space and time, and the categories applied in our judgements can be known a priori after a transcendental analysis of our mind, he claimed. But the content of our judgments about the external world is determined empirically. In this way Kant upheld the notion that epistemology could be, in fact was, a first philosophy. The aim of his Critique of Pure Reason was to provide the philosophical basis for our empirical knowledge. The philosophical landscape during the nineteenth century was to a great extent determined by Kant’s transcendental philosophy. One group, the full-blown idealists, simply dismissed Kant’s distinction between two levels of discourse, the empirical and the transcendental level and the ensuing distinction between two perspective on objects, viz., things as they are in themselves in contrast to things as they appear to us, phenomena. Idealists held that the notion of things as they are in themselves is superfluous, we only need to bother about things as phenomena, period. Others, the neo-Kantians, kept Kant’s distinction between things as phenomena and things as they are in themselves and held that the goal of science is to find out how things are in themselves, to tell the final truth about things. They held that the goal of science is to arrive at completely objective descriptions of things ans states of affairs. Such completely objective descriptions are freed from all influences of our minds. This position is in my view a contradiction of Kant’s transcendentalism; his very notion of a thing as it is in itself is the notion of an object about which we as cognising creatures cannot say anything at all. The notion of a thing as it is in itself functions in Kant’s philosophy merely as a contrast to the notion of a thing as phenomenon, a thing as we perceive it, a contrast he needs in order to be able to take home his point that we cannot completely free us from the contribution our own mind provides in the cognition of objects. A third group, the empiricists, resolutely rejected Kantian thinking. Comte, Mill, Mach and Poincaré, to name the most well-known empiricists during the nineteenth century, steadfastly held that observations are the basis for knowledge, that induction is justified and that scientific laws are generalisations from observations. Skeptical questions about the certainty of our knowledge either didn’t bother them,

30

3 Empiricism from Ockham to van Fraassen

or else were thought to be possible to rebut from a scientific perspective. I will here briefly discuss Mach and Poincaré, since these two focused on physics.

3.4.1 Mach In his The Science of Mechanics, (Mach 1960), Mach gave a strongly empiricist exposition of classical mechanics. He found Newton’s version of classical mechanics too metaphysical in basically two respects. The first is Newton’s claim that space and time are independently existing entities; Mach joined Leibniz in holding that space and time are mere relational attributes of material bodies. The second is his dismissal of forces; assuming such things is superfluous metaphysics. Mach clearly and correctly points out that the basis for dynamics is the introduction of the concept of MASS, and that it is introduced in connection with observations of the accelerations of interacting bodies: ‘The ratio of the masses is the negative inverse ratio of the counter-accelerations.’ (op. cit. p. 218). Mach further claims that MASS is a thoroughly empirical concept: All uneasiness will vanish when once we have made clear to ourselves that in the concept of mass no theory of any kind whatever is contained, but simply a fact of experience. The concept has hitherto held good. It is very improbable, but not impossible, that it will be shaken in the future. . . . (op. cit. p. 221)

Mach here refers to the observation that linearly colliding bodies change velocities in constant proportions, independently of their velocities before the collisions. This we can observe without any theory, we need only to measure velocity changes of colliding bodies. As mentioned in the previous chapter, such measurements were first done by Wallis, Wren and Huygens, independently of each other, 20 years before Newton published Principia and Newton refers to these experiments in his book. These experiments, and Newton’s use of them, will be further discussed in Chap. 10. Regarding the concept of FORCE Mach writes ‘denoting the moving force mφ [i.e. mass times acceleration] by the letter p,. . . ’ (op. cit. p. 270), thereby identifying force with mass times acceleration. At some places he writes that the measure of force is mass times acceleration. One could perhaps claim this is not equivalent to saying that force is mass times acceleration, since it is possible to hold that FORCE and MASS TIMES ACCELERATION are distinct universals always being coinstantiated. Mach doesn’t discuss this possibility, but I’m convinced that he would reject such a position as superfluous metaphysics. For Mach there are no forces. Mach’s outlook is strongly empiricist. Mechanics is based on directly observable motions of bodies. Measurements of motions are made in terms of the fundamental quantities of DISTANCE, TIME and MASS, neither of which require more theory than simple inductive generalisation from experiments for their application, and the entire theory can be built up from this basis. Momentum conservation is the fundamental principle arrived at as a generalisation of observation reports. This is

3.4 Empiricism During the Nineteenth Century

31

the first step in theory construction, hence it is the most fundamental law in classical mechanics, as will be shown in Chap. 10. Regarding Newton’s laws Mach writes: We readily perceive that Laws I and II [i.e. Newton’s first and second law] are contained in the definitions of force that precede. According to the latter, without force there is no acceleration, consequently only rest or uniform motion in a straight line. Furthermore, it is wholly unnecessary tautology, after having established acceleration as the measure of force, to say again that change of motion is proportional to the force.” (op. cit. p. 242)

Thus, Mach in effect says that Newton’s second law is a definition, since he writes that it is a tautology. I accept everything Mach says about Newton’s laws. What is missing in his book are two things: (i) a general discussion of the ontology (Did he dimiss all quantities?), and (ii) a discussion of the necessity of laws. One may also observe that Mach’s book only concerns classical mechanics. Mach is also well-known for his positivism, i.e., his rejection of unobservable entities such as atoms. This is not my cup of tea. One might very well be an empiricist, which is a position concerning what to count as scientific evidence, i.e. a methodological position, while holding that we have reason to accept hypotheses about unobserved objects, which is an ontological conclusion. The profound problems with positivism, in any of its forms, came to the fore with the criticism of logical positivism.

3.4.2 Poincaré Poincaré argues for a strong empiricist view in his Science and Hypothesis, (Poincaré 1952), in particular in Chap. 6. My view, to be spelled out in Sect. 3.10, has several affinities with his, although his arguments differ from mine. What we basically agree about is that we can observe motions of bodies, but not forces and masses. Concerning MASS he concludes that ‘Masses are co-efficients which it is found convenient to introduce into calculations.’ (op.cit. p. 103). Thus, Mach and Poincaré agree on the status of the concept of mass. Using modern semantics we may say that they agree on holding that the quantities FORCE and MASS are predicates with extensions but lacking reference. Poincaré did not distinguish between mass coefficients that may be calculated when bodies interact at distance and when they collide, which means that he did not consider the distinction between gravitational and inertial mass. This might be said to be congruent with his steadfast empiricism; to a critic pointing out this distinction he could have replied by saying that the distinction between gravitational and inertial mass have no observational consequences, so it is superfluous metaphysics. Well, we need, nevertheless, an explanation of why mass coefficients are exactly the same when bodies interact by colliding and when they interact at distance,

32

3 Empiricism from Ockham to van Fraassen

via gravitation. Prima facie it seems to be quite distinct phenomena. (This strict proportionality between gravitational and inertial mass is explained by general theory of relativity, which entails that they are identical.) About Newton’s second law Poincaré concludes (op.cit. p. 104) that by definition force equals mass times acceleration and that no future experiment can disprove it. He also observes that Newton’s third law immediately follows. This is correct, but then he writes: The principles of dynamics appeared to us first as experimental truths, but we have been compelled to use them as definitions. It is by definition that force equals mass times acceleration; this is a principle which is henceforth beyond the reach of any future experiment. (op.cit. p. 104).

This formulation suggests that at first one observed that forces were equal to mass times acceleration and later elevated it to a definition. If so I beg to disagree; the identity of force and the product of mass times acceleration is a stipulative definition of the term ‘force’, as I will show in Sect. 10.6. In another confusing argument Poincaré concludes that Newton’s second and third laws are not strict. One may reasonably ask how a definition of a quantitative variable in terms of two others could be an approximation? His arguments goes as follows: There is not in Nature any system that is perfectly isolated, perfectly abstracted from all external action; but there are systems that are nearly isolated. If we observe such a system, we can study no only the relative motion of its different parts with respect to each other, but the motion of its centre of gravity with respect to the other parts of the universe. We then find that the motion of its centre of gravity is nearly uniform and rectilinear in conformity with Newton’s Third Law. This is an experimental fact which cannot be invalidated by a more accurate experiment. What, in fact, would a more accurate experiment teach us? I would teach us that the law is only approximately true, and we know that already. Thus is explained how experiment may serve as a basis for the principle of mechanics, and yet will never invalidate them. (p.105)

I find this paragraph confusing and the conclusion in conflict with the view that Newton’s second law is a definition and the third law is a logical consequence of the second one. (Moreover, the third law is a consequence of the second law and momentum conservation, not of the second law alone, as will be shown in Chap. 10.) My general view on concept formation as based on experiments is given in Chaps. 5 and 10.

3.5 The Vienna Circle After the first world war a group of researchers who called themselves ‘Der Wiener Kreis’ (The Vienna Circle) began regular meetings. The common interest among its members was the foundations of science and a resolute dismissal of metaphysics, such as neokantianism, neohegelianism and other speculative theories. Their basic intuition was that metaphysical statements were not even false, they were

3.5 The Vienna Circle

33

meaningless. (One is reminded of the colourful comment Wolfgang Pauli much later made about a paper which he was asked to comment upon: ‘Das ist nicht nur nicht richtig; es ist nicht einmal falsch!’ (‘That is not only not right; it is not even wrong!)) The Vienna circle took an explicit semantic perspective in the debate by formulating their famous verifiability criterion for meaningful sentences: If a sentence (german ‘Satz’) cannot be conclusively verified (or falsified), it has no (cognitive) meaning because it is neither true nor false. This enabled them to dismiss Hegelian and Kantian metaphysics as devoid of meaning. Verification of empirical sentences is based on observations. Singular observation sentences can be verified directly, so to say, but how to verify a law, a general sentence? At the beginning they held that one could verify laws using inductive inferences from individual observed cases, but after some time they realised that cannot be done in the strict sense of the word ‘verify’. No amount of observations of black ravens can be a verification of the sentence ‘All ravens are black.’ They concluded that law statements lack truth values and Carnap and others in the Vienna Circle retreated to the position that scientific laws are instruments for predictions only, not significant sentences. Using laws and observations of initial conditions we can derive predictions about future events by applying logical rules and mathematics. These predictions are true or false since they are determined by observations, but the rest of the theory is a mere collection of mathematical instruments. This stance is simply incoherent. In derivations we use logical rules, each of which being justified by having the property of preserving truth. The very point of using only logical rules (or at least trying to do that as far as possible) is precisely that if the premises are true, we know for certain that the conclusion is true. But why should we bother about logic at all, if we assume that law premises lack truth value? Or indeed, why call a string of linguistic signs a ‘premise’ if we don’t conceive of it as a declarative sentence, something being able to be true or false? I’m a bit astonished that this inconsistency was not recognised by Carnap and others in the Vienna circle.2 Presumably this has to do with the impact of the Hilbert’s programme in mathematics on the Vienna circle. Logic was conceived as a set of syntactical rules only. Model theory and philosophical theories of truth occurred later. A second problem with logical positivism is that it presupposes a very strict distinction between observation sentences and theoretical sentences. But the logical positivists never gave any strict criterion for being an observation sentence. Do we observe when looking in the microscope? In ordinary language we certainly say that we see objects in the microscope, but one might reasonably say that reports about observed bacteria, for example, are highly dependent on theory, in particular optics.

2 The

problem is analogous to the Frege-Geach problem for non-cognitivists in ethics. A sentence expressing a moral judgement is according to non-cognitivists neither true nor false, it is an expression of a non-cognitive attitude, such as an emotion. But then, how could it be embedded in conditional sentences that are attributed truth values and used in logical inferences?

34

3 Empiricism from Ockham to van Fraassen

A third problematic aspect of the Vienna circle’s view on the relation between observation and theory was the ineliminability of theoretical terms. Their goal was to define theoretical predicates in purely observational terms, but that proved impossible. Richard Creath writes apropos the aim of finding the unity of science in a common observation language: [B]y the time of this essay [Carnap (1938)] Carnap had already decided . . . .. that theoretical terms could not in general be given explicit definitions in the observation language even though the observation reports were already in a physicalist vocabulary. The partially defined theoretical terms could not be eliminated. This seems to have caused Carnap no consternation at all, and it never seems to have occurred to him that there was any conflict whatever between this result and the unity of science. This is because by this point the elimination of concepts was not the point of the exercise; their inferential and evidential integration was. (Creath 2014)

Carnap was surely right in pointing out that inferential and evidential integration is a prime goal of any philosophy of science. But that cannot be reached if some of the sentences in a theory are held not to be significant sentences, i.e. lacking truth values; for such objects cannot have any inferential roles at all. The fourth problem is the logical positivist’s account of the relation between structure and content. In their manifesto (Wienerkreis 1929, sec. 3.2) we read: Inspired by ideas of Mach, Poincaré and Duhem, the problems of mastering reality through scientific systems, especially through systems of hypotheses and axioms, were discussed. A system of axioms, cut loose from all empirical application, can at first be regarded as a system of implicit definitions; that is to say, the concepts that appear in the axioms are fixed, or as it were defined, not from their content but only from their mutual relations through the axioms.

Thus we read that concepts are defined by their mutual relations, not by their content. In other words, the extensions of the predicates expressing these concepts are determined relative to each other, but not absolutely. There is no connection to something outside the system of sentences expressing these axioms. How, then, is this connection brought about? By some sort of mapping onto observable phenomena? This won’t do. There is a result in mathematical logic which completely undermines the idea that a formal theory uniquely can be mapped onto a description of empirical phenomena, viz., Löwenheim-Skolem’s theorem. This theorem says, roughly, that if a first order theory is satisfiable in any domain of infinite cardinality, it is satisfiable also in the domain of natural numbers. In other words, any first order theory to whose closed formulas we consistently can assign truths can be interpreted as being about natural numbers. This entails that no consistent theory expressed in first order logic can determine its own interpretation. And if we add interpretation rules to the theory ruling out non-intended interpretations, we get a new theory to which Löwenheim-Skolem’s theorem also applies.3

3 Putnam

(1980) was the first, to my knowledge, to point out the relevance of Lövenheim-Skolem’s theorem in the discussion about the relations between theories and the world.

3.6 Quine

35

The conclusion to be drawn is that the connection between a theory and reality must be established by something non-theoretical, and that is, I submit, indexicals in sentences used together with pointing, (Example: ‘This detector read 5.’) That fixes the truth values of some sentences and the references for some definite descriptions. Such sentences make up the connection between theory and reality; they are necessary for the semantics of any physical theory and in fact for any theory about parts of the real world. This conclusion has by Luntley (1999) been called ‘Russell’s insight’: The semantic power of language to represent derives from the semantic power of contextsensitive expressions’ (p. 285)

This is a crucial insight, strongly relevant for the discussion about realism in Chap. 9 and about direction of time in Chap. 13. As will be discussed in some detail in Chap. 5, concept formation and theory development goes hand in hand in physics. Theoretical predicates in physics, except those defined in pure mathematics, are already from their very introduction given a physical interpretation in the sense of being at least indirectly connected to descriptions of observable objects and events, which can be identified in concrete situations by pointing. The concept of MASS is a prime example; it was introduced, using the expression ‘quantity of matter’ in the reports from Wallis, Wren and Huygens about collisions between bodies, as will be discussed in detail in Chap. 10. We can reconstruct a well developed theory as an axiomatic system, disregarding how we arrived at theoretical predicates in the first place, and also suppressing their semantics. But doing that, one is prone to miss how natural science differ from mathematics in the process of arriving at implicit definitions of new predicates. As will be discussed in Chap. 10, I agree with Mach and Poincaré that physical concepts used in fundamental laws are implicitly defined by being employed in those laws. It took quite some time for the philosophical community to fully realise the shortcomings of logical positivism, the common name for the positions taken by The Vienna Circle. Logical positivism was dominant for several decades. None of the problems listed above were seen as really serious by its adherents, if recognised at all. The demise of logical positivism began with Quine’s critique of the analyticsynthetic distinction.

3.6 Quine A basic tenet of logical positivism was the analytic-synthetic distinction. Analytic sentences were said to be true in virtue only of the meanings of the words used, whereas synthetic sentences depend for their truth also of non-linguistic facts. Quine attacked this distinction in a much read paper, (Quine 1951), and his critique started a development in the philosophy of science which ultimately destroyed the entire logical positivist program.

36

3 Empiricism from Ockham to van Fraassen

Quine however retained a strong empiricist inclination his entire life and is sometimes described as the last in the logical positivist tradition. Others describe Quine as the main figure in putting the nails in the coffin of logical positivism. John Passmore wrote ‘Logical positivism, then, is dead, or as dead as a philosophical movement ever becomes.’(Passmore 1967, 57). If we understand ‘logical positivism’ as referring to a set of doctrines, one must agree with Passmore. But the empiricist movement, of which logical positivism is a part, is surely not dead. New versions have emerged and Quine must be counted as a strong empiricist; but certainly not a logical positivist. Empiricists are united in holding that empirical evidence is the only form of evidence there is, and can be, for scientific beliefs outside mathematic and logic. (But since Quine rejected the analytic-synthetic distinction, he found no valid reason to separate out logic and mathematics; all knowledge is ultimately based on empirical evidence.) It means, among other things, that any a priori argument for a statement having empirical content should be dismissed as illegitimate metaphysics, or simply as sheer nonsense. But the devil is in the details, as the twentieth century debate in philosophy of science has shown. I’m prone to accept Quine’s general arguments against the analytic-synthetic distinction to the effect that analyticity, meaning, synonymy and definition are on a par in the sense that any of these notions can be used as primitive and the others can be defined in terms of it. But all are equally unclear and so is the distinction analytic/synthetic. However, one may recognise that in his Two dogmas Quine made one explicit exception from his criticism of the idea that synonymy is based on definitions. He rightly observed that most definitions are lexical definitions, reports of language use, so they cannot provide explanation for synonymy, they rather presuppose synonymy. But a stipulative definition is another matter. Quine didn’t use the word ‘stipulative’, but it is quite clear that is what he refers to when talking about ‘explicitly conventional introduction of novel notations for purposes of sheer abbreviation.’ (op. cit. p. 26). The justification of the truth of such a definition being independent of observations is unproblematic also for Quine. One could say that such a definition proclaims synonymy of two expressions as used in particular contexts. Or at least we could say that a stipulative definition is proclaimed to be true, never mind synonymy, i.e., sameness of meaning. A core principle in Quine’s philosophy is holism, the doctrine that no single sentence can be tested in isolation. Only bigger chunks of theory are able to entail a sentence that can be compared with observation conditionals. And Quine famously extended his holism to include even logic and mathematics; he held that it is in principle possible to revise a logical principle when confronting a conflict between theory and observation. Quine didn’t explicitly mention mathematical axioms in Two dogmas, but it seems clear that he included them in his radical holism. If so I beg to disagree. In line with a modern view that mathematical axioms are to be viewed as implicit definitions of mathematical concepts one could say basically the same thing about axioms as about stipulative definitions; they are proclaimed to be true. The

3.7 Van Fraassen’s Constructive Empiricism

37

difference between a stipulative definition and an axiom is that axioms have the form of implicit definitions of introduced concepts. (This is however, no difference of relevance for epistemology; Quine himself has shown how to transform an implicit definition into an explicit one, i.e., a stipulative definition, see Quine 1976b.) The important point is how we know the truth of a sentence. And it is pretty clear that we know the truth of a sentence expressing a stipulative definition in the trivial sense that we hold it true by asserting it; and similarly for axioms, so long as they are accepted as such. This is no problem for the empiricist; he easily accepts that axioms and explicit definitions are known a priori, thus interpreted. (Empiricists however reject the idea that a priori knowledge is knowledge acquired by some sort of intuition into a platonic world of concepts.) The price is, as Wittgenstein observed in Tractatus, that sentences known a priori don’t inform us about any facts of matter in the real world. Mathematics and logic are mere structures, viz., linguistic structures. Upholding the distinction between empirical and formal sentences conflicts with the extreme form of holism propounded by Quine in Two dogmas. And, in fact, Quine later retracted from that position, as can be seen for example in the following discussion with Roger Gibson reported in Gibson (2004, 32): In a video discussion concerning Quine’s naturalized epistemology, I had the opportunity to suggest to Quine that the strong version of revisability is rather hard to take, especially when applied to laws of logic. Quine responded as follows: “Well I think I rather agree. I think nowadays it seems to me at best an uninteresting legalism.” The expression “uninteresting legalism” is Quine’s marker for earlier views that he has come to view as - if not altogether wrong, and perhaps even in some Pickwickian sense correct - needlessly extreme.

No matter what Quine himself thought, I believe it possible to separate logical and mathematical sentences from other sentences in a theory; the former are held true, not because they are supported by any kind of evidence, but because they jointly and implicitly define a set of concepts useful for scientific reasoning or are consequences of such definitions. As we will see in Chaps. 5 and 9, the same is true of some definitions of physical concepts.

3.7 Van Fraassen’s Constructive Empiricism Van Fraassen’s constructive empiricism (van Fraassen 1980) is a way to stick to the central tenets of empiricism while avoiding one fundamental problem of logical positivism, the stance that theoretical sentences lack truth value. Van Fraassen held that every sentence in a scientific theory is true or false. On this point Quine and van Fraassen are in agreement. But, van Fraassen claimed, as empiricists we need not bother about the truth of theoretical sentences. When we assert a theory we only assert that it is empirically adequate, i.e., that its empirical substructure is isomorphic with all observable phenomena. (This is van Fraassen’s phrasing in his (1980).) Undoubtedly, this is congenial to the empiricist mind; never mind about

38

3 Empiricism from Ockham to van Fraassen

questions of ultimate truth, we are only interested in whether a theory tells the truth about what we can observe, period. Van Fraassen made a distinction between accepting a theory and believing it being true. Acceptance means accepting it to be empirically adequate, which means that those sentences expressing the empirical substructure are true. We may be justified in accepting a particular theory to be empirically adequate, but we are never justified, he claimed, to go further by claiming that all the sentences of that theory are true. I’m sympathetic to van Fraassen’s empiricist outlook, but I see a problem in his theory. Empirical adequacy, means, according to him, that the empirical substructure of a model of a theory is isomorphic to all observable phenomena. Surely, we have never observed all observable phenomena in the domain of a certain theory, so claiming that a theory is empirically adequate is an inductive inference from observed to unobserved but observable phenomena of the same kind. This inference is justified according to van Fraassen. But we are never justified to make the further inductive inference from descriptions of observable phenomena to higher theoretical layers in the theory. But why this very sharp demarcation between two steps in inductive reasoning? The absence of any good reason for this demarcation undermines van Fraassen’s distinction between accepting and believing a theory. Let us consider an example, Maxwell’s first equation, which is a fundamental law of electromagnetism. This equation says that the total charge inside a closed surface is proportional to the total electric flux through that closed surface. Surely, we can never directly observe the electric charge, nor the electric flux, so this is certainly a theoretical law. What we can observe are motions of bodies big enough to be observable and then, using Maxwell’s equations (usually we need more than just the first one), we can calculate the motion of an observable body attributed with charge. Suppose we perform a series of experiments on this body, observe its motions and find that the predictions made using Maxwell’s first equation and other laws always are correct. Van Fraassen claims that we are now justified to believe all propositions about future motions of the observable body derived from Maxwell’s equation (and initial conditions); but we are not justified to believe the truth of Maxwell’s equation. This seems to be arbitrary, at least so long as we have not analysed the conceptual relations between descriptions of motions of observable bodies and the electromagnetic concepts charge, field, etc. (This will be done in Sect. 10.8). Now, van Fraassen may use the same argument as he used in another debate, (Musgrave 1985) viz., that I have drawn a principled distinction between observable and unobservable phenomena and that is not justified. That distinction varies with the development of science and is drawn by the scientists themselves; it depends on the actual state of the discipline, technological resources in particular. For example molecular structure is now observable using electron microscopes, while 100 years ago it was not. So perhaps van Fraassen could say that physicists nowadays hold electric fields and charges to be observable, which would mean that believing Maxwell’s equations to be true doesn’t conflict with his constructive empiricism,

3.7 Van Fraassen’s Constructive Empiricism

39

since they describe observable things, i.e., things contemporary physicists call ‘observable’. I grant van Fraassen’s point, but if anything it makes the case for his distinction worse: why should we say that on the one side of the distinction inductive inferences are justified and on the other not, if this distinction is not based on any fundamental epistemological principle, but is an ever changing limit, a distinction determined by the current state of technology? Van Fraassen’s position apparently comes down to the recommendation to follow scientists in their practice of believing in firmly established parts of science, while taking a more cautious stance towards higher spheres, and recommends us to follow suit. This is indeed reasonable, but it does not amount to any epistemology in the sense of justification for inference principles. Constructive empiricism was proposed as a better view on science than scientific realism, the majority position among present day philosophers of science. Scientific realists typically hold that our best theories in at least the mature sciences (by which they mean physics, chemistry, and their subdisciplines) are approximately true and that this is the best explanation for the success of science. In other words, the argument is of the form ‘inference to the best explanation’. I am prone to join van Fraassen in pointing out that (i) realists have not provided any non-partial evaluation of explanations in bad, good and better, (ii) reasons for belief in the truth of a theory are the same as the reasons for the belief that the theory is empirically adequate; the additional belief in the truth of the theory is supererogatory, see van Fraassen and Churchland (1985, 254), (iii) explanatory value basically is a pragmatic property which has no epistemological bearing; there simply is no valid inference from explanatory value to probability of truth. All three arguments are in my view fair criticism of the realist position and I have not seen any convincing rebuttal to any of them. (Presently many scientific realists seem to have realised these problems and moved to structural realism, a more attenuated position.) But van Fraassen’s constructive empiricism is no viable alternative. And he seems to have realised that himself.

3.7.1 Van Fraassen’s Empiricist Stance In his (2002) van Fraassen has made a slight position change; there he argues for what he calls ‘the empiricist stance’, which is not primarily or only a set of beliefs, but a kind of general perspective. The empirical stance does not include the belief, justified or not, that well supported theories are empirically adequate. He writes that rationality does not forbid inductive inferences, but neither can induction be rationally justified. The reason for this partial retraction from constructive empiricism is, I guess, his realisation that there is no logic of induction, as he later admitted, see Monton (2007, 343). If there is no logic of induction it is impossible to justify the conclusion, based on actual observations, that a particular scientific

40

3 Empiricism from Ockham to van Fraassen

theory is empirically adequate; remember that empirical adequacy was defined as being true about all observable phenomena. But still he is critical towards scientific realism. He gave the following characterisation of the empirical stance: What exactly are the targets of the empiricist critique? As I see it, the targets are forms of metaphysics that (a) give absolute primacy to demands for explanations, and (b) are satisfied by explanations-by-postulate, that is explanations that postulate the reality of certain entities or aspects of the world not already evident in experience. The empiricist critiques I see as correspondingly involving (a) a rejection of demands for explanation at certain crucial points and (b) a strong dissatisfaction with explanations (even if called for) that proceed by postulation. (van Fraassen 2002, 37)

I see here a strong affinity between van Fraassen’s empirical stance and medieval and modern nominalism; both are skeptical towards the explanatory value of postulating properties, relations, dispositions, necessities, etc. Another clear expression of these attitudes is found in Newton’s Principia: In experimental philosophy we are to look upon propositions collected by general induction from phenomena as accurately or very nearly true, notwithstanding any contrary hypotheses that may be imagined, until such time as other phenomena occur by which they can be made either more accurate, or liable to exception. (Principia III, Rule IV, p. 385)

Van Fraassen comments: This is Newton’s battle cry against the Cartesian “method of hypotheses” in his century’s great methodological dispute. We can easily see how the Jesuit’s argument would raise parallell questions for it. How do we identify the phenomena? What do they mean (that is, how do we distinguish an accurate minimal description from one that amounts to a hypothesis by effectively adding an interpretative element)? Finally what is this general induction, i.e., what are the implications of the phenomena? (op. cit. p. 127)

These questions go to the heart of scientific methodology: what is to be counted as evidence for a theory and how should we conceptualise evidence?

3.8 Evidence 3.8.1 Evidence and Reasons A common view is that ‘reason for’ relates things that are truth-apt, i.e., sentences, propositions and/or contents of beliefs. Logical relations provide the template: true premises in the valid proof of a formula make up complete justification for it. In a similar vein many philosophers of science (not Popper) accept that a number of true singular sentences partially justifies, i.e., gives reason for, belief in their corresponding generalisation, given some restrictions on the predicates used. (This will be thoroughly discussed in Chap. 5). But ‘evidence’ is a broader term than ‘reason for’. It is common to say things such as ‘your observation is evidence for the hypothesis’ or ‘these data gives us evidence

3.8 Evidence

41

for the theory in question’, hence evidence may be things that are not truth-apt. An observation report, which is true or false, may also be called evidence for an hypothesis. So evidence come in two forms, non-propositional, i.e., objects, data or observation acts, and propositional, i.e., complete sentences. Only the latter form of evidence may be called ‘reasons’ because only that is truth-apt. This analysis of the relation between ‘evidence’ and ‘reason for’ was made by Haack (2009, ch.4). I will return to this topic in Sect. 5.3.

3.8.2 Empirical Evidence The central tenet of empiricism is that all evidence we may have for a theory is empirical evidence; the interface between the external world and the space of reason is the evidence-relation. All agree that observations may be empirical evidence for a certain theory by being evidence for a sentence derived from the theory and describing an experiment. Such a sentence is what Quine called an ‘observation conditional’. It has the general form ‘Under conditions C, O is observed.’ It is crucial that it is possible to agree on the truth or falsity of observation conditionals independently of ones other convictions. There is no point in asking for evidence for an opponents’ theory if you are not willing to agree with your opponent about what to count as evidence for a theory. This is the bottom line in all empirical science. An influential train of thought, started by Thomas Kuhn’s The Structure of Scientific Revolutions, denies the possibility of theory independent observations. This is a consequence of accepting Kuhn’s incommensurability thesis, i.e., the claim that a paradigm succeeding its predecessor is incommensurable with it. But the incommensurability thesis is false, which may be inferred from the conclusion of a paper by Davidson, (1973a). Davidson discussed the most extreme case of conceptual distance, people from two completely different cultures lacking any common language and trying to communicate. His conclusion, strongly supported in my view, was that the very idea that people living in completely foreign cultures live in completely different conceptual worlds, is incoherent. The reason is that the claim that other people live in another conceptual world presupposes that we know that those people use language and that in turn requires that we can translate at least some of their expressions to our language. Davidson’s conclusion was that the very idea of fundamentally different conceptual schemes is incoherent. This applies as well to scientists working in different paradigms. If people share a common language it is not difficult to find numerous occasion sentences, i.e. sentences describing a feature of a here-and-now-situation, that people on the spot easily agree on, no matter differences in world-views. This is directly applicable to a situation where two researchers, one from an old paradigm and another from the new paradigm, try to communicate. In the end they will agree on some observation reports. They may disagree about the meaning of such a report, although they agree on its truth value, but this agreement is the crucial thing.

42

3 Empiricism from Ockham to van Fraassen

Ultimately, successive paradigms in a discipline can be compared and the area of comparison consists of testable predictions. Hence, the starting point in epistemology consists of observation reports that observers on the spot agree upon. No individual observation report is beyond doubt, but any such doubt is based on other observation reports conflicting with the first. So observation reports, taken collectively, is the basis for empirical knowledge. Since no individual report is beyond doubt, I’m not endorsing epistemological foundationalism. We need no unrevisable truths or ultimate justification for knowledge. Knowledge claims are provisional. The crucial question is what not to count as empirical evidence, and hence no evidence at all. Two factors which any empiricist rightly should exclude as lacking evidentiary value are explanatory force and simplicity. Explanatory force. Popper once remarked that he observed a definite difference between, on the one hand, relativity theory and on the other hand marxism and psychoanalysis. These three theories were in his youth in Vienna in the twenties all new and exciting theories, eagerly discussed in Vienna and elsewhere. Popper claimed that whereas marxism and psychoanalysis were able to come up with fine explanations of events ex post facto, they were completely useless for making definite predictions in their respective domains. Relativity theory by contrast very clearly made predictions. All interested had heard that relativity theory predicted that heavy masses bend light rays when passing nearby and that this prediction was beautifully confirmed to be correct as observed by Eddington’s solar eclipse expedition 1919. On this point I completely agree with Popper. A theory must be able to make testable predictions and lacking that we have no evidence whatsoever for it. Why, then, completely dismiss explanatory force as possible evidence for a theory? The simple answer is that the concept of explanation is hopelessly perspective dependent. The debate between scientific realists and van Fraassen illustrates that. Scientific realists typically hold that at least some central terms in mature sciences refer (this is Boyd’s formulation in Boyd 1984). Their argument is that it is the best explanation of scientific progress; they claim that the enormous scientific progress the last 200 years or so would be a miracle if there were no such things as atoms, charges, masses, electric fields etc; progress consists in discoveries of these things reported in (approximately) true sentences. It is clear that this argument’s force depends on how we classify explanations in bad, god, better and best and here we empiricists think that realists beg the question. One could for example join van Fraassen by saying that the best explanation of the success of science is that scientific theories are empirically adequate, that their observable parts fit observations (in van Fraassen’s terminology ‘has a model whose empirical substructure is isomorphic to phenomena.’). The fact that a particular theory is empirically adequate doesn’t entail anything about its truth. His realist opponents then argued that being a realist is a more brave stance. Thus Musgrave (1985, 199) wrote:

3.8 Evidence

43

Suppose the realist tentatively accepts a theory as true, while the constructive empiricist tentatively accepts it as empirically adequate. The realist takes a greater risk. But he takes no greater risk of being detected in error on empiricist grounds. So given strict empiricism (the principle that only evidence should determine theory choice), it seems that we ought as well be hung for the realist sheep as for the constructive empiricist lamb.

Van Fraassen replied (van Fraassen 1985, 255): ‘But since the extra opinion [i.e. believing a theory to be true] is not additionally vulnerable, the risk is – in human terms – illusory, and therefore so is the wealth.’ Van Fraassen is here obviously correct. One may argue that we have reason to believe also theoretical statements in a theory to be true, if we believe the empirical parts; not because it explains empirical success, but because otherwise we face problems of coherence, see the arguments against logical positivism above. But the realist argument, which is a case of inference to the best explanation, is not convincing. Van Fraassen and Achinstein (and maybe also some other philosophers of science) have, in my view convincingly, argued that the concept of scientific explanation is a strongly contextual and pragmatic one (van Fraassen 1980; Achinstein 1981, 1984). What one person or group accepts as a correct or good explanation of a phenomenon may be found unsatisfactory as explanation by another person or group. Here is a quote from the current debate illustrating the point: It is fair to say, however, that most philosophical discussions of explanation in the natural sciences eschew places where the mathematics (via the development of singularities) would say that the world isn’t law-governed. That is, explanations almost always involve some kind of subsumption of the explanandum phenomenon under some kind of regularity. (Even sophisticated non-covering law accounts, such as Woodward (2003) recent causal account, look to invariances - kinds of regularities - of some kind or another.) But if one’s interest is in understanding the robustness of the patterns of behaviour that we see, a focus on regularities and lawlike equations very often turns out to be the wrong place to look! We need to understand why we have these regularities and invariances. We need, that is, to ask for an explanation of those very regularities and invariances. This is the fundamental explanatory question. The other accounts don’t ask that question, in that they typically treat those regularities and invariances as given. The answer to this fundamental question necessarily will involve a demonstration of the stability of the phenomenon or pattern under changes in various details. (Batterman 2010)

Batterman thus claims that the fundamental explanatory question is why there are regularities and invariances, whereas Woodward takes these as explanans. Obviously, they have different views on what needs an explanation. I don’t see why one explanatory question should be deemed more fundamental than the other. I don’t think there is any fact of the matter as to what is the correct choice of fundamental explanation; what you ask about, and what you don’t ask about, depends on your perspective, your previous beliefs. Explanation requests are strongly perspective-dependent. There are two ways to disagree about an explanation, either disagreeing about the truth of explanans, or about the form of explanations. If such disagreements could be settled by reference to agreed empirical evidence all would be fine; but then explanatory value would be no additional aspect in epistemological evaluation

44

3 Empiricism from Ockham to van Fraassen

of theories. On the other hand, if disagreements concerns forms of explanations or metaphysical background assumptions, there is no hope of agreement on the plausibility of a theory; and, obviously, the disagreement is not about the predictive force of the theory. The disagreement is whether explanatory force has any evidentiary value at all. I think it a good advice to keep apart, as far as possible, epistemological and metaphysical questions. If we disagree about the reasons to believe a particular theory based on different metaphysical convictions, we should say that and not describe the dispute as concerning epistemology and empirical evidence. Likewise we should distinguish epistemological and pragmatic questions, where explanations, in my view, clearly belong to pragmatics. If epistemology aims at intersubjective and perspective-independent norms for what to accept as valid knowledge, explanatory value cannot be used. For the empiricist the sole epistemic value of theories is predictive success. Claiming that explanatory force have no evidentiary value does not mean rejecting or downgrading the urge for explanations. We certainly want an explanation of e.g., such remarkable phenomena as dark matter. Suppose someone comes up with a nice theory explaining to us what dark matter is. How would we evaluate this purported explanation? Surely, physicists would not be willing to accept this explanation as trustworthy unless one can derive new testable predictions which come out true. Lacking such tests the purported explanation would be regarded as, at best, an interesting guess to be tested. Explanation is connected to unification. There has been a constant urge to unify physics since at least Newton’s times, and a theory unifying seemingly disparate phenomena has always been viewed as a great achievement and having great explanatory value. I will discuss unification as explanation further in Chap. 6. In so far as an attempted unification allows new testable predictions, it has evidentiary value; if not it has no evidentiary value, and, as we will see in Chap. 6, unification without new testable predictions is easy to achieve; too easy to be worthy any consideration. One such example, to be discussed in Sect. 17.5.2., is the purported evidentiary value of string theory entailing gravitation. Simplicity. Physicists and philosophers have a strong preference for simpler theories over not so simple ones. If two theories are empirically equivalent, but one is simpler than the other, we all prefer the simpler one. This is easy to understand since it is easier to use and understand a simpler theory than a more complex one. But this is of course a pragmatic value, not a matter of trustworthiness. So this is no argument for simplicity as an aspect in theory choice. Another argument against simplicity as evidence for a theory is its dependence on predicates. A theory which is complex in the sense of containing a great number interrelated sentences, which utilise a great number of theoretical predicates, can be vastly simplified by the introduction, by explicit definition, of new predicates. As an illustration, Scorzato (2014) has shown that the standard model of particle physics, which usually is stated using 30 complicated equations, combined with Einstein’s equation, the fundamental relation in GTR (actually it is 10 independent equations) can be neatly written as the very simple equation = 0! This equation, one may

3.8 Evidence

45

say, is the simplest conceivable one encapsulating the standard model + GTR. But this formulation doesn’t add to our reasons for belief in these theories and it is easy to recognise that any alternative theory to the standard model + GTR can be expressed in the same simple way. So simplicity in formulations is a function of our effort and success in inventing new predicates. It should be obvious that simplicity cannot be a feature independent of the chosen language; a complex expression can always be simplified by the introduction of new variables and new predicates. Simplicity is a heuristic and pragmatic value.

3.8.3 Is Inconsistency Counter-Evidence Against a Theory? It might go without saying that no believable scientific theory can contain contradictory claims. It might seem obvious because from a contradiction one can derive any proposition, hence such a theory doesn’t exclude any state of affairs, which means that it is empirically empty. In practice however, physicists have sometimes been willing to work with theories that at least seemed to contain contradictions. One example is the use of the concept of an infinitesimal, a quantity defined as ‘less than any other quantity but not zero’ in classical mechanics. The characterisation of this concept sounds self-contradictory; between any two real numbers there is a real number greater than zero, so the word ‘infinitesimal’ cannot refer to any number. Still infinitesimals were successfully used in physics for two hundred years. It was not until Cauchy and Weierstrass gave us a consistent definition of derivatives in terms of limiting procedures that this inconsistency was removed from mathematics and physics. Derivatives are often expressed using the form df/dx and physicists often without problem treat ‘df ’ and ‘dx’ as denoting small numbers, in spite of the fact that neither df/dx, nor its components are terms, when making calculations. So inconsistency in the definitions doesn’t necessarily jeopardise successful use of concepts. But, of course, some care is needed! (And a consistent non-standard analysis, containing the concept of infinitesimal, was invented in the sixties.) A somewhat similar example is Dirac’s delta function, introduced by Dirac in his exposition of quantum mechanics. It is defined as δ(x − x0 ) =

 1, if x = x0 0, if x = x0

∞ But the integral −∞ δ(x − x0 ) = 1! It seems plainly inconsistent! As is well known, the inconsistency is avoided by changing the order of the limiting procedures, viewing the delta function as the limit of zero width of a gaussian curve whose integral equals unity. One may say that mathematical consistency in both cases ultimately is restored. But the interesting thing is that in neither case the empirical adequacy of the theory

46

3 Empiricism from Ockham to van Fraassen

was affected by the mathematical inconsistency. And, furthermore, physicists didn’t count these mathematical inconsistencies as evidence against the theory in question. This was a perfectly rational judgement and we should learn to follow suit. A reasonable conclusion is not to be totally dismissive about what appears to be inconsistencies in an otherwise promising theory. They might be resolved by some theoretical ingenuity. The point is that mathematics is a tool: we simplify our descriptions of real situations so that we get a well defined problem which can be solved by our known mathematical methods. But in many cases the utilised mathematical objects do not correspond to anything in the real world. Schwarz (2006(1966)) has made this point clearly. One of his examples is precisely Dirac’s delta function: . . . mathematics has often succeeded in proving, for instance, that the fundamental objects of the scientist’s calculations do not exist. The sorry history of the Dirac Delta function should teach us the pitfalls of rigor. Used repeatedly by Heaviside in the last century, used constantly and systematically by physicists since the 1920’s, this function remained for mathematicians a monstrosity and an amusing example of the physicists’ naiveté until it was realized that the Dirac Delta function was not literally a function but a generalized function. It is not hard to surmise that this history will be repeated for many of the notions of mathematical physics which are currently regarded as mathematically questionable. The physicist rightly dreads precise argument, since an argument which is only convincing if precise loses all its force if the assumptions upon which it is based are slightly changed, while an argument which is convincing though imprecise may well be stable under small perturbations of its underlying axioms.(Schwarz 2006(1966), 232)

The lesson to draw is that we should very clearly distinguish between mathematical and physical objects. The claim by Tegmark (2008) and other platonists that the physical universe is a mathematical entity is a big mistake. I don’t defend, of course, a general acceptance of inconsistent theories in science, not at all. My simple point is that there has been cases in the history of physics where a theory contained what seemed to be an inconsistency but which nevertheless was not counted as evidence against the theory, and rightly so. Some inconsistencies have no impact on descriptions of observable events. Some logicians, such as Graham Priest, have taken up this train of thought and developed what is called ‘paraconsistent logic’, see Norman et al. (1989). What must be rejected is any theory from which one can derive both a testable prediction and its negation. But if such were the case, we would not say that the inconsistency is evidence against the theory; we would dismiss it, before even considering possible evidence.

3.9 Classification: Natural Kinds We all have a natural tendency to classify things, to use universal concepts in Ockham’s formulation. This natural habit is reflected in language use: the very act of making a statement, uttering a declarative sentence with assertoric force, is to make a classification. The assertion that a is F is the claim that the object a is in

3.9 Classification: Natural Kinds

47

the extension of F . This is not meaningful when all object satisfy F. Neither is it meaningful if the identity of a is given in the form ‘a is the only object that satisfy F’, for if so we could reformulate ‘Fa’ as ‘That object which is F is F’. So useful predicates sort things into two classes, those who satisfy the predicate and those which don’t. When we justifiably assert that a number of objects each satisfy a certain predicate F, we have found that each of these objects satisfy the criteria of application for F; hence they are similar in this respect. Ockham and all later nominalists agree on that. The question is whether this similarity reflects a ‘deeper’ ontological level, whether there exists universals in the real world. Nominalists, including myself, denies that. Medieval nominalists, however, accepted two kinds of universals, the mental ones, i.e., concepts, and their linguistic counterparts, predicates. It is understandable that they held this view; it is in the nature of a universal to ‘inhere’ in several things, which appears analogous to the linguistic fact that a predicate may be true of many things. One may perhaps not want to say that concepts ‘inhere’ in the things they are true of, but structurally the relations ‘inhere’, as obtaining between a universal and an individual thing, and ‘true of’, as relating a predicate and an object, appear structurally analogous. What medieval nominalists strongly rejected was the notion that there are any universals in the external world. Medieval nominalists held that a predicate, and its mental counterpart the concept, could signify, i.e., make us think of, several individual things. But it doesn’t follow that linguistic and mental universals refer to universals in the external world, ‘universals ante res’, as medieval philosophers called them. The central metaphysical question is whether our way of classifying things by using general terms expressing concepts correspond to, i.e., refer to, real structural features in nature, and whether we can justify assumptions about such structural features. Medieval nominalists, Quine and Goodman don’t think we can do that. The assumption that a general term refers to an existing entity, a universal, does not, when added to a theory, entail any new testable predictions of that theory. Quine’s starting point in his Natural Kinds (chapter 5 in Quine (1969)) was the same as the medieval nominalists’: we have an innate tendency to sort things as more or less similar, thus thinking in terms of natural kinds. But this is no argument for assuming natural kinds as referents of general terms. Quine’s next point is that as a science matures we arrive at more theoretical classifications, making the notions of kind and similarity superfluous. He writes, apropos the use of the notion of similarity when talking about intelligence: Sometime, whether in terms of proteins or colloids or nerve nets or overt behavior, the relevant branch of science may reach a stage where a similarity notion can be constructed capable of making even the notion of intelligence respectable. And superfluous. In general we can take it as a very special mark of the maturity of a branch of science that it no longer needs an irreducible notion of similarity and kind. It is that final stage where the animal vestige is wholly absorbed into theory. In this career of the similarity notion, starting in its innate phase, developing over the years in the light of accumulated experience, passing then from the intuitive phase into theoretical similarity, and finally disappearing altogether, we have a paradigm of the evolution of unreason into science. (Quine 1969, 137-138)

48

3 Empiricism from Ockham to van Fraassen

Biological classification is a case in point. Unaided by theoretical knowledge we are prone to classify dolphins as fishes (ask a little child!); their appearance are similar to that of fishes. But we know better and classify them as mammals. We have found that other criteria than how they live and behave in water are more useful for scientific purposes. We have decided that humans and dolphins are similar in the sense of both satisfying a list of explicitly stated criteria for the predicate ‘mammal’. This is what Quine calls ‘theoretical similarity’. The notion of (unreflected) similarity is not sufficiently useful and instead we use explicit criteria for satisfaction of a predicate. This does not force us to postulate universals. Moreover, medieval nominalists and Goodman held that postulating such things simply is incoherent.

3.10 My Empiricist Stance My empiricist position consists of the following six components. 1. The core idea in empiricism is that empirical evidence is the only evidence there is for theories/sententences/propositions/beliefs about the external world. Empiricists are convinced that knowledge about the external world without exception is based on sensory experience. This is no leap of faith, but based on what we know about the human brain and about concept formation and cognition; hence it is an application of empirical knowledge. And it is part and parcel of epistemological naturalism, the rejection of the notion that epistemology is, or could be, an inquiry independent of empirical knowledge. This stance has two important consequences, skepticism towards a priori reasoning outside logic and mathematics and parsimony regarding ontology. 2. Anti-foundationalism and revisability. An empirical theory is based on observations and our knowledge about our cognitive apparatus tells us that observations are not completely certain. This might appear as a vicious circle, inviting general skepticism. But I think not. Any statement about the external world can be doubted based on something else we consider more trustworthy. We all sail in Neurath’s boat, repairing it plank by plank while still being afloat. 3. Epistemology is a social enterprise. Epistemology has almost exclusively been discussed in terms of reasons for beliefs and therefore implicitly been framed as an inquiry into the conditions for individual person’s rational judgements about what they know. A common view has been that empirical knowledge is based on what is immediately given, i.e., given to us in our minds. Hence, what is given are mind-contents. This is a big mistake in my opinion, and Sellars et al. (1997) has forcefully argued against this idea. Epistemology is a social enterprise, it cannot be based on subjective experiences. The question is what we humans as a collective have good reasons to accept. It is when we agree on what we observe we can begin to talk about evidence and reason for belief. What counts as evidence and justification is a collective decision.

3.10 My Empiricist Stance

49

This conclusion is immediately obvious when we discuss scientific knowledge. A single individual’s observation report is hardly ever regarded as sufficient evidence for any theory. Scientific knowledge consists of those theories the great majority of researchers in the relevant discipline can agree upon based on repeated observations of phenomena. We require wide agreement about observation reports and repeated experiments in order to conclude that such reports make up good evidence for any theory. We apply this demand for intersubjective agreement about evidence when we dismiss reports from religious persons claiming that their spiritual experiences are evidence for this or that. Such experiences are strictly personal and is therefore not regarded as evidence for anything. The social character of knowledge entails that discussions about evidence should be done in terms of intersubjectively available things, i.e., sentences, not contents of beliefs. Many philosophers hold that the contents of our thoughts, propositions, are intersubjective and so we can share them. Maybe we can do that, perhaps we sometimes do know the contents of other’s thoughts. But if that is the case, it is an unconscious inference from hearing what others have said. What we in fact share are sentences expressing those propositions and a discussion about evidence for theories should not be based upon a semantic theory which entails that there are such things as propositions, i.e., intentional entities being the purported meanings of sentences. It should be based on agreements on truth value about token sentences expressed in concrete situations, what Quine calls ‘occasion sentences’. 4. Truth is primary. Of the three semantic concepts truth, reference and meaning, truth is basic, as convincingly argued by e.g. Davidson (2001). This is a consequence of the context principle, first clearly expressed by Frege; it is only in the context of a complete sentence a term has a definite meaning and reference. So the truth of a sentence is primary in relation to the reference of the singular term in that sentence. An opponent might claim that in some clear cases of observation we easily agree on the reference of the singular term; we hear the word and see the speaker pointing at something, thus establishing the reference of the term. Then we observe its traits and can determine whether it satisfies the criteria for the predicate in that sentence, hence we determine that the sentence is true. This suggests that reference is primary and truth is the derived notion. However, this association of an object with a word does not settle the extension of the predicate ‘refer’. In most cases it is the other way: first we determine the truth of a sentence based on inferences from observations and non-conflict with other sentences we hold true. And if the sentence is true, it’s variables must have non-empty domains and individual constants be non-empty, although we do not directly observe those items. In other words, we infer the existence of referents in such cases, assuming our sentences are true. The meanings of sentences and predicates are highly debated issues, mostly because people have different conceptions of meaning. My conclusion is that truth is the least theoretical concept. Many have even argued that it is not

50

3 Empiricism from Ockham to van Fraassen

definable, it is primitive. But no matter what theory of truth we accept, we all agree on the T-schema: p is true iff p. It’s instances can be agreed upon no matter other philosophical differences, and that is sufficient for useful linguistic communication. Many philosophers think that we need meanings in explanations of linguistic phenomena. Well, it all depends on what one require of explanations. My stance is that, in so far as semantics is to be viewed as an empirical inquiry, what counts are testable predictions, and meanings are not needed for making predictions. 5. Inductive reasoning is a habit. It is a fact about us humans that to some extent we base our actions on past experiences; inductive reasoning belongs to our nature. That fact can be explained by reference to evolution. Moreover, everyone learning small children reading, simple mathematics or a new game, can observe this inductive habit in its primitive form; the child learn to follow a rule, a linguistic one or a rule in a game, by being rewarded for correct moves. Inductive moves in scientific thinking is no different; there the reward is success in predictions. We keep those predicates by which we have succeeded in our predictions and let others go. This is our way of interacting with other people and our environment and something similar is true of many animals. The difference between us and other animals is, as Popper put it, that we can let our hypotheses die instead of dying ourselves. So inductive reasoning is a natural habit, just like producing offspring and sleeping when tired. We don’t ask for justification of natural habits, so why should we ask for ultimate justification of general induction? I will discuss this topic at length in Chap. 5. The demand for ultimate justification of induction is a core idea in rationalism, and accepting that demand is to give in before the battle with rationalists really has begun. We should not accept the need for a first philosophy. Philosophy, in particular epistemology, is an integrated part of our scientific endeavour, not any a priori foundation for scientific knowledge. This last remark points to the holism of Duhem, albeit not to Quine’s wider form of holism. (But, as we saw in Sect. 3.6, late in his life he retracted from the most extreme form of holism.) We make inductive inferences every day in our ordinary life as well as when doing empirical science. Sometimes we get things right, sometimes not. The question is not whether induction can be justified as a general principle, but if it is possible to decrease the risk of mistakes in individual cases by improving our inductive habits. One such improvement is the use of the ‘golden rule’ in statistics, viz., double blind testing. The reasons for using this rule as much as possible is partially mathematical (random sampling usually generates a normal distribution of the target parameter, and knowing that we can use the normal distribution for calculating confidence intervals), partly an empirical result from psychology (the influence of expectations on the observation process.) There is an inductive element in both arguments, combined with mathematics and evaluation of past efforts, but it is not a vicious circle. The most reasonable view of the matter is, I think, that the method has been developed as a result of a kind of feed-back process internal to the scientific enterprise. We have in the course

3.10 My Empiricist Stance

51

of time learnt to apply inductive reasoning more carefully as a result of reflection of the mistakes made in some earlier inductions. 6. Nominalism. We empiricists are usually skeptical about universals, the reason being that accepting the truth of sentence, and the provisional acceptance of an entire theory as (approximately) true, does not entail that predicates refer to anything. Universals are not needed as objects in any first order theory. Accepting a theory as (provisionally) true, only forces us to accept that its singular terms, i.e., its names, singular definite descriptions, pronouns and first order variables, refer to things that exist. A second order theory requires universals. But physics, and natural science in general, has no need for second order theory, i.e., quantification over properties and relations not reducible to sets of individuals. Physics, and natural science in general, can be expressed purely extensionally in first order theories, referring to individual objects, sets of individual objects, sets of sets, etc. As already remarked, scientific laws are by philosophers held to be necessarily true, and it is almost universally agreed that any account of necessity requires quantified modal logic. I disagree about that, and my argument is given Sect. 10.10. I have been recommended to give a name for this version of empiricism, the joint acceptance of the six points above, and I certainly see the need for a name for further reference. The best I presently can come up with is Nominalistic empiricism.

Chapter 4

Mathematical Knowledge and Mathematical Objects

Abstract This chapter is about the ontology and epistemology of mathematical objects. The core problem for an empiricist is that conceiving mathematical objects as existing independently of human thinking makes it impossible to understand how we can have mathematical knowledge, while the alternative, a constructivist conception, resolves the epistemological problem, but entails the identification of truth with provability. That entails that the law of excluded middle must be dismissed as a generally valid logical principle. The identificaiton of truth with provability is furthermore problematic when taking into account Gödel’s first incompleteness theorem. The chapter ends with suggesting a modified constructivism, which keeps the distinction between truth and provability, thus avoiding counter arguments based on Gödel’s theorem.

4.1 Introduction Although the focus of this book is philosophy of physics, it is hard to avoid some fundamental issues in the philosophy of mathematics. Physics and mathematics are very closely related; it is inconceivable to do physics without using a lot of mathematics. The relations between mathematics and natural science, in particular physics, was the topic of Wigner’s much discussed paper ‘The Unreasonable Effectiveness of Mathematics in the Natural Sciences’ (1960). That there are deep connections between mathematics and physics is indisputable; some, like Tegmark (2008), have even taken the radical step to identify these two subject matters, thereby becoming full-blown pythagoreans. Others have more critically discussed how mathematical arguments are used in physics, see e.g. Schwarz (2006(1966)), pointing out how much adjustments and approximations are needed in practical use of mathematical models in physics. The radical idea of identifying mathematics and physics is not my cup of tea; it is a basic stance for all empiricists that physical objects in space and time radically differ from abstract objects; some even dismiss abstract objects altogether. I follow

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_4

53

54

4 Mathematical Knowledge and Mathematical Objects

Aristotle in distinguishing changeable from unchangeable things, the latter being the abstract ones. The relation between mathematics and physics will be discussed in Chaps. 7 and 17. In this chapter I will instead focus on two fundamental philosophical questions about mathematics: how do we obtain mathematical knowledge and do mathematical objects exist? It is obvious that these two questions are closely linked. Some empiricists are skeptical about abstract objects in general and numbers are certainly abstract things. The problem is how we can have any knowledge about abstract things. Certainly not by some kind of a priori intuition into Plato’s heaven, nor from sensory experience. But it is hardly possible to deny that we have mathematical knowledge, hence there must exist mathematical objects and we know many things about them. How is that possible? Hellman (1989), Field (1980) and Azzouni (2004) among others reject the conclusion that mathematical objects exist, they hold that these things are mere fictions. Their problem is then to explain how one may legitimately hold mathematical sentences true while denying that singular terms and variables in those sentences refer to existing objects. One way out is to reject the common view that the existential quantifier carries any ontological commitment. Instead, ontological commitments are represented by an existence predicate. A discussion of these views can be found in Bueno (2014). I find this road highly unattractive. Not only do I think that Kant’s argument against viewing EXISTENCE as a predicate convincing, it is also a highly unnatural rendering of ordinary discourse. When I say things like ‘there is a robin in my garden’ I use ‘there is’ precisely in order to make an existential claim. Who could deny that? And ‘there is’ is the existential quantifier in ordinary language. Why say that it has another meaning in mathematics? The alternative approach is to hold that mathematical statements are not truthapt. But that undermines the idea that use of mathematical identities in logical inferences is justified by being truth-preserving. For if we conceive an equation as a mere string of linguistic signs lacking truth-value, it cannot do its job of preserving truth in a deductive argument, hence this approach undermines the entire idea of valid deduction. This is too high a cost for me. (The logical positivists held that all theoretical sentences, mathematical or not, lack truth-value, and, as I argued in Chap. 2, this undermines the entire logical positivist programme.) Thus, I do not find fictionalism a viable stance and one had better accept that mathematical objects exist. But how should we understand that? There are two competing views on mathematical existence, platonism and constructivism. Platonism or mathematical realism is the doctrine that a mathematical sentence is true if and only if it corresponds to a mathematical fact in an independently existing realm of abstract objects, Plato’s heaven, independently of our knowledge of which is the case. This is the common view among philosophers of mathematics. But it invites skepticism: how could we obtain knowledge about such facts? The common answer is that we have a kind of intuition into some elementary facts in the mathematical realm, for example that we immediately ‘see’ that every natural number has an immediate successor. From an empiricist point of view this answer

4.1 Introduction

55

is deeply unsatisfying. Postulating a mental faculty for knowledge about abstract objects, a faculty which has nothing to do with sensory experience, nor of logical thinking, is on a par with postulating extra-sensory perception. The other view, mathematical constructivism, is a species of anti-realism. The basic idea is that the truth of a sentence is identified with its having been constructively proved and its falsity with its negation having been thus proved. A constructive proof does not utilise the law of excluded middle, LEM, nor double negation elimination,1 it is a direct construction of the sentence proved. It is then possible that neither a sentence p, nor its negation ¬p, is directly provable. The reason given by constructivists for not accepting LEM is that the meaning of each logical constant is jointly given by its introduction and elimination rule and the law of excluded middle is not a consequence of the rules for the logical constants ∨ and ¬ occurring in it. The law of excluded middle must be justified in some other way. But since constructivists hold that justification of logical rules comes from the meanings of the logical constants, they hold that excluded middle cannot be justified and should not be accepted as a logical rule. It follows that the principle of bivalence is not accepted as generally valid,2 thus truth-value gaps are accepted by constructivists. For an empiricist the constructivist point of view is natural, not to say mandatory; the constructive proof of a sentence is the most obvious and direct demonstration of how we in fact know it is true. But as we just have seen, the constructivist position comes with the prize of not accepting LEM and the principle of bivalence. This is a huge cost; giving up the principle of bivalence and the law of excluded middle is, particularly when it concerns mathematics, really hard to digest for me. Consider for example Goldbach’s conjecture that every even natural number greater than two is the sum two primes. There is no known proof, nor any disproof of Goldbach’s conjecture, hence according to constructivists it is neither true nor false. I cannot believe that; it must be either true or false, even though I don’t believe that natural numbers have an independent existence in Plato’s heaven. So I have found myself in the same camp as Feferman: . . . my own point of view philosophically, . . . .. has both negative and positive aspects. On the negative side, I am a confirmed anti-platonist. On the positive side, I am a realist insofar as the natural numbers are concerned, i.e., I believe that statements about the natural number structure have a determinate truth value independent of human proofs and constructions. (Feferman 2005, 619)

Some philosophers of mathematics think this combination of views is impossible. I certainly see the difficulties. Nevertheless I will try an account of mathematical knowledge that entails this combination of anti-platonism and acceptance of

1 Double

negation elimination and excluded middle are equivalent, given the introduction and elimination rules for ¬ and ∨. 2 LEM and the principle of bivalence are under reasonable assumptions extensionally equivalent. But one may hold they are different things: LEM is logical rule and bivalence is a semantic principle.

56

4 Mathematical Knowledge and Mathematical Objects

bivalence. I will propose a kind of holistic constructivism regarding mathematical objects, based on certain semantic and ontological considerations. The basic idea is that when you have introduced into discourse a predicate with its associated principles of application, you have implicitly said that all objects, which satisfy the predicate and for which identity criteria can be given, exist independently of having proved sentences where those objects are values of the variables. So my position is not constructivism in the sense of only accepting constructive proofs, which is the core idea of constructivist mathematics. In this way we may say that we construct en mass, so to speak, all objects satisfying a certain predicate. In so far as the satisfaction criterion for a certain predicate is clear, we may say, for example, that all sentences of the form ‘. . . . is a natural number’ is either true or false when completed with a singular term. Either that term refers to an object satisfying the criterion for being a natural number or not. Bivalence is upheld. But the reader immediately asks: what about mathematical paradoxes? Adopting this constructivist perspective on objects doesn’t say anything about which predicates are acceptable. As the history of twentieth century mathematics shows, unrestricted concept formation leads to paradoxes. Many, perhaps all, paradoxes arise when using impredicative constructions, so these must be dismissed. This topic will be discussed in Sect. 4.3.

4.2 Kant and Quine on Objects My conception of constructivism is inspired by Kant’s view on objects. Kant’s argument against metaphysical realism concerns primarily physical objects, bodies, but the argument is completely general and applies to all kinds objects of discourse, including abstract things. According to Kant all objects we think and talk about are objects of cognitive acts, phenomena. By this he means that objects, by being the results of cognitive acts, necessarily have some general features. This view enables Kant to answer the skeptical question how knowledge about the external world is possible. Both Berkeley and Hume had observed that the classical empiricist view that objects cause our impressions invites skepticism; how do we know that there really are any external objects and how do we know that our sense impressions are veridical representations of them? We have only direct access to our own impressions and ideas, according to classical empiricism. Reading Hume’s critique of classical empiricism, Kant commented that Hume ‘woke him up from his dogmatic slumber’. But Kant did not endorse Hume’s way out, viz., simply to reject the demand for ultimate justification of knowledge, which is the core idea in Hume’s naturalism. Instead Kant developed transcendentalism, i.e. an inquiry into the very conditions for knowledge and cognition. Kant’s starting point in the first Critique is to point out that both empiricists such as Locke, and metaphysical realists such as Leibniz had made a methodological

4.2 Kant and Quine on Objects

57

mistake in trying to say things about the relations between objects in the world and our mind. For when talking about such relations we treat our own mind as an object, a relatum for a relation, while at the same time we are using our mind for making judgements about that relation; it is this very mind that is doing the thinking. Our own mind is at the same time an object for a judgement about a relation and the subject doing the judging. Kant spent many pages in the first Critique to show that this stance leads to antinomies and one may join Kant by concluding that it is an incoherent stance. The conclusion is that we can only coherently think of objects from our human perspective, as how they appear to us humans; objects are objects for us, phenomena. We cannot step outside ourselves. Kant introduced the notion of things as they are in themselves merely in order to make a clear contrast in perspective to things as phenomena. In my interpretation, the distinction between things as they are in themselves and things as they appear to us is not a distinction between two kinds of things, it is a distinction between two perspectives, of which the first is not available to us humans.3 If we interpret Kant as talking about two different kinds of things instead of two perspective on the same kind of things, he would be back into the same problem he tried to solve with his transcendentalism, for we could immediately ask him: how are these two kinds of objects related? I can’t imagine that Kant should have been unaware of this, hence the reading of him as postulating a realm of ‘real’ objects distinct from those we cognize, should be dismissed. It is a short step to reinterpret objectivity as intersubjective agreement. The notion that objectivity means correspondence with facts is to presuppose that it is meaningful to compare an object as it appear to us with how it actually is; and that is, again, to assume that we coherently can think and talk about an object as it is in itself, which is impossible. Thus, mathematical platonism is inconsistent with Kant’s philosophy. Kant held that the constitution of our mind determines certain features of objects; this is what it means to say that objects are phenomena. When we use our senses for observing things, we necessarily observe them as situated in space and time; these are our forms of intuition (german ‘Anschauungsformen’). Then, when making judgements about an object we use general concepts and our mind is, according to Kant, so structured that all general concepts are based on 12 fundamental ones, the 3 Kant’s phrase was ‘Das Ding an sich selbst betrachtet’ i.e., the thing as it is in itself. But his phrase

is often abbreviated as ‘Das Ding an sich’, which easily is misunderstood as being about another thing than the thing perceived. The translation of Kant’s ‘Das Ding an sich selbst betrachtet’ as ‘The thing as it is in itself’ is a bit misleading I think. The German phrase ‘an sich selbst betrachtet’ is more correctly translated as ‘as seen in itself’. Translating it ‘as it is in itself’ seems to me misleading because to talk about an object as it is in itself indicates a realistic view according to which these objects, noumena, are distinct and exist independently of our cognitive acts. Kant’s view is precisely the opposite; distinct objects with properties is the result of the cognition act. To conceive of a distinct object as independent of the cognitive act is incoherent, according to Kant. When we in our imagination would try to reduce away the effects of our cognitive actions we would have nothing left, not even an individual object, a ‘bare particular’.

58

4 Mathematical Knowledge and Mathematical Objects

categories, which in a profound sense are in us; that is how our mind is construed. These categories are the necessary conditions for all judgements. The forms of intuition and the categories are structures of our mind, and these structural features determine the general form of our knowledge of things. It is clear that Kant primarily thought of physical objects in space and time, but his conclusion that it is incoherent to simultaneously conceive our mind as related to an external mind-independent object and use this very mind in doing this conceptualisation applies also to thinking about abstract objects. They are likewise the results of the operations of the human mind, although not existing in space and time, hence not objects of empirical intuition. One may doubt some aspects of Kant’s philosophy, for example is his ‘deduction’ of the 12 categories not convincing. But his basic point that it is impossible to say anything about the relation between independently existing external things, be they physical or abstract ones, and our mind, is hard to resist. Objects are objects for us and metaphysical realism is an incoherent stance. And since mathematical platonism is metaphysical realism about mathematical objects, this position is refuted by the same argument. Quine’s views about objects are close to Kant’s, although Quine never talks about judgements, concepts, categories or objects as phenomena. Quine is hostile to the use of intentional notions. Instead he formulates doctrines about our use of language. Instead of talking about judgements he talks about sentences held true and instead of concepts he talks about general terms and their rules of application. And he repeatedly pointed out that that is our only tools. We have to stay with our means and stay afloat on Neurath’s boat. This is Kant’s view stated in terms of language use. Since a phenomenon, an object as it appear to us, is in part shaped by of the operations of our mind, Kant’s position is that the act of judgement is primary in relation to the object being judged. Quine arrives at a similar view, using arguments from our use of language. The basis for ontology and semantics consists of occasion sentences taken holophrastically, according to Quine. The starting point is assent or dissent to an uttered sentence. If we assent to a sentence we hold it being true. In the simplest case we analyse the sentence as consisting of a singular term a and a general term F . If the sentence Fa is held true we implicitly accept that there exists an object a satisfying the predicate F . Furthermore, if F is a general term being true of several things there must be criteria for identity and a principle of individuation among the things this predicate is true of. (Quine: ‘No entity without identity’ and ‘To be is to be the value of a variable.’) These criteria are part of the rules of application of the predicate F . One may say that by using sortal predicates we divide up reality in distinct objects. Quine formulates the point thus: ‘The very notion of an object at all, concrete or abstract, is a human contribution, a feature of our inherited apparatus for organizing the amorphous welter of neural input’ (Quine 1992, 7). This view seems to me to be essentially the same as Kant’s and I join the party. Predicates are not given to us from the gods, they are human inventions. It is our use of sortal predicates, such as NATURAL NUMBER, in making assertions, that split

4.3 Truth Value Gaps in Mathematics

59

up the universe of discourse into distinct objects. The point is forcefully argued in Quine (1960, ch. 3). Metaphysical realists reject this view, holding that the world consists of objects with properties and relations quite independently of our cognitive and linguistic actions. In a successful theory we have been lucky to mirror the real structure of the world. But this view is, again, to conceive a relation between the real world and our representations of it from a vantage point of view outside our own thinking. This is, Kant argued, incoherent. I agree.

4.3 Truth Value Gaps in Mathematics Since number theoretic predicates can be constructed using Peano’s axioms, one might think that the extensions of all predicates taking natural numbers, pairs of numbers, etc., as objects are thereby determined. One might think that by accepting Peano’s axioms we implicitly accept the existence of all objects satisfying these axioms in one fell swoop, that all natural numbers are in a sense constructed at once. Hence the extensions of predicates such as ‘sum’, ‘prime number’ and ‘even number’ are fully determined when the rules for these predicates are stated. It follows that Goldbach’s conjecture is either true or false, no matter whether we have been able to find a proof or disproof of it. However, without restrictions of concept formation we end up with truth value gaps filled with antinomies and paradoxes. Some well known examples in mathematics are Burali-Forti paradox of the largest ordinal number, König’s paradox of the least non-definable ordinal number and Richard’s paradox, the definition, by diagonalisation, of a real number different from all definable real numbers. Such paradoxes provide reasons to reject the general validity of bivalence and that is the position taken by constructivists in philosophy of mathematics. Dummett, for example, elaborates on this reason for rejecting bivalence: A sufficiently definite grasp of a language, for example, is for this purpose one yielding an intuitive conception of the notion of truth as applying to the assertoric statements of that language. Given this, we may always frame a richer language in which we can talk about the first language, in the sense of formulating semantic properties of it, and also say anything that we could say in that language. Hence there can be for us no all-inclusive language, any more than we can talk simultaneously about all ordinal numbers in the sense of all objects that we could ever recognise as falling under the intuitive concept ordinal number. Dummett (1993, 454)

Dummett concludes: By the nature of the case, we can form no clear conception of the extension of an indefinitely extensible concept; any attempt to do so is liable to lead us into contradiction. Is it intelligible to suppose that a superhuman intelligence could form such a conception? The concept could not be given to that intelligence as indefinitely extensible; but might it not have a concept whose extension covered all and only those objects we are capable of coming to recognise as ordinal numbers? The question seems unanswerable; but we should be cautious in formulating the proposition that we cannot talk simultaneously about all objects

60

4 Mathematical Knowledge and Mathematical Objects falling under an intuitive concept given to us as indefinitely extensible. We can obviously frame some incontestably true statements about all such objects, for example, “Every ordinal number has a successor”. What we cannot do is to suppose that a language admitting such statements obeys a two-valued semantics; but there is no difficulty in envisaging such a language as obeying intuitionistic logic. That will not, of course, satisfy the externalist, because the statements of such a language will not all be determinately true or false, and for the most elementary reason, namely that the quantification they involve is not over a determinate domain (or at least over one of which we can attain a definite conception). We here come upon a link between externalism, as I have been discussing it, and realism, in the sense in which I have frequently discussed it and in which it crucially involves the principle of bivalence, a link that justifies Hilary Putnam’s use of the phrase ‘external realism’ (ibid. 454–455)

Dummett is clearly right that extending a concept indefinitely may lead to contradiction; Burali-Forti’s and König’s paradoxes are clear examples, and this he takes as an argument for giving up bivalence. But there is another way out: instead of giving up the principle of bivalence one might impose restrictions on acceptable predicate construction, at least in domains of inquiry such as mathematics, where we, explicitly or implicitly, introduce predicates not belonging to ordinary language, or restrict vernacular expressions by stating explicit conditions for their use in scientific contexts.4 Poincaré (1906) identified two sources of mathematical paradoxes: (i) there is a vicious circle in the definition of a certain mathematical object and (ii) the use of actual completed infinite sets of objects. One may think that this is not really two different sources but one and same although viewed from different angles. In any case it demonstrates that some restrictions need to be imposed on our use of mathematical predicates. For example, one might think that one has a firm grasp of the concept of the set of all ordinal numbers before one is confronted with Burali-Forti’s paradox, thus being forced to reconsider the concept and impose some restrictions on being an ordinal number. There is no obvious receipt for doing so. Feferman (2005) gives a detailed historical overview of the discussion of different reactions to the paradoxes. The paper’s title, ‘Predicativity’ indicates that for Feferman violations of predicativity is the common source of the paradoxes. Predicativity is the contrast to impredicativity; a definition is impredicative if it quantifies over a totality that includes the object to be defined. The paradoxes mentioned above are all cases of impredicative definitions and one may easily agree with Poincaré and Russell (1906) that impredicative definitions are instances of vicious circle reasoning. Feferman rehearses a number of different versions of predicativity with different strength. There is no need to discuss these attempts in detail. For my purpose it suffice to observe that several ways of closing truth value gaps in mathematics

4 Dummett

has not, to my knowledge, considered this option, which for a constructivist is rather natural. Perhaps he takes for granted that objects do not come into existence, or are shaped, in cognitive and linguistic acts, but exist with their properties independently of our use of concepts.

4.3 Truth Value Gaps in Mathematics

61

by dismissing impredicative constructions are possible; whether there is a best one doesn’t matter in this context. Thus, one can impose restrictions on acceptable concept formation in mathematics, thereby avoiding paradoxes and saving the principle of bivalence in this domain. Sticking to bivalence is the pragmatic decision that it is preferable to keep bivalence and restrict concept formation rather than the other way round. This is Feferman’s view: [Predicativity given the natural numbers] should be looked upon as the philosophy of how we get off the ground and sustain flight mathematically without assuming more than the basic structure of natural numbers to begin with. There are less clear-cut conceptions which can lead us higher into the mathematical stratosphere, for example that of various kinds of sets generated by infinitary closure conditions. That such conceptions are less clear-cut than the natural number system is no reason not to use them, but one should look to see where it is necessary to use them and what we can say about what it is we know when we do use them. (Feferman 1998, Preface, ix)

This is perfectly in accord with the stance that numbers and other mathematical objects are constructions. In mathematical constructions we formulate predicates and in order to avoid antinomies we impose necessary restrictions on these formulations. Then we may decide that there are abstract objects being the references of such descriptions. To some, like Russell, this may seem too frivolous. Russell is famous for his criticism of postulation: The method of ‘postulating’ what we want has many advantages; they are the same as the advantages of theft over honest toil. Let us leave them to others and proceed with our honest toil. (Russell 1919, 71)

Russell here objects to the use of implicit definitions of objects. On this point I disagree with Russell; Quine has shown that implicit definitions can be converted to axioms, see Quine (1976b). Now, consider Goldbach’s conjecture as an example of an arithmetical sentence that is neither proved nor disproved. By accepting the axiom that each natural number has a successor, we have so to say in one fell swoop constructed all natural numbers. Then we impose the restriction that impredicative formulations, such as ‘the largest ordinal number’ and ‘the least non-definable ordinal number’ are not allowed. Burali-Forti’s and König’s paradoxes are thus dismissed as illicit use of predicates and bivalence is saved within arithmetic. I have here assumed that all paradoxes in arithmetic is due to use of impredicative constructions. There might be other paradoxes, already known or discovered in the future, which is not excluded by forbidding impredicative constructions. So more restrictions might be necessary, and I see no fundamental problem with adding further conditions on acceptable predicate constructions. This is in harmony with Gödel’s first incompleteness theorem, whose content briefly could be described as that a formal language in which we can express elementary arithmetic has stronger expressive power than proof power. Kreisel is of a similar opinion:

62

4 Mathematical Knowledge and Mathematical Objects I do not make the assumption that, if mathematical objects are our own constructions, we must be expected to be able to decide all their properties; for, except under some extravagant restrictions on what one admits as the self, I do not see why one should expect so much more control over one’s mental products than over one’s bodily products — which are sometimes quite surprising. (Kreisel 1967).

By contrast, the alternative route taken by intuitionists, who identify truth with provability, faces severe problems in the light of Gödel’s theorem. This will be discussed in Sect. 4.7.

4.4 Are Numbers Universals? As discussed in Chap. 3, empiricism and nominalism are close allies in philosophy. Both nominalists and empiricists reject universals; nominalists think universals are superfluous, or incomprehensible, while empiricists reject them because there is no evidence for the existence of such things. Then if we accept numbers and think of them and other mathematical objects constructed out of numbers, as universals, as several philosophers of mathematics do, we end up with an exception from nominalism. But if we admit exceptions from nominalism in one area of discourse, why not accept universals tout court? The alternative to keep nominalism intact and conceive numbers as individuals is in my view the preferable one. This is, however, not the received view. Many in the debate hold that numbers are universals because they hold that all abstract objects are universals. This seems to be e.g., Quine’s stance. But I see no good reason to assume that abstract things, if they exist, must be universals. Why couldn’t we accept abstract individuals? In particular, we may classify numbers as individuals. Hence even a staunch nominalist who reject all universals may accept mathematical objects in his universe of discourse, provided they are individuals. The crucial question is how one draws the distinction between universals and individuals and here I take inspiration from Aristotle. In Categories he distinguishes between primary substances, (oυσ ια) i.e., individual objects, and universals; the former cannot, but the latter can, be predicated of things. He writes: ‘that which is called a substance most strictly, primarily, and most of all—is that which is neither said of a subject nor in a subject’ (Cat.2a11). Aristotle here uses the word ‘subject’ for the referent of a singular term, not as we do as a grammatical term, so let us rephrase: A term for an individual thing cannot occur as a general term, only as a singular term’. Terms for universals, by contrast, can occur both as singular and general terms; in Aristotle’s words, universals can be said of subjects and be in subjects. The question of the existence of universals is the question whether we should assume universals as referents of general terms or not. Using modern logic we may formulate this question as: should we quantify over universals, i.e., should we delve into second order quantification or not?

4.5 From Natural Numbers to Reals

63

Goodman, Quine and van Fraassen, (of which Goodman is a self-proclaimed nominalist, the two others by implication)5 reject universals and hence second order quantification. Modern empiricists and medieval nominalists alike expel universals from the universe of discourse because such things don’t contribute to explanation. Goodman writes: The nominalism I have described only demands that all entities admitted, no matter what they are, be treated as individuals. . . . .to treat entities as individuals for a system is to treat them as values of the variables of the lowest order in the system. (Goodman 1972, 157)

In predicate logic this means that only first order quantification is admitted, i.e., that universals, conceived as referents of predicates, are dismissed from the universe of discourse. Goodman’s clause, ‘no matter what they are,’ is, I think, meant to point out that also collections of things not naturally thought of as individuals nevertheless are individuals. Social units, such as The European Commission, (which consists of 27 persons), are logically speaking individuals, since they are referents of singular terms. ‘The European Commission’ cannot be used as a general term. Numerals are singular terms referring to numbers, hence numbers are individual things. Thus one may stick to nominalism in the usual sense of rejecting universals, while at the same time accept the existence of numbers. Dismissing universals from our universe of discourse is thus not the same as dismissing abstract objects. The distinction abstract/concrete and the distinction individual/universal are orthogonal. So for example, Quine rejects universals but accept abstract objects such as sets and numbers. Others, such as Armstrong, accept universals, but claim that they are concrete things, i.e., existing in space and time.

4.5 From Natural Numbers to Reals I accept that natural numbers exist, they are individual objects. We quantify over them and identity criteria are available. We implicitly postulate numbers as referents to numerical expressions when we accept arithmetical identities, such as 5+3=8, as true. When considering abstract objects one might ask how we distinguish between the object and its linguistic representation, in the sense of the type, for example a definite description; why say that there exist, in addition to the linguistic type something else, an abstract thing referred to by that very description? The need for something more than mere linguistic items enters when we introduce identities. For an identity statement of the form a = b may be described as

5 Quine

claims he is not a nominalist, his argument being that he accept abstract objects in his ontology, thus seemingly assuming that nominalism means rejecting all abstract objects. But I follow Goodman in holding that nominalism is to be identified with rejection of second order quantification and with that Quine agrees.

64

4 Mathematical Knowledge and Mathematical Objects

expressing that there is a thing being the referent of, at least, the two singular terms a and b. We say, for example, ‘Seven is the fourth prime number.’ The singular term ‘seven’ is a name for the number seven and this entity is also the referent of ‘the fourth prime number’. Suppose there were an abstract object a, an individual which only could be referred to by this singular term. In other words, no identity statement other than a = a about this object is possible. If so, one could reasonably ask, ‘what purpose is fulfilled by postulating an object as the referent for the term ‘a’? What is lost if we dismiss it from our universe of discourse? I submit nothing. In other words, there is reason to require of any existing object we talk about that it must satisfy a non-trivial identity criterion. Thus Quine’s ‘No entity without identity.’, albeit he gave other arguments for requiring identity. And by explicitly stating an identity we may say that we have postulated an object. Frege was the first, as far as I know, to require identity as a necessary condition on existence in his debate with Hilbert on mathematical existence. Hilbert claimed that consistency is sufficient for mathematical existence, whereby Frege protested, claiming that more is needed, viz., satisfaction of an identity criterion.

4.5.1 Against Reduction of Mathematics to Set Theory Russell & Whitehead had the view that mathematics could be reduced to pure logic, thus writing Principia Mathematica. This project was, one may think, motivated by a desire to show that mathematics was absolutely certain without relying on synthetic a priori principles. But the result was not as clearcut as expected. In order to achieve their goal Russell & Whitehead was forced to introduce axioms that was not obviously logical, such as the axiom of choice, and some set theoretical principles. It was gradually recognized that mathematics had not been reduced to pure logic, but to logic + set theory. Herman Weyl and Henri Poincaré were critical, thinking that the natural number system is in no need of any foundation. Feferman comments Weyl’s position: In the introduction to [Das Kontinuum] Weyl criticized axiomatic set theory as a “house built on sand” (though the objects of, and reasons for, his criticism are not made explicit.) He proposed to replace this with a solid foundation, but not for all that had come to be accepted from set theory; the rest he gave up willingly, not seeing any other alternative. Weyl’s main aim in this work was to secure mathematical analysis through a theory of the real number system (the continuum) that would make no basic assumptions beyond that of the structure of natural numbers N: . . . Weyl did not attempt to reduce. . . reasoning about N to something supposedly more basic. In this respect Weyl agreed with Henri Poincaré that the natural number system and the associated principle of induction constitute an irreducible minimum of theoretical mathematics, and any effort to “justify” that would implicitly involve its assumption elsewhere. . . . . . . unlike Brouwer, Weyl accepted uncritically the use of classical logic at this stage (though at a later date he was to champion Brouwer’s views). (Feferman 1998, 51ff)

4.5 From Natural Numbers to Reals

65

I completely agree with Weyl and Poincaré that the natural number system, including the principle of induction, is not in need of any deeper foundation. Broadly speaking, Weyl’s position is a kind of epistemological naturalism applied to mathematics. From early on each of us learn elementary arithmetic, and there is no point in trying to justify it. From which point of view could one ask for justification of natural linguistic practice? We use numerals and other expressions for natural numbers in ordinary language as a matter of course. The idea that the use of the natural number system in ordinary language needs a justification is an illicit demand, an instance of the belief that such practical knowledge needs an a priori justification. It does not. This is fine so long as we confine the discussion to mathematical objects for which there are names or definite descriptions, viz., natural numbers and finite constructions built upon them. But what to do with reals? It is well known that most reals have no names or descriptions. The set of reals is uncountably infinite, whereas names of things can be counted. Real numbers raise a crucial ontological and epistemological problem.

4.5.2 Platonism Versus Constructivism and Reals One may accept that there are reals lacking names or definite descriptions without conflicting with the postulate that any object we talk about must satisfy an identity criterion; for, obviously, an object that lacks any linguistic representation is not talked about in the sense of being a referent for the singular term in a sentence. But should we accept reals? Isn’t it incoherent to say that there are objects which in some sense are our constructions but which we never have, nor ever will be, written or talked about? Do we really need such things? For what purpose? And in what sense have such things been constructed? This brings us, again, to the dispute between classical and constructive mathematics. The problem is clearly visible when we quantify over the domain of reals. Consider for example the intermediate value theorem, which says that for any continuous function f on the interval [a, b] ∈ R and for any number u such that f (a) < u < f (b) or f (a) > u > f (b) there is an x ∈ [a, b] such that u = f (x). It is provable in classical mathematics, i.e., using double negation elimination; but it is not constructively provable. Not accepting theorems for which we lack constructive proofs is certainly a drawback in the concrete use of mathematics and one may argue that the rasion d‘être of mathematics is its usefulness. Philosophers with a pragmatist bent are particularly prone to argue along these lines. So for example Quine, who had strong inclinations of keeping ontology minimal, nevertheless accepted reals. He argued that we need mathematics for doing modern science; mathematical objects are in practice indispensable. I take him to primarily have thought about the use of reals in doing calculus.

66

4 Mathematical Knowledge and Mathematical Objects

Most philosophers agree on the indispensability of mathematics for science, but what conclusions can be drawn concerning mathematical objects? Feferman (1998, ch.14) discusses the indispensability argument and concludes: My conclusion from all this is that even if one accepts the indispensability arguments, practically nothing philosophically definitive can be said of the entities which are then supposed to have the same status – ontologically and epistemologically – as the entities of natural science. That being the case, what do the indispensability arguments amount to? As far as I’m concerned, they are completely vitiated. (op.cit., p. 297)

Feferman agrees with most mathematicians in accepting reals, but he sees the problem: ‘But as long as science takes the real number system for granted, its philosophers must eventually engage the basic foundational question of modern mathematics: “What are the real numbers, really?”’ (Feferman 1998, 298). My answer is: they are human constructions.

4.6 Constructions of Numbers 4.6.1 Constructions of Integers and Rationals Mathematical thinking begins with the construction of positive integers as abstractions from visible collections of things. Almost all small children, researchers tell us, can early on compare small collections consisting of up to three or four items and see whether they contain equal number of things. For higher natural numbers where vision is not sufficient to determine whether two collections consists of equal numbers of objects, they can soon learn to count (which basically is to learn to attach different words to each step in a succession of repeated actions) and decide whether two collections are equal in number. The cardinal principle, i.e., that the last word in counting items in a collection is a term for the number of items counted, is mastered around 3 1/2 years, according to Wynn (1990). These habits are reflected in one of the definitions of the natural numbers. The step to talk about numbers is an abstraction from talking about collections of objects; sometimes the items in two different collections can be mapped one-toone on each other, thus giving us a way of attributing a common feature to these collections. They consist of equal number of things. It seems indeed justified to call this process a construction of numbers; it is an acquired cognitive and linguistic habit. The next step is to learn elementary arithmetic, beginning with addition. My grandchildren, like most small children, count fingers when adding positive integers. Klara, 6 years old (when I’m writing this) and one of my grandchildren, knows that 5+3=8 and she knows how to check; she first counts five fingers, and then continue with three more, counting loudly ‘sex’, ‘sju’ ‘åtta’, (Swedish for ‘six’, ‘seven’ and ‘eight’) and keeping track of the number of words said. Then she knows that five plus three is eight.

4.6 Constructions of Numbers

67

When we describe Klara’s knowledge in semantic terms we say that she knows that the expressions ‘fem plus tre’ and ‘åtta’ refer to the same object, the number eight. (But, of course, she doesn’t know that she knows that!) We need abstract entities, the numbers three, five and eight, in order to give a semantics for her holding true this identity. Quine (1960) discusses the linguistic steps associated with reification, positing objects, claiming that we have clear evidence that a speaker posits an object when she uses essential pronouns, as in ‘Look, there’s a raven. It is black.’ If we replace ‘It’ by ‘A raven’ we lose the point of the second utterance, viz., that it is the same raven which is black. He further elaborates on reification, pointing out that it is connected to the use of a number of grammatical constructions, viz., terms with divided reference, subjunctive clauses, essential pronouns and plural constructions, all being linguistic devices for distinguishing objects. So it is no coincidence that most children at an age of circa 3,5 years begin to grasp the cardinal principle; at this age they are in normal cases language users that can manage pronouns, plural constructions and counting. So we have a plausible naturalistic account of how we implicitly postulate positive integers and learn elementary arithmetic. It is no philosophical problem to understand the further constructions of negative and rational numbers. But when we consider reals we are in deep waters. Reals are more problematic than natural numbers and rationals since reals cannot be counted and most reals have no names or other linguistic representations.

4.6.2 Reals and Infinity Real numbers are classically defined as infinite sequences of rationals, thus treating such sequences as actual and completed infinities. If we think of abstract objects as basically being human constructions we are now in trouble for, certainly, no infinite set is ever been constructed by any human. Is it really legitimate to talk about an infinite set as an object? Aristotle distinguished between potential infinity and actual infinity. The series of natural numbers, for example, is infinite in the sense of never completed; starting anywhere in the series we can always go to the next number by applying the successor operator. The series of natural numbers is thus infinite in the sense that it is always possible to continue counting, there is no end. This notion of infinity is unproblematic. But Aristotle rejected the notion that the set of natural numbers is completed and an actually existing infinite set N . (Aristotle’s aim with the distinction between actual and potential infinity was to reject some of Zeno’s paradoxes.) Aristotle’s view was for a very long time generally accepted among mathematicians. Here are some quotes: Gauss (2011, 216): I protest against the use of infinite magnitude as something completed, which is never permissible in mathematics. (letter to Schumacher, 1831)

68

4 Mathematical Knowledge and Mathematical Objects

Cantor (1883, 205, n.3): I cannot ascribe any being to the indefinite, the variable, the improper infinite in whatever form they appear, because they are nothing but either relational concepts or merely subjective representations or intuitions (imaginationes) but never adequate ideas.

Jonathan Lear (1980, 193): It is easy to be mislead into thinking that, for Aristotle, a length is said to be potentially infinite because there could a process of division that continued without end. Then it is natural to be confused as to why such a process would not also show that the line to be actually infinite by division. . . . [I]t would be more accurate to say that, for Aristotle, it is because the length is potentially infinite that there could be such a process. More accurate, but still not true, strictly speaking. Strictly speaking there could not be such a process, but the reason why there could not be is independent of the structure of the magnitude: however earnest a divider I may be, I am also mortal. Even at that sad moment when the process of division terminate, there will be more divisions which could have been made. The length is potentially infinite not because of the existence of any process, but because of the structure of the magnitude.

I do think the protests against the notion of the actual infinite are valid. And, certainly, viewing numbers as human constructions one can scarcely claim that a number is identical with an actually performed infinite construction. However, there is an alternative conception of reals not built upon actual infinite sets. This conception is the starting point in constructive analysis.

4.6.3 Constructive Analysis Constructivism in mathematics began with Brouwer, who rejected the view that mathematical objects exist independently of us humans. Brouwer’s version of constructivism, intuitionism, has not commanded wide support, but Bishop’s version (Bishop and Bridges (1985)) has met more interest. Billinge (2003, 176) describes the situation thus: At the beginning of the 1960s the prospects for constructive mathematics looked bleak. Brouwer’s Intuitionistic Mathematics and Russian Constructive Mathematics following Markov had inspired much work in logic and meta-mathematics, but had singularly failed to grab the imaginations of ordinary mathematicians. Analysts, algebraists, measure theorists, and the like continued to work exclusively with classical mathematics. However, in 1967 Errett Bishop published Foundations of Constructive Analysis. This book changed the prospects for constructivism entirely. Bishop showed how it was possible to do ordinary analysis within a constructive framework.

In Bishop and Bridges (1985) reals are defined as follows: A sequence (xn ) of rational numbers is regular if |xm − xn |  m−1 + n−1 (m, n ∈ Z+ )

(4.1)

4.7 Gödel’s First Incompleteness Theorem and the Law of Excluded Middle

69

A real number is a regular sequence of rational numbers. Two real numbers x = (xn ) and y = (yn ) are equal if |xn − yn |  2n−1

(4.2)

Consider two regular sequences (xn ) and (yn ) of rational numbers. In the case the sequences converge they will sooner or later be so close to each other that they fullfil 4.2. In that case Bishop & Bridges declare them as identical and being the same real number. Thus by Eqs. 4.1 and 4.2 reals are defined using only finite sequences and real numbers are accepted as mathematical entities. From an empiricist point of view Bishop’s constructivism is congenial. However, as all constructivists he dismisses bivalence and the law of excluded middle. In this respect I beg to disagree. As argued in Sect. 4.3, we may impose certain restrictions on concept formation so as to avoid paradoxical constructions, thereby giving a reason to hold on to bivalence in mathematics. Viewing bivalence as a norm on predicate construction is actually not so big an aberration from common standards in mathematics and logic as might seem at first glance. An inquisitive person might ask ‘why are philosophers and scientists so obsessed with logic; why should we follow only logical rules when reasoning? And the answer is something like: you should follow logical rules as much as possible, because they are beneficial in your thinking: if you use only logical rules you have a guarantee that the conclusion is true, provided the premises are true. So to follow logical rules may be seen as an instrumental norm, helping us to diminish the risk of believing false statements. Similarly with bivalence, it is instrumental in mathematical reasoning. But we need to impose restrictions on predicate construction in order to avoid paradoxes. The constructivist alternative of identifying truth with constructive proof and dismissing bivalence is not appealing, in particular when considering Gödel’s incompleteness theorem.

4.7 Gödel’s First Incompleteness Theorem and the Law of Excluded Middle Constructivists in the philosophy of mathematics identify truth with constructive proof, a position labelled ‘semantic anti-realism’. (My constructivism, which may be labelled ‘holistic constructivism’ is different; I view mathematical objects as being constructed and that does not entail identification of truth with proof.) As we have seen, this identification entails rejection of the principle of bivalence. Some critics of semantic anti-realism has referred to Gödel’s first incompleteness theorem to undermine this identification. Gödel’s first incompleteness theorem tells us that in any consistent formal theory rich enough to include elementary arithmetic one can formulate true but unprovable sentences. A condensed way of expressing this theorem is: truth

70

4 Mathematical Knowledge and Mathematical Objects

transcends provability. Since this theorem is constructively provable, this result at first sight seems to justify the conclusion that every well-formed formula either is true or false, irrespective of the existence of any direct proof. But it does not; the fact that there are unprovable but true sentences in a theory doesn’t entail that truth-value gaps generally are excluded. What it undermines is the identification of ‘true’ and ‘provable in a formal theory.’ It does not contradict the identification of ‘true’ and ‘provable’, the latter predicate taken in a general sense. We may remember that the proof of Gödel’s first incompleteness theorem consists in the construction of a sentence with the intended meaning ‘I am not provable in T’ and since it is proved not to belong to the set of provable sentences in T, it is true. In the words of Pataut: What the theorem shows is that the extension of the predicate ‘recognisable as true’ exceeds the extension of the predicate ‘provable in P’, which is a quite different matter. It shows that the extensions of these two predicates do not coincide. What we have acknowledged so far is that the truth-conditions of A are transcendent with respect to its provability in P. (Pataut 1998, 73)

But, as Raatikainen has pointed out, Gödel’s incompleteness theorem poses a more severe threat to intuitionist’s identification of truth with provability than is usually recognised: The whole picture I want to consider here is beautifully expressed by Sundholm: ‘Proofs begin with immediate truths (axioms), which themselves are not justified further by proof, and continue with steps of immediate inference, each of which cannot further be justified by proof’ Sundholm (1983, 162). I shall next argue that the two above ideas are incompatible. (Interestingly, also Beeson (1985) denies the decidability of proof relation. He ends up with this conclusion somewhat differently than the way I do.) For simplicity, let us focus on the provability in the language of arithmetic L(HA). Now given a finite sequence of formulas, it is certainly possible to check effectively whether every step in it is an application of intuitionistically acceptable rule of inference. But how about the premises? Only if one can in addition see that all the premises of a derivation are intuitionistically true one can say that one has a proof of the conclusion at hand. This is at least in principle possible if axiomhood is a decidable property. However, in the intuitionistic setting, it cannot be! For if it was, the intuitionistic provability could be captured by a formalized system. And then, by Gödel’s theorem, there would be truths that are unprovable, contrary to the basic principle of intuitionism, which equates truth with provability. . . . .. If one cannot tell whether the premises used in a derivation are acceptable, that is, true or not, one cannot tell whether one has a genuine proof before one’s eyes or not, contrary to the standard assumption of contemporary intuitionism. (Raatikainen (2004, 143-44))

I am convinced by Raatikainen’s argument. We have good reasons not to identify truth with provable or recognisable as true. We are justified in holding on to the principle of bivalence as good practice for language use. When and if a paradox is discovered we have reason to readjust language use by imposing restrictions on predicate constructions. It is a matter of cost-benefit analysis, which also was Quine’s conclusion in Quine (1981c).

4.8 Summary

71

4.8 Summary 1. All objects, including numbers and other mathematical entities, are in a Kantian sense constructed; they are the results of cognitive and linguistic actions. 2. Numbers and other types of mathematical objects are not constructed one by one, but collectively, when sortal predicates are utilised in making assertions. For example, all reals are constructed ones the predicate REAL NUMBER is established and an identity criterion is given. 3. Unrestricted use of predicates sometimes results in paradoxes, which is a reason not to unconditionally accept the universal validity of the principle of bivalence. 4. So far as we know, all known paradoxes results from impredicativity. Hence one has reason to restrict predicate construction so as to avoid impredicativity. 5. Supposing that one always can inpose restrictions on predicates, one may hold on to the principle of bivalence as a pragmatic decision in theory construction. 6. We can now use the law of excluded middle and double-negation elimination, i.e., accepting classical logic in the sciences. 7. But still we may hold that mathematical objects are the results of human linguistic actions.

Chapter 5

Induction and Concept Formation

Abstract The topic of this chapter is the induction problem. The views of Hume, Goodman, Quine and Wittgenstein are discussed, and their common stance, that inductive thinking is a natural habit among us humans, is stressed. Such natural habits make up the basis for concept formation, a point made by e.g. Wittgenstein in On Certainty. The demand for ultimate justification of induction should be rejected as a rationalistic mistake.

5.1 Induction in the Naturalistic Perspective All sciences except mathematics and logic apply inductive reasoning when drawing general conclusions from observed phenomena. Such inferences are ampliative; the conclusion is logically stronger than its premises. By contrast, a logically valid conclusion from a set of premises is not ampliative; the conclusion does not say more than its premises. Rules for logical reasoning have been studied since antiquity and there is almost universal agreement about the validity of at least the basic logical laws encapsulated by first order predicate logic; disagreement concerns the law of excluded middle and extensions to second order logic and modal logics. Many philosophers have in a similar vein tried to formulate basic rules for inductive reasoning. A popular idea has been that, whereas a logically valid inference ends in true statements whenever the premises are true, inductive inferences result in conclusions which are probable to some degree whenever the premises are true. But, alas, efforts in this direction have been in vain; no theory to this effect have so far survived reasonable criticism. So one is prone to ask; is there really any inductive logic to be found? Is it possible to formulate general and formal rules by which we can justify inductive reasoning? Hume stated the problem clearly: There are two possible ways for justifying a proposition, either to show that it follows logically from other propositions held true, or that it is supported by experience. Neither can be used in a general justification of using induction: if we argue that past experiences show us that inductive reasoning quite often is successful and therefore continued use of induction is justified, our © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_5

73

74

5 Induction and Concept Formation

reasoning is circular. Neither can logic provide any justification, and since there are no other options, inductive reasoning cannot be given any justification at all. Many philosophers have tried to rebut Hume’s skeptical conclusion, without success in my view. In particular, using probability theory in the efforts to formulate an inductive logic is of no help; one may easily recognise that Hume’s original argument still applies when one tries to use probability theory, for the simple reason that knowledge about probabilities is based on previous experiences. Popper (1992) claimed that science should be based on rationally justified methods, and since induction fails this criterion, science should do without induction. Instead we should adopt falsificationism as the only rational scientific method; we should do our best to try to falsify our hypotheses, and if the attempts fail we could say no more than that our hypothesis is corroborated. In Popper’s terminology the word ‘corroborated’ means no more than ‘so far not falsified’; hence, if we denounce inductive reasoning we cannot say that a corroborated hypothesis is more probable than before, or that we can attribute any degree of belief to this hypothesis. However, Popper has been criticised for being inconsistent, since he further argued (Popper and Schilpp 1974, 1192-3) that we have reason to think that of two hypotheses, neither of which is falsified, the one which is better corroborated, i.e., has survived more tests, is closer to truth, have more truthlikeness, than the other. In short, corroboration is a sign of truthlikeness. This argument is a species of induction, as Newton-Smith once pointed out. Popper admitted that it is a ‘whiff of inductivism’ and Newton-Smith retorted ‘This is a full-blown storm.’ (NewtonSmith 1981, 68). I cannot but agree with Newton-Smith. For my own part I have no problem accepting that degree of corroboration is a sign of truthlikeness (supposing that we can give a better analysis of truthlikeness than Popper’s failed attempt), but Popper cannot coherently take this stance since he rejected inductive reasoning tout court. It is a telling fact that not even Popper succeeded in formulating a scientific methodology totally devoid of inductive reasoning. The induction problem is still with us; we use a form of inference, which we see no way of defending. Quine once characterised our situation with his characteristic wit: “The Humean predicament is the human predicament.” (Quine 1969, 72). As with many other predicaments, the solution is, I believe, to reconsider the tacit presuppositions at work when formulating the problem. My suggestion is to start by asking: why do we want a general justification of induction? The usual answer is that it is the business of epistemology to provide a foundation for the sciences. Science is that kind of human activity, which should fulfill the highest standards of rationality, and that means that we, ideally, should be able to justify our scientific methods. A lot of specific methods are species of induction, hence we need a general justification of induction. Such a stance is part of a common view that philosophy, in particular epistemology, is more basic and fundamental than the special sciences. This view is motivated by the idea that scientific knowledge should be as certain as possible and since certainty cannot be had without a certain foundation of the general principles, epistemology should provide such certainty. But this train of thought is in my view erroneous. It is based on a rationalistic outlook, the notion

5.1 Induction in the Naturalistic Perspective

75

that we humans are able to know, a priori, something about the relations between human minds and the external world. I can’t see how such a priori knowledge is possible. Epistemology has traditionally tried to answer the question what we ought to believe, not merely to describe what we in fact believe; it has a normative component. It results in recommendations how to proceed in scientific thinking. But do scientists care? Often not; it seems to me an obvious fact that in cases where philosophically motivated epistemic principles conflict with actual scientific practice, the latter usually wins. Practicing scientists never care about the induction problem. Science proceeds well without any justification for induction and one may be forgiven for wondering why we philosophers should bother. Established inductive inference principles sometimes result in false conclusions. That is unavoidable, but it naturally triggers the question how to diminish the frequency of mistakes. One obvious suggestion is the recommendation to use double blind tests as much as possible in statistical testing. This has no doubt decreased the risk of making errors when inferring from samples to populations. And it is obvious that the recommendation to use double blind test is based on inductive reasoning. In other words, inductive reasoning is improved by applying inductive reasoning at a meta-level. Rationalist philosophers are prone to criticise this form of reasoning as circular. But I’m not impressed; it seems to be to be a good circle, not a vicious one. Should we then stop doing epistemology as a pointless enterprise? I think not. Instead we should reconsider our picture of the relation between philosophy and empirical science. Like many present day epistemologists, I adopt a naturalistic stance, i.e., to view epistemology as part of our scientific and empirical study of the world; epistemology is the study of how the cognitive apparatus of humans works and under what conditions the resulting cognitive states represent real states of affairs. In such an endeavour no factual a priori knowledge is needed. The old-age problem of rebutting total skepticism could just as well be left aside as a purely internal problem for philosophers; no scientist is bothered, and in fact, not even the skeptics themselves. They continue to live their lives unreflectively acting as if they believe most of the same things as the rest of us. Traditional epistemology results in epistemic norms. The critic might now claim that, as an empirical study, naturalised epistemology cannot entail any norms and so it cannot do its work. My reply is that it can result in statements of normative form (’Do so and so!’), but that does not entail the existence of a kind of entities, NORMS. We may well accept a declarative sentence as true without accepting that sentence describes a FACT and we may similarly accept the validity of a normative statement without accepting the existence of any norms. Epistemic principles have the form ‘do so and so in order to obtain knowledge’, or maybe ‘do so and so in order to minimise the risk of drawing false conclusions’. Such statements can be reformulated as conditionals, such as ‘if you want to get knowledge, do so and so’. Such a sentence could be the conclusion (an inductive one!) of empirical investigations of our cognitive faculties and earlier failures. Now, our goal of obtaining knowledge are often left unsaid as a tacit condition, since in many contexts it is obvious and we follow Grice’ rule of not saying obvious things.

76

5 Induction and Concept Formation

Hence we just utter the consequent, which is a sentence of normative form. Thus, the normative form of epistemic principles can be explained as a conditional where the antecedent, ‘We want to know’, is tacit. Many norms have this character. For example the social norm ‘Do not play music loudly if you live in a flat’ could be interpreted as tacitly presupposing that people normally want to have good relations with their neighbours and in order not to jeopardise that goal, they should avoid disturbing them. And the conditional is based on an inductive inference from ones own and others’ experiences. In the naturalistic view epistemology is fallible and revisable as all our knowledge. There is no vantage point of view from which to judge whether a particular method is bad or good; such a judgement must be made from within the sciences. The conditionals we believe and express as sentences of normative form, leaving the condition tacit, are the results of empirical investigations and every day experiences. Naturalists oppose transcendentalism, i.e., the Kantian philosophy of stating a priori conditions for empirical knowledge. Kant was right in insisting that (propositional) knowledge presupposes concepts, but wrong is assuming that the mind and its functional structure could be studied a priori from a transcendental point of view, as if it were something outside the natural world. The naturalist move is to say that our formation of concepts is a natural process, which can be studied by science in the usual way; that is done in cognitive science. (I do not claim, for certain, that naturalists have explained the intentional character of mental concepts; but the naturalist sets his task as precisely doing that.) We humans are part of the natural world and our cognitive capacities are our means of interacting with our environment. And knowledge about our cognitive capacities, for example our ability to form useful concepts and to apply induction, have no a priori status; these capacities are functions in the natural world which we may study by ordinary scientific means.

5.2 Justification in the Naturalistic Perspective Propositional knowledge consists of true, justified beliefs, according to the standard definition. Justification is a relation between beliefs or statements expressing these beliefs; one statement can contribute to the justification of another statement. No matter how we analyse the relation, it is obvious that a general demand for justification will result in an endless regress; if B justifies A, one will immediately ask for a justification of B etc. In practice we must stop somewhere and epistemological foundationalists have thought that the endpoints must be some kind of a priori and self-justifying statements. But is there any a priori ground for empirical knowledge, a set of basic and selfevident statements? I think not; any statement can be doubted; even the simplest observation or the most obvious logical principle can and has been doubted. Now, the sceptic attacks; from the statement ‘Any particular statement can be doubted’, it follows, he claims, ‘All statements can be doubted’. But this is an invalid inference.

5.2 Justification in the Naturalistic Perspective

77

The premise can be paraphrased as ‘For all x, if x is a statement, it is possible that x is false’ and the conclusion renders ‘It is possible that for all x, if x is a statement, x is false’. This is an invalid inference, no matter which modal logic you adopt. The same point has been made, in a different context, by Davidson: Yet, it has seemed obvious to many philosophers that if each of our beliefs about the world, taken alone, may be false, there is no reason why all such beliefs might not be false. This reasoning is fallacious. It does not follow, from the fact that any one of the bills in my pocket may have the highest serial number, that all the bills in my pocket may have the highest serial number, or from the fact that anyone may be elected president, that everyone may be elected president. (Davidsson 1991, 192)

So it is perfectly consistent to say that none of my beliefs are beyond doubt, that anyone might be false, and at the same time hold that most of my beliefs are true. Doubts about a particular belief are based on other beliefs not in doubt. But how can there be starting points in chains of justification which are not justified? This is, certainly, a problem for traditional epistemology. But in the naturalistic perspective we do not ask for ultimate justification; instead we look for intersubjective agreement of observation reports; such agreements make up the empirical basis in the empirical sciences, the endpoints where chains of justification begin. And this fact is the reason why it is appropriate to change from talking about beliefs to talking about statements/sentences; beliefs are subjective states, statements are open to intersubjective inspection. We humans are normally able to agree about shared observations. When several people at the same spot and talking the same language observe an event, they normally agree on at least some descriptions of it, so long as no intentional notions are used. Since any observed situation may be described in many different ways, people may disagree about what should be called the most salient description of what happens, but that is another thing. Some descriptions of observed events are agreed to be true.1 This does not mean that agreement is a guarantee for the truth of the sentences agreed upon. But it is a basis for empirical knowledge in the sense of a starting point in an ongoing discourse. Rejection of a previously agreed sentence is possible, if coherence arguments, emanating from our background knowledge, against it are strong enough. But this in turn depends on agreement about the truth of other observation sentences.2 Intersubjective agreement is not a species of justification, since justification is a relation between beliefs and between sentences expressing these beliefs, whereas agreement is not a species of belief; it is a kind of collective action. People may

1 Quine

and Davidson discussed translations between different languages, both concluding that different incompatible translations of a foreign language are compatible with all possible observations. But the basis for any translation is agreement to dissent or assent to occasion sentences among those on the spot. See Quine (1960) and Davidson (1973b). 2 Se also Sellars’ Empiricism and the Philosophy of Mind, chapter VIII, for an argument, based on Kantian perspective, to roughly the same conclusion.

78

5 Induction and Concept Formation

agree to assent to a sentence uttered at a particular occasion while having different beliefs about its meaning. We ask for justification when we doubt a certain statement made. In cases when two or more people at the same spot are able to observe something and agree on the observation, the demand for justification has come to an end. Consider for example several tourists being on a guided tour somewhere in Africa. One in the group suddenly exclaims ‘Look, an elephant! No one seeing the elephant would ask for justification. Such intersubjective agreements function as implicit determinations of the extensions of the predicates used in observation reports. This fact is most clearly recognised when we reflect on how infants learn their first language. For example, we learn a little child words for colours by pointing; we point to a number of hues of e.g., blue and say ‘This is blue’ (if we speak English). Learning to use ‘blue’ correctly requires repetition, situations where we point at blue things and say ‘blue’. After some time the child can correctly identify blue things. No one will ask for reasons. We have learnt the child correctly to use the predicate ‘blue’. In other words, we have learnt it the (approximate) extension of this predicate. The extension of the predicate ‘blue’ is somewhat vague. How would a child classify a hue between blue and green, if it has only learnt the words ‘blue’ and ‘green’? It depends on its internal dispositions for similarity among colour hues. If the unclear case by the child is perceived as more similar to blue than to green, it will call it ‘blue’, otherwise ‘green’. Thus classifications of perceived objects is determined by spontaneous perceptions of similarities. This is a point made by Quine (1969). One might wonder to what extent people agree on what is more or less similar. Small differences are to expected, but it is a fact about us humans that when it comes to colours people in general agree. (I myself, however disagree sometimes because I’m colourblind, which is a genetic defect affecting almost 8% of men.) In the first stages of learning one’s mother tongue such learning of predicates is common. Wittgenstein argued this point at least at two places in his oeuvre. The first is in §§143–202 of Philosophical Investigations, where we find his famous discussion about the notion ‘to follow a rule’. He discussed a simple rule of arithmetic, addition, and considered the possibility of explicitly stating rules for its application in particular cases. When so doing we get another rule and the application of this in turn requires another rule. Very soon we find that we just do things without any justification. Wittgenstein arrives at the conclusion in §202: ‘And hence also “obeying a rule” is a practice’. The point with this remark is, I believe, that the request for general justification cannot be met and the search for it is a misconception of the task of philosophy.3 3 There

is an enormous debate about this famous passage in Philosophical Investigations. To me it is obvious that Wittgenstein’s point is that language usage is open-ended and based on habits. The demand for ultimate definitions of meanings of linguistic expressions is a modern version of the rationalists’ demand for fundamental justification of knowledge, a demand that Wittgenstein totally rejects. And we empiricists agree.

5.2 Justification in the Naturalistic Perspective

79

The second place is remark 150 in Wittgenstein et al. (1969): 150. How does someone judge which is his right and which his left hand? How do I know that my judgment will agree with someone else? How do I know that this colour is blue? If I don’t trust myself here, why should I trust anyone else’s judgment? Is there a why? Must I not begin to trust somewhere? That is to say: somewhere I must begin with not-doubting; and that is not, so to speak, hasty but excusable: it is part of judging.

To judge, to express one’s beliefs, is to apply predicates. I interpret Wittgenstein as saying that those beliefs/statements which we hold true without justification function as criteria for use of the predicates used in such statements, i.e., as partial implicit definitions of these predicates. And the same applies when we agree on descriptions of what we observe. An observation sentence agreed upon may be viewed as having the function of partial implicit definition of the predicate used. Asking for justification of such a sentence is to misunderstand its function. Every chain of justification ends in such partial and implicit definitions; at every moment we unreflectively hold true some beliefs while doubting others. This holds true even in logic; if we for example try to justify modus ponens we find ourselves using modus ponens, as is nicely shown by Lewis Carroll in the famous dialogue ‘What the tortoise said to Achilles’ (Carroll 1895). The discussion is about a certain inference in Euclidian geometry. Achilles asks the Tortoise to accept the conclusion Z upon the premises A and B: A: Things that are equal to the same are equal to each other. B: The two sides of this Triangle are things that are equal to the same. Z: The two sides of this Triangle are equal to each other. The tortoise accepts A and B but do not yet accept the conclusion Z. Achilles and the Tortoise agree that in order to accept Z one need to accept A, B and the hypothetical, C: If A and B are true, then Z must be true. So they agree to make this completely explicit by writing in a notebook: A: B: C: Z:

Things that are equal to the same are equal to each other. The two sides of this Triangle are things that are equal to the same. If A and B are true, then Z must be true. The two sides of this Triangle are equal to each other.

Achilles now maintains that logic tells us that Z is true. However, Tortoise still expresses doubts about Z and Achilles then repeats the move. He asks the Tortoise to accept: D: if A, B and C are true, then Z must be true. Tortoise now accepts A, B, C and D, but he still expresses some doubts about Z. Achilles once more repeats his move and the dialogue continues infinitely. The point Lewis Carroll wanted to make, was, I think, that we cannot really say that the general rule modus ponens justifies its instances. Rather, the inference rule

80

5 Induction and Concept Formation

modus ponens must be seen as a description of how we in fact use the if-thenconstruction. The naturalist has only to add that this is our way of thinking and talking. If someone would fail to use the if-then-construction correctly the only thing one can do is to give examples of its use; fundamental rules cannot be proved. Hence, by explicitly accepting modus ponens as a valid inference is the same as accepting it as an implicit definition of the sentence operator ‘if. . . . . . then. . . . . . .’. Similarly, many basic beliefs, when expressed as sentences held true, function as implicit definitions of predicates occurring in these sentences. In science we often introduce new predicates in this way. To my knowledge the earliest example is the introduction of MASS. In Principia Newton explicitly introduced the word ‘mass’ as short for ‘quantity of matter’. This expression in turn was ‘defined’ in the very first sentence of Principia: “The quantity of matter is the measure of the same, arising from its density and bulk conjointly.” But this formulation is, I believe, a rhetorical move against Descartes, who held that quantity of matter is volume, for one is immediately prone to ask how Newton defined ‘density’; obviously he cannot, on pain of circularity, define density as mass per volume unit. The empirical basis for the introduction of ‘quantity of matter’, i.e., ‘mass’, is the discovery of conservation of momentum made by John Wallis, Christopher Wren and Christiaan Huygens almost 20 years before the publication of Principia. They found that two colliding balls always change velocity in constant proportions. Hence one can attribute constant quantities to bodies, viz., their masses. Newton extensively rehearses their findings in the first Scholium (after Corollarium VI) in Principia and it is clear that this is the empirical basis for the introduction of the predicate ‘mass’. I will discuss this in more detail in Chap. 10. Those basic beliefs, which constitute endpoints in instances of chains of justification, concern many sort of things, such as observable objects in the vicinity of the speaker, the meaning of sentences uttered by other people, common opinions about nature and society etc. Popper, by the way, was fully aware of this; in his methodology he introduced basic statements as a set of truths taken for granted for the time being in a particular case of testing a hypothesis. But a basic statement can be doubted if other things appear more safe and consistency requires that something is given up. These basic beliefs are mostly taken for granted as part of the background to our discourse and interaction with other people. For example, it is a non-negotiable fact about us humans that we take it for granted that ordinary things in our vicinity, including other people, have roughly the same properties from one moment to another and will respond roughly in the same manner. We also take it for granted that other people in general mean something with their words and we do not really doubt that our fellow humans think, have feelings and experiences more or less similar to ours. We follow a large number of rules, linguistic as well as non-linguistic, in every moment of our waking life without really justifying them. This is our way of being as animals with linguistic competence.

5.2 Justification in the Naturalistic Perspective

81

I see a resemblance between Carroll’s and Wittgenstein’s stance on ultimate justification. And, of course, the idea traces back to Hume’s position in Treatise when he discussed the skeptic’s doubt about the veracity of our immediate experiences of external objects. Hume concluded that a convincing argument cannot be given, but it does not lead to doubts about the existence of external objects: Thus the sceptic still continues to reason and believe, even tho’ he asserts, that he cannot defend his reason by reason; and by the same rule he must assent to the principle concerning the existence of body, tho’ he cannot pretend by any arguments of philosophy to maintain its veracity. . . . .We may well ask What causes induce us to believe in the existence of body? but ’tis in vain to ask Whether there be body or not? That is a point, which we must take for granted in all our reasonings.’ (Hume 1986, 238)

Thus Hume did not aspire to justify that our experiences are caused by external objects. Instead he stated that it is an empirical fact about us that we do believe that our perceptions are perceptions of external physical objects and we do believe that these objects may cause each other’s motions. It belongs to our nature to assume that external objects exist and cause our impressions. One may say that, in Hume’s view, someone who claims to be skeptical concerning the existence of external objects and other mundane things is not serious; he professes skepticism, but that is just empty talk. Hume’s stance is the first exposition of epistemological naturalism. The most explicit proponent of epistemological naturalism is Quine. The common trait in Hume’s and Quine’s position is the stance that justification of beliefs from a vantage point outside the realm of empirical knowledge is impossible. The difference between Hume and Quine is that Quine thinks it possible to give a scientific explanation of the interaction between our mind and the external world, whereas Hume is satisfied without such an explanation, he just notes that certain ways of thinking belongs to our nature. The attempted scientific explanation of the interaction between the external world and the mind is by Quine seen as epistemology in a new key. Epistemology is thus not the foundation for empirical knowledge, but an integrated part of it; as Quine puts it, ‘a chapter of psychology and hence of natural science.’ (Quine 1969, 82). Had Hume endeavoured to justify the veracity of our impressions by somehow arguing that they reflect the real nature of things, he would have ended in rationalistic metaphysics, thereby contradicting his own empiricist principles. For my own part I would say that naturalism is the natural development of empiricism. Epistemological foundationalists assume that there must be endpoints in chains of justification, statements that we accept as certain without being justified by something else. In older times some such statements were called ‘self-evident’, but this label has come into disrepute; there is in the history of mathematics examples where we now dismiss as false statements once held to be self-evident. (One example is Euclid’s axiom that the whole is greater than any of its parts.) It is obvious that there must be endpoints of justification, but foundationalists’ mistake is to conceive these endpoints as certain knowledge in the sense of justified true beliefs. To repeat, those endpoints are statements that function as implicit definitions of predicates; that is why they are certain. This was the core point of Wittgenstein’s On Certainty.

82

5 Induction and Concept Formation

5.3 Evidence and Justification A core problem in epistemology has always been to analyse how the external world relates to our beliefs. This is but one aspect of the mind-body problem. Wilfrid Sellars has formulated it as a question of the relation between the space of causes (i.e., the external world) to the space of reasons, our internal world. It seems indeed reasonable to say that external states of affairs may cause some of our beliefs. We take it for granted that we must recognise a state of affairs before it can cause any belief, but since the term ‘recognise’ belong to the space of reasons, it describes mental acts, we have not really made any progress. Rather soon one realises the profound truth of Sellars phrasing: our terms can be divided into two categories, those which belongs to the space of reasons and those which belongs to the space of causes and there seems to be no bridge between these two types of terms. But what about the term ‘evidence’? It is often used to relate items in the space of causes to items in the space of reasons; an object, a state of affairs, a data set, all may be called ‘evidence’ for the content of a belief and the sentence expressing that belief. (I’m not claiming that using ‘evidence’ is the solution to the mindbody problem; I’m just describing how we use this term.) Typically, we say that things observed constitute evidence for observation reports. An amount of DNA found on a knife by which a person has been murdered, is strong evidence for the proposition that the murderer has that DNA. We believe it certain that the person having this DNA has touched the knife, and the sentence ‘The person with this DNA has touched this knife’ then justifies the sentence ‘That person was the murderer’. As was discussed in Sect. 3.8, usage has it that ‘justification’ relates things that are truth-apt, whereas ‘evidence’ has a broader use; it may also relate a fact or state of affairs to something that is truth-apt. Hence if A justifies B, then A is evidence for B, but the converse is not always true. The claim that something, an object, a data set or an observation, is evidence is defeasible; another piece of evidence may overrule it. How do we know that A is evidence for B in those cases where A is an object with a certain property and B is a sentence? Just as in the case of learning colour words, it is a matter of learning to use the word ‘evidence’. My observation of a person being at a particular place and time is evidence for the statement that person was at that particular place at that particular time; This may be taken as a partial implicit definition of ‘evidence’. By paraphrasing Lewis Carroll’s dialogue in Sect. 5.2 about modus ponens I indicated that the situation is similar in logic, i.e., no ultimate justification is to be found. Even logicians are sailors on Neurath’s boat.4 However there is a difference

4 Cf:

‘Wie Schiffer sind wir, die ihr Schiff auf offener See umbauen müssen, ohne es jemals in einem Dock zerlegen und aus besten Bestandteilen neu errichten zu können.’ (Neurath 1932) ( ‘We are like sailors who on the open sea must reconstruct their ship, without going to the shipyard, and rebuild it from the best parts.’)

5.4 Induction and Concept Formation

83

between deductive and non-deductive reasoning. Deductive principles are valid independent of context of application, whereas no such general principle applicable for all cases of non-deductive reasoning can be found. Non-deductive reasoning can only be appraised case by case, in specified contexts. (We need not here distinguish between induction and abduction, both are non-deductive forms of inference.)

5.4 Induction and Concept Formation In the naturalistic view the problem of induction is thus not that of justifying induction in general, that is impossible. But it is obvious that we do not consider all instances of inductive thinking equally good; we have strong intuitions that some conclusions are much more reliable than others. Hence, we should reformulate the induction problem as the task of describing more thoroughly our inductive practices and to give an account of the methodological role induction has in our scientific work. We should try to explain why we think that certain inductions are more trustworthy than others. This is roughly Goodman’s way of viewing the matter in his Goodman (1955). More precisely, he asked what kind of predicates is used in (normal) inductive reasoning. To illustrate the problem he construed the artificial predicate grue, defined as true of things examined before some time in the future, AD 3000 say, and found to be green, or examined after AD 3000 and found blue. All emeralds so far examined are thus both green and grue. Without further constraints simple induction tells us that we have equal reason to assume that the first emerald to be examined after the year 3000 will be green as well as grue, i.e. blue. One prediction, at least, will ultimately fail and we all believe that emeralds will continue to be green. But why? This is the induction problem in the new key. Goodman’s formulation of the problem is that some predicates are projectible and some other not. Obviously, we need to know the conditions for a predicate being projectible. Goodman suggested that the notion of entrenchment could be used in order to distinguish between projectible and non-projectible predicates. But why do some predicates become entrenched? Goodman gave no answer. Being a naturalist I will here propose an evolutionary explanation: we humans have in the course of time evolved certain cognitive habits, viz., those that have made us more apt for survival and reproduction. We have invented concepts, which we use in predictions of future events, and sometimes this is successful. Let us consider colour concepts; they are not a priori in the sense of being innate, independent of any experience. They are not even universal; different cultures divide the colour spectrum differently, see e.g. Berlin and Kay (1969) and Saunders (2000). There are languages which have two, three, or four colour words only. Consider for example people talking a certain language which do not distinguish between green and blue, (and there are such people according to Berlin and Kay 1969); they have one colour word for all hues from blue to green, let us call it ‘tribegrue’. How should we express Goodman’s problem in their language? One option is to

84

5 Induction and Concept Formation

translate all three colour words, ‘blue’, green’ and ‘grue’ as ‘tribegrue’, in which case Goodman’s point is lost. Hence, Goodman may be taken to have shown that predicates are cultural phenomena and what from one background appears artificial from another background appears natural. Even though Goodman did not answer his own question, his analysis is a step forward, because he replaced the quest for general justification of induction with the more empirical question ‘under what conditions can a particular instance of induction be expected to be successful?’ However, stating the problem in terms of the distinction between projectible and non-projectible predicates, taken one at a time, is not satisfying. Goodman overlooked a crucial component in describing the situation, viz. the identification of the referents of the singular terms used in our observation statements. When we for example ask which predicate to use in generalisations about emeralds, green or grue, we should also consider the rules we follow in identification of emeralds. The question is thus not which single predicate, green or grue, to use in a particular case of inductive reasoning, but the correct pairing of predicates. In the sentence ‘This emerald is green’ we have two predicates, ‘emerald’ and ‘green’. Obviously, we use a predicate, ‘emerald’, in the identification of the referent of the noun phrase in the sentence. In general, any inductive conclusion has the form ‘For all x, if Ax, then Bx’, hence the real question concerns the relation between the predicates ‘A’ and ‘B’. The induction problem can now be reformulated as: for which pair of predicates A and B is it reasonable to expect that if an object satisfies A, it also satisfies B? In the case of grue or green emeralds it is rather simple. We know that emeralds consist of the mineral beryl contaminated with chromium. This metal makes the mineral green, according to fundamental physical laws. The necessary and sufficient condition for something to be an emerald is that it is a gem made up of beryl containing chromium. The same condition entails, via scientific laws, that it is green, independent of time. Hence if something satisfies the predicate ‘emerald’, it satisfies also the predicate ‘green’. I rely here on the concept ‘physical law’ and on the fact that laws are justified empirically by being generalisations of observations, to be further discussed in Chap. 10. Hence the argument depends on previous inductions. This is no vicious circle as already pointed out. Speakers of a language which do not distinguish between green and blue will of course say that all emeralds are tribegrue and this inductive conclusion is correct, if the conclusion that all emeralds are green is correct. So long as they have no practical need of distinguishing blue from green objects, their use of ‘tribegrue’ is successful and in no need of improvement. Perhaps members of this community at some time will find it useful to make finer colour distinctions. History of science is full of conceptual developments of this kind, the most obvious cases are perhaps introductions of finer distinctions between diseases. Two diseases may have similar symptoms but very different etiology, in which case it is necessary to keep them apart as different diseases if one wants to prescribe cures.

5.4 Induction and Concept Formation

85

Suppose we have observed a regularity in nature: So far, all observed objects are such that if they satisfy a predicate A, they also satisfy another predicate B. Let us assume that both predicates are expressions taken from our vernacular without using scientific theory. We thus have two options: either to assume that the regularity so far observed is a mere coincidence, i.e., an accidental generalisation, or else to assume that it reflects a hidden structural feature. Taking the first option is to guess that sooner or later will we hit upon a counterexample. The second option is to guess that the generality ‘for all x, if Ax, then Bx’ is true. If this is correct, we have found a natural law. Suppose we have found a natural law by inductive reasoning. Isn’t the existence of such laws a bit astonishing: Why is it the case that an indefinite number of objects satisfy two logically unrelated predicates? Is not the most reasonable assumption that the probability for such a state of affairs is zero? History of science suggests two ways of explaining such regularities. The first possibility is to derive the regularity, or some version close to it, from a set of more fundamental and independently acceptable principles. A telling example is the general law of gases. This law began life as Boyle’s observation that the product of pressure and volume of a portion of gas is constant. Later, Jacques Charles in 1787 and Joseph Lois Gay-Lussac in 1808, found that this constant depends on temperature and still later the complete general law of gases was formulated when the concept of mole was available. For some time this law appeared to be an empirical regularity, a brute fact. However we now know that it can be derived from the principle of energy conservation, given the identification of absolute temperature as mean translational kinetic energy among the particles making up the gas. So it is not just an empirical fact that the two open sentences ‘x is a gas.’ and ‘The pressure, volume and temperature of x satisfies the equation pV=nRT.’ are both satisfied by the same objects. It follows from a basic principle, given some auxiliary assumptions. This brings us to the second way explaining the remarkable fact that an indefinite number of objects all satisfy two unconnected predicates. Many scientific predicates start their lives as part of our vernacular, ‘energy’ and ‘force’ are two obvious examples. As science advances vague notions are sharpened and changed into scientific predicates with explicitly defined criteria of application. And, of course, many new predicates are introduced by implicit or explicit definitions. The crucial point is that in this process of conceptual development a well-established regularity is normally not given up. Suppose we have such a well-established generality, ‘For all x, if Ax, then Bx’, and hit upon a putative counter example, an object which satisfies A but not B. Logically we have two options; either to drop the regularity and accept it being falsified, or to change the criteria of application of the predicate A so that the putative counter example can be excluded. A simple example of the latter is the history of the concept of fish. Aristotle had observed that dolphins have lungs, that mothers gave birth to living offspring and fed them with milk, hence he clearly recognised that they were not fishes. (He classified dolphins, porpoises and whales in the genus cetacea.) But his insights were forgotten and for a long time these mammals were classified as fishes. But fishes have gills, while cetaceans have no gills, so how to resolve this conflict? It

86

5 Induction and Concept Formation

was John Ray (1627–1705) who in his Ray (1693) finally recognised that dolphins, porpoises and whales are not fishes. Thus our predecessors did not give up the generality ‘All fishes have gills’, instead dolphins were reclassified as not being fishes. The intuitive criteria for being a fish, ‘animal swimming in the seas with mouth, fins and eyes’, or something of the kind, were sharpened by additional clauses. Another example is provided by the atomic theory and in particular the law of definite proportions. This law says that all elements have atomic weights which are integer multiples of the atomic unit, equivalent to the weight of a hydrogen atom. However, soon after the formulation (beginning of nineteenth century) of this law it was found that the atomic weight of chlorine is 35.5, indicating that chlorine in fact do not consist of a certain number of atomic units. But the law of definite proportions was not given up; instead one guessed, correctly, that chlorine samples extracted from naturally existing compounds is a mixture of two isotopes with different masses, Cl-35 and Cl-37, hence naturally existing chlorine is not really one single substance but two and the average weight of chlorine in naturally occurring mixtures is the weighted mean of Cl-35 and Cl-37. Thus, identification of substances were improved. These are two examples of a possible and sometimes reasonable strategy, viz., to keep the regularity and redefine the criteria of application of the predicate in the antecedent. New counter examples might trigger new adjustments of criteria of application of predicates. The logical endpoint of this process is reached when the set of necessary conditions for satisfaction of the predicate in the consequent is a subset of those for the predicate in the antecedent; in such a case no further counter example is possible, in which case we have arrived at a fundamental law, which is a (partial) implicit definition of one of the predicates in the law sentence. I’ll discuss this in detail in Chap. 10. One may observe that I here use ’fundamental law’ in an epistemic sense. In an axiomatic exposition of a theory other laws may be chosen as fundamental, since such a choice may result in the most elegant exposition of the theory in question. This might be called the logical sense of ‘fundamental’. Inductive reasoning is intimately connected with theory development, but both inductivists and falsificationists have told a distorted story. The inductive process also involves concept development. Inductive reasoning is our way of improving our predictive capacity. The success of empirical science and in particular the usefulness of induction is explained fundamentally in the same way as other evolutionary processes; it is the result of adaptation and competition, in this case adaptation of concepts to the way the world is structured and competition among theories.5 Summarising the argument, the answer to the question above is that the two predicates in a successful inductive generalisation are in fact conceptually dependent

5 The phrase ‘adaption to the way world is structured’ should not be read as a commitment to some version of structural realism; it is merely another way of saying that hypotheses sometimes are disproved by experiments.

5.5 Induction as a Heuristic Device

87

on each other, or can be so made, either by deriving the observed regularity from fundamental laws, or else are the criteria of application for the predicate in the consequent a subset of those for predicate in the antecedent. This argument applies, of course, not to ordinary language, only to a well structured scientific theory. So called ‘laws’ expressed in ordinary language are not strict regularities. It is obvious that this procedure of refinement of criteria for application of concepts, when repeated on more and more abstract and general levels, ends in a set of fundamental laws and principles which are accepted as fundamental without being derived from other principles. Now the same question recurs, with still stronger force: how is it possible that an indefinite number of objects satisfy two logically unrelated predicates? And the answer is still; these fundamental laws function as implicit (partial) definitions of theoretical predicates. This will be shown in detail in Chap. 10.

5.5 Induction as a Heuristic Device The picture emerging from all this is that induction should not be seen as particular form of reasoning for which one need independent and non-empirical justification, but as an heuristic device in theory construction. We observe in a number of cases a regularity using two more or less well-defined predicates. Sometimes we believe that the observed regularity reflects a structural feature in nature. This naturally induces the scientist to try to invent a theory which reflects this structural feature and the goal is reached when the theory entails the empirical regularity or some formulation reasonably close to it. What I have just said resembles to some extent what Aristotle claims in Posterior Analytics. According to Hankinson (1998, 168), Aristotle’s word ‘π αγ oγ  (epagoge) which usually is translated as ‘induction’, should not be interpreted as denoting an inference principle, but rather as a causal term: The method in which we arrive at first principles is called by Aristotle ‘epagoge’. Starting from individual perceptions of things the perceiver gradually, by way of memory, builds up an experience (empeiria), which is ‘the universal in the soul, the one corresponding to the many’ (Posterior Analytics 2.19.100a6-8); and it is this which provides the arche, or first principle: ‘These dispositions are not determinate and innate, nor do they arise from other more knowledgeable dispositions, but rather from perception, just as when a retreat take place in battle, if one person makes a stand, another will too, and so on until the arche has been attained.’ (2.19.100a9-13) This process gives us universals (such as ‘man’) without which we cannot utter assertoric sentences, which in turn lead to higher-order universals, such as ‘animal’ from particular species. (2.19.100a15-b3). It is described in causal, not inferential terms (which is why ‘induction’ is misleading): the world simply impresses us in such a way that we come to internalize ever wider and more inclusive concepts. We are by nature equipped to take on form in this way; if we are diligent and unimpaired, our natural faculties will see to it that we do so. Thus in a relatively literal sense we just come to see that Callias is a man and, ultimately by the same process, what it is to be a man.

88

5 Induction and Concept Formation

Hankinson here in fact says that Aristotle was a naturalist in the sense here given, and it seems to me that he has evidence for this interpretation. Furthermore, Hankinson’s remark that ‘we just come to see that Callias is a man and, ultimately by the same process, what it is to be a man’, is another way of saying that endorsing the truth of the sentence ‘Callias is a man’, is to hold that sentence is a partial implicit definition of the predicate ‘man’. It is thus clear that no justification for this sentence is needed, or indeed possible. Who is to say, in advance, that a particular inductive conclusion is justified or unjustified? In retrospect we can say of a particular inductive step and the resulting theory that it was successful or unsuccessful and hence in a sense justified or unjustified as the case may be. But we cannot decide that in advance. In this perspective, to ask for a general rule for accepting/rejecting an inductive generalisation would amount to assume that we could know a priori the structure of reality and the future development of a scientific theory. A traditional metaphysician might think that is possible, but a naturalist does not. A critic might say that all this presupposes what should be proved, viz. that nature is regular and not completely chaotic. The account makes only sense if there really are regularities to be found. I agree that a general faith in the regularities is presupposed, but that is also part of the naturalistic view-point. If nature were not sufficiently stable over longer periods of time, no biological evolution could have taken place and we would not be here to ask questions. The problem is not to justify the general assumption of regularity, since the demand for such a justification, again, is precisely what the naturalist rejects. Instead, the task is to discover which particular regularities there are in nature; that there are such regularities can be inferred from the fact that we human beings are here asking these very questions. Answering these questions is precisely the task of natural science.

5.6 Summary The request for general justification of induction should be rejected as based on a rationalistic outlook. Inductive thinking is a basic natural habit among all humans, as well as other animals. It is an empirical fact. Goodman stated what he called the new riddle of induction as the problem of distinguishing between projectible and non-projectible predicates. This misses the crucial point that the identification of things quantified over in a suggested inductive generalisation are identified by a definite description involving at least one predicate. So the proper formulation of the induction problem in the new key is what pairs of predicates are used in successful inductive generalisations. Inductive reasoning in scientific contexts sometimes result in concept refinement or concept formation. If a suggested inductive generalisation of the form ‘All A:s are B:s’ is faced with counter evidence, one can either sharpen the criteria for fulfillment of predicate A or widen the criteria for satisfaction of B. This can be repeated and the process ends when no counterexamples are possible. In that case we have in fact

5.6 Summary

89

a consequence of fundamental laws; the criteria for satisfaction of B is part of the criteria for satisfaction A. This reasoning depends on previously known fundamental laws, so the question is how these are established. I will discuss that in detail in Chap. 10.

Chapter 6

Explanation, Unification and Reduction

Abstract Two developments of Hempel’s theory of explanation are discussed in this chapter, unification in the vein of Friedman, and theoretical reduction in the vein of Nagel. The problems in Friedman’s theory of unification are traced to its purely syntactical analysis. By contrast, Nagel’s account of theory reduction contains from the very outset non-formal aspects, which gives room for an account of how theory relates to the world, i.e., for an account of non-formal semantics. So a successful reduction in Nagel’s sense may be viewed as ontological reduction; one kind of objects are conceived as a sub-category of a broader range of objects. This is one important aim of physics.

6.1 Introduction Scientific explanation has, beside prediction, always been a prime goal for scientific research. But what more precisely is explanation? The seminal paper on explanation is Hempel and Oppenheim (1948), where the Deducive-Nomological model, abbreviated DN-model, was introduced. A DNexplanation is a deduction of explanandum from a set of premises among which there is at least one scientific law. Hempel & Oppenheim illustrates their account with the explanation of why a piece of metal expands when it is heated. This expansion was explained by the conjunction of two facts: (i) the law that all metals expand when heated and (ii) the particular fact that this metal was heated. Laws are universally generalised conditionals, but it cannot be this feature of laws that contribute to the explanation, for if we omit mention of lawhood, we get, in the case above, the following dialectic: ‘Why does this piece of metal expand when being heated? Answer: it is always the case that metals expand when being heated and this piece was heated.’ This doesn’t seem to be much of an explanation. So the explanatory force, if there is one, of the DN model is based on the implicit assumption that laws are not merely true, universally generalised conditionals; it crucially depends on the implicit assumption that laws in some sense are necessary, in contrast to mere accidental generalisations, which do not explain. (I will discuss necessity of laws in Sect. 10.10). But Hempel & Oppenheim didn’t discuss the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_6

91

92

6 Explanation, Unification and Reduction

concept of law in their paper, they took it for granted. So laws, qua laws, explain, according to Hempel & Oppenheim, individual observations. However, explanations of individual events are most often not of much scientific interest. The focus of the debate about explanation has been wrong in my view. What really needs an explanation is not why a particular piece of metal expand when being heated, but why it is generally so. In science we rarely want an explanation of a truly singular observation; such an observation is in many cases dismissed as a mistake or a random event. Primarily, we want explanations of types of repeated phenomena, described by general statements; these are things that primarily elicit wonder. The DN-model cannot, however, easily be adapted so as to exhibit how we explain general statements. How, for example, should we explain that all pieces of metal expand when heated? We cannot derive it from another law and a particular fact, obviously. Perhaps we can derive it using only laws? This idea was the starting point for two different routes in the ensuing discussion of explanation. One was taken by Nagel (1961, 1970), according to which theoretical explanation consists in the reduction of an entire theory into another more basic one. The other route was taken by Friedman (1974), who started the discussion of explanation as unification; he claimed that explanation aims at understanding and that is achieved by unification, which is the idea that a set of laws may be derived from a smaller set of laws fulfilling certain criteria and this constitutes the explanation of the thus derived laws. Reduction is a relation between theories, whereas unification relates laws. Theories consists fundamentally of laws, so it seems reasonable to hold that theoretical reduction is a species of unification, broadly construed, albeit not exactly in the sense proposed by Friedman. So let us start with the more general idea, explanation as unification.

6.2 Friedman on Unification Friedman (1974) proposed improving the original DN model by suggesting that a scientific explanation is achieved when completely different kinds of regularly occurring phenomena, each kind being described by a law, are shown to be derivable1 from a smaller set of laws. Explanation is unification and unification is reduction of the number of laws not derived from other laws. This was the core idea in Friedman’s paper, starting the discussion about explanation as unification. Why is unification explanatory? Friedman’s view was that explanation aims at understanding and unification achieves precisely this: we understand more of nature when being shown that different kinds of phenomena actually are consequences of a small set of ‘deeper’ and more general laws.

1 It

is of course not the phenomena themselves that is derivable, but descriptions of them that can be derived.

6.2 Friedman on Unification

93

The idea was not completely new. Quine hinted at unification, without using the word, in Quine (1960, 21): We may think of the physicist as interested in systematizing such general truths as can be said in common-sense terms about ordinary physical things. But within this medium the best he achieves is a combination θ of ill-connected theories about projectiles, temperature changes, capillary attraction, surface tension, etc. A sufficient reason for his positing extraordinary physical things, viz., molecules and subdivisible groups of molecules, is that for the thus-supplemented universe he can devise a theory θ’ which is simpler than θ and agrees with θ in its consequences for ordinary things. (As it happens he does a bit better. Besides being simpler than θ his θ’ excels θ on the score of familiarity of underlying principles. . . .Moreover, even those of its consequences that can be stated in common-sense terms about ordinary things exceed those of θ and apparently without including sentences that there is reason to deny).

Friedman (1974) suggested that if we can derive a set of laws from a smaller set of laws, and if the derived laws are independently acceptable of the laws in the smaller set, then this smaller set explains the derived laws. Explanation is reduction of the number of brute general truths, expressed as laws, postulates or principles; it is a unification of an area of discourse. I have some sympathy with this general idea, but there are well-known problems. One immediately observes that it is easy to reduce, in a trivial sense, the number of laws just by constructing a conjunction of several laws; but this conjunction is of course not explanatory and neither does it reflect what physicists actually do when they succeed in constructing a better unified theory starting from a number of laws. Thus, in order to block this, Friedman introduced the notion of independent acceptability, characterised by two conditions: (1) If S  Q, then S is not acceptable independently of Q. (2) If S is acceptable independently of P and Q  P, then S is acceptable independently of Q . Some sentences can be partitioned into smaller sentences that each are acceptable independently of the original one and this process my be continued until one reaches sentences that are not further divisible into independently acceptable parts; such sentences are called K-atomic: I will say that a sentence S is K-atomic if it has no partition; i.e., if there is no pair (S1 , S2 ) such that S1 and S2 are acceptable independently of S and S1 &S2 is logically equivalent to S. (Friedman 1974, 17)

Explanation is then, Friedman suggests, reduction of the number of independently acceptable consequences of a sentence S, conK (S). His definition of explanation is thus (Dl)

S1 explains S2 iff S2  con(S1 ) and S1 reduces con(S1 ).

However, Friedman is not entirely satisfied, since the addition of an independently acceptable law, S3 say, to explanans would give as result that the conjunction of S1 and S3 does not explain S2 . Friedman thinks this is undesirable and suggests an refined version to avoid this:

94

6 Explanation, Unification and Reduction

(D1’) S1 explains S2 iff there exists a partition  of S1 and an Si   such that S2  con(Si )and Si reduces cons(Si ) But alas, there are profound problems with Friedman’s account, as the debate clearly has shown. The most devastating one was identified by Salmon (1989), who showed that any lawlike generalisation can be partitioned using two or more predicates such that the union of their extensions equals the extension of the predicate in the antecedent of the original generalisation. For example (1) ‘All humans are mortal’ can be split up into (2) ‘All men are mortal’ and (3) ‘All women are mortal’. Since (1) presumably is not independently acceptable of either (2) or (3), whereas (2) and (3) are thus independently acceptable of (1), the conditions in Friedman’s definition is met. But (1) is not K-atomic, and it seems plausible to assume that similar partitions can be made in the extensions of all predicates; hence there seem to be no K-atomic predicates at all. Salmon’s examples shows, I think, that any merely syntactical criterion will not work as a clarification of the notion of unification. Any discussion confined to merely logical relations between statements does not seem to come to grips with the basic intuition in the unification account of explanation. Kitcher (1981, 1989) tried another way of explicating unification, but his account has not convinced many people in the field. The difficulties have triggered several other philosophers of science to repair or to suggest alternative accounts, see for example Barnes (1992); Jones (1995); Halonen and Hintikka (1999); Weber and van Dyck (2002); Gemes (1994). Morrison (2000) contains a detailed account of all the prime examples of unification in physics, together with discussions of several philosophical issues about unification, e.g. unification and realism, unification and reduction and unification and explanation. A reasonable conclusion is that no convincing account of unification has so far been presented. So let us proceed to reduction.

6.3 Nagel on Theory Reduction Nagel’s account of theory reduction in Nagel (1961, 1970) might be viewed as a kind of unification in so far as theory reduction consists in showing that an entire theory is a special case of another more encompassing theory. In any case, and more importantly, Nagel himself held that theory reduction is explanation: Reduction, in the sense in which the word is here employed, is the explanation of a theory or a set of experimental laws established in one area of inquiry, by a theory usually though not invariably formulated for some other domain. (Nagel 1961, 338)

The basic idea is that reduction of one theory (set of laws) to another is achieved if (i) the laws in the reduced theory can be derived from the laws in the reducing theory and (ii) the concepts in the reduced theory can be defined in terms of those in the reducing theory. There is no requirement of reduction of number of laws, in contrast to unification in Friedman’s paper.

6.3 Nagel on Theory Reduction

95

It is already from this formulation clear that the problems encountered in Friedman’s account of unification will not occur in theory reduction thus conceived. And Nagel stressed that the formal criteria was not sufficient. Sarkar (2015) comments: Nagels analysis of reduction has two components: a formal model and an extended discussion of nonformal conditions that scientifically significant reductions should satisfy. These nonformal conditions are better regarded as substantive assumptions about reduction. . . .. The relevant distinction is sometimes put forward as one between syntactic and semantic conditions. This is misleading sine the relevant nonformal or substantive assumptions do not generally consist of interpretations (models) of uninterpreted structures; rather they often introduce new claims including contextual criteria about the roles and value of theoretical developments. (op.cit. p.44)

The semantic conditions mentioned in the quote above is that of formal semantics in the vein of model theory in logics and I agree with Sarkar that this notion of semantics is not relevant. What Nagel discussed and what is of relevance is how our theories relate to the real world. This topic, which may be called non-formal or empirical semantics, was briefly discussed in Sect. 3.5, where it was shown that a consequence of Löwenheim-Skolem’s theorem is that use of indexicals together with pointing gestures, are necessary for establishing theory – world relations. This conclusion was by Luntley (1999) called ‘Russell’s insight’: The semantic power of language to represent derives from the semantic power of contextsensitive expressions’ (p. 285)

Theoretical reduction may be called ontological unification because one kind of objects is shown to be a subclass of a broader category. This is not reduction in the sense that the objects in the reduced theory are eliminated. If we hold that science has succeeded in reducing thermodynamics to statistical dynamics, optics to electromagnetism and genetics to molecular theory, we still don’t say that gases, light rays or genes are eliminated. But we know what kind of more general types of things each of these kinds of things belong to. Showing that a category of objects is a subclass of a broader category consists in showing that the laws for the broader category also applies to the objects talked about in the reduced theory; this is theoretical reduction. It is obvious that Friedman’s problem with K-atomicity is no problem in Nagel’s account. By their very nature theoretical predicates in the reducing theory are not K-atomic, that is why they are able to incorporate another theory. Nagel’s model of reduction was heavily criticised in the 1980s and 1990s. One critique was that one cannot derive the laws in the reduced theory in the strict logical sense of the word ‘derivation’; one needs to make approximations and simplifying assumptions here and there in the derivations. Also several other difficulties were spotted, so around the year 2000 Nagel’s model was considered refuted. But the tides have changed and since a decade or so several authors have defended Nagel’s model. Sarkar (2015) contains an overview of the debate and a strong defense of Nagel. He concludes:

96

6 Explanation, Unification and Reduction What should a critical assessment of Nagel’s model be? The discussion so far has defended Nagel on eight points against his critics and some “defenders” who have seriously misinterpreted him: (1) Most importantly, reduction is a type of explanation. (2) There is no problem with regarding reduction as a relation between theories so long as “theory” is interpreted broadly to include a wide variety of truth-bearing representational structures. (3) Consistent with the last two points, the pertinent issues about reduction are epistemological, rather than ontological. (4) The logical form of the connections between the reduced and reducing theories can be varied. In particular, they are not restricted to being synthetic identities. (5) The connections between the reduced and reducing theories need not be lawlike. Multiple realization is not a problem for reduction. (6) The reduced theory must be derived from the reducing theories with the help of these inter-theoretic connections. However, these derivations may involve context-dependent approximations including dapproximations - this means that what is derived may be an approximation to the reduced theory. (7) Elimination of the reduced theory (or the entities it postulates) is not a goal of reduction. (8) Valuable reductions typically enhance the further development of the reduced theory. (op.cit. p. 54)

Of these eight points I accept seven, both as correct descriptions of Nagel’s views and as a description of theoretical reduction in physics. The point I disagree with is item 3, Sarkar’s claim that the epistemological aspect is the pertinent one, not the ontological. I believe it is the opposite. Why do physicists and other interested think that e.g., the reduction of optics to electromagnetism and thermodynamics to statistical mechanics are great scientific achievements? It is certainly not because optics before Maxwell or thermodynamics before its final reduction to statistical mechanics were viewed with skepticism, which finally was rebutted when optics and thermodynamics were shown to be parts of wider and better supported theories. Both optics and thermodynamics were well established scientific theories at the middle of the nineteenth century. Thermodynamics was in fact better supported by evidence than statistical mechanics and reduction attempts were for a long time not completely successful. The main problem was that the fraction Cp/Cv, i.e., specific heat at constant pressure over specific heat at constant volume, is different for different gases and the observed values of this fraction could not be derived from classical statistical mechanics alone. One needs to take into account quantisation of energy levels in molecules, Bohr’s hypothesis about stationary states, to get it right. This completed the reduction, which then was viewed as a great scientific achievement; not because it added evidence to thermodynamics, but as an ontological reduction. Gases consist of molecules following the laws of classical mechanics plus Bohr’s postulate. It seems clear that theoretical reductions were valued as great scientific achievements because they simplified ontology; light was shown to be electromagnetic waves and gases were shown to consist of particles that followed the laws of statistical mechanics. The rationale for theoretical reduction is that, if successful, it achieves ontological unification; fewer kinds of things are needed in theory construction. Are there at present any attempts at theoretical reduction in physics? An obvious case is string theory, which unites the standard model, which is a quantum theory, and gravitation into one single theory. From the basic laws of string theory one gets the graviton, the particle carrying gravitational interaction. Thus a consequence of

6.3 Nagel on Theory Reduction

97

string theory is that gravitation also is quantized and one has a fully unified theory describing all kinds of interactions. String theory has been pursued for now more than forty years by lots of theoretical physicists. But alas, no testable new predictions are yet in sight, so one might think that string theory is a mistaken idea, a blind alley. In spite of this lack of empirical support, many theoretical physicists seem not to despair however, they continue to work on this research programme. Why? I think there are two reasons for this stubbornness. One is that in cosmology one needs a theory unifying the standard model and general relativity for handling the states of black holes; Here a unified theory is absolutely necessary. The other is, I think, a more metaphysical belief that it must be possible to construct a unified theory that covers all physical phenomena. Presently we have two theories, the standard model describing how physical systems interact by exchanging quanta, and general relativity, built upon the assumption that spacetime can be described as a continuous manifold. Thus, these theories are built on contradicting postulates; The fundamental idea in the standard model is that interactions are quantized, (see Chap. 16), whereas in general relativity interactions are described as continuous. Since the standard model unites three of the four types of interactions, i.e. electromagnetic, the weak and the strong nuclear force, it seems plausible to assume that the next step is to quantize also gravitation, which is done in string theory. So there is a strong belief among theoretical physicists that it must be possible to construct a theory that unites all four types of interactions into one unified theory, a TOE, a theory of everything and string theory is a candidate satisfying this goal.2 Moreover, the basic postulate of string theory is that all objects are constituted of one-dimensional fundamental objects, strings, satisfying a simple principle and this simplicity is deeply satisfactory for physicists. Lee Smolin comments: Indeed, the whole set of equations describing the propagation and interactions of the forces and particles has been derived from the simple condition that a string propagates so as to take up the least area in spacetime. The beautiful simplicity of this is what excited us originally and what has kept many people so excited; a single kind of entity satisfying a simple law. Smolin (2006, 184)

Thus, if successful, string theory would bring about the strongest possible theoretical reduction leading to ontological unification; only one kind of objects, strings, are needed at the fundamental level and all physical phenomena could be derived from the basic law for these strings. But evidence is lacking. String theory is an illustration to my thesis of Chap. 3 that strong explanatory power is no evidence for truth. Physicists have been able to construct a mathematical structure that incorporates the standard model and general relativity theory, but this construction doesn’t prove that other ways of uniting these two theories are

2 In

Johansson and Matsubara (2011) we discuss string theory as driven by the goal of unification and evaluate it according to four well known general methodologies, viz., logical positivism, Popper’s falsifcationism, Kuhn’s theory of scientific revolutions and Lakatos’ Methodology of Scientific Research Programmes.

98

6 Explanation, Unification and Reduction

impossible, as is explicitly shown by Jaksland (2019), to be further discussed in Sect. 17.5. Lacking empirical evidence for a theory we have no reason to believe in its truth.

6.4 Explanation and Understanding The last decade a growing number of philosophers assume that the goal of explanation is understanding, see e.g. Faye and De Regt (2019) and references therein. For me this goes without saying and I can only express astonishment that this has not been taken for granted from the very beginning of the debate about scientific explanation. I guess that the lack of interest in the goal of understanding in earlier times was based on the assumption that understanding is a purely psychological phenomenon and philosophy of science has been focused on philosophical, i.e., logical, ontological and semantical aspects of science, not on its psychologial aspects. But after the naturalistic turn in epistemology psychological aspects of scientific thinking enters the picture; epistemological naturalism, see Sect. 3.3, is the stance to base epistemology on observations of how we actually perceive and think, i.e. our psychology. We start with what appears to be valid empirical knowledge about our cognition; then, at a later stage of theoretical development we may find reason to reject some beliefs about our cognitive activities, if they conflict with the rest of our theory. There are no sacrosanct and indubitable truths in empirical science. I’m repeating the argument of Sect. 3.3. Epistemological naturalism is the rejection of the view of epistemology as a first philosophy, independent of empirical findings. Starting with purely a priori principles one cannot arrive at any conclusions about the empirical world, nor about our relations to it. If we hold e.g. statistical mechanics true and can derive all of phenomenal thermodynamics, (given some identifications of quantities such as temperature with mean kinetic energy) from this theory, one understands thermodynamical phenomena better. For example, one understands why the pressure of a gas confined within a closed container increases when the temperature increases. Theoretical reduction enhances understanding.

6.5 Summary Most scientists and philosophers of science have a strong urge for explanation. The aim of explanation is understanding. Understanding nature consists in part of knowing what kinds of things there are and how they are related to each other. Therefore successful ontological reduction increases understanding and has explanatory value.

6.5 Summary

99

Understanding is a subjective phenomenon. People differ in many respects relevant for explanations, such as background knowledge and having different demands on the form of explanations. The reasonable conclusion to draw is that people will differ vastly concerning the effectiveness or value of a purported explanation. There is however substantial agreement about at least some explanations in the history of physics and chemistry, such as the explanation of a large number of observed regularities, including those of phenomenal thermodynamics, by the fundamental laws of classical mechanics, explanations of electric, magnetic and optical phenomena using Maxwell’s equations, and explanations of the diversity of matter based on atoms. The common feature in these explanations is ontological reduction. It seems that in these cases Ockham’s razor is operative in determining what an explanation should achieve, viz., ontological simplicity. However, ontological reduction seems not to be a necessary condition for successful explanation. There are in the history of physics cases of successful explanations not consisting in reduction in ontology. A particularly obvious case is the explanation of the constancy of free fall acceleration. Before learning classical mechanics, all people took for granted that heavier bodies fall with greater acceleration than lighter ones. Careful observations of falling bodies reveal, however, the falsity of this assumption, all bodies have the same free fall acceleration. This fact is explained by classical mechanics, so this theory has obvious explanatory value. It is hard to see any common formal or epistemological feature of all scientific explanations; they are irreducibly context dependent.

Chapter 7

Realism, Theory-Equivalence and Underdetermination of Theories

Abstract The focus of this chapter is the realism-antirealism debate in philosophy of science. The two central doctrines of scientific realism, viz., (i) central terms in mature theories refer to existing things, and (ii) scientific theories are approximately true, are discussed. In so far as ‘term’ is understood as ‘general term’ this doctrine conflicts with nominalism and is thus rejected. The other doctrine should however be accepted also by empiricists, it is argued. The underdetermination argument, generally thought to be a great hurdle for scientific realism, takes for granted that two empirically equivalent theory formulations really express different theories. This assumption can be countered by an argument taken from Quine, showing that empirically equivalent theory formulations always can be translated to each other. Hence, one may hold that empirically equivalent theories are mere different theory formulations of the same theory. Thus both empiricists and scientific realists can reject the underdetermination argument. The debate about versions of structural realism is then discussed. The notion that a mere structure, thought of as being invariant under isomorphisms, could represent the physical world is critised on the basis of Löwenheim-Skolem’s theorem. The ontology of quantities is the next topic. The crucial problem is identity criteria for quantities; since they are additive, one can for example ask how many forces are acting on certain accelerating body? Is it one, or several? If the latter is the correct answer, what is the true decomposition of the vector sum? The conclusion is that quantities are not real things, quantitative predicates have extensions but no references.

7.1 The Physical Content of Theories Physical theories rely heavily on the use of abstract mathematics, but they are intended to be about the physical world, not about mathematical entities. This raises semantical questions regarding the connection between the mathematical formalism of a theory and the theory’s claims concerning physical reality. Two positions regarding this relation, at each end of the spectrum, are logical positivism and scientific realism. The logical positivists held that theoretical © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_7

101

102

7 Realism, Theory-Equivalence and Underdetermination of Theories

sentences are mere calculational devices, devoid of content. Theoretical sentences lack truth value, only descriptions of observable phenomena are truth-apt. Scientific realists take the opposite view, claiming that (i) the best current scientific theories are at least approximately true and (ii) the central terms of the best current theories are genuinely referential (Leplin 1984, 1). In short, theories in current sciences truly, but partly, represent the world. Hawking and Mlodinow (2010) caused quite a stir by claiming that philosophy is dead, referring to philosophical debates about realism in science. But despite the claim, they did not hesitate to comment the debate about realism concerning scientific theories. They introduced the term ‘model-dependent realism’, writing that, Model-dependent realism short-circuits all this argument and discussion between the realist and anti-realist school of thought. According to model-dependent realism, it is pointless to ask whether a model is real, only whether it agrees with observation. (Hawking and Mlodinow 2010, 45-46)

This sounds pretty similar to classical anti-realist instrumentalism; why is it called a species of realism? The position is hardly original. The weak points of logical positivism was thoroughly discussed in Chap. 3; I just want to repeat my criticism that it is incoherent to use a theoretical sentence, which according to logical positivism is not truth-apt, as a step in a derivation in which each step is justified as being truth-preserving. And there is a huge literature criticising other aspects of logical positivism. Few, if anyone nowadays regard it a viable view. Van Fraassen developed a better version of empiricism in his (van Fraassen 1980) and a still better position, ‘the empirical stance’, as his (van Fraassen 2002) is called. I have discussed these views in Chap. 3, concluding with my own version of empiricism, tentatively called ‘nominalistic empiricism’. Therefore I will in this chapter only discuss scientific realism, the alternative to empiricist views. What are the strengths and weaknesses of this position?

7.2 Arguments About Scientific Realism The two theses cited above is the common ground for all scientific realists. I accept (i) but not (ii) insofar as ‘central term’ is understood as ‘general term’. But I am skeptical both towards the realists’ no-miracle argument and the counter-argument, viz., under-determination of theory by evidence. The main argument for scientific realism, the so-called ‘no-miracle-argument’, has it that the tremendous success of modern science would be a miracle if it were not the case that modern scientific theories were at least approximately true and that those things they postulate really exist. In short, scientific realism is the best explanation for the success of science.

7.2 Arguments About Scientific Realism

103

This argument appears initially convincing and in perfect accord with how scientists talk. Particle physicists, for example, talk without hesitation about electrons, quarks, photons etc., and if we were to ask them if they believe in the existence of those things I guess they would answer ‘Sure’. On the question of truth, however, scientists hesitate. Instead of claiming that their theories are true, or approximately true, they often say that they have models. By using this word they indicate scope limits. A model is intended to be at least approximately correct description of classes of phenomena under some more or less tacit restrictions. In fluid mechanics, for example, substances such as water, air or oil are modeled as portions of continuous matter, thus neglecting the fact that they consist of molecules. But the assumption of continuous matter simplifies the calculations and one can estimate under which conditions neglecting the granularity of the substance is irrelevant. Thus, in particular applications where scope restrictions are satisfied, models are viewed as true descriptions of the phenomena. Still, the word ‘true’ is avoided in order not to implicate that not all parts of the model should be taken as literary true. I would say that if we are able to make a number of testable predictions by using a certain model, it is on a par with theories in being truth-apt. Talk about models instead of theories is a way of signaling that there are limits in applications, which is more implicit in talk about theories. Thus, in the debate between scientific realists and its critics one may take the term ‘theory’ to include also what is often called ‘models’. I concur with scientific realists in holding that at least the best theories in mature sciences, for example physics, is at least approximately true. But their argument that the success of science would be a miracle if this was not the case does not convince. One thing is to claim that we have evidence enough for believing a theory to be true, another thing is to claim that the truth of the theory explains anything. There are three main arguments against scientific realism. The first is van Fraassen’s and many others critique of the no-miracle argument. Van Fraassen claims that the best explanation for the success of science is that scientific theories are empirically adequate, not that they are approximately true; The truth of their theoretical parts doesn’t increase the success rate. It is obvious that in the dispute with van Fraassen scientific realists beg the question. The second argument is that explanatory power is no evidence for the truth of a theory, since explanatory power is a pragmatic, not an epistemic feature of theories, (van Fraassen 1980, ch.5). In Chap. 3 I discussed this matter, agreeing with van Fraassen. Realists, of course, dispute this argument, holding that explanatory power in part is an epistemic feature, a reason for belief. The third argument against scientific realism is underdetermination of theory by data; it is claimed that one can always conceive of an alternative theory to a presently successful one, an alternative that has exactly the same empirical consequences, but which postulates other objects with other properties. Hence there is no valid reason to believe that our present cherished theory is in fact true. In short, theory is underdetermined by data. Realists have generally accepted this as a grave problem for realism. But the underdetermination argument does not convince me.

104

7 Realism, Theory-Equivalence and Underdetermination of Theories

7.2.1 Defusing Underdetermination I am skeptical about the correctness of the underdetermination thesis because it presupposes we have a criterion for distinguishing empirically equivalent theories and what could that be? Some suggestions have been made, but so long as no generally agreed answer is in sight the underdetermination argument is weak indeed. All agree that merely using different vocabularies is not sufficient; words can be replaced by other words without affecting what the theory says about the world. The term ‘theory’ is supposed to refer to an abstract object and we are justified in asking for an identity criterion; what is required for two theory formulations to be formulations of the same theory? Another obvious example of trivial differences between theory formulations occurs if we conjoin a theory T with an additional sentence S having no logical connection to T and no empirical consequences of its own. The difference between T and T+S is obviously trivial; we have no good reason to say that T+S is another theory than T; S consists merely of some extra words without any connection to reality. Surely, if we say that T1 and T2 are different but empirically equivalent theories, not just different formulations of the same theory, we must be able to tell their difference. Only if we have an identity criterion and its converse, a principle of individuation, for theories, and can show that the suggested alternative theories fail the identity criterion, do we have good reason to claim that we have different theories. In my view, the default option when confronted with two empirically equivalent theories is that they are just different formulations of the same theory. Many philosophers of science seem to take the opposite view that by default two theories are different things, unless they satisfy an identity criterion. At first sight it appears obvious that if two theories T1 and T2, while being empirically equivalent, postulate different kinds of unobservable objects, they tell different stories about the world. But on second thought one may ask: how do we know that the objects postulated in the two theories really are different kinds of things? Perhaps the two theories just give different labels to the objects talked about? For example, isn’t it very reasonable to say that describing something as a body at certain place during a certain period of time and talking about the material content of that same spacetime region endowed with a certain degree of hardness are just two different ways of talking about the same thing? What is to be counted ‘same object’ and ‘same theory’? Quine (1981b, 19) has in fact provided a strong argument against underdetermination, albeit it was not so labelled by him. Consider two theories T1 and T2 which have exactly the same empirical consequences; they are empirically equivalent. (This might be difficult to establish, but that is another question.) T1 postulates a set of objects {On1 } and it uses a set of predicates P11 , P21 . . . .Pk1 to describe these 2 } and uses another set of predicates objects. T2 postulates another set of objects {Om 2 2 2 P1 , P2 . . . .Pj . Those who adhere to T1 say that their theory is true about its objects

7.2 Arguments About Scientific Realism

105

and adherents to T2 hold that their theory is the true one about the objects belonging 2 }. Since these theories are assumed to be empirically equivalent they to the set {Om imply exactly the same observation sentences. It is now always possible to construct 2 } in such a way that true sentences in T1 are a mapping between {On1 } and {Om mapped onto true sentences in T2. The mapping need not be one-to-one. Predicates in T1 and T2 may sort the world differently, and one may map an object in T1 onto a complex of objects in T2, and vice versa. Quine calls such a mapping a ‘proxy function’. The result is that an adherent to T1 can say that T2 is merely a reformulation of my theory; A sentence S in T2 is the map of my sentence S* and (sets of) objects postulated in T2 are maps of those objects I postulate in my theory. An adherent to T2 can similarly say that T1 is a reformulation of his theory. So long as they agree on all observable consequences of their respective theories this mapping is always possible. Quine concludes: The apparent change is two-fold and sweeping. The original objects have been supplanted and the general terms reinterpreted. There has been a revision of ontology on the one hand and on the ideology, so to say, on the other. They go together. Yet verbal behavior proceeds undisturbed, warranted by the same observations as before, and elicited by the same observations. Nothing has changed. (op.cit., p. 19)

Quine doesn’t discuss underdetermination, nor identity criteria for theories, in this paper, but it is a little step to say that if T1 and T2 are empirically equivalent, they are merely two formulations of the same theory, since reinterpretation as described above always is possible. The underdetermination argument is thus defused. Furthermore, since it is always possible to construct a mapping, a proxy function, between two empirically equivalent theories, one may conclude that an empiricist may consistently hold empirical equivalence to be the proper identity criterion for theories.1 Scientific realists would certainly disagree. They might bring forth the argument that even though one can map the extension of a predicate in T1 (i.e. a subset of those objects postulated in T1) onto the extension of a set of predicates in T2, (under the condition that truths in T1 are mapped onto truths in T2) and vice versa, the two theories postulate different properties and relations. Co-extensionality of predicates is not sufficient for saying that T1 and T2 are different formulations of the same theory, if one is a realist about properties; two distinct properties can be co-extensional. This is a possible argument for scientific and metaphysical realists since they accept universals in their ontology and the difference between T1 and T2 may be that they postulate different properties and relations in the world. However, this has not been the main realism response. Many realists have instead moved to a position called structural realism.

1 Halvorson

(2019, section 4.5) has, inspired by Quine, discussed translations between theories. But there is a crucial difference between Halvorsen’s and Quine’s views: Halvorsen presupposes that the objects quantified over can be identified and individuated independently of the theory at hand, while Quine holds, as usual, that individuation and identity among objects are entirely theory-dependent. I believe Quine is right on this matter.

106

7 Realism, Theory-Equivalence and Underdetermination of Theories

7.2.2 Structural Realism Structural realism is a partial retreat from scientific realism triggered by underdetermination arguments. The core idea is that only what is common to a set of theories having the same empirical support is real and this common feature is said to be the structure, not its objects. Hence Ladyman and Ross (2007) called their influential book ‘Every Thing Must Go’. Structuralism had an influential precursor in Carnap. In his (Carnap 1928/2003) he wrote For science want to speak about what is objective, and whatever does not belong to the structure but to the material (i.e., anything that can be pointed out in a concrete ostensive definition) is, in the final analysis, subjective. One can easily see that physics is almost altogether desubjectivized, since almost all physical concepts have been transformed into purely structural concepts. . . . From the point of view of construction theory, this state of affairs is to be described in the following way. The series of experiences is different for each subject. If we want to achieve, in spite of this, agreement in the names for the entities which are constructed on the basis of these experiences, then this cannot be done by reference to the completely divergent content, but only through the formal description of the structure of these entities. (p. 29)

By ‘construction theory’ Carnap refers to the construction of a formal system in which the concepts forming the subject-matter of the system are introduced by means of axioms and definitions. However, it is far from clear to me what Carnap refers to with ‘formal description of the structure of these entities’. In any case, he apparently dismissed things identified by being ‘pointed out in a concrete ostensive definition’, since these are subjective things. On this point I beg to disagree. Our basic observations in physics are observations of bodies moving around, hence Carnap dismisses such observations as subjective. The very act of making the observation is of course a subjective phenomenon, but the observation report is intersubjectively available. And if several people on the spot agree on the truth of an observation report, this is sufficient as objective knowledge, since it is intersubjective. Objectivity in the sense of correspondence with facts is a metaphysical notion which has no place in science. Structural realists elaborated on Carnap’s idea, seeing the structure of a theory as its crucial aspect. They gave up one of the realist doctrines, viz. that central terms refer to real existing things, while retaining the other, viz., that theories are at least approximately true. A theory is true if its structure maps the structure of the relevant part of reality. But isn’t there a profound difference between Carnap and structural realists? Carnap talked about structural concepts, and Carnap was certainly no metaphysical realist believing that concepts referred to universals, whereas structural realists seem to hold that structures are real. For Carnap ontological questions are pseudo-questions; his central idea was that objects were relative to framework and frameworks can be chosen at will. Structural realists, not distinguishing, as Carnap did, between framework and empirical content, drew the conclusion that there could not be any reason to believe

7.2 Arguments About Scientific Realism

107

in individual objects; things must go. But what, then is left of realism? What is real? Structure, they said, structures are what exists. Structural realism divides into epistemic and ontic structural realism. Epistemic structural realism is the stance that structure is what is saved during theory changes and that is what is knowable. Ontic structural realism (OSR) goes further and claims that structure is all there is. But what more exactly is a structure? Could one say that structures are those things that relate non-existent but purportedly existing individual things? No, that would be analogous to the notion that the smile of the Chesire cat2 remains when the cat has disappeared. OSR is inspired by modern physics, in particular quantum field theory and general relativity. Those theories are built upon abstract set theoretical structures such as Hilbert spaces and differentiable manifolds. These are supposed to map physical structures. Here is how Ross and Ladyman express OSR: According to OSR, if one were asked to present the ontology of the world according to, for example, GR one would present the apparatus of differential geometry and the field equations and then go on to explain the topology and other characteristics of the particular model (or more accurately equivalence class of diffeomorphic models) of these equations that is thought to describe the actual world. There is nothing else to be said, and presenting an interpretation that allows us to visualize the whole structure in classical terms is just not an option. Mathematical structures are used for the representation of physical structure and relations, and this kind of representation in ineliminable and irreducible in science. Hence, issues in philosophy of mathematics are or central importance for the semantic approach in general and the explication of structural realism in particular. (Ladyman and Ross 2007, 159)

So what really exists, according to OSR, are physical structures which are represented by mathematical structures. The individual objects, the nodes in these structures do not really exist. But isn’t that analogous to saying that the smile of the Cheshire cat exists but not the cat? This disastrous conclusion may be avoided if structures are conceived as individual things, not relations, which is what a nominalist perhaps could do, although I don’t see how singular terms without placeholders for objects thus structured is possible. In any case we need clear identity criteria for structures. One choice would be to state them in terms of equivalence classes under isomorphisms. But that applies only to mathematical structures; the crucial question is how these relate to physical structures? Let us recall Löwenheim-Skolem’s theorem and Russell’s insight, see Sect. 3.5. We need non-theoretical resources, indexicals and pointing gestures, in order to connect our theory to physical reality. How do we identify the physical counterpart to a certain equivalence class of structures by pointing? I don’t see how that could be done.

2 The

Cheshire cat is a figure in Lewis Carroll’s Alice in Wonderland.

108

7 Realism, Theory-Equivalence and Underdetermination of Theories

Isomorphisms and other mathematical objects are extremely valuable when making inferences in physics. But their utility does not entail that these mathematical objects represent physical entities. Recently some authors (including Keizo Matsubara and myself in a joint paper) have arrived at a similar stance when it concerns spacetime in string theory. String theory requires a 10-dimensional manifold and for quite some time string theorists believed that physical spacetime therefore must be 10-dimensional, if string theory is true or approximately so. But the existence of dualities, in particular AdS-CFT (Anti-de-Sitter - Conformal Field Theory) duality undermines this conclusion. In this kind of duality, a theory set in X dimensions can be physically equivalent to another in Y < X dimensions. This suggests that the dimensionality of the space in which string theory is formulated need not have anything to do with the dimensionality of physical spacetime, see further the discussion in Matsubara and Johansson (2018) and references therein. OSR is a response to the underdetermination problem, but it is an awkward, not to say inconsistent, position, claiming that relations exist without relata. A much better answer to underdetermination is to follow Quine; objects in one theory can always be mapped onto objects in an empirically equivalent theory satisfying the condition that sentences held true in one theory are mapped onto truths in the other theory. It is of course crucial that identification and individuation of objects depends on theory formulation, but there is no other option; How else would we individuate unobservable objects postulated in physical theories? Structural realists are right in stressing the importance of structural features of physical theories. But they are wrong in reify them and wrong in claiming that singular terms do not refer. If we want to have a plausible account of how we obtain knowledge in physics it is the other way round. This issue, how to give a semantics based on empiricist principles for physical laws, is further discussed in Chap. 10.

7.3 Existence Reflections about what exists are inevitable when one hears about modern physics. Is really space and time amalgamated into spacetime? Does spacetime exist in itself independently of any stuff filling it? Are there really gravitational forces or are they merely an effect of curvature of spacetime? Is it really possible that our universe came into being out of absolutely nothing at Big Bang? Are there really in the world such weird things as objects which sometimes behave as particles, sometimes as waves? Are there really six compactified spatial dimensions, as is suggested by string theory? Questions about existence multiply when one learn more and more about modern physics. If there is any discipline that triggers ontological questions, it is physics. Ontological questions are intimately connected with semantics. In ordinary parlance we take for granted that when we ascertain a sentence, we also ascertain the existence of the referent of the singular term (or terms) in that sentence. But troubles immediately arise when using an empty name in making a true assertion. Hence

7.3 Existence

109

we need to paraphrase ordinary sentences into first order logic and use Russell’s invention of translating names as definite descriptions together with quantifiers and variables to avoid such formulations. The connection between semantics and ontology can thus be stated: when a true sentence is paraphrased in first order logic, those things being the values of bound variables in this sentence must exist. Now there is often a choice regarding how to formulate a physical theory in first order logic. Consider for example electromagnetism, whose ontology has been debated for at least 70 years: is it a theory about charged particles, a theory about fields or a theory about both? (This question will be discussed thoroughly in Chap. 11). The theory itself doesn’t tell us what exists and thus not which is the correct paraphrase in first order logic. We have determined what exists only when we have determined our preferred, or best, or correct way of expressing our believed theory in first order predicate logic. The debate about the ontology of electromagnetism is not a debate about the existence of middle-sized bodies which we directly observe; such things are presupposed in applications of electromagnetic theory. The question is what more to accept in our ontology. Since philosophers may, and in fact do disagree about ontology without disagreeing about the truth of observation sentences, it is clear that ontology is not fully determined by empirical evidence. Empirical evidence consists of intersubjectively agreed observation sentences; we may agree about their truth while disagreeing about their semantics, i.e., how to paraphrase the theory into first order predicate logic and whether the general terms used in these observation sentences refer or not. And realists also accept universals as part of ontology, while some empiricists do not. Surely, most people never express their beliefs in first order logic, while still believing many things to exist. But the point is not what individuals believe: the point of the discussion is what a theory says there is, and in order to establish that, we need to express the theory in some language which allows expressions of the form ‘there are things such that. . . .’; this is done when we express our theory in first order predicate logic. An expression of this form is formalised as the existential quantifier together with a variable and quantification is done over the objects of discourse, hence the label ’objectual quantification’. We accept the things in the domain we quantify over. But this is not the only possible option; there is an alternative view, substitutional quantification, proposed by Ruth Barcan Marcus. An existence claim ∃xAx is understood as that it is true if there exists a proper name t such that when we substitute t for the variable x in the open sentence Ax, we get a true sentence. This interpretation of quantification is proposed as a solution to the problems of quantification into modal and intensional contexts, see her ‘Modalities and Intensional Languages’ and Quantification and Ontology’, reprinted in Barcan Marcus (1993). We empiricists are in general skeptical towards quantified modal logic, it raises difficult epistemological questions. The formal semantics of quantified modal logic in terms of possible worlds is fine, but how do we know what is true in a possible but non-actual world? At best we know some truths in our world. Therefore, from an empiricist point of view one had better avoid delving into quantified modal logic.

110

7 Realism, Theory-Equivalence and Underdetermination of Theories

In Chap. 10 I will give my analysis of the necessity of laws without interpreting ‘necessarily’ as a modal operator in the object language. So I see no need to further consider substitutional quantification in this context.

7.4 Are Physical Quantities Real? 7.4.1 Universals Medieval philosophers distinguished between three metaphysical doctrines: (i) accepting universals ante res, i.e., metaphysical realism, (ii) rejecting universals ante res but accepting universals in rebus, i.e., that universals exist only as instantiated in objects, and (iii) rejecting both universals ante res and in rebus, while accepting that words and ‘words in the mind’, i.e. concepts, are universals, which is medieval nominalism. Empiricists naturally join the nominalists in rejecting both universal ante res and in rebus; no empirical evidence for the existence of such things is possible.3 What are, then, the consequences for the status of physical quantities? I accept all presently accepted physical theories as at least approximately true; in this respect I agree with scientific realists. Theories, when explicitly stated, consist of sentences. These sentences can be paraphrased in first order predicate logic. Having done so, we can discern what ontological commitments we have made, viz., those things needed as values of bound variables. Thus referents of singular terms and bound variables occurring in a thus paraphrased theory, which we hold true, must be accepted in the ontology no matter how observable or not they are. But properties or relations, purported referents of general terms, are not needed since we only utilise first order logic. We need not assume that general terms in our theories have references, only that they have extensions. Nominalists and empiricists need no universals. Scientific realists may protest by saying that we actually need properties and relations when formulating scientific explanations. My answer, given in Sect. 3.8, is that scientific explanation has no evidentiary value. Empirical evidence for a physical theory consists only of observation reports, sentences that says the same, or nearly the same, as generalised observation conditionals derived form the theory. Explanations belong to pragmatics.

3 Nor

are universals in mind or language assumed by modern empiricists. The distinction between individual and universal words and mind states is nowadays made using the token-type distinction.

7.4 Are Physical Quantities Real?

111

7.4.2 Physical Quantities Physical quantities such as LENGTH, TIME, CHARGE and FORCE are general terms with agreed criteria of application, which are explicitly stated in the SI system. Since we empiricists dismiss universals we also dismiss quantities conceived as universals. Therefore I may be allowed to use the word ‘quantity’ as shorthand for ‘quantitative predicate’ in what follows. But √ what about definite descriptions containing quantity words, as in ‘A force of 89 Newton acted on the body causing its acceleration.’ √ If we construe this sentence as consisting of two singular terms, ‘A force of 89 Newton’ and ‘the body’ and a general term ‘. . . .acted on. . . .’, this particular force might be viewed as an entity without assuming any universal. But I see a great obstacle for adopting this perspective: how are forces in this sense individuated? Forces are represented as vectors and in mathematics we have no problem with individuating numbers and vectors. The number 5 can be referred to by e.g. ‘3+2’, or ‘20/4’ or ‘the third prime’ and in many other ways. Similarly, a vector in threedimensional space such as (3, 4, 8) can be given in a number of different ways, for example as (1, 1, 1)+(2, 3, 7). Now, let us assume that this mathematical vector √ represents a force acting on a body. Should we say that one force of the strength 89 N acts on this body, or should we say that two forces with different directions, one force√in the direction √ (1,1,1) with the strength 3 N and √ one, (2,3,7), with the strength 62 N act on the body so that their vector sum is 89√N? Or should we chose any other composition of two or more forces adding up to 89 N in the direction (3,4,8)? Obviously we can’t say that three forces, both the components and their sum exist, for in so doing we would have doubled the action on the body, giving it double acceleration. The natural move for anyone accepting forces is to say that it is the component forces that exist and act on a moving body. And we know how to identify the component forces acting on a body; it is done by identifying their sources and the kinds of interactions that the body of interest is involved in. For example, the gravitational force on a body has its sources in other bodies and electromagnetic forces comes from charged particles. Let us as a simple example consider the gravitational forces on the earth. We can in practice neglect all other bodies except the sun and the moon, so we have one force coming from the sun and one from the moon. This is the physically correct way of dividing the total force on the earth into components and we have a way to individuate forces. But this just highlights the point that forces are disposable as being part of the ontology of physics, because individuation of forces crucially depends on individuation of bodies. If it is impossible to identify a force except by identifying its source, why say that it is an entity in its own right? Using Ockham’s razor we may simply say that it is an attribute of that body, or an attribute of pairs of bodies, as when we use expressions of the type ‘force between. . . . and . . . ..’. Similar arguments can be construed regarding all quantities; they are additive and the question of the correct decomposition can always be raised.

112

7.5

7 Realism, Theory-Equivalence and Underdetermination of Theories

The Use of ‘Model’ in Physics

When physicists talk about models they mean simplified descriptions of classes of concrete physical systems. One example is the simple harmonic oscillator, which is said to be a model of pendulums and other oscillating systems. The core element in this model is the equation d 2 θ/dt 2 + (k/m)θ = 0. Physicists interpret this equation as relating physical attributes of one and the same system to each other; θ is the system’s angular deviation, t its time, k it’s spring constant and m is the mass of the oscillating system. In short, the equation by itself doesn’t contain any information telling us that it is about physical phenomena; this information is contained in the background knowledge among physicists. This background consists of three implicit assumptions: (i) the variables take numbers as values, (ii) units are attributed in a standard way, tacitly taken for granted by all physicists, to numbers and (iii) that these numbers with their units are attributed to one and the same physical object. Taking this background knowledge into account we understand how models are related to concrete physical phenomena. When we call the harmonic oscillator a model of pendulums we indicate that it is not a perfect match between the dynamics of any real physical pendulum and that of the harmonic oscillator. We make at least three simplifying assumptions about the real pendulum. The first is to restrict its amplitude so that one can put sin θ ≈ θ . The second is to assume that the length of the string doesn’t change, (i.e., that it is completely inelastic!) and the third is that we can neglect damping. All three conditions are strictly speaking impossible to satisfy in real cases but it possible to calculate how close to this theoretical model a certain pendulum is. So we can with a fair amount of justification say that the harmonic oscillator equation is a model of a particular physical pendulum, or of a class of such things, given some approximations. Models are abstract representations of physical systems. There is also another use of the term ‘model’ adopted by e.g., Suppes (2003), van Fraassen (1980) and (Halvorson 2019, 22). This usage comes from Tarski, who conceived of a theory as set of uninterpreted formulas containing individual constants and variables. When the individual constants are given values and the domains of variables are identified such that all sentences in the theory are true, we have a model of the theory. If the domains of some variables are constituted by physical objects we have a physical model of the theory. The model is thus something concrete, systems of physical objects. Thus Suppes writes, quoting Tarski: ‘A possible realisation in which all valid sentences of a theory T are satisfied is called a model of T.’ (Suppes 2003, 17) and later he refers to this quote in the following sentences: It is very widespread practice in mathematical statistics and in the behavioral sciences to use the word model to mean the set of quantitative assumptions of the theory, that is the sentences which in a precise treatment would be taken as axioms or, if they are themselves not explicit enough, would constitute the intuitive basis for formulating a set of axioms. In this usage a model is a linguistic entity and is to be contrasted with usage characterized by the definition from Tarski, according to which a model is a non-linguistic entity in which a theory is satisfied. (op. cit. p. 20)

7.6 Theories of Principle vs Constructive Theories

113

He continues: It does not seem to me that these (the different usages of ’model’) are serious difficulties. I claim that the concept of model in the sense of Tarski may be used without distortion as a fundamental concept in all of the disciplines from which the above quotations are drawn. In this sense I would assert that (the meaning of) the concept of model is the same in mathematics and in the empirical sciences. The difference between these disciplines is to be found in their use of the concept.

Suppes never mentions the usage of the word ‘model’ in the sense of a simplified description of a portion of reality, a physical, chemical or biological system. This usage is totally different from Tarski’s use of ‘model’. In Tarski semantics (and thus in the semantic view of theories) the second term place in the predicate ‘. . . .is a model of . . . . . . ’ is to be filled with terms for theories, i.e., abstract entities, whereas in empirical sciences the second term place is to be filled with terms for portions of the empirical world. So the extension of the predicate ‘. . . is a model of. . . ..’ differs sharply between the two ways of using this expression. From an empirical point of view, those entities called ‘models of a theory’ are just as theoretical as theories. Such things have in themselves no connections to reality, despite being called ‘models’. As earlier pointed out, non-theoretical elements, indexicals together with pointing gestures, are needed for connecting a theory, hence also a model in the sense of Tarski semantics, to something in the natural world. This is an immediate consequence of Lövenheim-Skolem’s theorem, see Sect. 3.5.

7.6 Theories of Principle vs Constructive Theories As was argued in Sect. 5.6 (and I return to this topic in Sect. 11.5), some fundamental scientific predicates are the results of generalisations of repeated observations. Many disagree. There is a debate about this topic, which started with some comments by Einstein who in Einstein (1919(1954)) distinguished between theories of principle and constructive theories. As an example of a constructive theory Einstein mentioned the kinetic theory of gases, where from the hypothesis that gases consist of molecules moving around one constructs descriptions of observable phenomena. As an example of a theory of principle he mentioned his own theory of relativity, where observed empirical regularities are generalised to higher principles. The advantage of theories of principle is, according to Einstein, their logical perfection and certainty of the basis, whereas the advantage of constructive theories is their visualisability, completeness and versatility. Einstein did not claim any advantage of one type of theory over the other. Later philosophers have found Einstein’s description of constructive theories close to the view that a theory is a set of models. Thus Balashov and Janssen (2003) writes: In a theory of principle, one starts from some general, well-confirmed empirical regularities that are raised to the status of postulates (e.g., the impossibility of perpetual motion of the first and second kind, which became the first and second laws of thermodynamics). With

114

7 Realism, Theory-Equivalence and Underdetermination of Theories

such a theory, one explains the phenomena by showing that they necessarily occur in a world in accordance with the postulates. Whereas theories of principle are about the phenomena, constructive theories aim to get at the underlying reality. In a constructive theory one presupposes (a set) of model(s) for some part of physical reality (e.g., the kinetic theory modeling a gas as a swarm of tiny billiard balls bouncing around in a box). One explains the phenomena by showing that the theory provides a model that gives an empirically adequate description of the salient features of reality. (Balashov and Janssen 2003, 331)

Constructive theories are thus seen as examples of model theory in which a theory is conceived as a family of models. (This is Tarski’s use of ‘model’ as discussed above.) A successful physical theory have models that are intended to represent real phenomena, thereby explaining them. The relation between model and reality is rarely spelled out in detail. Van Fraassen described the relation as an isomorphism; that a theory is empirically adequate means that it has a model such that a substructure of the model is isomorphic to phenomena (van Fraassen 1980). It is far from clear to me what this more exactly amounts to in a concrete situation. The common way of displaying fit between theory and observations is to calculate a function f(x) whose values can be observed and in the same picture display the theoretical function f(x) and a set of observed values with error bars. If the curve always, or nearly always, lies within the error bars of the data points we conclude that theory fits observations. Would van Fraassen call such a fit an isomorphism? Well, in a loose sense one might describe a good fit between data points and a theoretically calculated curve as an isomorphism. But it is hard to see this fit as an instance of an isomorphism in the mathematical sense. However, wasn’t van Fraassen’s aim to provide a rather precise and stringent account of the relation between theory and reality? But let us return to the claim that constructivist theories explain phenomena. At least in some cases we get a feeling of having got a fine explanation when deriving an observed regularity from some fundamental principles. For example, when seeing the derivation of the general law of gases from the principle of energy conservation in ensembles of randomly moving molecules, we understand why a gas in a closed container becomes warmer when the pressure is increased. But would we say we understand this phenomenon if the assumption that gases consists of ensembles of particles were a completely new idea, never suggested before? I think not. I believe that the explanatory force of the derivation of the law of gases crucially depends on the fact that atoms and molecules were accepted as the constituents of matter in advance and independently of any knowledge about gases. This was discussed in the previous chapter. Principled theories are said to merely describe phenomena, not to explain them. But what is more exactly the criterion for distinguishing between descriptions and explanations? Another way of posing the question is: what explains what? Acuna (2016) claims that in the specific case of the debate whether Minkowski spacetime explains Lorentz invariance or if it is the other way round, both parties base their arguments on tacit ontological assumptions; those who hold that Minkowski spacetime explains Lorentz invariance base their view on substantival-

7.6 Theories of Principle vs Constructive Theories

115

ism about spacetime, those who hold that Lorentz invariance explain Minkowski spacetime are relationalists. I think Acuna is right in this assessment; I only want to add that this is one more illustration of my thesis in Sect. 3.8 that explanatory value is strongly context-dependent. The basic question is what you accept as starting points, explanans, in an explanation. We empiricists ask for principles that are generalisations of observations, metaphysicians ask for something more profound. Metaphysicians usually propose something claimed to be known a priori. But how could one derive empirical consequences from purely a priori principles? Nancy Cartwright (1989, ch.2) wrote ‘No causes in, no causes out’ and I think she would join me in the analogous ‘No empirical content in, no empirical content out’. Adherents to constructivist theories hold that if a certain structure can be mapped onto, or can be recovered from, a class of empirical phenomena, this structure explains the latter. One is immediately tempted to ask what is required of an explanation. We can all agree that it is uncontroversial to say that if a certain class of phenomena has a common structure, a certain invariance under some class of coordinate transformations, this structure is a description of common traits. But is that an explanation? If so, what are the criteria for an explanation? Many philosophers of science have a strong urge for explanations structured as in logic and mathematics, i.e., starting from axioms or other a priori principles. But physics is different, it does not consist of purely formal theories and that must be taken into account when discussing physical explanation. As the reader may infer from Chaps. 5 and 10, I hold that physical theories can and should be interpreted as theories of principle. My account in Chap. 10 of how we arrive at fundamental laws of physics fits Balashov &Janssen’s characterisation above: ‘In a theory of principle, one starts from some general, well-confirmed empirical regularities that are raised to the status of postulates.’ As I will show, the postulates thus established as generalisations of experiments function as implicit definitions of theoretical predicates and these function as fundamental laws from which other laws are derived.

116

7 Realism, Theory-Equivalence and Underdetermination of Theories

7.7 Summary 1. I accept the realist view that scientific theories in mature sciences are at least approximately true. I also accept that singular terms in those theories refer to objects, no matter whether they are observable or not. I reject the doctrine that general terms have references. 2. I don’t find the positive arguments for scientific realism, the no-miracle argument and the argument from explanatory power, convincing. Both arguments begs the question in the debate with empiricists. But consistency requires of us to hold all sentences in a theory we accept to be at least approximately true. Empiricists have no good reason to dismiss this realist doctrine. 3. The debate about underdetermination lacks clearly stated identity criteria for theories. If two theories are held to be different and not merely two different theory formulations, we need a convincing argument to the effect that they fail a clearly stated identity criterion. Quine has given a strong argument to the effect that two empirically equivalent theories may be viewed merely as two theory formulations, not two different theories. 4. Physical theories are about the physical world, they are not merely logical or syntactical items. The connection between a theory and the empirical world is established by our use of indexicals and gestures together with use of expressions belonging to the theory. This crucial fact is generally overlooked by scientific realists. 5. The antirealist camp has, following the logical empiricists, mistakenly dismissed reference to unobservable objects. Gravitational fields, to take one example, are not observable, but empiricists have nevertheless good reason to accept such entities, if they are quantified over in a theory held true. (But see the discussion about gravitational fields and matter in Chap. 17.) 6. Neither scientific realists nor empiricists have distinguished between singular and general terms in the discussions of realism. One cannot coherently deny that singular terms and variables of first order quantification refer to existing things, if the theory is held to be true, or approximately so. But general terms are in no need of referents. 7. Since terms for physical quantities are predicates, and we have no need for referents to predicates, we can safely say that no quantities conceived as universals exist. The perennial debate about the reality of forces is from an empiricist view decided once one realises that adding quantities to the ontology doesn’t add to the theory’s predictive power. Quantities are ontologically superfluous. 8. Models are in physics models of portions of reality, not models of theories. The latter use of ‘model’ is adopted in the semantic view of theories. These two ways of using the term ‘model’ are quite different and this difference is rarely kept in mind.

Part III

Philosophy of Physics

Chapter 8

Causation in Physics

Abstract The conclusion of this chapter is that there are no causal laws in physics. Physical laws relate quantities to each other, but do not say anything about cause and effect. Furthermore, neither forces nor causes are accepted in the ontology since ‘force’ and ‘cause’ are general terms. But physical theory is very often used in causal discourse. Often we use physical laws to connect events, in which case we label them ‘cause’ and ‘effect’. These terms come from our agent perspective; when we want to achieve a certain goal, we ask what to do, and that is the context in which we label these two events or states of affairs as ‘cause’ and ‘effect’.

8.1 Introduction Hume is famous for rejecting the notion that causes necessitate their effects in any deeper sense of ‘necessitate’ than mere strict regularity. His position, expressed in modern terms, was that the necessary truth conditions for a sentence of the form ‘A causes B’ are three: (i) A precedes B, (ii) A and B are contiguous and (iii) A events always are followed by B events. (Hume didn’t use the word ‘event’, but that is unimportant for this discussion.) All three conditions have been heavily debated and Hume himself added caveats to all. He admitted (i) that in the limit cause and effect could be simultaneous, (ii) that cause and effect need not be directly contiguous, there may be intermediate events which make up a continuous chain, and (iii) if there are several causes of an effect, an individual cause only increases the probability for the effect. This conception of causation has always been the common ground for all empiricists. Kant wasn’t satisfied with Hume’s skepticism about empirical knowledge in general and with his account of causation in particular. But he clearly recognised the cogency of Hume’s critique of the empiricist idea that objects cause our representations of them. According to Hume we cannot in principle have any knowledge about this causal relation, since it requires knowledge of both relata and how could we know that there really is an external object corresponding to our impression? All we directly know is what is in our minds, according to classical empiricists. Classical empiricism invites skepticism. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_8

119

120

8 Causation in Physics

Kant’s response to this epistemological problem, briefly mentioned in Sect. 3.4, was his ‘Copernican revolution’ as he himself called it. Objects of cognition come into being, are formed, as distinct objects in the cognitive process. Their basic features can be recognized by analysing the structural features of our mind, viz. our forms of intuition, space and time, and the 12 fundamental categories in our thinking, among which we find causality (Kant et al. 2003, (A80/B106)). Objects as they appear to us cannot be said to be re-presentations of independently existing things; that is an impossible stance, since it presupposes a point of view not available to us humans. When thinking that our impressions are representations of independently existing objects, we use our mind as a subject reflecting on a relation between two objects, of which the very mind doing the thinking is one of these. This is incoherent; the mind cannot in the same act be both subject and object. Kant concluded that objects are contents of our presentations. Therefore they have by necessity a number of features that reflects the constitution of our mind. Knowledge consists of judgements, which come in 12 fundamental forms, according to Kant. One of these is ‘X causes Y’, hence cause is a fundamental predicate, one of the 12 categories. Kant’s ‘deduction’ of the categories is one of the most difficult arguments in philosophy and I have no intention to judge whether it is a sound argument. But I do think his conclusion, phrased more colloquially, that causal thinking is a basic feature of humans, is correct. It can be justified by empirical studies of linguistic practice. One cannot directly observe judgements, only their linguistic counterparts, i.e., uttered and written declarative sentences. Thus an empiricist may accept Kant’s conclusion that CAUSE is a fundamental concept in our thinking without being convinced by his transcendental reasoning; instead one can come to this conclusion based on observations of linguistic practice. Observing linguistic practice we easily recognise how fundamental causal idiom is both in our vernacular and in scientific discourse. There are lots of widely used common expressions which have a clear causal sense, not only ‘cause’ and ‘effect’, but also ‘leads to’, ‘brings about’, ‘makes happen’, ‘produces’, ‘does’, ‘start’, etc. We have a natural tendency to think and talk in causal terms, it is no less basic than talk about e.g. persons or the weather. Our use of the verbs ‘do’ and ‘make’ are obvious examples and both indicate a doer, not necessarily a human, which is a cause of a state of affairs. In short, I do think that Kant was right, causal thinking is a basic feature of us humans. And I don’t see any need for a transcendental argument for this conclusion;

8.2 Causes and Laws

121

observations of linguistic practice suffice.1 Using causal notions is fundamental linguistic practice, but has it any place in physics, or in scientific theory in general?

8.2 Causes and Laws Kant and several other philosophers held that something they called the ‘the causal law’ could be known a priori. Empiricists are skeptical because they are generally skeptical about the very possibility of a priori knowledge about the empirical world. Russell famously expressed his rejection of this notion as follows: The law of causality, I believe, like much that passes muster among philosophers, is a relic of a bygone age, surviving, like the monarchy, only because it is erroneously supposed to do no harm. Russell (1913, 1).

Russell’s view is shared by among others Redhead (1990), Batterman (2002) and Norton (2003). Others disagree. A careful analysis of Russell’s view is Ross and Spurrett (2007). In this paper the authors report that scientists regularly use causal idiom: A search for articles in which the word ‘cause’ appeared in the on-line archives of Science between October 1995 and June 2003 returned a list of results containing 8288 documents, averaging around 90 documents per month, in which the word ‘cause’ occurred. ‘Effect’ was more popular - 10456 documents for the same period, around 112 per month. (op. cit. p. 60)

I submit that if the authors of this paper had included other causal expressions, such as ‘leads to’, ‘results in’, ‘brings about’ etc., the figures would have been still higher. So was Russell wrong? The authors’ answer is that under a certain interpretation of Russell’s claim he was not. Russell correctly observed that laws of physics never utilise causal idiom, and since he thought that all of science ultimately was based on, or could be reduced to, physics he concluded that in no parts of science proper there are any causal laws. This latter assumption, that all of science could be reduced to, or based on, physics, is highly controversial. Ross & Spurrett concludes that a general dismissal of the law of causality is doubtful, but if we restrict its dismissal to physics Russell is justified.

1 If

there exists a natural language lacking causal idioms one would be justified to conclude that causal thinking is a local phenomenon in western and western influenced cultures. But that could only be established by translating the foreign language to for example English. This raises the problem how we determine the correctness of a translation in which no translated sentence contains any of the causal words: ‘cause’ ‘effect’, ‘make happen’, ‘bring about’ etc. The general problem is discussed by Quine (1960) and Davidson (1973a). Davidson’s conclusion is that just by attributing a language to the foreigner and translating it to our own language, we assume a lot of common features between the foreigner’s world-view and our own. Since causal thinking is a basic feature of our world-view it seems to me unavoidable that some of the expressions in the foreign language would be translated to causal idiom in English.

122

8 Causation in Physics

The basic reason for there being no causal laws in physics is that all laws of physics relate quantities to each other while ‘cause’ relates events or states of affairs.2 As will be discussed in Chap. 10, when expressed in first order predicate logic, laws are universally generalised conditionals. In these generalisations we quantify over bodies, fields, or more generally, over physical systems and these objects are attributed quantities related by equations. Laws in physics do not quantify over events or states of affairs. Newton’s second law, for example, tells us that the force f , the mass m and the acceleration a when attributed to the same body satisfies the equation f = ma. The law doesn’t say anything about cause and effect. Perhaps there are causal laws in other sciences, but I leave this issue for another occasion. However, in spite of there being no causal laws in physics, and physicists never refer to the law of causality, physics papers still often contain causal claims. Why, then, are causal expressions used in physics? Like several other philosophers, e.g., Eagle (2007),3 I believe that causal notions enter physical discourse when we use physics for causal explanations or when we suggest what to do in order to achieve a certain goal. In both cases we move from the theoretical perspective to the agent perspective on the world; we consider how to use physical knowledge. In this perspective the pragmatic aspect of language use comes to the fore. In the case of planning actions it is obvious, less so in explanations. But as argued in Sect. 3.8, there are no good reason to accept that explanations, causal or not, should be given any evidantiary value, their role in discourse is purely pragmatic. As an illustration consider our use of the general law of gases: pV = nRT

(8.1)

where as usual p stands for pressure, V for volume, n for the number of moles, R is the general gas constant and T is temperature. Expressed in predicate logic this law is ‘For any portion of a gas, its pressure p, volume V , number of moles n and temperature T satisfies the equation pV = nRT , where R is the general gas constant.’ Obviously, this law doesn’t say anything about causes. But we regularly use it for giving causal explanations and making predictions. When we for example explain how a boiler works we say something like: ‘If we heat the steam in the boiler, the pressure of the steam will increase, according to the general law of gases since the volume and the number of moles in the gas is constant.’ The heating of 2

It might be objected that variables often are described as causally related, an example being ‘high blood pressure causes heart infarction’. But values of variables are states of affairs or events. 3 Eagle writes: ‘Causation is context-dependent: it is sensitive to which events or variables are included in the model, and some think it is relative to default values for the variables also. Causation is partial and local. These are precisely the features which causal accounts do not share with authoritative physical accounts, and it is precisely these deficiencies which the exclusion arguments exploits (Eagle 2007, 176).

8.2 Causes and Laws

123

the steam causes its pressure to increase. In another situation we use the same law for explaining that an increase in the volume of a fixed portion of gas will cause its temperature do fall. Context determines the direction of the causal explanation. The gas law can be used in causal explanations, but the law itself makes no distinction between cause and effect. It is we humans as agents that make such a distinction by manipulating a variable in order to achieve a certain state change. In other words, the cause is what we manipulate, the effect is another change and the connection between these changes is determined by the law. Another illustration is our use of Maxwell’s equations. They have very often been interpreted in causal terms; charges cause electric fields and when moving they cause magnetic fields. As will be discussed in Chap. 11 this leads to deep problems connected with so called ‘self fields’. If we use the causal argument that a charged particle cannot be acted upon by its self-field, we run into contradictions. On the other hand, if we calculate the motion of a charged particle and assume that its motion is determined by total field=external field + self-field, we get the result that in cases where no external field is present its motion is accelerated by its own field! This is certainly wrong. The problem will be thoroughly discussed in Chap. 11. For now I only want to point out that a causal reading of Maxwell’s equations is impossible. But we can use Maxwell’s equations for calculating the effects of certain actions, for example increasing the charge in a piece of metal. Similarly, assuming that matter causes gravitational fields is, first, not justified by Einstein’s equation, and second, it would result in conceptual problems. This will be discussed in Chap. 17.

8.2.1 Causation and Relativity Theory In both philosophical and scientific papers discussing the implications of relativity theory we often find arguments based on the notion that causes can be transmitted with at most the velocity of light. Hence, light signals are conceived as transmitters of causation and as such treated as causal processes. This notion was developed by Salmon (1984), inspired by Reichenbach and Reichenbach (1999). Salmon’s analysis was criticised by Hitchcock (1995) as being too weak for sorting out causal explanations and Salmon adjusted his conditions in Salmon (1997). Salmon had for most of his life a realistic view on causation, but ultimately he retreated to a pragmatic view when he admitted the relevance of context: The major obstacle to the creation of a fully objective and realistic theory of cause-effect relations is the fact that the instances we tend to select are highly context dependent. . . . . Cause-effect statements are almost always—if not always—context dependent” (Salmon 2001, 123, 125)

The context dependence of causal explanations comes from the general context dependence of explanations, as already pointed out. This does not undermine the claim that the velocity of light is an upper limit for transmission between cause

124

8 Causation in Physics

and effect. Hume held that a necessary condition on causation was that cause and effect must be contiguous or connected by a contiguous chain of events. This notion can be given a more precise formulation in physical terms: the contact between cause and effect consists of transmissions of conserved quantities, such as energy or momentum in the form of photons, between physical systems. Hence, a necessary condition for something being the cause of a certain effect is that the event termed ‘the cause’ must be connected to the effect by a physical signal. This does not entail that such an event has explanatory relevance, as Hitchcock pointed out. But the very expression ‘explanatory relevance’ indicates that this is a contextual and pragmatic aspect of our use of ’cause’.

8.3 Are Forces Causes? The headline of this section might give the impression that I think of forces and causes as entities. I do not. As pointed out in the previous chapter, I conceive of quantitative predicates, including FORCE, as having extensions but no references, according to my overarching nominalism. My talk about quantities should be understood as talk about quantitative predicates. And similarly, the two-place predicate ‘. . . . cause of . . . ..’ has extension but no reference. So there are strictly speaking no causes; the question ‘are forces causes’ should thus be understood as ‘is the extension of the predicate FORCE a subset of the extension of the predicate CAUSE ’. But we also use the term ‘cause’ in descriptions of singular events or states of affairs, i.e., individual things, and nominalism does not exclude such items. One might thus hold that individual causes exist; causes are events or states of affairs. However, that invites the idea that causes make up a subcategory of events, which is wrong, for one and the same event can be the cause of one event and the effect of another event. The crucial thing is that ‘cause’ is a relational word. Saying that an event is a cause we mean that is the cause of something else and that must be taken care of in paraphrasing vernacular expressions in logical notation. So for example we may paraphrase ‘The cause of the first world war was the assassination of archduke Franz Ferdinand’, as ‘There are events x and y such that x is the assassination of Franz Ferdinand and y is the first world war and x caused y.’. We have thus quantified over events and CAUSE is a two place predicate. Hence there are no individual things, events or states of affairs which are causes in my nominalist ontology. Neither are there any forces, as already said. It is no contradiction to say that an event is a cause of another event and denying that there are causes, since the term ‘cause’ is relational; if something is called a cause it is implicit that it is a cause of something else. Thus, we should paraphrase the colloquial ‘cause’ as the two place predicate ‘. . . .. is the cause of . . . . . . ’. And I have repeatedly claimed that predicates do not refer to universals, hence there are no causal relations referred to by this predicate.

8.5 Summary

125

Returning to forces, one may keep in mind that the use of ‘force’ in physics and in colloquial talk are radically different; everyday use of ‘force’ reflects an Aristotelian conception according to which a force is that which moves something, whereas the physical quantity FORCE is identical to MASS TIMES ACCELERATION. In colloquial talk people sometimes use the words ‘force’ and ‘cause’ more or less synonymously.

8.4

CAUSE Is Agent-Related

The agent-relative analysis of causation was forcefully argued by Pearl (2000) and Woodward (2003). We ask for causes mainly in two types of contexts; either we ask for an explanation of an event or state of affairs, or we ask for the cause since we want to do something in order either to prevent or to promote a future state of affairs. We ask ‘What should I do?’ and a correct answer must be a cause of the future state being in focus. This conclusion does not contradict that a necessary condition for A being the cause of B is that a physical signal of some sort goes from A to B, quite the opposite; A physical contact is necessary but not a sufficient condition for the use of the predicate ‘. . . .. causes . . . ..’. When saying that in order to achieve a certain state of affairs B, you should do A, this can only be true if there is a physical connection between A and B.

8.5 Summary Russell was fundamentally right, causes has no place in physical theories proper. The two-place predicate CAUSE has its proper place in discourse from an agency perspective, when we deliberate over what to do or ask for explanations. That does not conflict with the fact that books and articles in physics are full of causal idiom; a great part of such publications is about the use of physical knowledge in deliberations about human affairs. A necessary condition, however, for causation is physical contact, exchange of some conserved quantity between objects related as cause and effect. We use physical laws in determining causes and effects of given events or states or affairs; but physical laws are not causal laws.

Chapter 9

SPACE, TIME and BODY; Three Fundamental Concepts

Abstract The starting point in this chapter is the observation that physics is based on observations of bodies moving around, hence the three fundamental concepts in physics are SPACE, TIME and BODY, corresponding to the three quantities DISTANCE , TIME and MASS . These three concepts, and their corresponding quantities, are mutually dependent; the definition of one of these quantities presupposes the other two. There is a debate among physicists concerning how many of the natural constants are fundamental, a question which is closely connected to the question of the number of fundamental quantities. Some agree with Gauss holding that DISTANCE, TIME and MASS are the fundamental quantities, thus concluding that three constants are fundamental, others hold that one or two of these can be reduced away, and some hold that theoretical physics can be done without any natural constants at all. The conclusion drawn in this chapter is that when we apply physics to concrete experiments and observations we need to put in the values of three natural constants, a conclusion in harmony with the starting observation that we, at the fundamental level, observe bodies with mass moving around in space and time.

9.1 Observations As argued in Chap. 3, the basis for empirical knowledge are observation reports that several people on the spot unhesitatingly agree upon. But, then, what do we observe? Some argue that we first and foremost observe physical bodies, others that in addition we observe events, a third group hold that we also observe states of affairs, motions and some properties such as colours and shapes. The list is not exhaustive. One may suppose that people belonging to the same culture easily agree on what is primarily observable, which in western cultures are people, animals, trees, etc., in short, bodies. We might also agree on observations of particular instances of colours, shapes and some visible relations, things which in metaphysics are called ‘tropes’, but I won’t take a stance on this issue. I only take it for granted that we all agree on observing bodies; this is the most uncontroversial ontological basis. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_9

127

128

9 SPACE, TIME and BODY; Three Fundamental Concepts

Now suppose people from two radically different cultures, speaking different languages, of which neither part knows the other language, try to convey to each other what they observe, each talking his own language. This was the situation discussed by Quine at the beginning of Word and Object. If you don’t understand what their words mean, is it then possible to infer, merely from their observable behaviour, how they conceptualise the observable world? Do they primarily see bodies, instantiated properties, motions, events or what? In order to decide that we need a translation manual and Quine’s conclusion was that the chosen translation is underdetermined by all possible evidence in terms of overt behaviour. Different translations of a foreign language, encapsulating different ontologies, are possible. Whether foreign cultures perceive and think in terms of bodies moving around, or in terms of properties being instantiated at certain spots, or in some other way never imagined by western philosophers, cannot be conclusively determined just by observing a foreigners’ behaviour and hearing her utterances. One may observe that Quine only considered evidence in terms overt behaviour, including assent and dissent to utterances of speakers at the spot. But even if we follow Quine in dismissing talk about intentional states and admit only evidence formulated in strictly behavioural terms, other evidence for hypotheses about people’s conceptualisation of the environment might be found. And in fact that has been done. Cognitive research on prelinguistic children gives some evidence for the conclusion that humans in general conceptualise the environment as consisting of bodies, in particular moving bodies, see e.g. (Spelke et al. 1995a,b,c). Already from an age of circa six months are children able to identify and re-identify moving bodies, a conclusion drawn from observations of their behaviour, obviously, since they don’t talk. I will suppose this is true about all humans and it is easy to understand why that is the case: we humans, as other animals, are primarily interested in food, enemies and mating partners, and all these things are bodies, so this way of cognising the environment may be an effect of evolution. We might also discern other kinds of things in our environment, but bodies seem to be a basic ingredient in our ontology. There are reasons to think it a universal trait. Two observers of an experiment believing in opposing theories must be able to agree on observations; otherwise performing experiments and making observations in order to decide what evidence there is for a theory is pointless. Observations must at least be objective in the sense that people can arrive at inter-subjective agreement about observation reports. And we may conclude from the former paragraph that basic observation reports are about bodies. We observe photos, measurement devices, computer screens and many other things, all being bodies. One might argue that we also observe other things, such as events or causes, but bodies cannot be left out. A well-known illustration from the history of physics is the observation of the deflection of light from distant stars during a solar eclipse. According to relativity theory, photons are sensitive to the gravitational field, whereas pre-relativistic theory has no such consequence. The famous Eddington solar eclipse expedition sent out by Royal Society 1919 reported that light is deflected by 1,64 seconds of an arc

9.2 How Does a Theory Connect to the World?

129

when passing the sun. Einstein had predicted the deflection to be 1.75 seconds of an arc. All involved agreed that this observation refuted pre-relativistic theory and confirmed relativity theory, no matter their previous convictions, see e.g. (Frank 1947, 141). What did people actually observe? Pictures taken by the solar expeditions, i.e., the observed things were bodies. Of course, a lot of calculations and measurements on these pictures must be done in order to get a result that can be compared with the predictions of relativity theory and classical mechanics, and in fact those involved also agreed on the validity of these calculations and the theories upon which these calculations are based. For example the optical theory upon which the telescopes and cameras were constructed are relied upon in disputes about cosmology. In modern physics an enormous amount of theory is so well established that inferences drawn from raw data, using such uncontroversial theories, are normally relied upon by all parties in a debate. But the ultimate evidence for any scientific theory consists of observations of bodies. In Sect. 11.6 I will further discuss observations.

9.2 How Does a Theory Connect to the World? The physical world consists of many unobservable things according to our best modern theories. How, then, do we connect a physical theory (or any empirical theory), which postulates many unobservable entities, to the observable world? In other words, how do we manage to give a theory empirical significance? A necessary condition is that we must somehow ascertain that at least some of its terms refer to things observed by us. How is this reference relation established? This connection cannot be done by any theory alone; this is a consequence of Löwenheim-Skolem’s theorem, as shown in Sect. 3.5. We need non-theoretical resources, indexicals together with pointing gestures, in order to connect theory to reality, a conclusion by Luntley (1999) called ‘Russell’s insight’: The semantic power of language to represent derives from the semantic power of contextsensitive expressions’ (op. cit. p. 285)

The same conclusion can be reached without using Löwenheim-Skolem’s theorem. A name cannot by itself point out what it is a name of and a definite description, however elaborated, is insufficient to definitely identify a particular object. A definite description ‘The F’ contains at least one general term and in order to sort out a particular physical object as The F, we must use space and time coordinates. But the physical application of coordinates presuppose reference to bodies, which constitute the physical instantiation of origin and spatial directions of the coordinate system used. So in the final step we need to point, either directly to the object referred to, or to objects making up the coordinate basis. Sunny Auyang (1995) makes the same point: . . . to say something specific, we must use some demonstratives, labels, or names, such as this or that, left or right, i or j . These are admittedly conventional, yet we cannot simply

130

9 SPACE, TIME and BODY; Three Fundamental Concepts

discard them because without them we cannot express experiences in which we encounter particular things. (op.cit. p. 162.)

This doctrine of Russell and Auyang relates to a point Bohr often made when discussing the interpretation of quantum theory: . . . [It] is decisive to recognize that, however far the phenomena transcend the scope of classical physical explanation, the account of evidence must be expressed in classical terms. (Bohr 1951, 209)

By ‘classical terms’ I take Bohr to mean positions and velocities attributed to observable bodies e.g. measurement devices; his point is that we cannot express evidence for quantum theory in terms of wave functions, quantum systems or other things talked about in quantum theory; it must have observable consequences, descriptions of situations we humans can locate in space and time. This would not work as common ground for linguistic interaction about the environment unless two or more people on the same spot easily can agree on which objects in the vicinity they are talking about. If a speaker points in a certain direction but wants others to focus on some other thing than a body, for example the shape or the colour of the object in front of the speaker, she needs to add something in order to convey the message; mere pointing is not sufficient. Everyone learning small children words for colours and shapes can confirm this. The default option is that we talk about bodies; that is a fundamental mode of operation of our cognitive apparatus. The first who to my knowledge used Löwenheim-Skolem’s theorem for criticising the notion that a theory by itself can determine referential relations to external things was Putnam. He ends his Putnam (1980) thus: On any view, the understanding of the language must determine the reference of the terms, or, rather, must determine the reference given the context of use. If the use, even in a fixed context, does not determine reference, then use is not understanding. The language, on the perspective we talked ourselves into, has a full program of use; but it still lacks an interpretation. This is the fatal step.To adopt a theory of meaning according to which a language whose whole use is specified still lacks something - viz. its “interpretation” - is to accept a problem which can only have crazy solutions. To speak as if this were my problem, “I know how to use my language, but, now, how shall I single out an interpretation?” is to speak nonsense. Either the use already fixes the “interpretation” or nothing can. . . . . . . Models are not lost noumenal waifs looking for someone to name them; they are constructions within our theory itself, and they have names from birth. (op.cit 481–82)

Thus, Putnam had the same insight as Russell and Auyang.

9.3 The Interdependence Between the Predicates PLACE, TIME and BODY When we talk about bodies we take for granted some principle of identity among such objects and it is universally accepted that genidentity is the identity criterion

9.3 The Interdependence Between the Predicates PLACE, TIME and BODY

131

for bodies. Genidentity means that two occurrences of a body at different places at different times are occurrences of the same body iff there is a continuous trajectory connecting these two occurrences.1 Hence, the application of the concept of BODY presupposes the concepts of TIME and PLACE, and since places are portions of space, the concept of SPACE is also presupposed. But how are these latter concepts given empirical content? I will now argue that TIME and PLACE in turn depends on BODY . Thus, BODY, PLACE and TIME are mutually dependent in the sense that the characterisation of one of them is made in terms of the other two. Herman Weyl once argued along somewhat similar lines. He begins his book Space, Time and Matter (Weyl 1952/1918) in a clearly Kantian vein: Space and time are commonly regarded as the forms of existence of the real world, matter as its substance. A definite portion of matter occupies a definite part of space at a definite moment of time. It is in the composite idea of motion that those three fundamental conceptions enter into intimate relationship.

On this I basically agree since a body is a portion of matter at a particular place and time; not however based on any Kantian reflections on the faculties of the mind (as was Weyl’s background), but rather on what different observers can agree upon when looking at the same scene at the same time. No matter what theories about space, time, causes of motions etc., different observers accept, so long they are not moving with relativistic velocity relative to each other, they can agree on observations of how visible bodies change position, as argued above. Sentences about such things require the predicates PLACE, TIME and BODY and since a place is a portion of space, the latter is implicitly involved. What, then, does Weyl mean by “those three fundamental conceptions enter into intimate relationship?” My interpretation of “intimate relationship” between these conceptions is that the three concepts SPACE, TIME and BODY reciprocally presuppose each other; together they form a conceptual space for talking about observable physical events. To see this, lets start by analysing our concept of BODY. A body can be described by the following four characteristics: 1. A body exists for some time, however short. A necessary condition for being a body is that it can be identified and later reidentified as the same body. 2. One and the same body cannot exist at two places in space at the same time.2 The term ‘place’ should here be understood in relation to the size of the body considered. When we talk about big bodies, the places are correspondingly extended. As an example, when saying that the earth occupies different places at different times we mean that the space filled by the earth at one particular point of time is not exactly the same as the space filled at another point of time. Here

1 It doesn’t matter if space is discontinuous at the Planck scale, which would be the case if spacetime

quantized. We have no use for the concept of body at this scale. condition is another formulation of Aristotle’s point in Physics that a body by its very definition is wholly inside a closed surface.

2 This

132

9 SPACE, TIME and BODY; Three Fundamental Concepts

the individuative word ‘place’ refers to a rather extended volume. It follows that a body’s change of place takes time, or in other words that no displacement can occur with infinite velocity. For this to have a well defined meaning a coordinate system must be used. 3. Two bodies cannot simultaneously be at the same place at the same time; they are impenetrable. Hence, clouds, gases and liquids are not bodies in this classical sense. 4. A body may change properties with time while still being the same body.3 A consequence of 1–4 is that we can say that two occurrences of bodies are occurrences of the same body iff they can be connected by a continuous trajectory, which is, as already pointed out, the identity criterion for bodies. (We idealise by thinking about solid bodies that do not disintegrate or are cut apart; this restriction is innocuous for the present discussion.) The four points above, with its ensuing identity criterion, give a reasonable explication of the meaning of BODY as it is used in classical physics, and also to a certain extent of how it is used in ordinary language. Thus we see that the classical concept of BODY (and also the concept of PARTICLE as used in classical statistical mechanics, but not in quantum theory) presupposes times and places, since we must be able to distinguish different times and places in order to fulfill the four conditions. This seems to support the substantivalist view that time and space (=the set of places) are real and independent of its containing bodies. But that is a mistaken conclusion, for the concepts TIME and SPACE in turn presuppose the concept of BODY ; hence the words ‘space’ and ‘time’ cannot be thought of as referring to independently existing entities. This could be seen more clearly by looking at our way of determining time and space intervals. In order to give empirical meaning to the concept of spatial distance (and thus to the concept of point in space) we need a length unit and a standardised procedure for comparing distances. For a long time the standard meter in Paris was used as the length unit, and hence the definition of distance presupposed the (application of the) concept of BODY, since the standard meter is a body. Nowadays the length unit is defined as the distance travelled by light in vacuum during a specified time interval, using the constancy of the speed of light as the bridge between times and distances. Time measurements need clocks, i.e., bodies, so the change of meter standard has not changed the fact that the concept of space interval, i.e. distance, presupposes the concept of BODY.

3 This

conflicts with the principle of the indiscernability of identicals, i.e., that if a=b, then if the sentence “Fa”, is true, “Fb” is true as well, no matter which predicate ‘F’ we consider. On the one hand, this principle appears indeed plausible; on the other hand we ordinarily take for granted that almost all physical bodies change place and other attributes with time while still being the same body. So if ‘a’ and ‘b’ refers to a body observed at different times, there are two sentences ‘Fa’ and ‘Fb’ which differ in truth value in spite of a=b. This conflict is itself an interesting issue, but here I simply take for granted the common way of thinking and talking about bodies.

9.3 The Interdependence Between the Predicates PLACE, TIME and BODY

133

Thus Weyl was right. Descriptions of motion require BODY, TIME INTERVAL and SPATIAL DISTANCE and these concepts are reciprocally used in their respective characterisations as empirical concepts. One could say that TIME, SPACE and BODY are part of a conceptual scheme for talking about events and things in the physical world. We invoke positions in space and time to individuate bodies and we use bodies to individuate points in space and time.4 This conclusion might remind the reader of Aristotle’s view on time and change: Not only do we measure change by time, but we also measure time by change, because they are determined by each other; time determines change in the sense that it is a number of change and change does the same for time. (Physics, book 4, 12, 220b14)

By ‘change’ Aristotle means motion, growth and corruption of bodies, i.e., the things that changes are bodies. Aristotle held that times exists, ‘inheres’ in things, and I reject that notion; but I agree with him in rejecting the notion that time is an independently existing entity. The conclusion that BODY, SPACE and TIME mutually depends on each other is also reflected in general relativity, where the spacetime metric gμν and the matterenergy distribution described by the tensor Tμν are related to each other according to Einsteins equation: 1 8π G Rμν − gμν R + gμν  = 4 Tμν 2 c

(9.1)

The lhs of this equation is an expression for the structure of spacetime based on the spacetime metric gμν and the rhs is a way of describing matter-energy. Hence a quantity that can be calculated from the spacetime metric + some auxiliary information is identical to a quantity that can be calculated from information about matter-energy. This equation may thus be viewed as using field concepts for saying that the concepts SPACE, TIME and MATTER are mutually dependent. In this account of the concepts of BODY, SPACE and TIME we have presupposed an empirical concept DISTANCE. In mathematics, any scalar function D(x, y) of two variables being positively definite, symmetric, satisfying the triangle inequality and having zero value for D(x, x) may be called a distance function. But that does not suffice as a physical concept. In order to connect distance functions in an abstract theory to the real world, we need to be able to identify parts of the metric to instances of this concept, i.e., empirically determined distances. How is that done? Several philosophers, for example Reichenbach, have considered this problem under the label ‘principles of coordination’. The question is how to connect a mathematical function to something empirical. A tempting idea is to conceive of this coordination as an example of an isomorphism; values of a mathematical quantity is one-to-one mapped onto a set of empirical facts. But

4 This

interdependence between the concepts BODY, TIME INTERVAL and SPATIAL DISTANCE is connected to CPT symmetry, since charges are attributes of bodies, see Sect. 13.2.2.

134

9 SPACE, TIME and BODY; Three Fundamental Concepts

this doesn’t work, for the simple reason that an isomorphism, a mapping, is also a mathematical object and what is it that establishes that for example a mathematical function labelled ‘distance’ in fact represents distances in the physical world? It can be one-to-one mapped onto a set of linguistic representations of distances; but the crucial question is then how to analyse the relation of representation obtaining between ‘facts in the world’ and their linguistic representations. This lacuna is but one further illustration that theory alone cannot determine its connection to the external word; it is we humans who establish such connections by interacting with the world, pointing and using indexicals in our language, as pointed out in the previous section. Furthermore, it is obvious already from a mathematical point of view that in order to establish a mapping between two domains we must have independent access to objects in both domains, as pointed out by e.g. Putnam (1981, 74). Reichenbach’s idea of an isomorphism between theory and the world doesn’t solve our problem; extra-theoretical resources, indexicals and gestures, must be invoked; that was Russell’s insight, rehearsed in the previous section.

9.3.1 Bodies and Particles Both classical and quantum physics are often stated as theories about particles, not bodies. But the word ‘particle’ means different things in classical and quantum mechanics. In classical mechanics the term ‘particle’ is used for all kinds of bodies, big and small; so for example, planets orbiting a star may be called particles. The meaning is simply that when we call an extended object a ‘particle’, we consciously disregard its extension and inner structure, i.e., we treat it as a mass point when calculating its interactions with other objects. Referring to something as a particle in classical mechanics does not entail that it lacks extension or has infinite mass density. The concept of body has no application in micro physics since no thing in microphysics satisfy the third condition for being a body, impenetrability. But we describe the empirical consequences of micro physics in a background of space and time. It follows that saying about a quantum system that it has some property at a certain place in space requires that bodies are available as reference objects when giving the concept of space (and position) empirical content. But that we already knew. This means that we must use the classical conceptual scheme of SPACE, TIME and BODY when reporting quantum experiments, even though we do not think that these concepts are applicable at the fundamental level. This was Bohr’s position expressed at many places, e.g., in the quote in Sect. 9.2. In quantum physics the word ‘particle’ is a generic label for such things as electrons, protons, photons, etc. The basic point is the same in quantum and classical mechanics, something referred to with a particle word is treated as a unit in interactions. The difference between classical and quantum theory is that a particle’s inner structure and extension cannot be given any description during a quantum

9.4 Fundamental Quantities

135

interaction simply because it interacts as an indivisible unit; this is what quantisation of interaction means. By contrast, a classical particle can be given a more detailed description if need be. I will elaborate on that in Chap. 16. There is a further difference between classical and quantum particles; the latter, both fermions and bosons, lack identity criteria. If we have identified a quantum particle at a certain time and point in space by its interaction with a measurement device, we cannot in general tell whether the occurrence of another particle of the same kind at another time and place is the same as the first particle or not. One might think this is merely an epistemological shortcoming without ontological significance, but that is a mistaken conclusion, as will be shown in Chap. 14.

9.4 Fundamental Quantities The reasoning above indicates that TIME, DISTANCE and MASS are the three fundamental quantities in physics, since bodies are attributed masses and positions in space and time. (I remind the reader that I dismiss universals and therefore the term ‘quantity’ in my use of this word does not refer to any universal; it should be read as short for ‘quantitative predicate’, see Sect. 7.4.) Hence, saying that TIME, DISTANCE and MASS are fundamental quantities in physics should be understood as that these general terms are the fundamental quantitative predicates. Independently of the ontological question whether quantities exist or are merely attributes, there is a debate about which quantities are the most fundamental ones and how many of this fundamental sort there are. Some hold that only one quantity is fundamental, some say three quantities are needed and some hold that no quantity is fundamental; they can all be reduced away as superfluous in physics. Lev Okun, Michael Duff and Gabriele Veneziano had an interesting debate about these things in Duff et al. (2002). Duff has also returned to the topic in Duff (2014). Theoretical physicists often put the three natural constants c = h¯ = G = 1, i.e. a pure number without dimension, thereby effectively reducing away the three quantities MASS, TIME, DISTANCE. and in fact all physical quantities, since all others are defined in terms of these three. In fact, one can do theoretical physics without talking about any quantities at all. What remains is a list of dimensionless numbers, such as the hyperfine constant, and it is held that these numbers are the objective content of physics. Quantities are not essential.

136

9 SPACE, TIME and BODY; Three Fundamental Concepts

But putting a dimensionful constant, such as c, equal to 1, cannot be viewed as a true identity. Lev Okun writes (Duff et al. 2002, 6): The universal character of c, h¯ , G and hence of mP , lP , tP [i.e., Planck mass, Planck length and Planck time] makes natural their use in dealing with futuristic TOE. [Theory Of Everything] (In the case of strings the role of lP is played by the string length λs.) In such natural units all physical quantities and variables become dimensionless. In practice the use of these units is realized by putting c = 1, h¯ = 1, G(orλs) = 1 in all formulas. However one should not take these equalities too literally, because their left-hand sides are dimensionful, while the right-hand sides are dimensionless. It would be more proper to use arrows ‘→’ (which mean ‘substituted by’) instead of equality signs ‘=’. The absence of c, h¯ , G (or any of them) in the so obtained dimensionless equations does not diminish the fundamental character of these units. Moreover it stresses their universality and importance. It is necessary to keep in mind that when comparing the theoretical predictions with experimental results one has anyway to restore (‘←’) the three basic units c, h¯ , G in equations because all measurements involve standard scales. The above arguments imply what is often dubbed as a ‘moderate reductionism’, which in this case means that all physical phenomena can be explained in terms of a few fundamental interactions of fundamental particles and thus expressed in terms of three basic units and a certain number of fundamental dimensionless parameters.

His conclusion, stated at the beginning of his paper, was to agree with Gauss: The three basic physical dimensions: Length L, time T and mass M with corresponding metric units: cm, sec, gram, are usually associated with the name of C.F. Gauss. In spite of tremendous changes in physics, three basic dimensions are still necessary and sufficient to express the dimension of any physical quantity. The number three corresponds to the three basic entities (notions): space, time and matter. It does not depend on the dimensionality of space, being the same in spaces of any dimension. It does not depend on the number and nature of fundamental interactions. For instance, in a world without gravity it still would be three.

Michael Duff however disagrees: In my view, this apparent contradiction arises from trying to use two different sets of units at the same time, and really goes to the heart of my disagreement with Lev about what is real physics and what is mere convention. In the units favored by members of the Three Constants Party, length and time have different dimensions and you cannot, therefore, put c = 1 (just as you cannot put k = 1, if you want to follow the conventions of the Seven Constants Party). If you want to put c = 1, you must trade in your membership card for that of (or at least adopt the habits of) the Two Constants Party, whose favorite units do not distinguish length from time. In these units, c is dimensionless and you may quite literally set it equal to one. (Duff et al. 2002, 22)

And in his Duff (2014) he writes: I argue that the laws of physics should be independent of one’s choice of units or measuring apparatus. This is the case if they are framed in terms of dimensionless numbers such as the fine structure constant α. For example, the Standard Model of particle physics has 19 such dimensionless parameters whose values all observers can agree on, irrespective of what clock, rulers, scales. . . they use to measure them. Dimensional constants, on the other hand, such as h¯ , c, G, e, k. . . , are merely human constructs whose number and values differ from one choice of units to the next. In this sense only dimensionless constants are “fundamental”.

Thus Duff holds that one need for example not distinguish times from lengths. This is a natural point of view once one has fully accepted theory of relativity.

9.4 Fundamental Quantities

137

But in a discussion of epistemological foundations one cannot do that; relativity theory is not a basis for empirical knowledge but a result of a long and profound inquiry. Measuring times and distances require different operations and different measurement devices, unless we in advance have accepted electromagnetism. That c is a constant is a consequence of the theory of electromagnetism (see Sect. 10.7), not a basic assumption. So from an epistemological point of view Duff is wrong; we need three quantities, (TIME, LENGTH and MASS) and hence three units (s, m, kg), in order to start any stringent inquiry into the physical aspects of the external world. At a later stage of scientific development it is theoretically possible to transform to other quantities and other units as basic. It is even possible to rescale and put c = h¯ = G = 1, thus reducing these dimensional constants to pure numbers. This is fine so long as we only (!) do calculations, but as Okun observed, when comparing and applying theory to reality we need to restore units. Thus the question ‘How many quantities are fundamental’ is not enough precise: we must distinguish between fundamental in an epistemological sense and fundamental in a logical/theoretical sense. In the epistemological sense, three quantities, TIME, DISTANCE and MASS are fundamental, just as Gauss and Okun claimed, whereas in pure theory quantities are not needed at all. One may recollect how physics textbooks are written; equations are given and derivations made in a formal way, only using a lot of letters without units. But since the physics community to a very large extent have agreed on what letters to use for quantities (m for mass, B for electric field, etc.) the reader can interpret the formalism as being physics and not merely an uninterpreted calculus. The crucial thing is that the validity of calculations does not depend on our interpreting certain letters as standing for specific quantities, whereas the physical meaning of these expressions is given by units attached to the letters standing for quantities. I do recognise as very important the aim of formulating fundamental physical laws as objective as possible and in that endeavour trying to avoid as much as possible dependence of particular decisions made by individual humans. But it seems to me that the baby has gone with the bath water for Duff; for the standard model encapsulating 19 dimensionless parameters are surely not intended to be an uninterpreted calculus; it is still claimed to be a physical theory and if so, it must somehow be connected to empirical observations, i.e., measurements using measurement devices. And these measurement procedures are indicated by the dimensions attached to numbers. Saying that the distance to the sun in 8 light minutes we say that if a certain measurement procedure is applied to the distance between earth and sun we would get the number 8. A theory without any dimensions at all cannot be anything but a formal structure. And let us once more rehearse the conclusion drawn from Löwenheim-Skolem’s theorem; no theory can by itself tell us anything whatsoever about nature. We need extra resources, in physics we need physical units whose meaning are given by concrete measurement procedures in which we use indexicals together with pointing gestures for connecting theory to reality.

138

9 SPACE, TIME and BODY; Three Fundamental Concepts

Duff is right in pointing out that dimensionless constants are fundamental in the sense that they determine the theoretical structure of the Standard Model, if the standard model is thought of as nothing but mathematics. That doesn’t conflict with the view that LENGTH, TIME and MASS are empirically fundamental dimensions of physics. Theory should not depend on any conventional choice made by an individual, or group of individuals. But it cannot be made totally independent of human actions in general. If we try to formulate physics, or any theory about the world, from a view-point in which no human observers are present, we in fact make the same mistake as classical metaphysicians. A theory is a human construct, and if we forget this we implicitly assume that no humans are needed to connect theory to reality. This is the basic mistake of metaphysical realists; they try to describe the relation between theory and reality from ‘God’s eye point of view’, or expressed in more modern terms, from a point of view from ‘nowhere and nowhen’. This is incoherent and Kant rightly criticised Leibniz and Wolff for that in his first Critique.

9.5 Summary 1. SPACE, TIME and BODY are from an epistemological point of view the fundamental concepts in physics. From a purely theoretical point of view in which the exposition of physics begins with a purely mathematical structure, other concepts may be a better start. 2. SPACE, TIME and BODY are mutually dependent; any one of them requires the other two in descriptions of how to apply them in concrete situations. 3. The fundamentality of these concepts is mirrored in the fact we need units for time intervals (second), spatial intervals (meter) and quantity of matter=mass (kg) when describing our most direct observations, as Gauss once claimed. It is when units are attached to numbers the transition from pure theory (i.e., mathematics) to physical theory with predictive power is made. 4. The mutual interdependence of SPACE, TIME and BODY entails that neither classical substantivalism nor relationalism are viable views. Points in space and time cannot be identified without reference to material objects, but neither can material objects be identified and individuated without inserting time intervals and spatial distances between them. 5. Einstein’s equation says that the matter-energy tensor and the spacetime metric determine each other. That is a field version of the claim that SPACE, TIME and BODY are mutually dependent concepts.

Chapter 10

Laws

Abstract This chapter contains a thoroughly empiricist account of laws of physics. Laws of physics are divided into fundamental and derived laws. Fundamental laws are those universally generalised conditionals which function as contextual definitions of theoretical predicates introduced into physical theory. The necessity attributed to both fundamental and derived laws is interpreted as a qualifier to the semantic predicate ‘true’; laws are necessarily true because they are definitions, or consequences of such definitions. Viewing nomological necessity in this way blocks the use of quantified modal logic in the semantics and ontology of physics. Holding that fundamental laws are definitions doesn’t mean that they are devoid of empirical content. Fundamental laws are those universally generalised conditionals accepted as true on the basis of extensive observations of a number singular phenomena; but the theoretical predicate used in formulating a universally generalised conditional was constructed for this very purpose. The general argument is illustrated by an analysis of the fundamental laws in classical mechanics, relativity theory, classical electromagnetism and quantum theory.

10.1 Introduction The concept of a law of nature has been debated by philosophers for a long time and many views have been proposed. I have discerned at least eleven different positions in the debate: (i) laws are contingent relations between universals (Tooley 1977; Dretske 1977; Armstrong 1983), (ii) laws are axioms and theorems in a complete theory about the world (Lewis 1983, 1986), (iii) laws are those universally generalised conditionals true in all possible worlds (Pargetter 1984; McCall 1984; Vallentyne 1988), (iv) laws are relations between essential properties (Bird 2007; Bigelow et al. 1992), (v) there are no laws (van Fraassen 1989) (Mumford 2004), (vi) laws are grounded in causal powers (Ellis 1999)s, (vii) laws are grounded in invariances based on dispositional properties (Woodward 1992), (viii) laws belong to non-maximal sets of counterfactually stable propositions (Lange 2009), (ix) laws are relatively a priori principles for empirical knowledge (Friedman 2001), (x)

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_10

139

140

10 Laws

laws are primitives (Maudlin 2007; Carroll 1994) and (xi) laws are metatheoretic propositions (Roberts 2008). The list is not complete. The debate has been characterised by Earman (2002) in the following way: It is hard to imagine how there could be more disagreement about the fundamentals of the concept of law of nature - or any other concept so basic to the philosophy of science - than currently exists. A cursory survey of the recent literature reveal the following oppositions (among others): there are no laws of nature vs. there are/must be laws; laws express relations between universals vs. laws do not express such relations; laws are not/cannot be Humean supervenient vs. laws are/must be Humean supervenient; laws do not/cannot contain ceteris paribus clauses vs. laws do/must contain ceteris paribus clauses. One might shrug of this situation with the remark that in philosophy disagreement is par for the course. But the correct characterisation of this situation seems to me to be “disarray” rather than “disagreement”. Moreover, much of the philosophical discussion of laws seems disconnected from the practice and substance of science; scientists overhearing typical philosophical debates about laws would take away the impression of scholasticism - and they would be right!

Earman’s remark that the discussion about laws is disconnected from the substance and practice of science is indeed true and in my view one reason why it has been so inconclusive. In this chapter I will try to avoid this mistake. A fruitful approach is, I believe, to begin the discussion about laws with some concrete examples from physics, examples that everyone interested in the debate would accept as prime examples of scientific laws. My aim is then to discern the reasons why everyone agrees that these examples are laws and what information scientists themselves convey by thus calling them “laws”. For it is an astonishing fact that there are many uncontroversial examples of laws, thus the extension of the predicate “natural law” is not much in dispute. By contrast, the metaphysics of laws is highly controversial among philosophers, hence also the meaning of “natural law”. I assume that the meaning of a general term determines its extension, but not the other way round. Metaphysical disputes about laws are disputes about the meaning of the term “natural law”. One aspect of this dispute is whether terms occurring in laws, such as “mass”, “charge”, “force”, “current”, etc., refer to quantitative properties and relations, or whether we should conceive of them as general terms with extensions but lacking reference. Many positions in the debate seem to be motivated by metaphysical convictions about the existence of universals, such as properties, essences, relations or irreducible dispositions. Led by these convictions many philosophers try to define the concept of law in terms of the preferred metaphysical notion. This is not my cup of tea. I share empiricists’ general skepticism concerning the explanatory force of postulating such things as properties, essences, relations or dispositions, and, moreover, I don’t think that is the kind of reason scientists have for calling certain sentences of theories they hold true for “laws”. All empiricists concur, I believe, with Hume’s criticism of the idea of hidden powers being responsible for the lawful regularities in nature. But I cannot rest content with Hume’s psychological explanation of why we tend to think there are lawful necessities in nature. His observation that we are conditioned to expect the

10.1 Introduction

141

continuation of an observed regularity is certainly correct, but that cannot be the full explanation of our beliefs in laws (and our use of the associated notion of physical necessity), because it is easy to conceive situations where we have this expectation without referring to a law or principle being operative. My goal in this chapter, then, is to discern the reasons why some true sentences in science are called “laws” without postulating any metaphysics. Van Fraassen (1989) took a harsher route by dismissing the concept of natural law as unnecessary. Being a leading empiricist, he criticised the first three options listed above (these were the main alternatives when he published his (1989)) as failing the goal of analysing the concept of law. (Several newer ideas would fall prey to more or less similar criticism.) But, he claims, this failure is no reason for concern, for we have no need for the concept of natural law. One can give a fully satisfactory account of science without assuming that there is a specific category of propositions, laws. In the strong metaphysical sense of “law” according to which laws are necessary de re propositions I agree with van Fraassen; we have no need for such things. But the expressions “natural law”, “physical law”, “scientific law” etc., are commonly used, so one is prone to ask “What is the point of making a distinction between some sentences, called ‘laws’, and other true, general sentences?” And what is this distinction based upon, if not a difference in modality? Van Fraassen has, of course, not convinced opponents of a more metaphysical bent. Several philosophers, for example Bird (2007) and Bigelow et al. (1992), hold that laws are grounded in relations between essential properties of things and are therefore necessary. The empiricist’s natural reaction is to ask: how do we know that? Observable phenomena cannot be used to distinguish the support for “It is a law that P” from the support for “P”. By the same token, the empirical support for a sentence of the type “a is F” and for “a is necessarily F” is the same, so we have no empirical reason to make modal distinctions. Metaphysicians accept that, of course, arguing that we need assumptions about modal properties for explaining lawhood, not for making correct predictions. Well, I will here explain our calling some sentences “laws” without using modalities, so it is not needed. But of course, it all depends on what we require of an explanation. Van Fraassen’s conclusion that the concept of law is not needed for ascertaining empirical adequacy of a theory is correct. But scientists use the concept, so they use it for some other purpose. Maudlin (2007) argues that we should view laws as primitives not analysable in terms of necessitation, counterfactuals, dispositions etc.; it is rather the other way round. I agree that we should not try to analyse the concept of law in terms of these metaphysical notions; no scientist has, to my knowledge, ever claimed that a universally generalised conditional in a theory is a law because it fulfils the criteria of any of these popular concepts in philosophical discourse. However, one is immediately led to ask how we obtain knowledge about laws, qua laws, if they are primitive? Maudlin admits this difficulty: “To the epistemological questions I must, with Armstrong, admit a degree of skepticism. There is no guarantee that the observable phenomena will lead us correctly to infer the laws of nature.” (op.cit., p.17).

142

10 Laws

One should observe that Maudlin talks about inferences from observable phenomena, not from observed phenomena. Everyone knows that inductive generalisations from observed phenomena to general statements about observable phenomena is uncertain, no matter if we call the conclusion a “law” or not. So I take Maudlin to be a bit sceptical about the inference from a generalisation of observations to its being a law. In this chapter I will suggest a solution to this problem, viz., that in some cases of inductive generalisations we introduce a new predicate in order to formulate the regularity; thus the conclusion of the induction also functions as an implicit definition of the new predicate. These generalisations are in an epistemological sense fundamental laws, which is one subcategory of laws that I will discern in this chapter. The two other subcategories are derived laws and laws being explicit definitions of new quantities. But before arguing these points in some detail, some preliminary reflections are necessary. In the next section I will discuss the extension of the predicate “law of nature”, in Sect. 10.3 I will show how to bring equations to the standard logical form of laws and in Sect. 10.4 I will consider some semantical issues. In Sect. 10.5 I will discuss in more detail induction and concept formation, in Sects. 10.6, 10.7, 10.8 and 10.9 I will analyse some laws in, respectively, classical mechanics, relativity theory, electromagnetism and quantum mechanics. Finally, in Sect. 10.10 I will give my explanation of why we say that laws are necessary. Postponing the discussion of physical necessity to the end of the chapter is motivated by two considerations: (i) I treat physical necessity as a semantic predicate, not as a modal operator in the object language, and (ii) I explain physical necessity in terms of laws, not the other way round.

10.2 The Extension of the Predicate “Law of Nature” Quite often scientists do not use the word “law” when describing the core of scientific theories; instead they talk about “equations”, “principles”, or “postulates”, as in “Schrödinger’s equation”, “Pauli’s exclusion principle” or “Einstein’s postulates”. However, it is pretty obvious that these labels refer to things philosophers would call “laws of nature”. And many scientists use the word “law” as a generic label for these things; Penrose, for example, has called his magnum opus The Road to Reality. A Complete Guide to the Laws of the Universe (Penrose 2005). Henceforth, I will assume that the extension of the concept of law in physics comprises a large number of equations, principles and postulates. Whether there are laws in chemistry, biology and other natural sciences depends on the analysis to be given for these fields and I leave that for another occasion. Hence, it is implicit that “laws” here means “physical laws”. Laws in physics do not contain any ceteris paribus-clauses, in contrast to so called ‘laws’ in many other disciplines. The reason is obvious, if one accepts my account of physical laws to be given in this chapter. Earman and Roberts (1999) has the same view, based on other arguments.

10.3 The Logical Form of Laws

143

The set of laws seems to be a rather heterogeneous collection, even if we consider only physical laws, and I am unable to give a fully unified account of them. But I will discern some types which together at least cover all the well-known examples.

10.3 The Logical Form of Laws A common but not undisputed view is that laws have the logical form ∀x(Ax → Bx), i.e., that they are universally generalised conditionals, UGCs, for short. (Adherents to the theory that laws are relations between universals hold that such relations provide the metaphysical grounds for calling a true UGC a “law”.) Some simple laws are easily seen to fit this schema, such as “All pieces of metal expand when heated”, or “All portions of gas expand in proportion to increase of temperature when heated under constant pressure”. But these are of lesser interest; laws, properly so called by scientists, are more precise. For example, the rather imprecise sentence about the expansion of metals under heating has been replaced by a family of precise laws that for each metal states a coefficient for the increase in length per unit length and unit increase in temperature. Laws that relate quantities to each other in the form of equations are not obviously of the UGC form; some interpretative work is needed to show that. Consider for example the law of gravitation f =

Gma mb 2 rab

(10.1)

which gives the gravitational force f between two masses ma and mb at a distance rab from each other. (G is the universal gravitational constant.) Bodies are attributed mass, and force and distance are attributed to pairs of bodies. These quantitative attributions can be expressed in the notation of predicate logic as: The body a has mass ma : M(a, ma ) The body b has mass mb : M(b, mb ) The distance between a and b is rab : D(a, b, rab ) The gravitational force between a and b is f: F (a, b, f ) Equation (10.1) is valid for all pairs of bodies, so the implicit generalisation is to all pairs of bodies. The letters symbolising mass, distance and force magnitudes, i.e., ma , mb , rab and fab , are functions of the variables a and b. We quantify over material objects.1 Now the complete law of gravitation can be expressed as:

1 So

the application of the law of gravitation presupposes that we have identity criteria for bodies. The law itself does not presuppose the existence of bodies; it would be vacuously true if there were no bodies. But, of course, we would never be able to discover this law if there were no bodies.

144

10 Laws

Law of Gravitation: ∀a∀b[M(a, ma )&M(b, mb )&D(a, b, rab )&F (a, b, fab ) ↔ fab =

Gma mb ] 2 rab

(10.2)

This sentence is not exactly of the canonical form ∀x(Ax → Bx): it is a biconditional instead of a conditional, and it is a double generalisation, instead of a single one. But these are minor points; to include this and similar cases, we could simply say that laws are universally generalised conditionals or biconditionals. However, it is well known that many true sentences have this form without being laws. (“All prime ministers of Sweden are shorter than 2 m.”, is a case in point, taking “are”, non-temporarily.) So being a true, universally generalised conditional or biconditional is at most a necessary condition for being a law; our problem is to say what more is needed. It is clear that we need a criterion for distinguishing between two classes of true sentences of this form, laws and the rest, usually called accidental generalisations. This was the central problem emerging in Goodman’s seminal paper (Goodman 1946), where he discussed the problem of distinguishing between true and false counterfactuals. He found that true counterfactuals were associated with laws, whereas false ones were associated with accidental generalisations. But then, what is the distinction between true UGC:s being laws and those being accidental generalisations? Since Goodman was a staunch empiricist and nominalist he tried, unsuccessfully, to solve the problem without drawing on modal notions. His conclusion was that no such distinction could be drawn and the reader is tempted to conclude (though Goodman did not) that we need stronger resources than first order predicate logic for this task. So the question is: what further conditions than being a true UGC should a sentence fulfil to count as a law-sentence? A very common idea is that laws are, in some sense, necessary. This form of necessity is often referred to as natural, physical or nomological necessity. In Sect. 10.10 I will discuss the relation between the predicates ‘. . . .is a natural law’ and ‘. . . .. is necessary.’ Many philosophers argue that a sentence is a law because it is necessary, but in my view it is the other way round. That is to say, I will first give an account of why some sentences in physical theories are labelled “laws”, and then explain the predicate “physically necessary” using the predicate “natural law”.

10.4 Semantics and Ontology Saying that we quantify over physical bodies, as in Eq. (10.2), when we express a law in first order predicate logic entails a commitment to bodies as referents for the variables. This is uncontroversial, but what about the existence of forces, masses, electromagnetic fields etc., i.e., all the quantities in physics? Do they exist? Clearly, we may consistently hold that e.g., Newton’s second law, f = ma is true, while denying that there are any forces, masses or accelerations. Using the

10.4 Semantics and Ontology

145

predicates M(x, mx ) for “mass of x is mx ”, A(x, ax ) for “acceleration of x is ax ” and F (x, fx ) for “force on x is fx ”, Newton’s second law is: Newton II : ∀x[M(x, mx )&A(x, ax )&F (x, fx ) ↔ fx = mx ax ]

(10.3)

If this law is true, but not vacuously so, there exists at least one object being the referent of the variable x and this referent can be attributed the three quantities FORCE , MASS and ACCELERATION fulfilling the condition f = ma. Prima facie, one might think that quantities are the referents of quantitative predicates. But there is no need to reify. There must be a referent for the singular term in a true sentence, but the predicate in a true sentence need not refer; it suffice that the object talked about belongs to the extension of the predicate. The nominalist stance about quantitative predicates, and all general terms, is that they do not refer. Since I do not invoke universals as referents to quantitative predicates in my ontology, I can allow myself to use the word “quantity” as short for “quantitative predicate”. In order to avoid any use-mention confusion I use as before SMALL CAPITALS when talking about quantities=quantitative predicates.2,3 The reader may observe that FORCE is a three-place predicate in Newton’s law of gravitation (and in Coulomb’s law) whereas in Newton’s second law it is a two-place predicate. Furthermore, in Newton’s second law it may take vectors4 as arguments at the second argument place, whereas this is not so in the law of gravitation. This tension may be resolved by recognising that expressions of the form “the force between x and y is z” may be viewed as short for “The magnitude of force on x is z ∧ the magnitude of force on y is z ∧ the forces are oppositely directed” (assuming as usual that no other bodies are sufficiently close to these two and that the bodies have no charge, as is the usual assumption when discussing the law of gravitation). The differences in syntax for “force between” and “force on” do not lead to any incoherence; as usual, the context is sufficient to determine what the label “f” stands for in a particular case. I guess that some readers, those who call themselves realists, now are inclined to ask: “But do you really deny that there are masses, forces, electromagnetic fields, energy, etc., in the real physical world? Don’t we have good reasons to say that these

2A

quantitative predicate is a general term with well-defined rules for application (given in the SIsystem), it is not merely a word or string of words. Therefore I need something else than quotation marks when indicating that I talk about such predicates. 3 If we simply define the property of having mass as belonging to the set of objects satisfying the predicate “mass of . . . is. . . . kg”, and similarly for other quantities, there is of course neither any problem, nor any gain, in accepting that quantitative predicates refer to properties and relations. The real ontological dispute is between those who hold that properties and relations are something else than mere sets of objects and those who deny that. 4 These vectors are mathematical objects, which I accept in my ontology; but there is no need to assume that a vector in the mathematical sense represents, or corresponds to a physical universal, see Sect. 9.4. Moreover, numbers, and all mathematical objects constructed from numbers, are most naturally viewed as individuals, not universals, see Chap. 4.

146

10 Laws

things exist and that our discovery of them is the best explanation for the success of physics?” This argument, which is of the form “inference to the best explanation” is often rehearsed by realists as their core argument for scientific realism. My reply is: what one counts as the best explanation for the success of science, in this case physics, depends very much on one’s metaphysical world view. What to count as a scientific explanation is a highly controversial issue, and the question about the best explanation is, if possible, even more controversial. Van Fraassen, to mention the most well known anti-realist, points out that even if we endorse the rule of inference to the best explanation, the scientific realist needs an extra premiss: The realist asks us to choose between different hypotheses that explain regularities in certain ways; but his opponent always wishes to choose among hypotheses of the form ‘theory T is empirically adequate.’ So the realist will need his special extra premiss that every universal regularity in nature needs an explanation, before the rule will make realists of us all. (van Fraassen 1980, 21)

Van Fraassen’s most general argument against the no-miracle argument is that explanations belongs to pragmatics, they have no evidentiary value; explanations don’t count either for or against a theory.5 In general, I see no added explanatory value in assuming that quantitative predicates refer to physical universals. Predictive power is the prime epistemic demand upon scientific theories, and explanatory force is to a great extent context sensitive, see Sect. 3.8. Two persons agreeing about a certain theory’s testability and predictive power may nevertheless disagree vastly about its explanatory value, due to their background assumptions and world views. A particularly illustrating case is quantum mechanics, where all agree on its astonishingly accurate predictive power, whereas there are still, 90 years after its formulation, profound disagreements about its interpretation. The different interpretations are clearly based on different metaphysical presuppositions. The conclusion to be drawn is that explanatory power cannot be used as an argument for realism about physical properties and relations; it begs the question. For further discussions of explanations see Chap. 6. Perhaps the most severe problem for those who believe that quantitative predicates refer to properties is to provide identity criteria for such properties. The problem is that quantitative predicates can be transformed to each other via natural constants. For example, if they hold, as I guess they would, that length and time are different properties, they have problem with the common convention of putting the velocity of light equal to unity without dimension! Doing so enables us to measure distances as times, i.e. to hold the quantities LENGTH and TIME are coextensional predicates. (And we are accustomed to talk about lengths in time units in astrophysics.) One cannot at the same time accept that putting c=1 without dimension is a mere convention and still distinguish TIME and LENGTH as referring to different properties.

5 This

and the foregoing paragraph are not identical to those in the published paper (Johansson 2019). A referee pointed out that my earlier formulation gave the impression that van Fraassen accepted an instance of the inference to the best explanation, which he did not.

10.5 Induction, Concept Formation and Discovery of Fundamental Laws

147

Henry Kyburg (1997) discussed the ontological status of quantities and arrived at the position that quantities are functions whose ranges are magnitudes. One may think that Kyburg assumes that magnitudes are properties of physical objects, state of affairs or events. If so, I beg to disagree; Carnap’s view, that values of quantities are real numbers is all we need.

10.5 Induction, Concept Formation and Discovery of Fundamental Laws Our belief in laws of nature are grounded on observations of results of systematic experiments. In most cases the connections between a particular law and observations are indirect, being transmitted by long chains of derivations, assumptions about measurement instruments etc. For example, one cannot directly observe electric fields and electric charges and observe whether values of these quantities instantiate or conflict with Maxwell’s first equation. No hypothesis, or law, can be tested in isolation; in testing we always assume a certain amount of background information, which could contain mistaken assumptions. This conclusion has often been called the Duhem-Quine thesis, albeit the exact formulation of this thesis is a matter of debate. Sometimes we observe a regularity in a series of experiments and sometimes this regularity is still observed when the experiment series is prolonged, in which case we make an inductive inference to the general conclusion. The formulation of such a generality is sometimes accompanied by the introduction of a new quantity, a quantitative predicate so far not thought of. In such cases the inductive generalisation is a candidate for being a scientific law. These two steps are fundamental in the development of a new theory and, as we will see in the case of classical mechanics and electromagnetism, to be discussed in Sects. 10.6 and 10.8, this is how fundamental laws are established. There is an amount of circularity in the application of a physical theory to concrete situations. Consider for example electromagnetism; in order to determine whether a system is sufficiently isolated we need to know whether there are any measurable electromagnetic fields from external sources affecting the system in question, and that we cannot know unless we have determined a way of measuring these fields. In principle no system is ever completely isolated of course; the rest of the world is not infinitely distant away and hence the probability of interaction is not exactly zero. So the question of isolation is a practical question: is the system being observed well enough isolated so that possible interactions with the rest of the world only affects the system’s state within the margins of error? But this is precisely the reason why one cannot, other than analytically, separate discoveries of laws and the introduction of precise quantities in theory development. If we fail to sufficiently isolate the system, we will sooner or later hit upon a case where unknown factors interfere and disturb the predicted outcomes, thus producing a counter instance.

148

10 Laws

Neither in practice, nor in the conceptual analysis, can we proceed by first defining a set of new quantities and then perform experiments to see how they relate to each other. Observing, experimenting and developing quantitative concepts are inseparably intertwined, as will be clearly shown in Sects. 10.6, 10.7, 10.8 and 10.9. This is why fundamental laws at the same time are implicit definitions of new predicates and have empirical content.

10.5.1 Laws, Physical Theories and Observations: Top-Down Or Bottom-Up? My conception of physical theories might be described as “bottom-up”: theory construction starts with descriptions of observed regularities. By contrast, the common view is that a physical theory is a mathematical structure built upon some abstract principles, whose laws are declared to be fundamental in the logical sense. By starting from “above”, the concepts occurring in fundamental laws are not yet given any physical interpretation, they have merely mathematical relations to each other. The physical interpretation are only given when part of this structure, the empirical “edges” are compared with observations, or, as in the semantic view of theories, a part of the structure is thought of as a mapping of observed phenomena. In this top-down view one faces the task of explaining how mathematical equations and functions relate to observations. According to classical empiricism it is provided by “coordination principles” (Reichenbach 1920). In the words of Friedman (2001, 76): “They serve as general rules for setting up a coordination or correspondence between the abstract mathematical representations. . . .. and concrete empirical phenomena to which these representations are intended to apply.” Somewhat similar views are expressed by many philosophers of physics, van Fraassen (1980) being one clear example. The problem with this statement is the word “phenomena”. In order to set up a correspondence between general statements in a theory (“abstract mathematical representations”) and something else, you must describe that something else, “the phenomena”, in some way. One can not establish any correspondence between a mathematical structure and something which is not yet organised as the content of a perception. In other words, the correspondence is a correspondence between a mathematical structure and a part of the contents of our observations, i.e., descriptions of observations. The question is what predicates to use in such descriptions? If these are purely empirical predicates whose application rules are fully independent of any theory, we may truly ask how there could be a correspondence between “phenomena” thus described and theoretical statements constructed independently of any description of empirical “phenomena”. Think for example of electromagnetic theory: it describes the dynamics of charged particles in electric and magnetic fields, it relates electric and magnetic fields and the motion of charged particles to each other. But we cannot directly

10.5 Induction, Concept Formation and Discovery of Fundamental Laws

149

observe electric fields, magnetic fields, or charges. What we observe are physical bodies in space and time. (Observing the value of a meter of some kind is obviously an observation of a body at a certain place.) In order to establish a correlation between descriptions in terms of moving bodies and electromagnetic predictions we need to sort out those bodies that are sensitive to electric and/or magnetic fields and compare their motions with theory. But doing so, we use the electromagnetic concepts. For example, we attribute charge to some bodies and different charges to different bodies with the same mechanical properties. So it is no longer any theoryindependent individuation of things attributed electromagnetic properties in the empirical realm. Descriptions of “phenomena”, in the sense intended by Friedman and others depends on the theory. I cannot see how it is possible to sort out those motions of observable bodies that are related to electric and magnetic interactions without using electromagnetic concepts, or some others with the same extensions. So the correspondence has not the character of a correspondence between items in two conceptually independent realms. It is of course easy to set up a correspondence between two domains containing different types of entities, if the individuation of things in one domain is determined by the individuation of things in the other domain, as in the “correspondence” between facts (German “Tatsache”) and sentences (German “Sätze”) in Wittgenstein’s Tractatus. No empirical investigations are needed, or indeed possible, to check this correspondence, it is in a profound sense trivial. (Wittgenstein claimed that it could be shown, albeit not talked about; but I doubt the intelligibility of this statement.) Certainly, the relation between theory and empirical evidence is not of this kind. So how should we understand the correspondence between mathematical structures and empirical phenomena from the view-point of Friedman, Reichenbach and others using this concept? In fact I don’t see how one could give a substantial content to the notion of correspondence in the sense intended by Reichenbach and his followers. Hence I don’t think the notions of correspondence or coordinating principles are useful for understanding the relation between theory and empirical evidence; either the correspondence is completely trivial or else it is impossible. A similar critique may be directed against van Fraassen’s notion of isomorphism between “the empirical substructure” of a model of a theory and “appearances”, the latter being characterised as follows: “[T]he structures that can be described in experiential and measurement reports we can call appearances” (van Fraassen 1980, 64). The problem with this conception is that experimental and measurement reports are almost never void of theoretical predicates, such as “mass” or “electric field”, and these are defined within a system of equations. Hence most appearances cannot be described without employing theoretical concepts in van Fraassen’s view. And how to describe structures of appearances without using theoretical concepts? So the isomorphism between an empirical substructure and appearances cannot be conceived as an independent empirical check on the model, or as a relation between theory and evidence. Van Fraassen faces the same problem as Kuhn: accepting that descriptions of experiences depends on the theory to be tested, conflicts with the

150

10 Laws

empiricist stance that there is basis, a subset of empirical observation reports, which are theory-independent. Van Fraassen’s conception of scientific theories is one version of the general idea that a theory is a set of models and my critique of van Fraassen’s conception of the relation between model and theory applies generally. Models must be described in order to be explicitly related in any way to a theory, and descriptions of models require theoretical predicates.6 Reflections similar to these might have been the reason why Kuhn (1970) drew the conclusion that there are no theory-independent observations whatsoever. This general conclusion is false; there is a meagre basis of theory-independent observations, in physics exemplified by positions and motions of nearby visible bodies. But Kuhn had a point; sometimes we introduce new theoretical concepts when generalising our observations, the result being what I call “fundamental laws”.7 Summarising this section, my view is that observation reports ordinarily so called in many cases utilise theoretical predicates. But it is possible to discern a subset which do not utilise any such theoretical terms. This subset is the ultimate empirical basis for theory construction. Some fundamental theoretical concepts are constructed during the process of inductive generalisations from such observation reports and using these we can continue and explicitly define new useful quantities and thus construct an empirical theory. I will now show that classical mechanics, relativity theory and classical electromagnetism fit this account of laws and how they are based on observations.

10.6 Laws and Fundamental Quantities in Classical Mechanics 10.6.1 The Discovery of Momentum Conservation and the Introduction of MASS and FORCE Classical mechanics consists of kinematics and dynamics. Kinematics describes the motion of physical bodies, usually called “particles” in the theoretical exposition,

6A

similar point was made by Halvorsen: ‘Thus, it is doubtful that there are any “languagefree” account of mathematical structures and hence no plausible language-free semantic view of theories.’ (Halvorson 2019, 173). 7 In the postscript to the second edition of The Structure of Scientific Revolutions, Kuhn used the concept of disciplinary matrix instead of the concept of paradigm. The first component of the disciplinary matrix is the set of symbolic generalisations, and it seems pretty clear that by this term he refers to what we usually call scientific laws. But why didn’t he use the term “law”? One reason was, I think, that using the term “law” one is inclined to miss his point that the terms in a theory get their meaning implicitly (just as I argued above), by being used in the theory, not by any explicit definition.

10.6 Laws and Fundamental Quantities in Classical Mechanics

151

since their inner structure is not considered, while dynamics is the theory about interactions between particles. Classical mechanics is from an epistemological point of view the fundamental physical discipline; motions of bodies are clearly the most directly observed events. But it is also basic from a conceptual perspective because all physical quantities ultimately are defined in terms of TIME, DISTANCE and MASS. This fact is easily recognised when looking at the definitions of the SI units, and I have thoroughly discussed this topic in Chap. 9. TIME and DISTANCE are the two fundamental quantities in kinematics; these two are used when describing particles’ positions, velocities and accelerations. These quantities are operationally defined in terms of how to use meter sticks and clocks in measurements.8 In performing such measurements we take for granted that those physical objects utilised as measurement devices are invariant when being moved from one place or time to another. We take for granted that meter sticks don’t change length and that clocks ticks with the same rate when moved from one place to another. These are not purely empirical assumptions; if we have determined concrete procedures by which to compare time intervals and distances, i.e., instructions about time and distance measurements, we have stated how to apply the truth conditions for example for the statement that two objects at different places or at different times have the same length. But then, how is it possible to replace e.g. a time unit with a better one? Why did we replace the definition of one second as 1/86400 of the diurnal day with a number of oscillations in a certain kind of electromagnetic radiation? Well, one reason was that according to our theory of gravitation, the diurnal day varies slightly, whereas quantum theory tells us that nothing affects the frequency of electromagnetic radiation. This topic is discussed at considerable length in van Fraassen (2008, 130 ff). Determining fundamental units, i.e., determining how to apply fundamental units (such as meter and second) in practical measurements is decided by IUPAP, i.e., International Union of Pure and Applied Physics, and these decisions may change; for example, in the 1960s it was decided to change the meter definition from an ostensive one (“One meter is the length of the meter prototype in Sevres”) to one based on the distance travelled by light in vacuum during a very short period of time. But this change didn’t affect the lengths attributed to objects (within a very small margin of uncertainty) and since this is what counts, and not the intension of

8 This

view has often been criticised with the argument that changes of operational definitions would change the meaning of quantitative predicates, which is taken to be unacceptable. My reply is; so much the worse for the concept of meaning. When the definition of the meter unit was changed from being based on the meter prototype to a certain distance travelled by light in vacuum, the extension of the predicate “one meter” undergone a slight change, since its precision increased. But since I have no need for referents of quantities, there is no conceptual problem here. Why bother about meaning?

152

10 Laws

the expression “length”, we may conclude that theory-ladeness of this predicate is innocent. It may be observed that “fundamental” here means “fundamental relative to the theory at hand”. It is no claim about fundamentality in an absolute or metaphysical sense. The reason why we need two fundamental quantities in kinematics is that, so long as we do not consider relativity theory, we need two kinds of measuring instruments (meter sticks and clocks) to measure and observe kinematic quantities.9 One also needs some geometry and arithmetic in doing mechanics, but these disciplines belong to mathematics; no measuring instruments are needed. How, then, do we proceed to dynamics? The actual history is illuminating. By using only kinematical quantities Descartes failed to construct an empirically adequate theory about interactions between bodies. But some years later John Wallis took the first step in advancing a successful dynamics, according to Rothman (1989, 85). In a report to Royal Society 1668 Wallis described his measurements of collisions of pendulums. Huygens and Wren performed similar experiments. All three found that there is a constant proportion between the velocity changes of two colliding bodies: v1 = constant v2

(10.4)

k1 v1 = −k2 v2

(10.5)

which can be written

The minus sign is introduced so as to have both k1 and k2 positive. By testing with different bodies they found that the constants really are constants following the bodies, i.e., they are permanent attributes of the bodies. These constant attributes are their masses, and we may chose a mass prototype giving us the unit. So we have m1 v1 = −m2 v2

9 When

(10.6)

we proceed to relativity theory, we can, since the velocity of light is a universal constant, reduce the number of fundamental quantities to only one, viz., TIME, since distances can be expressed in terms of times for light travel. So considering physics in its entirety we may say that only one quantity is fundamental. But we have arrived at this conclusion using classical theories as starting points (i.e., classical mechanics and electromagnetism) and these theories presuppose two fundamental quantities, TIME and LENGTH. One might say, following Wittgenstein, that once we have climbed the ladder we may throw it away!

10.6 Laws and Fundamental Quantities in Classical Mechanics

153

This is the law of momentum conservation, a law that from an epistemological point of view must be said to be fundamental in physics.10 The very first line of Newton’s Principia is the definition “The quantity of matter is the measure of the same, arising from its density and bulk conjointly.”11 This quantity he then calls “mass”. But how to measure density without using the quantity mass? In fact, Newton relied on the findings of Wallis, Huygens and Wren, as is clear from the Scholium following corollary VI in the first section of the first book of Principia. Wallis, Wren and Huygens had introduced the concept of quantity of matter without using the word “mass”. If we now divide both sides of Eq. (10.6) with the collision time, we get (neglecting the difference between differentials and derivatives since this is of no relevance for the present argument): m1 a1 = −m2 a2

(10.7)

Let us further introduce the term “force”, labelled “f”, as shorthand for the product of mass and acceleration. This gives us Newton’s second and third laws: N2 : f = ma

(10.8)

N3 : f1 = −f2

(10.9)

Thus we have got Newton’s second and third laws based on an observed regularity, viz., momentum conservation during collisions between bodies. Forces are often thought to be causes of accelerations; when a body changes its velocity, we say it has been affected by a force. This force is the momentum change of another body, perhaps a remote one, in which case the momentum exchange is transmitted by a field. Thus the claim that force is defined as dp/dt is compatible with the common conception that forces are causes; it is the momentum change of another body that is the cause of an observed body’s momentum change. However, if we want to use causal idiom, we must say that cause and effect occurs simultaneously. Furthermore, the notion that forces are causes is hardly compatible with my stance that quantitative predicates do not refer to anything, since causes

10 Konopinski’s

account of classical mechanics in Konopinski (1969) begins similarly by considering collisions; he states that “The Principle that the total momentum of any isolated system is conserved forms part of the basic framework on which all physical theory has been constructed.” (p. 35). 11 Mach criticised this definition and rightly observed that mass must be defined using observations of interactions between bodies: “Definition 1 is, as has already been set forth, a pseudo-definition. The concept of mass is not made clearer by describing mass as the product of the volume into density as density itself denotes simply the mass of unit volume. The true definition of mass can be deduced only from the dynamical relations of bodies.” (Mach 1960, 241).

154

10 Laws

normally is presupposed to be a kind of entities. But causes are no entities, see Chap. 8.

10.6.2 Types of Laws in Classical Mechanics I have so far discerned three different types of laws in classical mechanics: • Fundamental laws are those UGCs which at the same time express generalisations about observations and function as implicit definitions of new quantities. (I will generalise and give a more precise definition of a fundamental law in Sect. 10.9.) • Explicit definitions of new quantities, i.e., quantitative predicates. • Derived laws, which logically follow from fundamental laws and explicit definitions of new quantities. Let’s now look at the law of gravitation to see whether it fits into any of these categories: f =G

m1 m2 r2

(10.10)

The force can be replaced by its definiens, ma, so we have (identifying m = m1 ) a=G

m2 r2

(10.11)

Thus we can derive the acceleration of a body,12 given knowledge about the mass of the other gravitating body and its distance. Now we can check the law of gravitation by measuring the body’s acceleration and we may find that prediction and observation always coincide. So the law of gravitation seems to be a purely empirical law. But isn’t this remarkable? How could it be that the quantities MASS , ACCELERATION , FORCE and DISTANCE , defined independently of the law of gravitation, without exception also satisfy this extra condition? Collisions between bodies and gravitational interactions seem to be quite different kinds of events; it appears to be a cosmic coincidence, a brute fact that cannot be further explained. But, surely, there must be an explanation. The first step is, as is well known, to realise that we are talking about two different mass concepts, INERTIAL MASS and GRAVITATIONAL MASS; INERTIAL MASS is defined using the regularity observed in collisions, GRAVITATIONAL MASS is defined using the regularity observed when bodies interact at distances. But this does not really remove our bewilderment, for now one asks instead: how could it be

12 I

have here presumed that we talk about a body only involved in gravitational interaction.

10.6 Laws and Fundamental Quantities in Classical Mechanics

155

that the gravitational and inertial masses of all bodies are proportional? Newton saw it, but had no explanation. It was Einstein who solved the problem within general theory of relativity. The solution is simply that gravitational and inertial mass is the same quantity, since gravitation and inertia at bottom are not different kinds of phenomena. This is the basic idea in general relativity. This step, by the way, strongly supports my view, presented in Sects. 7.4 and 10.4, that quantities should be understood as quantitative predicates, not as physical universals. It is not only superfluous to assume that quantitative predicates refer to universals, it also raises obstacles for our understanding of relativity theory. If we accept GTR and assume that quantitative predicates refer to universals, we must either say that the two predicates GRAVITATIONAL MASS and INERTIAL MASS refer to the same universal, or that they refer to two different universals with the same extension. Both alternatives gives us more problems than it solves. The very point of a quantitative predicate is that its identity is determined purely extensionally and that removes any need to postulate a referent for it. Einstein’s crucial step was to generalise the relativity principle used in special theory of relativity. Special relativity is restricted to inertial, i.e. non-accelerated systems. In general theory of relativity this restriction is removed; all coordinate systems, whether accelerated or not, are equally legitimate and the laws should have the same form in all of them. This has the consequence that the distinction between gravitation and inertia disappears. (Einstein’s argument was that we cannot by local observations decide whether the force on a body is gravitational attraction from another body, or inertia due to the system being accelerated.) Returning to classical mechanics considered per se and disregarding relativity theory, we may conclude that INERTIAL MASS is determined by the law of momentum conservation, that GRAVITATIONAL MASS is determined by the law of gravitation and FORCE is explicitly defined as ma. Thus the law of gravitation satisfies the definition of a fundamental law (an empirical regularity and simultaneously an implicit definition of a new concept). In classical mechanics we now have two fundamental laws, momentum conservation and the law of gravitation, and two fundamental quantities, defined by these laws. Together with TIME, DISTANCE and functions of these, we have a complete set of fundamental quantities in classical mechanics. All other quantities, such as FORCE , ENERGY, WORK , POWER , ANGULAR MOMENTUM etc., can be explicitly defined in terms of the kinematic concepts + INERTIAL MASS + GRAVITATIONAL MASS . There are several different theory formulations of classical mechanics, but they all rely on the kinematical quantities + MASS, although one might think not at first glance. The fundamental notions in for example Hamilton’s and Lagrange’s versions of classical mechanics are generalised coordinates and their corresponding momenta, which are treated as independent variables. But when applying the theory to observable phenomena one identifies MOMENTUM as mv, where v is measured in the chosen generalised coordinate. And just as in my account, FORCE is introduced as a derivative notion, this time as the derivative of a potential function. So the

156

10 Laws

empirical foundation in any version of classical mechanics are descriptions of observed regularities in which MASS is used. This is the reason why I concur with Gauss, who famously held that TIME, DISTANCE and MASS are the fundamental quantities in physics. Being an empiricist, I believe it crucial to state the empirical basis consisting of theory-independent observations for any empirical theory we may consider. In physics (and I would claim also in the rest of natural science) this basis consists of observations of bodies at particular places at particular times.13 In biology and chemistry we may be interested in how things smell or taste, or what colour they have, but still, the things observed are bodies at particular places. So I don’t see any alternative to take the kinematical concepts of classical mechanics, i.e. time, distance and the time derivatives of distance, i.e., as pre-theoretical and given in advance, whereas mass (inertial and gravitational) is introduced with our formulation of the laws of momentum conservation and gravitation respectively. The number of fundamental quantities implicitly defined and the number of fundamental laws must be the same. All other useful quantities, such as FORCE, KINETIC ENERGY, POWER , etc., can then be introduced as explicit definitions in terms of previously defined quantities. It is impossible to state the law of momentum conservation without using the concept of INERTIAL MASS and this is the crucial point. Equation (10.6), interpreted as short for an UGC (“For every pair of colliding bodies. . . ..”) is at the same time an inductive generalisation from a set of observations and a contextual, i.e., implicit, definition of the quantity INERTIAL MASS. By a similar reasoning we may conclude that Eq. (10.10) at the same time is an implicit definition of GRAVITATIONAL MASS and an inductive generalisation of observations.14 In the very construction of quantitative concepts in classical mechanics we use fundamental laws of nature as definitions, or better, discovering new laws and constructing new quantitative concepts go hand in hand; these are closely related processes. The traditional view that one first has to define one’s concepts and then apply them in describing one’s observations is incorrect. This is, by the way, one good reason to dismiss the analytic-synthetic distinction as a fundamental premiss in epistemology. Newton’s second law is an explicit definition of the quantitative predicate FORCE; it does not express any generalisation of observations and “force” can always be replaced by “mass times acceleration”. No one has ever directly observed a force and compared it to the product of mass times acceleration.

13 Bridgeman

once expressed an almost similar opinion: ”What we observe are material bodies with or without charges (including eventually in this category electrons), their positions, motions, and the forces to which they are subject.” (Bridgman 1960, 58). I disagree on two points; (i) we never observe forces, we only observe moving bodies, and (ii) electrons are not bodies. 14 My view that some laws are implicit definitions has some affinities with Herbert Simon’s view (1970) on axioms of physical theories: “In the former case, new definable terms are likely to enter the system embedded in statements of physical law. These statements will partake of the nature both of definitions and of laws.” (Simon 1970, 22-23).

10.7 Laws in Special Theory of Relativity

157

The critical reader might point to calculations in statics where we attribute several forces to an element in e.g. a building. Nothing moves, so there are no accelerations, still we analyse the stability of the building by calculating forces at different points. Doesn’t this indicate that we in physics assume forces? No. Consider an element in a construction. Since it is not moving, the vector sum of all forces upon this element is zero. We can replace each force fi with its definiens mi ai and the total acceleration is of course zero. Talking about forces is convenient but logically superfluous. As argued earlier, we have no reason to assume that quantitative predicates refer to universals, in this case that the predicate FORCE refers to a force; we only need to assume the existence of those things talked about, i.e., that the singular terms refer. We may paraphrase a sentence attributing a force to an object as in the second paragraph of Sect. 10.4. A similar conclusion can be drawn also about other predicates, such as colour words. We benefit greatly of our ability for colour discrimination and our use of colour predicates, but that does not entail that we have reason to believe in the existence of referents of colour words in predicate position. Why not? Because postulating referents for predicates have no additional testable consequences beyond those of the sentences in which the colour words are used; no empirical evidence could be had for such assumptions. In ordinary discourse we talk as if colours exist. But if so, how many colours are there in reality? It is a well known fact that different cultures divide the spectrum differently, so identity criteria for colours are relative to culture, and there are no arguments for holding any one way of differentiating colours is the correct one; we make colour distinctions when we need them. The sensible conclusion is that there are no colours in the real world. Newton’s third law is a consequence of momentum conservation and the definition of force. There are lots of such relations between quantities derivable from the definitions, some of which being called “laws”. So if we want to keep close to the established use of the word “physical law” we should say that some laws are consequences of other laws. I will now extend the discussion to fundamental laws in two other theories, special theory of relativity and classical electromagnetism.

10.7 Laws in Special Theory of Relativity Special theory of relativity is built on two fundamental postulates, the relativity principle and the constancy of the speed of light. The relativity principle says that experiment results should be the same in all inertial systems, or in Einstein’s words: Special principle of relativity: If a system of coordinates K is chosen so that, in relation to it, physical laws hold good in their simplest form, the same laws hold good in relation to any other system of coordinates K’ moving in uniform translation relatively to K. (Lorentz 1952, Part A, §1)

158

10 Laws

How do we know this is true? The basic motivation is derived from an objectivity demand on physical descriptions, viz., that the physical content of the description of the state of a physical system should be independent of the observers’ perspective. Thus if two observers move with a constant velocity relative to each other, they should give similar descriptions of a physical system they both observe. So we know the relativity principle is true because we hold it true; apparent violations are explained as mistaken observations of e.g., uniformity of motion of the observer. Galilei apparently was the first to formulate a relativity principle and Newton followed suit. However, Newton did not view it as a fundamental principle for objective descriptions; he claims to have derived it as a corollary (Corollary V) in Principia. But that derivation is a non sequitur, as shown by Harvey Brown (2005, ch. 3). I think it fair to say that Einstein is the first to conceive it as a fundamental epistemological principle, a requirement on observer independence. The relativity principle does not fit into any of the three categories of laws so far identified, and that is perhaps a reason why it is not called a law. It is a condition for objective descriptions of nature. The constancy of velocity of light is generally stated as a basic postulate of special relativity. However, it is in fact no fundamental law, it follows from the relativity principle, Newton’s laws and Maxwell’s equations, as shown by Feynman et al. (1964, 18-5) and Dunstan (2008). Dunstan concludes: Special relativity derives directly from the principle of relativity and from Newton’s laws of motion. The parameter values of a=1 or k=0 were compatible with all experimental information available in Newton’s day. However, Maxwell’s equations permit a more accurate determination, from Faraday’s and Ampère’s experimental work and Maxwell’s own introduction of the displacement current. Discussions of the Michelson and Morley experiment and of theories of the ether are quite unnecessary. The behaviour and the mechanism of the propagation of light are not at the foundations of special relativity. (op. cit. p. 1865)

 The parameter a is the transformation formula 1/ 1 − 0 μ0 v 2 and k = −0 μ0 , i.e. k = −c−2 . Constancy of the speed of light is thus a derived law. It is rather well known, too, that Einstein was not primarily motivated by the negative outcome of Michelson-Morley’s attempt to measure the ether-wind when he stated that the velocity of light is constant and an upper limit for all velocities. His fundamental inspiration was the thought experiment of an observer traveling with the same velocity as an electromagnetic wave front. He realised that the observer would see the front as a stationary electromagnetic field and that contradicts Maxwell’s equations. Hence, the assumption of an observer traveling at the velocity of light must be wrong. So Dunstan’s proof is a mere spelling out of an older insight. It is interesting to note that we begin with two kinematical quantities in classical mechanics, TIME and DISTANCE, which require two distinct kinds of measurement devices, and then, based on this theory + electromagnetism, we have constructed a more general theory, special theory of relativity, which entails that we can reduce

10.8 Laws of Electromagnetism

159

the number of fundamental kinematical quantities to one and we need no longer any meter sticks, only clocks!15

10.8 Laws of Electromagnetism The conceptual structure of electromagnetism is more convoluted than that of mechanics. The first thing we have to notice is that although the words “electricity” and “magnetism” was used long before one had any theory about these phenomena, the more precise quantitative concepts were not fully developed until Maxwell published his (Maxwell 1873) and introduced DISPLACEMENT CURRENT. The second thing to notice is that one cannot find any single law in electromagnetism that individually introduces a new quantity; it is only jointly that a set of laws implicitly define the electromagnetic quantities. There is general agreement that the fundamental laws are Maxwell’s equations and Lorentz law, see e.g. Feynman et al. (1964, ch.18), so these laws together should function as joint implicit definitions of the fundamental quantities in electromagnetism. And indeed they do. The effects of electromagnetic interactions are observable as changes of the motions of bodies.16 So in the electromagnetic theory we need a law that connects electromagnetic quantities to mechanical quantities attributed to bodies, such as mass and velocity, which is done by Lorentz’ law: F = q(E + v × B)

(10.12)

where F stands for the force on a body, q for its charge, v for its velocity, E for the electric field and B the magnetic field. (Boldface letters stand for vector quantities.) Then we need laws that implicitly define CHARGE, ELECTRIC FIELD and MAGNETIC FIELD. That is done by Maxwell’s equations: ρ 

(10.13)

·B=0

(10.14)

·E=

×E=

15 The

∂B ∂t

(10.15)

connected question about the number of fundamental dimensional constants is a topic of debate, see Sect. 9.4 and (Duff et al. 2002). 16 There is no other option, as observed by e.g. Born (1924, 189): “Electromagnetic forces are never observable except in connection with bodies”.

160

10 Laws

×B=

4π k 1 ∂E J+ 2 c2 c ∂t

(10.16)

There are no independent definitions of the electromagnetic quantities, so these laws must also function as implicit definitions of these quantities. This means that the number of independent laws must equal the number of fundamental electromagnetic quantities. In order to clearly see this, we cannot count the number of equations in the form given above, since several quantities are vectorial quantities and each component of such a quantity is independent of the other. Furthermore, these equations are invariant under Lorentz transformations, so for the present purpose it is more convenient to express Maxwell’s equations in Lorentz invariant form with the help of the tensor ⎞ 0 Ex Ey Ez ⎜ −Ex 0 Bz −By ⎟ ⎟ =⎜ ⎝ −Ey −Bz 0 Bx ⎠ −Ez By −Bx 0 ⎛

F μ,ν

its dual Fμ,ν and the fourcurrent J= (ρ, Jx , Jy , Jz ) as the two equations: Jβ =

∂F β,α ∂x α

0 = ∂α Fβ,γ + ∂γ Fα,β + ∂β Fγ ,α

(10.17) (10.18)

where the inhomogeneous equation expresses equations (10.13) and (10.16) and the homogeneous one expresses (10.14) and (10.15). Now, (10.17) and (10.18) are in fact each four independent equations, and since Lorentz’ law consist of three independent equations, we have in total 11 equations. That equals the number of quantities we need to determine: three components of the electric field, three of the magnetic field, four components of the fourcurrent and finally total charge. In other words, Maxwell’s equations+Lorentz’s law together completely determine the quantities ELECTRIC FIELD, MAGNETIC FIELD, TOTAL CHARGE and FOURCURRENT , given only the directly observable properties of a system. Hence, we may say that Maxwell’s equations+Lorentz’s law together constitute implicit definitions of the fundamental electromagnetic quantities; they are the fundamental laws of electromagnetism. Other electromagnetic quantities are explicitly defined in terms of the fundamental quantities and all other laws of electromagnetism are derivable from the fundamental ones + explicit definitions.

10.9 Fundamental Laws that Do Not Introduce New Quantities

161

10.9 Fundamental Laws that Do Not Introduce New Quantities There are quite a number of basic principles which generally are said to be fundamental laws, but which do not establish relations between quantities. One example is, as we just saw, the relativity principle. Three other examples are: Conservation of charge: ∀x, if x is a closed system and q(x) is the total charge in x, then ∂q/∂t = 0 Pauli’s exclusion principle: ∀x∀y∀z, if x is a quantum system, y and z are fermions belonging to that system and y = z and if S(y) is the ordered quadruple of quantum numbers of y and T(z) is the ordered quadruple of quantum numbers for z, then S(y) = T (z) Quantisation of interaction: ∀x, if x is a quantum system, x emits or absorbs energy E only in discrete portions E = hν. For each of these I will now explain why we are prone to say that they are laws. Charge conservation is one of the conservation laws, which all have the same form, i.e., expressing the conservation of a quantity in a closed system. The interesting thing with these laws is that we have no independent criterion for what to count as a closed system. In other words, if an experiment indicates that e.g., charge or energy is not conserved in a system, one has two options: either to reject the assumption that the observed system is closed, or to accept that conservation of the quantity is violated. A well-known example of this is the first experiments (around 1932) in which weak interactions (as they later were called) were studied. Neutrinos are produced in such interactions and they carry energy. But neutrinos were not known or observed, they very rarely interact. So the experiments seemed to violate energy conservation. This was also suggested by Bohr, but Pauli disagreed and instead held that the system was not closed; he suggested that a so far unknown particle had been produced and carried away the missing energy. A theory was developed and 20 years later new experiments confirmed the existence of neutrinos. The conservation laws jointly define what we mean by a closed system, provided that the quantities involved are independently defined. But they do not satisfy the definition of a fundamental law given above, because closed system is not a quantitative concept. However, conservation laws are in an important respect similar to fundamental laws, as defined above, in that they are generalisations of observations and implicitly and partly define the theoretical concept closed system; each conservation law contributes to the determination of the identity criteria for things satisfying the predicate “closed system”. Conservation laws are viewed so certain that it is inconceivable that any physicist would have doubts. The reason is that using Noether’s theorem they can all be derived from symmetry requirements. Noether’s theorem says roughly that if a system is symmetric under a continuous parameter transformation, the conjugate quantity to that parameter is conserved. A little more precisely: To every differentiable symmetry generated by local actions, it corresponds a conserved

162

10 Laws

current. In the derivation of Noether’s theorem one uses the Lagrangian, so the theorem does not apply to systems that cannot be modelled by a Lagrange function. This corresponds to systems not being closed, i.e., dissipative systems; hence conservation applies only to closed systems. Thus, time translation symmetry entails energy conservation, spatial translation symmetry entails momentum conservation, rotation symmetry entails angular momentum conservation and gauge invariance entails charge conservation. (Gauge invariance is invariance under phase transformations of the electromagnetic vector potential.) The symmetry requirements are thus basis for the conservation laws. These symmetry requirements may in turn be understood as being part of objectivity requirements on descriptions of physical systems. For example the requirement that the Lagrangian for a system be invariant under the transformation t → t + t is an objectivity requirement: the objective features of the system described by the Lagrangian do not depend on the choice when to start the clock, i.e. when to put t=0, which means that transforming a description given by one observer to another who has started his clock earlier (or later) should leave the description invariant. So the requirement of invariance under time translations is an objectivity demand. Similar considerations apply for spatial translations, rotations and phase transitions in electromagnetism. Symmetry under certain parameter transformations is a necessary condition for objective descriptions of the physical world. The relativity principle is, as we saw, another such condition for objective description. Pauli’s exclusion principle and quantisation of interaction are two of the basic principles of quantum mechanics. They both describe properties of quantum systems. What, then, are the identity criteria for a quantum system? Just as with closed systems, we have no independent criteria for identity of quantum systems, so these two principles contribute to establish identity criteria for quantum systems and for their representation in the formalism. For example, if an observation report purports to indicate violation of Pauli’s exclusion principle, the scientific community again has two options, either to give up Pauli’s exclusion principle, or to dismiss the assumption that the system was closed. All scientists would say that the condition of being closed was not fulfilled. In other words, if two or more fermions are found to have the same quantum numbers, they must belong to different quantum systems, which means that one may not construct a tensor product of their wave functions. For if we were to construct such a tensor product we would treat the two systems as now being one system which interacts with the environment as a unit and such a unit cannot contain two fermions with the same set of quantum numbers. (This will be further discussed in connection with the measurement problem.) We may remember that in the quantum world one cannot identify quantum systems by spatiotemporal criteria, because of their wavelike behaviour during propagation. Individuation and identity among quantum systems are given by our theory, by the way we manipulate wave functions.

10.9 Fundamental Laws that Do Not Introduce New Quantities

163

Two quantum systems prepared to be in respectively the initial states | and |, which do not belong to the same ray17 are thus distinguished as different systems and Pauli’s exclusion principle applies within each system separately. So a fermion being part of the system | can have exactly the same set of quantum numbers as a fermion in |. But if the two systems interact and the total state is the tensor product ||, no two fermions can be in the same state in this joint system. So Pauli’s exclusion principle contribute to determining criteria for identity and individuation of quantum systems. This means that the new combined system must be treated as a unit when it interacts with other systems by exchanging energy, momentum or other conserved quantities. That a closed system is attributed definite quantum numbers is a consequence of the fundamental quantum principle, discovered by Planck, that exchange of energy only occur in discrete portions; in other words, interactions between quantum systems are quantised. Thus Pauli’s exclusion principle and quantisation of interaction are in a general sense fundamental laws, albeit they do not fit my definition of fundamental quantitative law because they do not express relations between quantities. However, there are profound similarities. Fundamental quantitative laws have two features, they are generalisations of observations and they implicitly define a theoretical quantity. The first feature is also present in the conservation laws, Pauli’s exclusion principle and quantisation of interaction, which all are supported by observations (though in a more indirect way). The other feature is not exactly the same but it has a close analogue; a definition of a quantity must contain information about how to determine values of that quantity, whereas definitions of the concepts closed system and quantum system respectively, require information about identity and individuation among systems satisfying these descriptions, since we quantify over them.18 One might ask why not require the same of quantities. The answer is that we need no quantitative properties in the ontology. As already argued, there are no good reasons to say that quantitative predicates refer to properties; it suffice that they have extensions. We may now generalise and make more precise the informal characterisation of a fundamental law earlier given by generalising from quantities to theoretical predicates in general: Definition of Fundamental law: A physical law is a fundamental law if and only if (i) it belongs to the set of implicit definitions of theoretical predicates used in a physical theory, (ii) it is supported by observations, and (iii) it is part of a theory which enables us to make testable predictions.

are sets of wave functions and two wave functions | and | belong to the same ray if |=c| for any complex number c, see Weinberg (1995, 49). 18 This is an application of Quine’s “no entity without identity”. Using the expressions “for all x”, and “there is an x” in a meaningful way requires an identity criterion for entities in the domain. 17 Rays

164

10 Laws

Another use of the expression “fundamental law” is to found among adherents to the syntactic view of theories, such as Carnap and Gardner (1995) and Hempel (1970). In this tradition the intended meaning of “fundamental law” is “logically fundamental”. It is well known that one and the same theory can be given different formulations with different laws being the fundamental ones in this logical sense; the best example is perhaps classical mechanics which can be given a Newtonian, Lagrangian, Hamiltonian or a d’Alembertian formulation, each with different axioms. So “fundamental” in this logical sense must be relativized to theory formulation. By contrast, my conception of fundamental law is not relative to theory formulation. Those true universally generalised conditionals which satisfy the conditions in the definition given above are fundamental in an epistemic and semantic sense.

10.10 Lawhood and Necessity Consider the well-rehearsed contrast between # 1 All spheres of gold are less than 1 km in diameter. # 2 All spheres of U 235 are less than 1 km in diameter. We believe that #1 and #2 both are true. (If someone would discover a counter instance to #1, a huge heap of gold somewhere in universe, one could simply take a bigger diameter.) Knowing that U 235 is a radioactive isotope for which the critical mass is 52 kg (a sphere with a diameter of 17 cm) we are prone to say that #2 is a law, whereas #1 is not. It is also natural to say that #1 is contingently true, whereas #2 must be true, i.e., it is necessary. In fact, we are prone to say about all laws that they are necessary. Why? The specific kind of necessity attributed to laws is often called “physical necessity” ( or “nomological necessity”). Should we now say that a certain sentence p is (or expresses) a law because it is necessary, or should we say that since p is a law it is necessary? Those positions in the debate about laws that postulate universals or relations between universals as the metaphysical basis for lawhood naturally would say that p is a law because it is necessary. I am not tempted to go in that direction. The previous discussion is, I think, a plausible explanation why at least some important laws are classified as such without talking about necessity. So I prefer to explain physical necessity in terms of lawhood. We have three cases to consider, fundamental laws, derived laws and explicit definitions of new theoretical predicates. 1. Why do we say that fundamental laws are necessary? What is the intended meaning of “necessary” in this context? If we use a quantitative predicate such as ELECTRIC FIELD in a theory, we need a definition of that predicate and, as shown above, Maxwell’s equations function jointly as implicit definitions of this and other electromagnetic quan-

10.10 Lawhood and Necessity

165

tities. Thus, these equations are necessary conditions for the coherent use of ELECTRIC FIELD in our calculations. Often we abbreviate; instead of saying that Maxwell’s equations are necessary conditions for theoretical descriptions of electromagnetic phenomena, we simply say that they are necessary. Since the sentence “p is a necessary condition for q” has the form of a material conditional, we do not intend any modal distinction at the level of object language when we express this conditional with the short version “p is necessary”. Then, since “necessary” here is not intended as marking a modal distinction, the logical form of “p is necessary” is not that of p, but rather “p is necessary”, i.e., the statement that a law L is necessary may be understood as that it is a necessarily true part of the theory. Since the law sentence said to be necessary, i.e., necessarily true, is talked about, not used, we must put the law sentence in quotation marks. Thus we do not enter quantified modal logic at all. Saying that laws are necessary in the sense given above does not entail that they are absolutely certain, or that violations are inconceivable. Electromagnetism might one day be replaced by a better theory, but such a replacement means changing the electromagnetic laws, hence changing the extension of the quantitative predicates CHARGE, CURRENT, MAGNETIC FIELD etc., if these words still would be used in the new theory. 2. If a sentence q is derivable from another sentence p it follows that we have established the material conditional p → q. Hence, q is a necessary condition for p, which we in ordinary parlance may ascertain by the expression “q is necessary”, thus as before suppressing p as unnecessary to mention in the context at hand (and, as is usual in ordinary parlance, disregarding the use-mentiondistinction). Not mentioning a condition is common in natural language in cases where the speaker and listener assume mutual awareness about it. For example, if I say to my visitor “Now you must hurry”, we both understand the use of “must” as indicating a tacit condition, such as “if you want to catch the train”, which we both want to be fulfilled. This is also similar to our use of “must” in mathematics and logic; we may say, for example, “if 3x + 32= 83, then x must be 17”. The truth of the sentence “x=17” is a necessary condition for the truth of “3x+32=83”. Hence, since a derived law is a necessary condition for the truth of the fundamental laws from which it is derived, we say about derived laws that they are necessary. 3. Explicit definitions are usually not called “necessary”, but it is entirely correct to say that a definition of a technical term is a necessary condition for the meaningful use of that term in discourse. Hence we may reasonably say that having a definition of a quantity (or any other theoretical concept) is a necessary condition for the use of it in that theory. For example, we may say that Newton’s second law is a necessary condition for the use of the quantity FORCE in calculations and predictions in mechanics. Accepting classical mechanics means accepting f = ma as giving the extension of FORCE. Again, the sense of “necessary” here intended is simply “necessary condition” i.e., the consequent in

166

10 Laws

a material conditional with tacit antecedent. And as before, the word “necessary” is here a semantic predicate, not a sentence operator. A true accidental generalisation, such as #1, is not necessary in this sense; in #1 we use predicates which are defined independently of that sentence and neither is it a consequence of such definitions. So my explanation of our saying that laws are necessary suffice for distinguishing between #1 and #2. In short, all the laws that constitute a particular theory, fundamental laws, explicit definitions and all their logical consequences, are necessary conditions for our acceptance and use of the concepts in that theory.19 No modal distinctions in the object language are assumed by using the word “necessary” in this sense. Thus we have the means to discern some true UGCs as laws and it is their status as laws that motivate our calling them necessary. Not all the logical consequences of a set of laws are UGCs; we may also derive singular conditional statements from a set of laws and it is entirely reasonable to say that also such statements are physically necessary. So we arrive at the following definition: Physical Necessity: p is physically necessary if p is a law, or a logical consequence of a set of laws. Both the expressions “is a law” and “is physically necessary” are thus used as predicates taking sentences as arguments, not as sentence operators. This is not common, usually “necessary” is taken as a sentence operator. But if we do that and apply it to quantified sentences, we enter quantified modal logic. In this realm we arrive, via the Converse Barcan Formula and Distribution of Necessity, at what Quine (1976c) called ‘Aristotelian essentialism’, i.e., a distinction between essential and contingent properties. Being an empiricist, this is too much metaphysics for my taste. Any such metaphysical commitments are avoided if we conceive “necessary” as a semantic predicate, a modifier of “true”. Van Fraassen (1977) argued that physical necessity, which he conceived as a sentence operator, is a species of verbal necessity. This is a possible stance so long as one does not apply “necessity” to sentences containing quantifiers, and van Fraassen didn’t discuss that. But this is a bit astonishing since law sentences are UGCs. It seems to me that he succeeds in arriving at his conclusion only by avoiding quantified modal logic. Henry Kyburg (1990) similarly argues that necessity should be construed as a semantic predicate, but I disagree with him about the status of laws and quantities, as already mentioned in Sect. 10.4.

19 This

was arguably a core idea in Kuhn’s talk about paradigms in his Kuhn (1970). But he would have won much clarity had he talked about extensions of predicates instead of paradigms, metaphysical assumptions etc. But if so, the book might have been less famous.

10.11 Summary

167

10.11 Summary In this chapter I have not been able to cover all physical laws, but I do think that I have given good reasons for the thesis that most physical laws fits into one of the three types of laws here described: (i) fundamental laws, which are generalisations of observations and at the same time implicit definitions of either quantitative predicates used in reporting generalised observations, or identity criteria for systems being quantified over, (ii) laws that are explicit definitions of quantities and (iii) laws that are derivable from other laws. The expressions “. . . .is a law” and “. . . ..is physically necessary” are best viewed as predicates in metalanguage, not operators in the object language. Saying that laws are necessary may be interpreted as talk about law sentences; we thereby distinguish a subclass of true sentences. One may consistently say about laws that they are necessary, i.e., necessarily true, in this sense, since they establish rules for use of general terms in the theory, without granting the existence of any metaphysical categories such as essences, dispositions or relations between universals. Theoretical predicates in physics ultimately get their meaning, i.e. their rules of application, from observations, and theory construction in physics must ultimately be built upon directly observable things, i.e. bodies, attributed measurable quantities. This account of laws is Humean in spirit. But in contrast to Hume, who held that necessity just is a projection of our expectations, I hold that attributing physical necessity and lawhood to some sentences in scientific theories are motivated by conceptual and epistemological arguments.

Chapter 11

Electromagnetism: Fields or Particles?

Abstract In this chapter it is shown that classical electromagnetism can viewed either as a theory about charged particles acting at distance on each other, or as a pure field theory, but that a double ontology where particles interact with electromagnetic fields, is untenable. But one can switch between a field and a particle ontology. This freedom is however not possible in relativistic quantum electrodynamics, there only a field ontology is possible, as was shown by Malament. A particle ontology does not fit relativistic quantum electrodynamics.

11.1 Introduction: What Is Real: Fields, Particles or Both? Some physicists hold that electromagnetic fields are not real, but merely calculational devices; the electromagnetic field at a certain point is nothing else than an expression for the effect distant charged particles would have on a charged particle at that point. For example Wheeler and Feynman (1949, 426) proposed quite some time ago such an interpretation of electromagnetism: This description of nature differs from that given by the usual field theory in three respects: (1) There is no such concept as “the” field, an independent entity with degrees of freedom of its own. (2) There is no action of an elementary charge upon itself and consequently no problem of an infinity in the energy of the electromagnetic field. (3) The symmetry between past and future in the prescription for the fields is not a mere logical possibility, as in the usual theory, but a postulational requirement.

Others, in particular quantum field theorists, such as Weinberg, take the opposite view, holding that only fields exist: The inhabitants of the universe were conceived to be a set of fields - an electron field, a proton field, an electromagnetic field - and particles were reduced to mere epiphenomena. In its essentials, this point of view has survived to present day, and the forms the central dogma of quantum field theory; the essential reality is a set of fields subject to the rules of special relativity and quantum mechanics; all else is derived as a consequence of the quantum dynamics of these fields. Weinberg (1977, 23))

A philosopher who has elaborated this field view in a Kantian vein is Auyang (1995). There is also a third option concerning the ontology of EM, viz., to hold © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_11

169

170

11 Electromagnetism: Fields or Particles?

that both charged bodies and electromagnetic fields exist. This appears to be a common view among both physicists and philosophers, and, moreover, it is usually the view of many textbooks in electromagnetism. Several philosophers have joined the debate, see for example Lange (2002), Frisch (2005, 2008), Belot (2007), Muller (2007), Vickers (2008) and Pietsch (2010). In this chapter I will argue: (i) A double ontology comprising both particles and fields is problematic. Either we should think of electromagnetism as a theory about charged particles directly interacting with each other, or as theory of fields whose local interactions are manifested as field quanta, called ‘particles’. (ii) From a purely theoretical point of view the choice doesn’t matter much when it concerns classical electromagnetism; it is possible to formulate it in first order predicate logic either as a theory about particles or as a theory about fields and there is, as shown by Quine (1981b, 17–19), a general method for translating a theory about one kind of objects into a theory assuming another kind of objects, provided these theories are empirically equivalent. (iii) From an empiricist point of view one must accept as existing those objects that are directly observable. Empirical predictions of electromagnetism are predictions of the motion of charged bodies and since bodies in theory are represented as particles, this should be the empiricists’ choice of ontology. (iv) In quantum electrodynamics one is forced to chose a field ontology, since a particle ontology is impossible, as proved by Malament and others. So called ‘quantum particles’ are field quanta, not particles with identity criteria and field quanta cannot be treated as individuals.1 In Sect. 11.2 I will first discuss how we may identify the ontological commitments of a theory and present Quine’s method for changing ontology between two empirically equivalent theory formulations and in Sect. 11.3 I will spell out how this may be done in classical electromagnetism. In Sect. 11.4 I will rehearse a recent debate about the consistency of classical electromagnetism. The outcome of that debate was that the source of inconsistency is contradictory assumptions about self-fields. Since self-fields are necessary in a consistent theory but conceptually awkward I will in Sect. 11.5 discuss the relation between particles and fields and give my arguments against a double ontology. In Sect. 11.6 I argue for the need to accept bodies in our ontology, since it is these things we directly observe when testing our theories. In Sect. 11.7 I will discuss the problem with a particle ontology in quantum electrodynamics, concluding that fields are those entities we may accept as real and that a particle ontology of relativistic quantum theory is impossible. There is thus a tension between classical and quantum electrodynamics, a tension which is nothing else than the well-known measurement problem of quantum mechanics.

1 This

does not contradict Wheeler & Feynman’s stance since their paper explicitly concerns classical electromagnetism.

11.2 Ontological Commitment

171

11.2 Ontological Commitment The theoretical skeleton of physics consists of a number of equations relating physical quantities to each other and rules for measuring these quantities. These equations and rules do not contain much of ontological commitment, if anything at all. But when we describe the content of these equations in complete sentences we must commit ourselves to some ontology. (Example: ‘The electromagnetic field at point x determines the motion of a charged particle at that point.’ The speaker of this sentence is committed both to the existence of an electromagnetic field and of a particle.) We cannot avoid making ontological assumptions when we express an abstract theory in complete sentences. The ontological question related to electromagnetism may, therefore, be stated as: Which things are we committed to accept as existing when we accept electromagnetism as (approximately) true? Are there really any fields? Are particles real? Do both fields and particles exist? It is desirable to have a general methodology for answering these questions, and, luckily, one such is available. It was proposed by Quine quite some time ago in his (Quine 1976a) (which was a paper he read 1939 at the fifth International Conference for the Unity of Science in Cambridge, Mass.) The idea is now well known, by Quine famously expressed as: ‘To be is to be the value of a variable.’ In other words, we accept those things as existing, which are needed as values of variables in a theory we believe to be true, when this theory is expressed in first order predicate logic. I fully endorse this principle and also Quine’s ensuing criterion for acceptance of a purported kind of entity, viz., that we need an identity criterion for acceptable objects in our ontology, by Quine famously phrased ‘No entity without identity’. The argument for this principle is that in order to legitimately postulate a kind of entities we need a criterion which tells us when two distinct singular terms refer to the same thing. If this condition is not fulfilled, there is no clear sense in talking about a particular thing a, i.e., using expressions containing a term referring to a. Surely, we do not want to say that the only way of talking about the particular object a is to use the name ‘a’ and that no other singular term can be used to refer to this object. For if the latter is the case, a critic might reasonably say that we have no reason to distinguish between the linguistic item ‘a’ and its purported reference, the object a, in cases where the purported referent is a theoretical, postulated entity.2 But one cannot directly read off the ontology of electromagnetism from an ordinary textbook, because there is no unique way of expressing it in first order predicate logic. It is possible to quantify over fields, over particles, or over both particles and fields. Which paraphrase should we choose? This choice reflects our ontological commitments. Before we continue a comment about the word ‘particle’ is in order. In physical theories the word ‘particle’ is often used, but we should not interpret it to mean a permanently existing object without extension. When occurring in an expression

2 The

medieval notion of haecceity, ‘thisness’, hence conflicts with Quine’s demand on identity.

172

11 Electromagnetism: Fields or Particles?

such as ‘A particle with mass m and charge q. . . .’ it cannot literally mean a point object, for if that were the case we would postulate an object with infinite mass and charge density and that conflicts with physical theory. The reference of ‘particle’ is simply an object about which we, in a particular context, disregard its spatial extension and inner structure. We treat it as a unit in interactions with other things, disregarding its spatial extension and inner dynamics, if any such there is. Hence, in classical mechanics and classical electromagnetism, the word ‘particle’ may be interpreted as referring to a body, a spatially extended object which can be identified and later re-identified as the same body. (These things, moreover, are the ultimate things we observe when we submit our theories to empirical testing.) By contrast, in quantum mechanics and quantum field theory, the meaning of the word ‘particle’ is a field quantum; photons are quanta of the electromagnetic field, electrons are quanta of the electron field, etc., as Weinberg put it in the quotation above. (In passing it may be observed that in this field picture the question of the identity of electrons, photons, etc., never occurs; electrons are simply definite portions of charge and photons portions of radiation energy.) It may be noted that Weinberg’s dictum that particles are mere constructions, or as he calls them, ‘epiphenomena’, out of local fields, may be correct, but being a construction, or an epiphenomenon, does not entail being unreal in the sense as being impossible to identify as value of a variable used when formulating electromagnetism in first order predicate logic. Consider the parallel of money; Money is a social construction, as Searle (1995) has shown, but, surely, we distinguish between real money and counterfeit. I get money, real money(!) as salary, every month from my university. No one doubts money is a kind of construction; still money is real in any reasonable sense. Another example of real but constructed objects may be provided by numbers. Platonists and constructivists disagree about the metaphysical status of numbers, but they agree that numbers can be identified and talked about. Platonists believe numbers to exist independently of humans, constructivists view them as constructions. Brouwer, for example, held that numbers are constructed out of our intuition of time. Constructivists may be said to embrace an anti-realist metaphysics, but numbers viewed as constructed objects clearly satisfy Quine’s criterion of being possible values of variables, and Platonists agree on that. So if we adopt Brouwer’s constructionism, we may say that numbers are real in the sense I here use that word, albeit being constructions. So holding that particles are constructions out of fields doesn’t entail that they are unreal and unsuitable as values of our variables; even a constructed object may have a clear identity criterion and being acceptable in our universe of discourse; but both fermions and bosons lack identity criteria, as will be thoroughly discussed in Chap. 14.

11.2 Ontological Commitment

173

11.2.1 Alternating the Ontology of a Theory We may adopt either an ontology of fields or an ontology of electrically charged bodies as the entities talked about in classical electromagnetism; both are possible and accepting electromagnetism as an approximately true theory does not force us to make a choice. (But a particle interpretation of relativistic quantum theory is impossible as will be discussed in Sect. 11.6.) The general argument for such a possibility was given by Quine (1981b). He showed that a theory about a class of objects can be translated into another empirically equivalent theory formulation about another kind of objects using what he calls ‘proxy functions’. The idea is this: assume that in theory T1 a set of objects {ai } are assumed to exist and being the values of the variables in T1. Now assume someone has invented another empirically equivalent theory T2 assuming another kind of objects {bj } being the values of the variabels in T2. Suppose the sentence P (x), being part of T1, is true of each member of a subset {ak } of {ai } and the sentence Q(y) is true of each member of a subset of {bl } of {bj }. Let us now consider a mapping f , which to each element in {ai } associates an element in {bj }, such that {ak } is mapped onto {bl }. It is always possible to construct this mapping as being surjective, i.e. so that all elements in {bl } are maps of elements in {ak }. The extension of the predicate Q(y) is the map of that of P(x). Thus if P (an ), an ∈ {ak } is true in T1 then Q(f (an )) is true in T2. The procedure can be repeated for the extension of every predicate in T1 and thus we have all reasons to say that T1 and T2 are merely two formulations of the same theory, although T1 is about one class of objects and T2 about another class of objects. This procedure is always possible so long as T1 and T2 are empirically equivalent; thus we have good reason to accept Quine’s conclusion that ‘Structure is what matters to theory and not its choice of objects.’ (Quine 1981b, 20).3 In this quote from Quine the word ‘structure’ is not a singular term with reference but a predicate used in talk about theories. Endorsing Quine’s dictum may at first sight seem to conflict with accepting bodies in the ontology, but that is not so. We basically observe bodies, as pointed out in Sect. 11.1, so these are necessary components in the ontology. The question is what more is needed in theory construction and this may be taken as the point of Quine’s argument; one may observe that a condition for changing theory formulation by using proxy functions is empirical equivalence. The structure that matters to electromagnetism are the fundamental laws, i.e., Maxwell’s equations + Lorentz law. These state relations between electromagnetic quantities. Furthermore, via Lorentz’ law and Newton’s second law, the electromagnetic quantities are connected to the directly measurable quantities MASS and ACCELERATION of observable bodies and this provides electromagnetic theory

3 By

endorsing this view of Quine, the reader might think that I accept structural realism. I do not, since that would mean that I would accept different structures as entities that could be values of variables. But this conflicts with nominalism. Se also my discussion or structuralism in Sect. 7.2.2.

174

11 Electromagnetism: Fields or Particles?

with a foundation in observations. Quine’s point is that we have a choice when constructing singular terms, referring to objects, and general terms, out of these quantitative predicates. I will illustrate the point by discussing Maxwell’s first equation.

11.3 Semantics of Classical Electromagnetism Accepting a law as true does not entail that we must accept that the general terms used in the law refer to universals; it suffice that these general terms have extension. So if we assume that charged particles exist we may consistently hold electromagnetism to be true while denying that electromagnetic fields, thought of as properties of charges, exist. But we must clearly say what kind of objects the general terms utilised in electromagnetism are true of. Let us, as an example, consider Maxwell’s first equation and express it in first order predicate logic. I begin with its integral form:

EdS = S

ρdV

(11.1)

V

This equation says that the total flux of the electric field E through a sphere S enclosing a space volume V equals the volume integral of the charge density ρ, which is the total charge q, in that volume. Assuming that charges are attributes of bodies, we may now express Maxwell’s first law as a statement about charged bodies: Maxwell’s first equation: For all charged bodies x, the charge q of x satisfy the  equation q = S EdS, where S is a closed surface surrounding x and no other charged body is inside S. Here I have tried to explicitly express the tacit assumptions made when using Maxwell’s equation for calculating fields and/or charges. The crucial thing is that ‘q’ is a parameter, not a variable bound by a quantifier. Hence we do not assume that it refers to anything that we are bound to accept as existing. The label ‘q’ is in concrete cases replaced by a number expressing the quantity of charge attributed to the body and quantities, i.e., quantitative attributes, are not entities. (Aristotle held the same view, by the way.) Similarly for electric field; it may be viewed as a quantitative attribute of bodies, not a thing that we need to accept as existing. Maxwell’s first equation is a fundamental law of electromagnetism. Holding this version of it true entails that we accept that the following two conditions are satisfied: 1. There exist charged bodies which are the referents of x. 2. Thesebodies satisfy the predicate ‘. . . have a charge q that satisfy the equation q = S EdS, where S is a closed surface surrounding a body (or bodies) having a total charge q.’

11.4 Inconsistency of Classical Electromagnetism?

175

 It is thus not assumed that the expressions ‘q’, ‘E’ or ‘ S EdS’ ref er to any properties; what is needed is only that these predicates have extension, i.e., are true of existing things. Let us now turn to the differential form of Maxwell’s first equation: E =

ρ 0

(11.2)

This is naturally interpreted as stating an equivalence between the divergence of the electric field and the charge density, i.e., as requiring an ontology of fields instead of particles and attributing charge densities to fields: Alternative formulation of Maxwell’s first equation: For all electric fields E, the divergence of E at a point x is proportional to the charge density at that point. Here we have switched ontology by quantifying over fields instead of particles. A field is identified by its field value at each point in spacetime. But we know that the differential and the integral form of Maxwell’s equation are two formulations of the same law; so we have a clear case of swapping ontology without changing neither the structure, nor the empirical content, of the theory. Without going through the same procedure with the other laws of electromagnetism I presume that this kind of reinterpretation between electromagnetism as a theory of charged particles and as a theory of fields is possible. Moreover, the very fact that one may disagree about the ontology without disagreeing about electromagnetism’s empirical correctness illustrates Quine’s point. So, given electromagnetism, we may either interpret it as a theory about particles, or about fields. But why not say that both particles and fields exist? This seems to be the common view among physicists. However, I see a conceptual problem in doing so. This problem is most clearly seen by considering the status of so called self-fields. This brings us to a recent debate concerning a purported inconsistency of classical electromagnetism.

11.4 Inconsistency of Classical Electromagnetism? Mathias Frisch argues (Frisch 2005, 32–34) that classical electromagnetism is inconsistent. He states four premises, all held to be true in electromagnetism, that entail a contradiction: 1. There are discrete finitely charged particles. 2. Charged particles function as sources of electromagnetic fields in accord with the Maxwell equations. 3. Charged particles obey Newton’s second law (and thus in the absence of nonelectromagnetic forces, their motion is governed by the Lorentz force law). 4. Energy is conserved in particle-field interactions, where the energy of the electromagnetic field and the energy flow are defined in the standard way.

176

11 Electromagnetism: Fields or Particles?

Belot (2007) and Muller (2007) have both discussed this argument and arrived at roughly similar verdicts: the formal derivation of the contradiction is correct, but the inconsistency comes from an inconsistent application of E, i.e., the electromagnetic field, in the equations. Their argument is in short that Frisch in one expression for the energy assumes that the force on a charged particle is due to the total electric field: F = q(Etot + v × B)

(11.3)

where Etot = Eext + Eself , i.e. the total field acting on the charge is the sum of the field from other charges and the self-field emanating from the very charge itself, whereas in another calculation for the energy he in fact uses only the external field in calculating the force and hence the energy. No wonder an inconsistency arises. One might think that there is something fishy about the idea of a charged particle acting on itself via its self-field, hence that Lorentz law explicitly and consistently should be expressed as that the force on a particle is produced only by external fields. Feynman et al. (1964, sec. 28.5) discusses this solution, but immediately rejects it: However, we have then thrown away the baby with the bath! Because the second term in ... Eq. (28.9) [i.e. the force on a particle due to its self-field] the term in x is needed. That force does something very definite. If you throw it away, your’e in trouble again. When we accelerate a charge, we must require more force than is required to accelerate a neutral object of the same mass; otherwise energy wouldn’t be conserved. The rate at which we do work on an accelerating charge must be equal to the rate of loss of energy per second by radiation. . . . We still have to answer the question: Where does the extra force, against which we must do this work, come from?. . . For a single accelerating electron radiating into otherwise empty space, there would seem to only one place the force could come from - the action of one part of electron on another part.

So consistency demands of us that we hold that the self-field contributes to the force on a charged particle. The somewhat astonishing fact is that even in the absence of external fields, it requires more work to accelerate a charged particle than an uncharged particle with similar mass! So the self-field must be taken into account in an exact calculation. ( See also Bauer and Dürr (2001, Theorem 1 and Lemma 5) or Komech and Spohn (2000, proposition 2.3) for a proof of the need to take self-fields into account.) So a consistent application of Maxwell’s equations and Lorentz’ law requires self-fields being included in E. One may observe Feynman’s last phrase ‘the action of one part of the electron on another part’. Thus he does not conceive of the self-field as something distinct from the charged particle, it is ‘another part’ of it. One may assume that Feynman adheres to an ontology purely of particles, thinking of fields only as calculational devices, as he did in his joint paper with Wheeler, quoted above. This view is plausible when we think of classical electromagnetism, where the word ‘particle’ refers to a body, an extended object. However, Feynman’s application to electrons (in the quote above) is troublesome, as we shall see when discussing relativistic quantum field theory; electrons cannot be conceived as individual objects, they are merely field quanta.

11.5 Why Not a Double Ontology?

177

11.5 Why Not a Double Ontology? Belot, in contrast to both Feynman and Weinberg adopts a double ontology: ‘The Maxwell-Lorentz equations (under the present understanding) describe a genuine interaction between the electromagnetic field and a charged particle that already treats the self-field of the particle.’ (op.cit. p. 268). This leads to troubles. Belot’s position seems to be the common one; the real world is populated both by charged particles and electromagnetic fields (including self-fields) and electromagnetism is a theory describing how these entities interact. But I beg to disagree! If talk of interaction, exchange of energy, is to have any meaning one must be able to identify the interacting objects independently of each other. This is impossible when it comes to the self-field of a charged particle; the only way to identify the self-field is by determining it by its source, the charged particle. This is the reason why Feynman (in the quotation above) abstains from concluding that the self-field is something else than the charged particle. The concept EXCHANGE OF ENERGY BETWEEN PARTICLE AND FIELD requires for its meaningful application that we can give independent identification of both relata; but since that is impossible in the case of self-fields, the concept has no application. We should say, instead, that either there are fields and charge densities are attributes of the fields, or that there are charged bodies and fields are attributes of these bodies. As was shown at the beginning of Sect. 11.3, in neither case are we forced to say that attributes exist. This conclusion should be rather straightforward already when looking at Maxwell’s first equation; knowledge about the field on the surface of a closed area determines the charge density inside that area, and vice versa. Neither the field, nor the charge, has any further relevant properties enabling us to treat them as distinct entities. This makes it hard, I would say impossible, to think of the relation between charge and field as a (causal) relation between different things. The natural interpretation is to say that the field on a closed surface and the charge density inside are but two descriptions of the same state of affairs. It is often said that charges are the sources of electric fields. This should not be interpreted in causal terms, neither should it be viewed as stating the ontological priority of charges over fields. I take talk about sources of fields as indicating an epistemological point: knowledge about charges enables us to infer values of the electric field at different points. Taking a single-ontology view, either conceiving fields as attributes of charged particles, or charges as attributes of fields, it is immediately clear that we must include the so-called ‘self-field’ term in the expressions for E when calculating the force using Lorentz’ law in order to have a consistent theory, just as we should include all particles in the particle description of the situation. My conclusion, so far, is that of the three possible ontologies for classical electromagnetism we should reject the particle-and-field ontology as deeply troublesome; either we should conceive electromagnetism as a theory about particles or about

178

11 Electromagnetism: Fields or Particles?

fields. We may switch between a particle ontology and a field ontology, but we should not think of these two kinds of entities as interacting with each other. There is profound analogy in this respect between Maxwell’s equations and Einstein’s field equations in general relativity theory, the fundamental law of GTR: 1 8π G Rμν − gμν R + gμν  = 4 Tμν 2 c

(11.4)

These 16 equations4 (both μ and ν take the values 0, 1, 2, 3) may be interpreted as stating that two quantitative descriptions of the world, the stress-energy-tensor Tμν and the spacetime description Rμν − 12 gμν R + gμν , i.e. a function of the metric tensor gμν , are proportional. By itself Eq. (11.4) does not say that the universe consists of two interacting entities, matter-energy and spacetime. There is no causal mechanism going from matter-energy distribution to the spacetime geometry, or vice versa. When expressing the content of these equations in complete sentences with subject-predicate structure we either say something like ‘the matter/energy of the universe has a certain spacetime-structure’ or ‘the spacetime structure of the universe has a certain matter-energy distribution.’ The point is that by itself, Eq. (11.4) does not determine what to treat as object of predication and what to treat as attribute. We may say that in modern physics the distinction between object and property/relation is merely a matter of linguistic convention. I’m repeating the point made in Sect. 11.3. However, there are more things than the fundamental equations to consider in the discussion about ontology of electromagnetism. One is epistemology; what do we need as objects when expressing observations supporting our theory? Another are relativistic constraints and the consequences of quantisation.

11.6 What Do We Observe? Physics, like any empirical theory must make contact with the external reality as observed by us humans. And what we observe, independently of any theory, are first and foremost medium sized bodies. (In an experiment we observe detectors, and these are medium sized bodies.) No matter what we think of the causes of bodies’ motions, electromagnetic fields or whatsoever, we easily agree on statements about positions and state changes of visible bodies. In classical mechanics and classical electromagnetism such bodies are represented as particles, so particles are unavoidable in our ontology. By contrast, fields are never directly observed; the presence of electric and magnetic fields are inferred from observations of bodies. So

4 Because

of symmetry it is in fact only 10 independent equations.

11.7 Relativistic Quantum Electrodynamics

179

we need bodies in our ontology anyway; this is a fact about ourselves as observers of the external world. This empiricist stance does not entail that unobservable things are non-existing. (This was the big mistake made by the Vienna circle, in my view.) Being an empiricist doesn’t mean that one rejects the existence of unobservable things, it only means that all evidence for a theory ultimately consists of observations. There is, however, a metaphysical argument for adopting a field ontology instead of particle ontology. For if we conceive of the physical world as populated by particles, bodies confined within well defined portions of space interacting with each other, we face the ancient-old conundrum: how could two things at different places interact without anything in between transmitting the interaction? How is action-at-a-distance possible? The desire to get rid of this conundrum has been, I guess, a strong reason to adopt a field ontology instead of a particle ontology. Our hostility to the notion of action at-a-distance comes from an illicit tacit assumption about space, viz., that it is a sort of ‘container’ for physical events. If we reject this picture and take general relativity into account, we must say that spatial distance is relative to observer. The objective distance measure is the spacetime interval, and the spacetime interval between two events connected by being the emission and absorption of a photon is zero. (And exchanging photons is how particles, i.e. bodies, interact in electromagnetism.) There is, from an observer independent point of view no distance at all between these two events; in fact they might better be described as two descriptions of the same event. (Cf. coin flipping: ’head up’ and ’tail down’ is the same outcome.) So I don’t think we should take action-at-a-distance in electromagnetism as a problem. But there is another obstacle for a particle interpretation of electromagnetism when we take quantisation into account and adopt a relativistic perspective.

11.7 Relativistic Quantum Electrodynamics The particle interpretation of quantum theory has come under heavy criticism from among others David Malament (1996), who argued that there can be no relativistic quantum theory of (localizable) particles, which entails that quantum electrodynamics cannot be interpreted in terms of particles. The paper started a debate and Halvorson and Clifton (2002) has defended Malament against several objections. Malament argument is based on four conditions which seems entirely reasonable on any relativistic quantum theory describing anything that can be called ‘a particle’. The conditions are Translation Covariance, Energy non-negative, Localizability and Locality. Translation covariance is the requirement that covariant translation in spacetime of all vectors should not change the predicted outcomes of experiments. This is a fundamental requirement in general relativity. This requirement and the energy condition are obvious constraints on any relativistic theory. The localizability

180

11 Electromagnetism: Fields or Particles?

condition states what we mean by a particle, viz., an object that can be found in a well defined restricted portion of space. The locality condition is weaker than the traditional condition that no object can travel with infinite speed. It merely says that the projection operators P1 and P2 commute, i.e., that the probability of detecting a particle in 1 is statistically independent of whether a detection experiment is performed in 2 and vice versa. From these assumptions Malament proves: Theorem If the structure (H, a → U (a),  → P ), satisfies conditions (1)-(4), then P = 0 for all spatial sets . Malament comments: We can think about it this way. Any candidate relativistic particle theory satisfying the four conditions must predict that, no matter what the state of the particle, the probability of finding it in any spatial set is 0. The conclusion is unacceptable. So the proposition has the force of a “no - go- theorem” to the extent that one considers (1) through (4) reasonable constraints. (op.cit. p. 6)

Halvorsen & Clifton points out that this doesn’t show that it is impossible to construct particles as supervenient on localized fields, but they formulate a theorem, which with very reasonable assumptions excludes this possibility. So an interpretation of electromagnetism that takes as its ontological basis electrons and other charged quantum particles, conceived as being confined to definite volumes in space, is out of the question. The field interpretation is the only remaining option; a field is by its very nature not confined to limited portions of space, it does not satisfy the localizability condition. This fact is related to another well-known feature of so called ‘quantum particles’, viz., that in general they lack identity criteria.5 Since they lack identity criteria, we cannot quantify over them and treat them as objects interacting with other objects in quantum field theory. (And this is the fundamental reason, I think, why we got into the trouble with self-fields.) It has been argued, for example by Segal (1964) and Barrett (2001) that empirical evidence supporting relativistic quantum field theory consists of observations of particles, i.e., objects being at a particular place at the time of observation: It is an elementary fact, without which experimentation of the usual sort would not be possible, that particles are indeed localized in space at a given time. (Segal 1964, 145).

Halvorsen and Clifton comments: It seems to us, however, that the moral we should draw from the no-go theorems is that Segal’s account of observation is false. In particular, it is not (strictly speaking) true that we observe particles. Rather, there are ‘observation events’, and these observation events are consistent (to a good degree of accuracy) with the supposition that they are brought about by (localizable) particles.” (Halvorson and Clifton 2002, 23)

5 Steven

French (1989) discusses identity criteria in physics and entertain the possibility of attributing quantum particles a primitive identity, a form of ’thisness’. I don’t see any gain in accepting this proposal. It appears to me being a case of obscurum per obscurus.

11.8 Summary

181

Basically I agree with Halvorsen and Clifton, adding that these observation events are events occurring in macroscopic bodies, detectors; we don’t observe particles, we observe bodies. What we consider as tests of physical theories are observations of state changes of macroscopic bodies and this does not conflict with the assumption that these state changes consist in exchange of energy and/or momentum between such macroscopic bodies and fields. What is required is that the field has a certain non-zero amplitude in the spacetime region where the detector is situated and observed. But quantum field theory has no account of the interaction between fields and macroscopic bodies, interactions occur between fields in that theory. In other words, we need a solution to the measurement problem, which will be given in Chap. 16. Thus, we should be careful to distinguish classical and quantum contexts when using the word ‘particle’. In the classical domain it means a body where we disregard its extension and inner structure, but we may attribute a definite trajectory to it, whereas in the quantum realm the word ‘particle’ and its cognates (electron, photon, etc.) signifies a portion of a conserved quantity, a field quantum, lacking identity and well defined trajectory. We should not think of particles triggering detectors; instead we should think of fields triggering detectors. Classical electromagnetism is a theory about electromagnetic interaction between bodies, represented in theory by particles, whereas quantum electrodynamics is a theory about fields. The problem of giving an account of the transition from quantum electrodynamics to classical electrodynamics is basically the measurement problem.

11.8 Summary The natural interpretation of classical electromagnetism as describing how charged particles interacts with electromagnetic fields is untenable. Particles and fields cannot be thought of as interacting; either we should think of fields as calculational devices; the electric field at a certain point is a descriptions of the effects of distant particles may have on a test particle at that point, or we may take the opposite view by holding that charged particles are nothing else than descriptions of electric fields. From a purely theoretical point of view both positions are possible. But since bodies are fundamental from an epistemological point of view, and bodies are represented in classical electromagnetism as particles, the choice for an empiricist must be to adopt particle ontology of classical electromagnetism. When moving to the quantized version of electromagnetism, quantum field theory, we must chose an ontology of fields because quantum particles, both fermions and bosons, lack identity criteria, which means that these purported objects cannot be treated as objects of predication. Thus there is a tension between quantum electrodynamics and classical electromagnetism and this tension is at bottom the measurement problem of quantum mechanics.

Chapter 12

Propensities

Abstract In this chapter it is shown that in quantum theory there is a place for talk about propensities. A quantum system being in a certain state may be ascribed a certain propensity to change into another particular state. This propensity can be calculated without knowledge of any frequencies. Observations of frequencies can be used to test the correctness of the calculated propensities. The use of propensities is fruitful when discussing irreducibly random state changes. In all other cases, where randomness is an effect of incomplete knowledge of initial conditions or of the dynamics, one cannot attribute other than trivial propensities, 0 or 1, to state changes. It is furthermore important to realise that conditional probabilities are not conditionals. If we equate a statement of the form ‘prob(A|B)= x’ with ‘if B, then prob(A)=x’, x must be 0 or 1. This point has often been missed in discussions about the interpretation of probability as propensity.

12.1 Introduction Philosophers have long argued about the interpretation of probability. What do we mean when we talk about probabilities? Is it a measure of our degree of belief in propositions, or is it an objective attribute of events, situations, objects or states of affairs, such as a long run frequency or a dispositional attribute? I think the reasonable stance is to say that we mean different things in different situations. We may call many measure functions satisfying Kolmogorov’s axioms ‘probabilities’ and there is no reason not to expect finding many quite different domains of inquiry in which it is possible to construe a mapping from a set of events, objects or propositions onto the compact interval [0,1] of real numbers fulfilling these axioms. My interest in this chapter is in propensities and objective chances. The relation between propensities and chances is intimate. Propensities are attributes of physical objects, whereas chances are ascribed to events. The connection is: if a physical object has a propensity p to undergo a change from a state α to a state β, then and only then this change has a chance p to materialise. One could perhaps attribute © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_12

183

184

12 Propensities

propensities both to objects and events, but I prefer to keep the terminology straight by using different attributes for different kinds of objects. A propensity, in my use of this term, is a dispositional attribute of a physical object.1 Under proper circumstances the disposition will result in relative frequencies that can be observed; relative frequencies are manifestations of these dispositions. Observations of relative frequencies give us evidence about propensities; they are not the same as propensities. My use of ‘propensity’ is akin to how (Popper 1990) used the term, although not the same. There are three differences. The first is that Popper attributes propensities to physical situations, not to physical objects in isolation. Secondly, Popper views propensities as causes, which I do not. Thirdly, Popper holds that for example a fair die has a propensity of 1/6 to come up 4, whereas I hold that, objectively, in any concrete situation it is either zero or one, as the case may be. This last difference reflects a difference regarding the scope of genuine indeterminateness. I’ll return to that in the next section. ‘Chance’ in my use of the term is not the same as Dawid Lewis’ term ‘chance’. Lewis holds that chance is objective single-case probability (Lewis 1999, 227). This he analyses in terms of objectified credence, i.e., credence conditionalised on actual history and our best theory. His New Principal Principle2 connecting objectified credence, C(|), with chance, P(), runs C(A|T H ) = P (A|T )

(12.1)

where ‘A’ refers to any proposition, ‘T’ to our best theory of the world and ‘H’ to the actual history up till now. In Lewis’ view this principle is an analysis of chance; hence chance for Lewis is an epistemic concept (and thus it is presupposed that T is true) and would change with changes of what we think is our best theory. Since Lewis contemplates the problem that future events might change the frequencies of events and thus the chance for an earlier event, it is quite clear that in his view chance cannot be a physical attribute of events, because if so, future events could affect earlier events, which would amount to an instance of backwards causation, which is impossible. My use of ‘chance’ is not like that; In my use of ‘chance’, it is a quantitative attribute of events that is independent of what we consider our best theory to be. Hence an independent characterisation of propensity (and hence chance), the objective measure function on physical dispositions, is needed. It might be argued that talk about propensities, chances and dispositions in general is superfluous metaphysics. What we observe are frequencies and why not

1 Attributes, dispositional or not, are no entities. They are predicates and I don’t recognise referents

to any predicates. introduces in Lewis (1999, 243) his New Principal Principle as an improved version of the original principle C(A|HT)= P(A), because he had spotted problems with the old version.

2 Lewis

12.1 Introduction

185

simply say that probabilities are relative frequencies? Let’s be sound empiricists. But that is not easily done. Suppose we identify probability with relative frequency. Relative frequencies can be attributed to finite or infinite sequences of trials. No one has ever performed an infinite series of trials, so, sticking to sound empiricism, we might think of relative frequencies in finite series. But that does not conform to our intuitions. Take for example dicing. Suppose we perform 600 rolls of a die and get 4 114 times, i.e., we have a relative frequency of 0,19. I think most of us would nevertheless say that the probability for 4 ‘really’ is 1/6, provided we don’t have any reason to suspect the die is not fair. We believe that if we would continue rolling, the relative frequency would approach the ‘true’ value. So our intuitive idea is that probabilities are the relative frequencies we would get in the long run, i.e. in infinite series of trials. The basic reason is that we intuitively conceive of probabilities as stable objective properties of objects or situations; if we would identify probability with actual frequency in a finite series, we would often be forced to say that the probability for e.g. 4 when dicing would change from one series of tosses to the next. But we are strongly convinced that in the long run the frequency will stabilize at 1/6. However, the formulation ‘the relative frequency we would get in the long run’, means ‘the relative frequency we would get, if we were to perform an infinite series of trials’ which is a subjunctive conditional, the truth value of which cannot easily be settled. The sceptic might ask: How could you know in advance what we would get? Perhaps the long run frequency for 4 is 0.18, while still the die is fair. The natural reply would be something like this. If the long run frequency really is 0.18, the die cannot be fair. All sides of a fair die have equal chances of coming up, so the probability distribution would be uniform in an infinite series of trials. But then, how do we know that all sides have equal chance, i.e., that the die is fair? Usually, lacking evidence to the contrary, we start by assuming a uniform distribution. Then, if we encounter evidence against a uniform distribution, we reject the uniformity and the fairness assumption. But if observed relative frequencies are the only means for determining fairness, then the attribution of propensities and objective chances, as something distinct from relative frequency in the long run, is idle. However, evidence concerning fairness can be obtained independently of frequencies of outcomes. A fair dice is mirror symmetric in three dimensions. This is a categorical attribute, more or less directly observable. Now, let’s suppose that by observing its symmetries we can convince ourselves that a die is fair, while the long term frequency for e.g. 4 still deviates from 1/6. Could that happen? Yes, of course it could; the initial conditions in the series of trials might not be uniformly distributed. I refer to such things as the orientation and angular momentum of the die at start of each toss, etc. All these conditions are clearly observable. So suppose further that we have made observations and found that the distributions of these parameters in an actual finite series of trials of a fair die are all uniform. Could it still be the case that the relative frequency of e.g. 4 deviates from 1/6? I would say no, the argument being that the outcome of dicing is determined by initial conditions and deterministic laws, viz., the laws of classical mechanics. If the

186

12 Propensities

initial conditions in a sequence of trials are uniformly distributed, the outcomes will also be thus uniformly distributed. I’m here assuming that dicing is a realisation of classical mechanics, a deterministic theory, and that quantum fluctuations have no influence on the dynamics. Since the initial conditions are observable, we can use classical mechanics to calculate the outcome. Since the outcome is determined by laws and initial conditions, the probability for, say 4, in a single case is either zero or one, as the case may be. We no longer have a chance event. Thus it is clear that when we call dicing a chancy event, we do so because we have no detailed information about all those factors that contribute to the outcome. By saying that dicing is chancy, we don’t imply that the outcomes are genuinely indeterministic, only that we have no complete control of the process. That goes also for the use of ‘chance’ and ‘random’ in many other contexts, for example in the discussion of causes of diseases. If we compare two persons, similar in a number of relevant respects, but differing in that one got a particular disease and the other not, we are prone to say that it was a mere chance that the one fell ill and not the other. However, few would conclude that there is nothing more to do or say about this disease; most take for granted that further research is meaningful and it is rational to hope for finding relevant differences between those who fell ill and those who didn’t. We see that in many cases ‘randomness’ means ‘lack of information’ or ‘lack of experimental control’. But not in all cases; there are genuinely random events in nature, viz., indeterministic ones. Or at least, so I believe. An event is deterministic iff a description of it can be derived from initial conditions and deterministic laws. A deterministic law together with a given set of values for all parameters and independent variables in this law entails definite values of the dependent variables. So a deterministic law is a mapping from the set of possible initial conditions to the set of values of the dependent variables. We are here considering dynamical laws, i.e., laws describing how states of physical systems evolve in time. Hence a law is deterministic iff only one state at any particular time is possible given the law and a complete specification of initial conditions. If more than one state at each point of time is thus compatible, the law is indeterministic. In other words, in the deterministic case the probability distribution over possible states is a singularity.

12.2

Objectivity and Chanciness

Is lack of control or lack of information an objective feature of events? No. One may reasonably say that, objectively, the single case chance for, say 4, upon dicing is either zero or one, not 1/6, because conditional on the initial conditions and the force situation, the outcome is determined to be either 4 or not 4. But of course, most people would, if queried, say that the probability for 4 when using a fair die is 1/6. Thus we have a reason to distinguish between objectivity and intersubjectivity;

12.3 Indeterminism and Objective Chance

187

Objectively the chance for 4 is either zero or one, but intersubjectively the chance is 1/6. In many contexts we equate objectivity with intersubjective agreement. This is in the present case wrong, if we take intersubjective agreement as meaning ‘almost universal agreement’. For if a person has detailed information about the initial conditions for each roll of a die, he would be able to predict the outcome with complete certainty. That means that he would say that the probability for e.g. 4 is either zero or one in each particular case, and he would be right and the majority wrong. The most reasonable stance would be to say that most people’s credence in the proposition ‘the outcome of the next toss is 4’ is 1/6, whereas the objective probability is either zero or one, depending on the initial conditions. There is no conflict in saying that objectively the probability for 4 in a single toss is either zero or one as the case may be, whereas the subjective credence is 1/6. Lewis’ Principal Principle (both the old and new version) tells us that the objectified credence, i.e., the credence we ought to have in a proposition, conditional on our best theory and full information about the history up till now, should equal the objective chance, in this case zero or one. This seems perfectly reasonable; in the example just discussed, the objectified credence is either zero or one. The conclusions to be drawn are two: (i) we should distinguish between events that are genuinely random and those that we usually call random, in spite of being deterministic, due to lack of complete information, and (ii) we should restrict the concept of objective chance to events we believe are genuinely random. But are there any such genuinely indeterministic events?

12.3 Indeterminism and Objective Chance I don’t believe the world is completely deterministic. We have good reason to believe that at least in the quantum world there are genuinely indeterministic events, for example decays of individual unstable nuclei, absorption of photons in atoms and in general all state transitions associated with irreversible interactions. Predictability entails determinism, but the converse implication does not hold since there are physical situations that are completely deterministic but not predictable, as observed already by Laplace. So if we introduce the concepts PROPENSITY and OBJECTIVE CHANCE we should construe them so as to cover only those genuinely indeterministic events, i.e., those events such that even if we know everything there is to know about the conditions, the outcome is not predictable. The distinction determinism/indeterminism is a piece of metaphysics, because whether a system is indeterministic or not cannot be finally decided. The reason is that in order to prove a system indeterministic, we need to know that our theory about that system is complete. For suppose we have a theory having indeterministic laws. In order to know that real systems obeying these laws really are indeterministic we need to exclude the possibility that our theory is incomplete and could be completed with so far hidden variables that determine the seemingly undetermined outcomes. But that cannot be done, for in order to do that we need to somehow

188

12 Propensities

directly compare theory and reality as it is in itself unmediated by our observations and concepts, which is impossible. So we are free always to entertain hope that an apparently indeterministic theory sooner or later will be replaced by a deterministic and complete one. So, for example, since Einstein couldn’t accept that nature fundamentally is indeterministic, he believed that quantum theory is incomplete and therefore possible to improve; He is reported to have said ‘Der liebe Gott würfelt nicht.’ (‘God does not play dice’.) When it comes to a non-fundamental theory we are often in a position to say that it is not complete or that it is an approximation because we compare this nonfundamental theory with a more fundamental one. But with fundamental theories we have nothing better to compare with. This is the situation as regards quantum theory; it is a fundamental and indeterministic theory. (String theory is in a sense more fundamental, but it is a quantum theory and thus indeterministic.) The other fundamental theory, general theory of relativity, is deterministic, and since we have no other fundamental theories to consider, we can in what follows focus on quantum theory as the sole candidate for a theory in which we have any need for the concepts PROPENSITY and OBJECTIVE CHANCE . The question for us now is if it is possible to define a measure function over genuinely indeterministic events in physical systems described by quantum theory and fulfilling the following conditions: (i) it satisfies Kolmogorov’s axioms for a probability measure (ii) It is possible to derive probability distributions for state changes directly from the axioms of the theory. If there exist such probabilities I will call them propensities when attributed to physical objects and chances when attributed to state changes of these objects. Thus, an object has a propensity of x% to undergo a particular state change, iff the objective chance for this state change is x%. Physical objects are attributed propensities and events are attributed chances. The point of condition (ii) is that it guarantees that probability distributions can be calculated without use of frequencies or credence functions. This is necessary because otherwise propensities are idle; if the only way to acquire information about probability distributions is to observe relative frequencies there is no point in introducing propensities as something distinct from frequencies. Similarly, if we define propensity using credence functions, however objectified, they are objectified credence, which in essence is not a attribute of external objects, but of our minds. (Our minds are in a sense objects in the world, of course, but the point is that propensities is thought of as attributable mainly to things that are not minds.) So the question is: how do we determine probabilities for what we believe to be genuinely random events in quantum physics? The answer is easy; the scalar product of two normalised state functions in the same Hilbert space fulfills Kolmogorov’s axioms for a probability measure, and these scalar products, expressing transition probabilities, provide the required propensities. Let’s imagine that we have a physical system which for some time is isolated from the rest of the world. Its state at time t is (t) and its evolution in time

12.3 Indeterminism and Objective Chance

189

is determined by the time dependent Schrödinger equation. This time evolution is deterministic, no chanciness occurs. However, sometimes there occurs a collapse of the state, either spontaneously, or induced by some external condition. As will be shown in Chap. 16, the collapse is a non-linear, non-unitary, irreversible and indeterministic event. In the background of quantum mechanics, from indeterminism the other features follow, as we will see in Chap. 16. The common view is that the collapse occurs if and only if a measurement is performed upon the system. This view, however, faces severe difficulties. From a strictly physical point of view, measurements are but examples of ordinary interactions, i.e., interactions between the measurement device and the measured system. As is well-known, the measurement system could, theoretically, be included in the description of the measured system and this combined system obeys the deterministic Schrödinger equation. It means that it will not collapse, but observations show that it has; this is the measurement problem of quantum mechanics. In Chap. 16 I will briefly discuss the most well known views on the measurement problem. They can be divided in collapse and no-collapse interpretations. I do think there are overwhelmingly strong reasons to adopt a collapse interpretation and I will give my views in Chap. 16. The interest in propensities and objective chances depends on that stance, for if one believes that the world is deterministic there is no use for these two concepts. So let us for the present discussion assume that genuinely random events occur and that quantum mechanics cannot be completed with hidden variables. How, then, do we calculate the probability distribution over the space of possible outcomes in a particular case? The recipe is as follows. The collapse is a state change  → φk where φk is one out of several possible states {φi }. This set of state functions {φi } could always be chosen so as to constitute a complete orthonormal set of functions, spanning the same Hilbert space in which the original state function  is defined. If this is done, the scalar products {, φi }, usually labelled transition probabilities, each gives a number between zero and one, provided  is normalised. (The expression , φk  could be visualized as the overlap between the two functions and the bigger the overlap, the higher probability, which means that the likelier that the system changes its state from  to φk . The conditions of completeness and orthonormality together guarantee that the scalar product is a function fulfilling the axioms for a probability measure. So here we have a way of determining propensities without relying on frequencies or credence. Hence, the two conditions on propensities are fulfilled; the axioms for a probability measure are satisfied and they are defined without using frequencies or credence functions, which means that in any model of the axioms of quantum mechanics we can calculate transition probabilities, i.e., propensities.

190

12 Propensities

12.4 Conditional Propensities Paul Humphreys (1995) claimed that propensities do not obey Kolmogorov’s axioms for a probability measure. His argument is that the axioms entail Bayes’ theorem, i.e., a formula for inverting conditional probabilities, but propensities are non-invertible. In the case of only two alternative events, B and not-B, Bayes’ theorem is P (B|A) =

P (A|B)P (B) P (A|B)P (B) + P (A|¬B)P (¬B)

(12.2)

Humphreys gave some examples, such as this one: when light above a frequency threshold hits a metal surface, electrons will be emitted from the surface. Electron emission is an indeterministic event, so one can say that an electron has a certain propensity p to be emitted, conditional on the metal being exposed to light above the threshold frequency. Using Bayes’ formula for calculating propensities, one can calculate the inverse, viz., the propensity for the metal being exposed to such light, conditional on an electron being emitted. Since exposure of light above the threshold is a necessary condition for emission, this latter probability equals one, and so the propensity must be the same. But this is not what we get when using Bayes’ formula. Humphreys concluded that no formal analysis of propensities that have the following statement as a theorem is adequate: Cond. A.

If the probability of A given B exists, so does the probability of B given

I disagree with Humphreys on this point. I will first discuss his examples and then proceed to a general discussion about asymmetries of conditional propensities. Humphreys writes (op. cit., p. 558): Whether or not a particular electron is emitted is an indeterministic matter, and hence we can claim that there is a propensity p for an electron in the metal to be emitted, conditional upon the metal being exposed to light above the threshold frequency. Is there a corresponding propensity for the metal to be exposed to such light, conditional on an electron being emitted, and if so, what is its value?

Humphreys naturally concludes, since propensity is a disposition, that the latter propensity is unaffected by a later emission of electron from the metal. But this disagrees with the probability value we can calculate using Bayes’ formula. If we try to formalise Humpreys’ example, we must have him assuming a twodimensional outcome space consisting of the four outcomes (L=metal exposed to light above threshold, E=emission of an electron):{(L, E), (¬L, E), (L, ¬E), (¬L, ¬E)}. We can now assign a probability distribution over this outcome space and then calculate the conditional probabilities definable in it. But the problem with this example is his mixture of macroscopic and microscopic concepts, viz., ‘exposure to light’ and ‘electron emission’, and one might wonder if a consistent microscopic description would make his example convincing. So let us take ‘L’ to mean one single photon, not a huge number, arriving at the metal

12.4 Conditional Propensities

191

plate. But what would that mean? A photon has a well-defined position only in interactions, i.e., when it is produced or absorbed; during propagation it cannot be attributed a position like ‘arriving at a metal plate’, unless it interacts with the metal plate. A mere arrival of a photon is no event that can be identified by any physical criteria. The photon ‘arrives’ if and only if it gives of its energy to the metal and that is the same event as an electron being emitted. Thus, our two-dimensional outcome space would in this reading collapse to one dimension and no conditional propensities can be defined on it. In the next section of his paper Humphreys has a more detailed and general argument to the effect that propensities conflict with axioms for probability. Using another physical event, the reflection and transmission of photons on a half-silvered mirror, he states the following regarding propensities: (T=transition, I=impinge, R=reflection of a photon, B=background conditions) 1. P r(T |I B) = p > 0 2. 1 > P r(I |B) = q > 0 3. P r(T |¬I B) = 0 The last assumption says that the propensity for a photon being transmitted through the mirror equals zero if the photon does not impinge on the mirror. Furthermore he states an independence principle saying that propensity for impinge is independent of transmission and reflection: CI :P r(I |T B) = P r(I |¬T B) = P r(I |B) The assumptions (i), (ii), (iii) and CI together with Bayes’ theorem yields a contradiction. So either propensities cannot be identified with probabilities or one of the four assumptions are not true of propensities. I take the latter stance. In order to analyse the situation we need, as before, to be clear about how Humphreys conceives the outcome space. Since transmission and reflection of a photon exclude each other and also exhaust the possibilities, the outcome space must be (I, T ), (I, R), (¬I, T ), (¬I, R) The background conditions are assumed to be stable. They are thus idle in the description of the events in the outcome space and we can suppress ‘B’. As before the problem is with the macroscopic descriptions of what is said to be microscopic indeterministic events. The impinging of a photon on a mirror is no event that can be distinguished by any criteria, unless it interacts with the mirror. A photon cannot be said be somewhere when it does not interact with the surroundings, as already has been pointed out. So let’s try and interpret Impinge as that the photon really interacts with the mirror. If so, Impinge must be identical to the event of the photon either being reflected or transmitted through the half-silvered mirror; thus I = T ∨ R. If the mirror is exactly half-silvered, a photon that interacts has equal chances of being reflected or transmitted. Since ¬T = R and P r(I |¬T ) = P r(T ∨ R|¬T ) = P r(T ∨ R|R), we have P r(I |T ) = 1

(12.3)

192

12 Propensities

P r(I |¬T ) = 1

(12.4)

So the first equality of CI is correct. (But I think Humphreys’ motive for CI was that Impinge is causally independent of transmission and reflection; this thought surfaces later in the paper in which he discusses causation and probability.) What then about the second equality, i.e., what is Pr(I)? In order to fulfill CI, we must now put P r(I ) = 1. This can only be true if we conceive of an experiment where only photons observed to have been reflected or transmitted through the half silvered mirror are counted. Bur if so, assumption (ii) is false. If we on the other hand conceive of a situation in which assumption (ii) is true, we must think of more photons than those transmitted or reflected being counted. But if so, the second equality of CI is false. Hence, Humphreys’ premises cannot be fulfilled in any consistent description of the type of experiment considered. Of course, we know very well that probabilities can only attributed to well defined experiments, and this is true no matter if we interpret probabilities as manifestations of propensities or not. If an object has a certain propensity all by itself to do something, then, of course, if we want to calculate it we must determine the outcome space clearly.

12.5

Conditionals vs Conditional Probabilities

However, the general question remains; is it meaningful to talk about conditional propensities? Even though Humphreys’ counterargument failed, there might still be problems with a propensity interpretation of conditional probabilities. The first question we have to answer is ‘what is the meaning of a conditional probability in the domain of genuinely chancy events, i.e., of quantum state transitions?’. So let’s return to these and discuss in some more detail how to apply a probability measure to them. A transition is a change from a state  to a state φk , the latter being one out of a complete set of orthonormal states {φi } spanning a Hilbert space in which we find . But the same state  could also evolve into a state ξj belonging to another orthonormal complete set spanning the same Hilbert space. For example, a spin half particle may change spin state from |spin(up) in x-direction either to |spin(up) or |spin(down) in z-direction or to |spin(up) or |spin(down) in the y-direction. However, the state  cannot under identical external conditions ‘chose’ between evolving into a state φi or into a state ξj , if these two states are eigenstates of noncommuting operators. The choice of the set of possible outcomes is made in the preparation step of the experiment by a suitable arrangement of external potentials, as will be thoroughly discussed in Sect. 16.4. So one can in this case say: ‘The probability for a spin-half-particle, being in the spin-up-state in the x-direction, to change into spin-up in the z-direction, is 50%, if the external conditions are such that the particle with certainty will align its spin along the z-direction.’ So the

12.5 Conditionals vs Conditional Probabilities

193

probability distribution is conditional on preparation, viz., on the type of interaction to be performed. This is, however, not a conditional probability: it has the form ‘if B, then prob(A)=x’, not ‘prob(A|B)=x’. These two statement forms mean different things. It is known that if we equate these expressions we get the disastrous consequence that the probability for any event A is either zero or one, see Edgington (1995). A quantum system S can be attributed a propensity to undergo a change form a state  to a state φk , it is the overlap integral. In order to check this propensity we must prepare a test situation such that S is forced into the state  and then physically acted upon in a way represented by an eigenoperator to the state φk . Of course, a mathematical object does not act upon the real physical system; the operator operates on the function representing the real system and this mathematical operation may be thought to represent a physical action on the system described by that function. This means that the outcome space is determined by the type of interaction to be performed, i.e., by the chosen operator. If we chose another sort of interaction, represented by an operator not commuting with the first, then the outcome spaces in these two interaction types necessarily differ. This means that the propensity for a state change  → φk can only be tested if the appropriate type of interaction takes place. So propensities only manifest themselves as frequencies in certain well defined experiments. But this is nothing peculiar for propensities, the same goes for dispositional properties in general. The solubility of sugar in water can only be manifested if sugar is put into water. Nancy Cartwright (1980) thinks (and I agree, see Chap. 16) that such quantum transitions occur irrespective of the interaction is a measurement or not, while adherents to Copenhagen interpretation and many others deny that. But this difference is immaterial for the present discussion. Next, let’s return to Humphreys’ argument that inverting conditional propensities result in absurd statements; so far I have only rejected his examples, not the general statement. Consider a set of possible transitions from a state  to members of a set of states {φi }, where this set is a complete and orthonormal set of states in the same Hilbert space H in which  ‘lives’. The outcome space is the set of such transitions, i.e., {( → φ1 ), ( → φ2 ), ( → φ3 ), ..}. An event is a subset of this set, for example {( → φ1 ), ( → φ2 )}. Let’s call this event E12, while E1 is the label for ( → φ1 ). Now we know what we mean by a conditional probability, such as P(E1|E12); it is the probability for E1, given that E12 occurs. If the distribution over the outcome set is uniform and the outcomes are mutually disjoint, this equals 50%. Could such a formula be given a meaning in terms of propensities? Does it make any sense to say ‘The propensity for E1, given that E12 occurs, is 50%’? I think it does. First I see no reason not to say that a system in state  has a certain propensity to change either as ( → φ1 ) or as ( → φ2 ); it is the sum of the individual propensities because E1 ∩ E2 = ∅. Secondly, I see no reason not to calculate the fraction of two propensities, and since P r(E1|E12) = P r(E1E12)/P r(E12), we

194

12 Propensities

have given the truth conditions for conditional propensities in terms of things that are meaningful. What then happens if we invert conditional propensities using Bayes formula? In this case we get P(E12|E1), which is the propensity for a system in state  to change into either the state φ1 or into φ2 , given that it changes into φ1 . I don’t see any reason to say this is absurd or meaningless and this propensity is obviously equal to one. So we have here at least one domain in which there are good reasons to believe that there are genuinely indeterministic events that meaningfully can be attributed both unconditional and conditional propensities. Moreover, quantum theory provides us with measure functions for state transitions fulfilling the axioms for a probability measure. So I think Humphreys is wrong. It should now be pretty clear that the reason why Humphreys found it impossible to interpret conditional probabilities as propensities has nothing to do with the nature of propensities, but with the implicit assumption that conditions prepared by the experimenter could be viewed as events in the outcome space and thus used in calculating conditional probabilities. Since those conditions are determined by the experimenter, he can decide the frequencies for different preparations. Hence they are not stochastic variables. But these preparation events do not belong to the outcome space and any frequency attributed to such events are not involved in calculations over the outcome space. This is the fundamental reason why we need to clearly keep apart statements of the forms ‘if A, then p(B)=x’ and ‘p(B|A)=x’; the conditional can be used to state probabilities under preparation conditions, whereas the conditional probability express fractions between probabilities. It should be pretty obvious that if we equate these things we make a mistake and this conclusion is confirmed by Edgington’s proof.

12.6

The Scope of Genuine Randomness

Are there other genuinely random events in nature? According to our present theories, the answer is no. When judging the scope of the concept of genuine, irreducible indeterminism, our best option is to rely on our best fundamental theories about the world. They are relativity theory and quantum theory, more precisely, the standard model. Like most philosophers I believe that mental events supervene on physical events, i.e., I hold that if a mental change (within a person) has occurred, then there has also occurred a physical change. Supervenience can be stated as the slogan ‘no mental difference without a physical difference.’ The converse is however not true; many physical changes in our bodies are not accompanied by any mental change. It goes without saying that even biological and chemical changes supervene on physical ones, hence all events, physical, chemical, biological, social, psychological, historical etc., supervene on facts described in these two basic theories of physics. This is the minimal content of physicalism. Put differently,

12.7 Summary

195

If the physicist suspected there was any event that did not consist in a redistribution of the elementary states allowed for by his physical theory, he would seek a way of supplementing his theory. Full coverage in this sense is the very business of physics, and only of physics (Quine 1981a, 98)

Of course, psychologists, historians and other researchers outside physics couldn’t care less about physics and mostly they are perfectly right in so doing. But my point is ontological. I need not commit myself to any form of reductionism for the purpose of the present argument. Now, indeterminism enters quantum theory during collapses and, as Bohr rightly observed, indeterminism is connected with the quantum of action, see Chap. 16. When aggregating to macroscopic descriptions, randomness disappears. So, for example, even though it is genuinely indeterminate whether a singular photon entering (or better ‘being in the vicinity of’) a material surface will pass through, be absorbed or reflected, when we aggregate and consider the proportions of these events when a flash of light hits a surface, there is no longer any indeterminism. Transmittance, reflectance and absorption coefficients can be determined to any desired degree in advance. And since all chemical, biological and psychological phenomena are built up of basic physical events, there is no additional chanciness in nature. If there were genuine random events describable with for example biological predicates, we should have indeterministic laws expressed in non-reducible biological predicates. But we have not. I submit that all apparent randomness in statistical mechanics, biology, chemistry, psychology etc., is due to lack of complete information. So all genuine indeterminism is traceable to indeterminism in quantum interactions.

12.7 Summary The concepts PROPENSITY and CHANCE are intended to be attributes to respectively physical objects and events that explain observed frequencies. It means that we have use for PROPENSITY and CHANCE as objective features only if there are genuinely indeterministic events in nature. For if we believe that all events are determined by deterministic laws and initial conditions, all ascriptions of probabilities reflect lack of knowledge, not genuine indeterminacy. In such a case all probabilities collapse to zero or one, when complete information is attained. If so, all probabilities and relative frequencies are measures of human ignorance, not attributes of observed objects and systems. The concepts PROPENSITY and CHANCE are useful only if their values can be calculated without access to observed relative frequencies. This is possible in quantum mechanics; one can calculate the overlap integral of a system’s state before and after a state transition and this integral returns numbers which fulfil the axioms for a probability measure, provided the initial state function is normalised. In this way we can test hypotheses about propensities by observing relative frequencies.

Chapter 13

Direction of Time

Abstract Time is directed, which is often thought to conflict with the fact that all laws in fundamental physics are symmetric under time reversal. This is however a confusion of two concepts. That time is directed is reflected in the asymmetry of the predicates ‘before’ and ‘after’. A reversal of the time parameter in physics, on the other hand, is a mere change of convention, from labelling later times with bigger numbers, to labelling them with decreasing numbers. Just as the directions of axes in spatial coordinate systems are mere conventions, the direction of the time axis is a mere convention and it is a fundamental requirement that fundamental physics be invariant under changes of such conventions. We can in many cases easily decide which of two events occurred before the other one without having any time measuring device or any memory of these events. Many physical systems change state irreversibly, this is the basis for the asymmetry of the predicates ‘before’ and ‘after’. It is further shown that a universal direction of time does not require any universal clock undergoing irreversible changes. It suffice that there are a number of partly isolated physical systems, each existing for some time and undergoing irreversible changes. If they partly overlap in time we can construct a universal time parameter beginning with Big Bang. A universal time direction can thus be constructed without using the second law of thermodynamics.

13.1 Introduction We live in a world in which there is an obvious difference between past and future, both as regards our ‘lived world’ and as regards the physical world, the universe and its parts. By contrast, most laws in fundamental physics (exceptions will be discussed in due course) are time symmetric. Since physicists and philosophers have strong confidence in fundamental laws and also believe that explanations should use fundamental laws as ultimate starting points, there is an explanatory problem: why is the real world asymmetric in time when almost all fundamental laws are time symmetric and reversible? It is well known that from a set of laws, all of which are symmetric in time, one can derive only time symmetric propositions. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_13

197

198

13 Direction of Time

Numerous attempts to bridge this explanatory gap has been tried, and many have contributed to the discussion, see for example Earman (1974, 2002), Horwich (1987), Coveney and Highfield (1991), Mackey (1993), Halliwell et al. (1994), Savitt (1995), Price (1996), Prigogine and Stengers (1997), Albert (2000), Zeh (2007), Mersini-Houghton and Vaas (2012) and Albeverio and Blanchard (2014). An obvious starter, suggested by many participants in the debate, is to base an explanation of the observed time asymmetry on the second law of thermodynamics, although this law is not viewed as a fundamental law. This law tells us that entropy in an isolated system consisting of a vast number of interacting objects increases with time, if entropy is not already maximal. However, Price (1996) criticises attempts to base the directedness of time on this law for begging the question. The explanation relies, he claims, on the assumption that objects in the system normally are uncorrelated before they interact and correlated after interaction. This assumption seems very plausible from our ordinary perspective on events, as is observed in the certainty by which we use ‘earlier’ and ‘later’ for ordering events in time; but it is utterly implausible when time is reversed. Hence, using the second law as explanation for the direction of time actually presupposes time asymmetry, i.e., that we can, independently of any theory, tell the difference between ‘before event e’ and ‘after event e’. I will discuss time asymmetry and the second law of thermodynamics further in Sect. 13.7. Is perhaps the direction of time a mere appearance? Perhaps the real world is symmetric in time and reversible? Or is the arrow of time a real trait of the world? If so, how do we integrate such an assumption into fundamental physics? Not only philosophers but also some physicists are troubled; so for example Antony Zee (2003, 99, footnote) writes: ‘Incidentally, I do not feel that we completely understand the implications of time reversal invariance.’ Like most people I take for granted that the arrow of time is a real trait of the world and not just a mere appearance. How, then, is it possible to account for this observation using physical theories all based on laws which are symmetric in time? And what is the physical basis for the direction of time? The first question reflects a confusion of TIME REVERSAL SYMMETRY and DYNAMICAL REVERSIBILITY . My first point in this chapter is that clearly distinguishing between these two concepts enables us to hold that dynamically irreversibility and time reversal symmetry are compatible. Secondly I will discuss the reason for demanding time reversal symmetry and the more general CPT symmetry (Charge conjugation, Parity and Time reversal) of fundamental laws. Finally, I will discuss how direction of time relates to irreversible state changes in physical systems in a way that to my knowledge has not hitherto been discussed.

13.2 Time Reversal and Dynamics of Motion

199

13.2 Time Reversal and Dynamics of Motion 13.2.1

Time Reversal in Classical Mechanics

Time reversal is usually associated with reversal of motion. Consider a particle moving from x1 to x2 during a time interval t, i.e., its average momentum is p = mx/t . Now suppose we perform a time reversal, i.e., a reversal of the time parameter: this will change sign of t and hence the representation of momentum, since the variable p will also change sign. But a similar effect would be achieved if we instead keep the direction of the time axis and reflect the particle against a wall; this would change sign of x and hence change of sign of momentum. (Similarly, if we reverse the x-axis, while keeping time and direction of motion relative to other bodies, we get at reversal of momentum. A joint reversal of spatial axis and time will however not change momentum. I’ll discuss this topic further in the next subsection.) So the result of inverting the time axis and of reflecting the motion of particle while keeping the time direction is similar at the level of representation; both actions are represented by the vector transformation p → −p. In other words, the mathematical operation of changing sign of variables representing velocity and momentum can represent two quite different things. It can either be a result of reversal of the coordination of the time axis, or a change of the direction of motion in a previously chosen time basis. This is of course a trivial observation, but the point is that so long as one only study theory in abstraction from applications to real physical experiments this triviality will go under the radar. Considered in isolation, a time reversal operation is a mere change of mathematical representation of time relations between events. Just as the choice of direction and zero point of the coordinate axes when describing the spatial positions of events is a matter of convention, one can chose direction (and zero point) of the time axis; no physical fact forces us to label later times with increasing numbers. Of course, ordinarily, when we represent the time ordering of events we always chose timing so that if an event a is earlier than another event b we represent times with numbers so that ta < tb . However, this is, just as the choice of zero point, a purely conventional affair. Nothing prevents us from inverting the numerical representation of the times of events and such a reversal is only a matter of replacing one convention for another. If we reverse the time axis in our co-ordinate representation of the times of physical events, so that later times is represented by decreasing numbers, all times and time derivatives and linear functions thereof will change sign, without anything real has changed. Hence, by itself a change of sign of these variables cannot without further assumptions be taken to represent a change of direction of dynamical evolution. So how do we distinguish between the two cases? A theory is a set of postulates and derived sentences. In the case of physical theories these postulates and sentences express relations between quantities (i.e., quantitative predicates), but these sentences do not by themselves contain any semantic connection to any extra-linguistic states of affairs; they only relate words to other words. In order to make contact with the physical world, a theory must be

200

13 Direction of Time

completed with ‘here-and-now’ sentences, i.e., sentences where indexicals are used together with pointing gestures for establishing reference to real things and events. This is a general lesson from semantics, by Luntley (1999, 285) labelled ‘Russell’s insight’, see the discussion in Sect. 9.2. Three obvious examples of such indexicals are terms for times, places and persons. When we label a certain time point ‘zero’, we must point to a particular clock at a certain point of time and dub its state at that moment ‘zero’. We can then label a certain later state of the clock, it’s showing ‘10’ for example, as t = 10 s. But, as already pointed out, we could just as well invert the time coordinates, having a negative sign on the clock’s numerals and saying that t = −10 s. This is reversal of the time axis in our representation system. When we say that a clock changes it pointer state from t = t1 to t = t1 + 1 we have said that the state t = t1 occurs before the other state. So changing the coordinates doesn’t change the direction of the succession of events. Thus the notion of reversing the time axis presupposes that we have a way of telling, independent of attributing time coordinates, which of two events is the earlier one, at least in some cases. I will return to that topic in Sect. 13.4. It follows that time reversal symmetry, which is a general feature of many fundamental equations in physics, does not tell us anything about the possibility of reversing the motion or dynamical evolution of a physical system or a body. It only tells us that the fundamental equations (except those which explicitly contain odd powers of time and time derivatives) are invariant under the time reversal operation, which is a change of a convention.

13.2.2 CPT Symmetry It is generally required that physical theories should be invariant under changes of human conventions and this demand in turn is an application of an objectivity demand. A fully objective theory shall describe nature from no particular point of view. An application of this objectivity demand is to require independence of particular choices of parametrisation of physical magnitudes. This is the fundamental reason for requiring general covariance, i.e., invariance of the form of physical laws under arbitrary differentiable coordinate transformations. The essential idea is that coordinates do not exist a priori in nature, but are only artifices used in describing nature hence they should play no role in the formulation of fundamental physical laws. But that demand has no implications concerning reversibility of motion. However, symmetry under only a time reversal is not a correct demand, since TIME is intimately connected to SPACE and CHARGE , and hence to all quantities

13.2 Time Reversal and Dynamics of Motion

201

defined in terms of these.1 This may be observed by looking at the definitions of the units for time (second), distance (meter) and charge (coulomb) in the SI system. In the SI definitions we see that not only those quantities and units that are called ‘derived’ are defined in terms of others, but also several of the fundamental ones. It is only the time unit, the second,2 that is truly independent of other units: so for example, electric current in a certain direction is defined as the derivative dq/dt , i.e., the amount of charge q passing through a plane perpendicular to that direction per unit time. So a time reversal operation T : t → −t must, in order to be a mere change of convention, be accompanied by reversal of those magnitudes that in this way depend on the direction of the time parameter. And, contrarily, a time reversal not accompanied by reversal of such magnitudes does not by itself reflect a mere convention reversal. TIME and its unit is deeply integrated into the entire system of physical quantities and units. It is in fact the most fundamental one, as one may infer from how definitions of the units in the SI system are related; the second is the only unit which is defined without using any other unit. It is well known that we should require CPT invariance, i.e., invariance under the joint operations of time reversal, parity reversal and charge conjugation, for any truly perspective-independent description of nature, see Lüders (1957). It is furthermore proven that violation of CPT invariance entails violation of Lorentz invariance, see Greenberg (2002). Lorentz invariance is invariance under coordinate changes in 4-dimensional spacetime. Choice of a particular inertial system is a purely conventional matter; hence we should require Lorentz invariance of any theory claiming perspective-independence. Since Lorentz invariance implies CPT invariance, the latter is a consequence of this objectivity demand. Hence, time reversal symmetry when adjoined to charge conjugation and parity reversal cannot possibly tell us anything about the possibility of reversing a dynamical process. This feature, DYNAMICAL REVERSIBILITY, is quite another thing than time reversal symmetry; it will be further discussed and defined in Sect. 13.5. But what, then, is the meaning of time reversal not accompanied by reversal of quantities defined in terms of time, such as electric current? If we do so in e.g. Maxwell’s fourth equation ∇ ×B=

1 ∂E 4π k J+ 2 c2 c ∂t

(13.1)

we get

1 In

Chap. 9 I showed that SPACE, TIME and BODY mutually depends on each other. Adding to this set of fundamental and mutually dependent quantities is necessary since charge is an attribute to bodies and particles. 2 The SI definition of a second: ‘The second is the duration of 9 192 631 770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the cesium133 atom.’ CHARGE

202

13 Direction of Time

∇ ×B=

4π k 1 ∂E J− 2 c2 c ∂t

(13.2)

which is simply a falsity, no matter what you think about the direction of time.

13.2.3 Time Asymmetry in Weak Interactions It is well known that weak interactions violate CP symmetry. By the CPT theorem it means that weak interactions also violates T symmetry. Recently, the Babar collaboration (Lees 2012) has reported direct observations of violations of T symmetry in B meson decays, i.e., observations of parameters calculated independently of observation of violations of CP symmetry. This should not come as any surprise; all it shows is that determining time direction, space orientation and positive/negative charge of charged particles are connected decisions; they cannot be done independently of each other without violating fundamental physics. But it does not show that time intrinsically has any direction.

13.2.4

Time Reversal in Quantum Mechanics

There are two aspects of reversal of time coordination in quantum mechanics to consider, its impact on the time evolution operator U = exp(−iHt/ h) and on the Schrödinger equation. Let’s start with the Schrödinger equation. It is not in a strict sense invariant under time reversal. Schrödinger’s equation is (H denotes the Hamilton operator): H(r, t) = −i h¯

∂ (r, t) ∂t

(13.3)

Applying the time reversal operator T : t → t  = −t to Eq. (13.3) gives us H  (r, t  ) = i h¯

∂   (r, t  ) ∂t 

(13.4)

where it is assumed that H = THT−1 = H and   = T. But this does not give us any new predictions, because all measurable quantities are quadratic in the wave function. We further recognise, by looking at the explicit form of the Hamilton

13.2 Time Reversal and Dynamics of Motion

203

2

h¯ operator H = − 2m ∇ 2 + V (r), that Schrödinger’s equation is CPT invariant, as it 3,4 should. Let us continue to the effect of time reversal on the wave function (r, t). A change of the interpretation of the time parameter, so that future times now are represented by negative numbers, doesn’t affect anything in the formalism. After reversal of the time coordinate the dynamical state of the system  after a time interval t  = −t1 has elapsed is (r, −t1 ). This formula by itself doesn’t tell us whether the time coordinate −t refers to a time earlier or a later than t = 0. The operator equation

TU−1 T = U

(13.5)

can easily be proved using the fact that U and T commute. In a paper by Hwang (1972), this equation is interpreted as that the result of three operations, a time reversal, a negative time displacement −t1 and another time reversal, is the same as a positive time displacement t1 of the system. Hwang concludes: ‘In other words, we reach the same dynamic state via the positive time system and the negative time system. Conversely, the dynamic system cannot be used to distinguish the positive and negative time directions.’(op.cit. p. 322). I completely agree with this. I only want to point out that the word ‘reach’ is a bit misleading in this context: we calculated the state of a system at a certain time, given we knew the state at another time, chosen as t=0. No real experiment is done. Hwang further concludes that the dynamical evolution of the system is irreversible, which, he claims, can be seen by applying the time displacement operator U(t) to equation (13.5): UTU−1 T = UU = 1

(13.6)

The left hand side of this equation will give the dynamical state of a system after a time reversal, a negative time displacement by −t1 , another time reversal and then a time displacement by t1 . If the system were reversible, the result of these operations would yield the original state. But that is not the case because the operator U2 is not the identity operator; the result is instead the state at a time displacement of 2t. However, Hwang’s reasoning is here mistaken on two counts. First, he has put too strong demands on reversibility. For if two successive and equal time translations, one in forward time, one in backward time, returns the system back to the original state, independently of the length of the chosen time interval, we may conclude, by

3 One

may observe that representing the system’s potential energy by a continuous potential function V(r) actually is a classical approximation, which strictly speaking conflicts with the fundamental postulate in quantum theory, viz., that all interaction is discretised. 4 If the potential function depends on time, this dependence must come from the time dependence of something treated as external to the system described by a particular instance of the Schrödinger equation.

204

13 Direction of Time

generalising over all shorter time intervals, that it cannot really leave the original state; under these conditions the system is in a stationary state and no changes are possible. Secondly, even if Hwang would add a condition to avoid this consequence, the reasoning is incorrect, since he implicitly assumes that the time evolution operator U(t) =exp(-iH/t) changes the state of system . This is wrong. Quantum states are represented by rays denoted R, which are defined as sets of normalised vectors  in Hilbert space fulfilling the condition that  and   belong to the same ray iff   = c for any complex number c, see Weinberg (1995, 49). The reason is that multiplying the state function by a complex number doesn’t change the expectation values of any observable. See also Sects. 10.9 and 14.5.

13.3 Time Symmetry and Electromagnetic Radiation Quite a number of physicists and philosophers have discussed the direction of time and its relation to the properties of electromagnetic radiation. The core problem is that there are two solutions to Maxwell’s equations, the advanced and the retarded one, which seems incompatible with our observations of electromagnetic phenomena. We never observe any situation where later states determine earlier states of affairs, situations which are assumed to be represented by the advanced solution. There is thus a time symmetry in the solutions to the electromagnetic wave equation, a symmetry which is not observed in the real world. But, again, the participants in the debate conflate dynamical reversibility with symmetry under time reversal operation. Maxwell’s equations in differential form are: ρ 

(13.7)

∇ ·B=0

(13.8)

∇ ·E=

∂B ∂t

(13.9)

4π k 1 ∂E J+ 2 2 c c ∂t

(13.10)

∇ ×E= ∇ ×B=

Using the scalar and vector potentials φ and A, as implicitly defined by E = −∇ × φ −

∂A ∂t

B = −∇ × A

(13.11) (13.12)

we can express the solutions to Maxwell’s equations in terms of these potentials:

13.3 Time Symmetry and Electromagnetic Radiation

205

retarded solution φ(r, t) = A(r, t) =

ρ(r  , t − |r − r  |/c)  dr 4π 0 |r − r  |

(13.13)

J(r  , t − |r − r  |/c)  dr 4π 0 |r − r  |

(13.14)

ρ(r  , t + |r − r  |/c)  dr 4π 0 |r − r  |

(13.15)

J(r  , t + |r − r  |/c)  dr 4π 0 |r − r  |

(13.16)

advanced solution: φ(r, t) = A(r, t) =

The only difference between these solutions is the sign in the time argument of the charge density ρ and current J. In the retarded solution the charge and current at a position r  propagate their effects to the position r by the velocity of light, hence the charge and current at r  at an earlier time t − |r − r  |/c contribute to φ(r, t) and A(r, t). In the advanced solution time there is a plus-sign in the time argument, which means that the charges and currents at other positions at later times contribute to these potentials. It is indeed natural to describe the contents of these equations in causal terms; I used the word ‘propagate’, which naturally is interpreted causally. Thus, in the debate all assume that equations (13.13) and (13.14) describe how the potentials at a point causally depend on the charges and currents at earlier times at other places, whereas (13.15) and (13.16) describe how the potentials causally depend on charges and currents at other places at future times. The advanced solution has then been dismissed with the argument that future states of affairs cannot have any causal effect on earlier ones. However, I don’t think we should use the concept of cause for this purpose, nor in describing in words the contents of these equations. As was shown in Chap. 8, CAUSE has an agentrelative component, MANIPULABILITY; a cause is in many cases an event we can manipulate (or could have manipulated) in order to make the effect happen (or to stop it from happen). So how should we understand the existence of both retarded and advanced solutions without relying on causal intuitions? Let us start by looking for symmetry properties of Maxwell’s equations. They are invariant under the joint transformation t → −t, B → −B and j → −j. This joint transformation is a CPT transformation, i.e., the joint transformations of inverting the time axis, space reflection and charge conjugation. (By contrast, Maxwell’s equations are not invariant under only time inversion.) As already argued, CPT transformations are transformations between choices of coordinate axes and labelling of charges. (Of course, it is purely a matter of convention that electrons are attributed negative and protons positive charge.)

206

13 Direction of Time

If we now perform the joint transformation time reversal, parity reversal and charge conjugation on the retarded solution we get the advanced one, and vice versa. So these two solutions are merely the same solution using two different conventions. But this doesn’t give us any direction of time; nothing in these equations tells us what occurs before or after a particular couple of values of the potentials is realised. Hence, classical electromagnetism by itself, not interpreted causally, does not give any direction to time.

13.4

Conditions for Time and Space Co-ordination

In the previous sections I took the predicate ‘. . . earlier than. . . ’ (and its inverse ‘. . . before. . . .’) as primitive. We use our memory to decide which of two directly experienced events comes first. Later we may parametrise this relation by using a clock and attributing real numbers to these events. We can in many cases decide which of two states of an object occurred before the other. Consider two pictures of one and the same person, for example one taken when she is 1 year old, the other when she is 10 years old; we can immediately say which picture was taken first without knowledge of any times when the pictures are taken. This entails that some observable physical objects change irreversibly with time. This in turn presupposes an identity criterion for observable physical objects, i.e., bodies, a criterion which determines whether two occurrences of a body are occurrences of the same body. And this criterion is well-known, viz., genidentity; two occurrences of a body are occurrences of the same body iff they can be connected by a continuous trajectory in spacetime. (No time direction is here presupposed!) The practical problem of actually tracking a body is often solved by instead checking an unique attribute that does not change with time, such as a piece of DNA in the case of organic objects; but such criteria are subordinate to genidentity. The fact that we in some cases definitely can say that of two observations of the same body one must have occurred before the other, shows that at least some bodies undergo irreversible state changes. Now its time to consider the distinction between reversible and irreversible state changes, and I will follow Hwang in labelling the crucial concept DYNAMICAL REVERSIBILITY.

13.5 Definition of DYNAMICAL

REVERSIBILITY

Unfortunately, Hwang, in the paper earlier mentioned, does not give us any precise definition of his concept of DYNAMICAL REVERSIBILITY. He implicitly characterised it by saying that a classical system is reversible if it is possible to change the direction of motion of all particles in the system, without changing the time axis. The last clause is important because a mere change of the sign of

13.5 Definition of DYNAMICAL REVERSIBILITY

207

the velocity coordinate does not suffice; as already pointed out, such a change can either represent a time reversal or a reversal of motion without changing the time parametrisation. The quantum analogue to reversal of the momentum variable is reversal of sign of the momentum operator, again while preserving the time parametrisation. However, if the momentum of a quantum system is well-defined, position is not, and, moreover, variables are in quantum theory replaced by Hermitian operators. Quantum systems do not evolve in phase space but in Hilbert space. We need thus a more general concept of reversibility than reversal of momentum. It seems reasonable to say that the dynamics of a system, be it a classical or quantum one, is reversible if it evolves from a state A to a state B = A during a time interval t1 and then, after another time interval t2 , has returned to the original state A. But again, this cannot be interpreted as that the time interval can be chosen freely, for then we could generalise to all time intervals, however short, again resulting in no change at all. Hence we must block this generalisation. A classical state is identified by its position in phase space. The identity criterion for quantum states was discussed at the end of Sect. 13.2, where I endorsed Weinberg’s stance that rays represent quantum states. Two wave functions differing by only a complex number belong to the same ray and this is the proper identity criterion, since two state functions differing by a phase factor, a complex number, give the same probability distributions for all observables. Now it seems reasonable to say that the evolution of quantum system (t) is reversible if (t) and (t + t) belong to the same ray R but at the same time there are intermediary states which do not belong to that R. For example, suppose we have a system which evolves during the states A, B, C etc., without returning to any previous state, while time parametrisation is kept unchanged. Suppose then that the conditions are changed, either by nature or by an experimenter, so that the system will, with increasing time, (remember that a choice of time parametrisation by itself do not entail a choice of what is earlier and what is later!) go through the same states in the reversed order: . . . C, B, A. Wouldn’t we call the evolution through the states A, B, C. . . reversible, if such a change of conditions is physically possible? I think so. Against this background I propose the following definition as a reasonable explication of the rather informal notion of reversibility: The evolution of a system S is dynamically reversible if and only if (i) we have chosen a direction of the time parameter5 and do not change that during calculations of its states during the relevant period of time, (ii) it is possible to rearrange the physical circumstances of S at a time t + t1 so that, if the system during the time before t + t1 has evolved from the state (t), belonging to a ray R, to another state (t + t1 ) not belonging

5 This

choice consists in determining whether later times are parameterised with higher or lower numbers.

208

13 Direction of Time

to R, then it will with certainty evolve to (t + t1 + t2 ) = c(t) for some time interval t2 and complex number c. If a state evolution is not dynamically reversible, it is dynamically irreversible.6 I have chosen to require certainty, i.e., probability equal to one, in the definition of dynamical reversibility and some might think that it would be more faithful to our common sense understanding of reversibility, i.e., ‘possibility to reverse’ to have a much lower probability in the definition. But, as we will se in Sects. 13.7 and 13.8, for the purpose of finding a basis for the direction of time the important thing is if there are systems that do not satisfy the chosen definition of the predicate DYNAMICALLY REVERSIBLE ; any weaker condition will not suffice. It is easy to understand why the two concepts of TIME REVERSAL and DYNAMICAL REVERSIBILITY have not been properly distinguished; both time reversal and reversal of motion are in theory represented similarly and the only way to distinguish the two cases is in applications of theory to concrete observed events. Any theory is ultimately connected to reality by the application of predicates of the theory to objects and states of affairs that are identified by the use of demonstratives (’this body’, ‘now’, ‘that point’ etc.) in concrete linguistic acts; this was Russell’s insight, as discussed earlier. If we don’t take into account such concrete linguistic acts and consider theory only abstractly, we have no way of distinguishing between reversal of time coordination and dynamical reversibility for the simple reason that time reversal is change of time labelling convention in concrete linguistic acts.

13.6 When Is a Quantum System Dynamically Reversible? The equation of motion in quantum mechanics, i.e., the Schrödinger equation, is anti-symmetric under the time reversal operation, the operator ∂/∂t changes sign but the Hamiltonian doesn’t. This fact does not tell us anything about the possibility or impossibility of reversal of dynamical motion of quantum systems. It is often assumed that the existence of an inverse evolution operator is both sufficient and necessary for a dynamical process being reversible. This is however wrong. The necessary and sufficient criterion for the existence of an inverse to an operator acting on a Hilbert space is its being unitary or anti-unitary and such operators can be written in the form

6 By

contrast, Tolman (1979, 103-104) gives a more standard definition of dynamical reversibility: ‘Thus we see, corresponding to any possible motion of a system of the kind mentioned [i.e., Lagrangian with constant H], that there would be a possible reverse motion in which the same values of the coordinates would be reached in the reverse order with reversed velocities. This is the content of the principle of dynamical reversibility.’ Obviously, he didn’t consider the fact that reversing the time parameter would change the representation of time order of states and of momentum, without any real reversal of motion; in that case earlier states would still be earlier states!

13.6 When Is a Quantum System Dynamically Reversible?

U = exp(−iωt); ω = H/h¯

209

(13.17)

Now, the question is: what type of physical events is represented by applying such an operator or its inverse to a state function? Assume that we know the state of a system at a certain time t = 0, (0). We then let the operator U(t) operate on this system’s description in order to calculate what state we have at a later time t (or at an earlier time, if we had inverted the time axis!) provided that the total energy of the system is correctly represented by the Hamiltonian H . Conversely, U(t)−1 when applied to (0) yields the state of the system at an earlier time −t, because U(t)−1 = U(−t). Hence, applying the inverse operator U−1 does not represent dynamical reversibility. Neither does it by itself represent reversal of the time axis. It is merely a way of calculating states at earlier times, provided the situation is correctly represented by the chosen Hamiltonian. It appears that no physical action on the system is presupposed when using unitary or anti-unitary operators. They are just calculational devises, useful for calculating parameter values at different times. And furthermore, since the states U(t)−1 (0) and (0) differ by a complex number only, they belong to the same ray and so these two formulas are merely two descriptions of the same quantum state. In other words, applying the unitary operator U to a quantum state does not represent a change of the state. This is contrary to the common view. In fact, the necessary condition for getting correct results is that the system in question is not acted upon by any external influences, because such external influences are theoretically represented as changes of the Hamiltonian. If this condition is fulfilled the Hamiltonian is a constant and given the state description for any chosen time, we can calculate the state description at any other time, earlier or later, using the operators U(t) or U(t)−1 . And since two such state descriptions differ only by a complex number, they describe the same state. Again, the so called unitary evolution of a quantum state is no evolution, it is merely a redescription of the same state. When the quantum system is an elementary particle we may arrive at the same conclusion by informal reasoning. Elementary particles, taken one at a time, have no history, they don’t change; it is only systems of elementary particles that may change with time. This is a consequence of the meaning of ‘elementary particle’; it has no parts, no inner structure, and therefore it cannot be attributed any state variable whose values changes with time. This is in fact an application of Aristotle’s analysis in Aristoteles and Ross (1936, book 1) of CHANGE as merger of elementary unchangeable objects or as dissolution of complexes of such objects. The application of the concept of change presupposes that there are things that don’t change and elementary particles have this function in modern physics. Hence, if the wave function is a description of an elementary particle and the Hamiltonian is constant, this elementary particle does not change. Time asymmetry is only possible in aggregates of elementary particles, which is the topic of classical and quantum statistical mechanics. Thus, it is now time to focus on the role on the second law of thermodynamics and its quantum analoge for timing.

210

13 Direction of Time

13.7 Time and Entropy 13.7.1 Time Reversal and the Second Law of Thermodynamics The second law of thermodynamics says that any isolated system consisting of a number of interacting parts will increase its entropy S(t)7 with time, if the entropy is not already maximal.8 This is usually expressed as dS 0 dt

(13.18)

Now if we apply the time reversal operator T : t → −t, which represents a mere reversal of the time axis in the description of physical events, this change affects only the differential dt, not dS. Thus the second law becomes dS 0 dt

(13.19)

which contradicts the former formulation. Hence, time reversal cannot be a mere change of convention, it could be argued. This argument is however flawed. In Eq. (13.18) we represent future times with increasing numbers, as usual. Then if t2 refers to a later time than t1 , the difference t2 − t1 is a positive number. Hence, if entropy increases during that interval, we have S =

dS (t2 − t1 )  0 dt

(13.20)

Now apply the time reversal operation, which changes the time interval t2 − t1 into a negative number. (Remember that ‘−t2 ’ after reversal refers to the same time as ‘t2 ’ referred to before time reversal, which is to say that −t2 is a later time than −t1 .) As the sign of the differential also changes, we still have that S =

dS (t2 − t1 )  0 dt

(13.21)

i.e., states at future times are still characterised as having higher entropy, albeit future times are labelled by smaller numbers than earlier times. The second law is independent of time parametrisation. A spontaneous state change towards lower entropy at later times will not be found, except for very short intervals. As discussed in Tolman (1979, ch. XII) a quantum statistical analogue to the second law, the H-theorem, can be derived. The argument above applies also in this case and Eq. (13.20) is still true. here presupposing Gibbs definition of entropy S = −kB i pi ln pi . 8 Since it is a statistical law, it is possible that entropy decreases for short periods of time. 7 I’m

13.7 Time and Entropy

211

As already mentioned, Price (1996) argued that in the derivation of the second law one in fact presupposes the distinction before-after. This is clearly true also in quantum statistical mechanics, where the so called ‘Stoßzahlansatz’, which says that before a collision between two particles their motions are uncorrelated, is an explicit premise in the derivation of the H-theorem. We may conclude that none of the laws of physics so far discussed can be used to explain the direction of time; the distinction before – after as applied to events is epistemologically prior to all theory. In Chap. 9 I argued that the predicates BODY, SPACE and TIME are mutually dependent and fundamental in physics. In Chap. 10 I showed that MASS, which is attributed bodies, is the fundamental quantity in dynamics. We may now add that the direction of time, more precisely the predicate . . . .BEFORE. . . .. taking events and states of affairs as arguments, also is fundamental from an epistemological point of view. When we attribute times to events and states of affairs we get the direction of time since we know that some state changes are irreversible. The concept of entropy is highly theoretical, less understood by lay people than most physical concepts. From an epistemological point of view it cannot be fundamental, it must be defined in terms of more directly accessible features of the world. So I don’t see any problem in saying that a state of lower entropy of a system usually occurs before a state of higher entropy, in other words, using the predicate . . . . BEFORE. . . . in giving the direction of time as the basis for saying that entropy increases with time.

13.7.2 Entropy Function Defined on Hilbert Spaces? A collection of elementary particles moving around randomly has no history, no intrinsic time. An elementary particle has no internal structure, nothing that can change and function as record of its history, as already pointed out in Sect. 13.6. It follows that no entropy operator, an operator with monotonically increasing expectation values, can be defined on Hilbert spaces, i.e., on spaces of state functions for individual quantum systems. This is proved by Misra et al. (1979) and Lindblad (1983) 9 independently of each other. However, an entropy operator can be defined as acting on distribution functions over vectors in Hilbert space; such operators are called ‘Lyapunov operators’. Then it can be shown that a Lyapunov operator with monotonically increasing expectation value is not factorisable into ordinary operators operating on the same Hilbert space on which the density distribution is defined. In other words, a Lyapunov operator is not possible to factorise into operators acting on state functions for individual systems. Misra et.al. conclude:

9 Since

one can reverse the time parameter as part of a CPT transformation without changing any physics, we can generalise and simply say: there is no operator with monotonic increasing or decreasing expectation value.

212

13 Direction of Time

. . . . We show that irreversibility (expressed as the existence of entropy superoperator M for the measuring apparatus) implies the classical nature of apparatus in that the distinction between pure and mixed states is lost. (op.cit., p.71–72)

In other words, orthodox quantum mechanics, in which observables are represented by Hermitian operators acting on vectors (rays) in Hilbert space, do not, and cannot, contain any operator with monotonically increasing expectation value; only operators acting on ensembles can have this property. The conclusion to be drawn is that there is no physical observable attributable to pure quantum states that can serve as the physical basis for the direction of time; a system in a pure state does not change with time unless acted upon by other systems. Hence the distinction between earlier and later states cannot be done in quantum mechanics for individual quantum systems. This distinction is possible only when we aggregate into larger systems of interacting quantum systems. We have arrived at the same conclusion as Boltzmann: time directedness is attributable only to aggregates of interacting systems, not to its elementary parts.

13.8 The Arrow of Time and Clocks State variables, both in classical and quantum theories, are functions of time, which is a parameter, not a variable. That means that in applications of theory to the study of real physical systems we must use a device not taking part of the dynamics of the studied system as clock mechanism. This is the reason, by the way, that TIME is not an operator in quantum mechanics; operators replace classical variables, not parameters. This external clock is a physical system consisting of two parts, one that oscillates (a certain number of oscillations define the time unit) and one that counts the oscillations. Any counter which keeps track of the number of oscillations will do. This counter must by necessity undergo irreversible state changes when counting; how else would we have a stable record of the number of elapsed time units? This shows that measuring times presupposes that there are irreversible state changes in some physical systems. As we saw in the previous subsection, such systems cannot be in pure quantum states; there cannot be clocks in a world consisting only of quantum mechanically pure states. So the very ascription of time evolution to any physical system presupposes that systems that do not behave as pure quantum states are available. This will be relevant for the discussion of the measurement problem in Chap. 16. One could say that the oscillation counter is the most basic type of memory; its pointer state at any chosen point of time is a function of what has happened to it since a given starting point. The same is true of our minds; the phenomenal experience of the direction of time is based on the fact that we have a memory; when we observe an event and store it in our memory our mind/brain undergoes a state change which in a weak sense is irreversible. (Since memories can be lost, one might think that it

13.8 The Arrow of Time and Clocks

213

is a reversible process. But the mind/brain states before experiencing en event and after having forgotten it are not the same.) In fact, the very meaning of the term ‘memory’ presupposes a distinction between earlier and later, a time direction. Thus, one might argue that the analysis of the direction of time in terms of clocks presupposes precisely that which is supposed to be analysed. And the same can be said about the concept of DYNAMICAL IRREVERSIBILITY . But I see no problem here; the direction of time is a fundamental feature of our world (including our experiences) and the most we can do is to inquire into its physical basis. In short, the physical meaning of the passing of time is that some physical systems change their states irreversibly.

13.8.1 Entropy of Clocks A time measuring device records the number of its oscillations, but this does not entail anything about its entropy, it can decrease or increase. In fact, normally it decreases. When, for example, we store information, about elapsed time or whatever, in computer memories, their entropy decreases. And the same is true of our own memories: when we store information we decrease entropy by increasing order. It is usually said that computers, brains and all kinds of systems that store information need energy from some source, but in fact its expenditure is not energy but negentropy10 We eat food to use some of its negentropy to build up our tissues, and feed our organs, in particular the brain. But we emit the same amount of energy as heat radiation, provided we do not gain weight. What we need for keeping our brain (and the rest of the body) functioning is fundamentally negentropy, which is carried by some edible objects. Similarly with other memory systems that can be used as (part of) clocks, such as computer memories; They need energy, but they release (normally) the same amount energy, while increasing negentropy. We cannot feed our computers with energy in the form of heat. One could say that things in which energy are stored in the form of chemical bondings or as energetic carrier particles are the carriers of negentropy. Intake into a system, our brain or an artefact, of these carriers are necessary for that system to function as memory; but the memory system’s net expenditure of energy is zero; its expenditure is negentropy. The fact that negentropy increasing systems can be used as time keepers do not exclude the possibility to use systems that develops in the opposite direction as time keepers.

10 Negentropy

is defined as ‘the specific entropy deficit of the ordered sub-system relative to its surrounding chaos’, see Mahulikar and Herwig (2009).

214

13 Direction of Time

13.8.2 Direction of Time Without a Universal Clock In the long run the memory of any individual object, be it a person or a mechanical device, deteriorates and its functioning as memory comes to an end when the negentropy storing mechanism is broken. Similarly with systems containing radioactive isotopes; sooner or later the time keeping function comes to a halt. But the point is that a universal time system can be constructed by using many such systems, each functioning for a limited period, provided they pairwise partially overlap in time. This is how historians are able to tell us which year certain important events in the past. We know for example that the first olympic games were held 776 BC. This knowledge is made possible by overlapping calendars; the greeks counted years and held olympic games every fourth year for a long period. Then for some time the roman calendar overlapped with the greek one so we can tell the year of the first olympic game in the roman calendar and later the christian calendar overlapped with the roman. No single memory or calendar system is used for counting years in our history, but if we can determine one and the same year in two calendars and if the time period, the sidereal year, is the same, we can jump from one calendar (physically stored in some way) to another and determine the time of events occurring over very long periods of time. And the same principle is used for timing over geological and cosmic ages. Each time keeping device functions for some time, none for ever. The DIRECTION OF TIME is based on a huge amount of irreversible processes in physical systems each having a limited existence in time. All these systems have a memory system that counts the number of events of some kind, This counting is in many cases from a physical point of view an accumulation of negentropy. So often, but not always, the physical basis for talking of ‘later times’ is ‘states of memory systems with higher negentropy’ i.e., lower entropy! This does not contradict the assumption that the total entropy of the universe is increasing. For the second law tells us that if a system increases its negentropy (i.e. decrease entropy) during a time interval, it must have interacted with its surroundings and a decrease in negentropy, i.e. increase of entropy, must have taken place in these surroundings. Neither does it contradict the possibility that the total entropy of the universe may decrease during shorter periods of time. It is theoretically possible to measure times and have a time direction without there really be any overall change of entropy in the universe. It is not impossible that in a part of universe not being accessible to us here on earth the entropy decreases, while the entropy increases in our part of the universe. The method of using overlapping time measuring devices to get a universally applicable quantitative predicate TIME with direction is analogous to the way the metric is determined in curved spacetime. There is no metric applicable to all portions of spacetime ‘at once’ so to speak. Instead one uses local coordinates defined on patches of the spacetime manifold. Theses patches are ‘glued’ together and jointly they cover the portion of spacetime of interest. By this method we can

13.10 Summary

215

give meaning to the quantity SPACETIME INTERVAL applied to any pair of points in curved 4D spacetime. As Einstein once said, TIME is that which we measure by clocks. Clocks contain a part that counts oscillations of some sort. Storing information about the number of performed oscillations does not require entropy increase, rather it often requires negentropy increase. This means that no monotone increasing quantity attributable to the universe as a whole is needed in order to attribute a time direction to the universe or any of its parts. Each such part, such as a clock, a biological organism, a star, etc., is composed of many elementary particles. It can be attributed a time direction as long as it exists and since many such parts overlap in time, we can construct a universal directed quantity TIME.

13.9 Time and Big Bang According to the common view among cosmologists, the universe began its existence some 13.7 billion years ago with a ‘Big Bang’. One cannot pose the question what happened before that event, since the conditions for applying TIME before that event is not fulfilled. One could figuratively say that space time and matter began to exist simultaneously; but remember, figuratively! Time and space are no entities! The relation between change and time is discussed by Aristotle in Physics, book 4 and he concludes (220b15): ‘Not only do we measure the movement by the time, but also the time by the movement, because they define each other.’ So even if there is something before Big Bang which does not change in any way there is no time before Big Bang. Or to express the point less metaphysical: In order to attribute time to states of affairs or events there must be differences, a state of affairs that changes into another state of affairs; no change, no time and no time, no change. In this sense Big Bang marks the beginning of time and space simultaneously with the formation of matter-energy. We see here in cosmology a reflection of the conclusion arrived at in Chap. 9: The concepts SPACE, TIME and MATTER are mutually dependent of each other.

13.10

Summary

• The time reversal operation T : t → −t cannot be taken to represent the informal notion of interchanging the relations ‘before’ and ‘after’ as applied to pairs of events. • Time reversal symmetry and its generalisation CPT symmetry is a formal requirement on physical theories and does not entail anything about the dynamics of physical systems.

216

13 Direction of Time

• There exist no observable that can be attributed to individual quantum systems in pure states which can function as basis for a distinction earlier-later. There is no time direction in pure quantum states. • Direction of time is based on the physical fact that there are physical systems composed of many interacting parts that undergo irreversible state changes. • Recording elapsed time from a starting point requires a clock, a physical system having memory, i.e., a system that for certain periods of time undergo irreversible state changes. • No monotone physical quantity attributable to the entire universe is needed for applying the concept of time directedness to the evolution of the universe. It is sufficient that there exists a number of limited physical systems each undergoing irreversible state changes for limited periods of time and overlapping with some of the other ones.

Chapter 14

Identity, Individuation, Indistinguishability and Entanglement

Abstract In this chapter it is argued that, from an empiricist perspective, identity criteria for particles derive from respectively Maxwell-Boltzmann, Fermi-Dirac and Bose-Einstein statistics. Identity of quantum states are determined by probability distributions for observables. It means that the time evolution of an isolated quantum system, which changes only the complex phase, is no state change. Quantum systems are individuated by theory, which means that a tensor product of two state functions represent one single system, not two interacting ones. It follows that a system of two entangled particles is one system and cannot be viewed as composed of two interacting objects.

14.1 Introduction The questions of identity and individuation in particle physics have confused many and attracted a lot of interest among laymen, physicists and philosophers. A physics student’s first confrontation with this topic is the concept of identical particles presented in quantum mechanics textbooks. The reflective student may reasonably ask when learning that two or more particles are identical: How could two things be truly identical? If a and b are identical, the labels ‘a’ and ‘b’ must refer to the same thing, and if so, how could they be said to be two things? It is clear that the word ‘identical’ is in quantum theory not used in its usual sense. The core problem is: how could one use count terms allowing plurals, such as ‘photon’ and at the same time say that the counted things are absolutely identical? If they are identical, how do we count them? The empirical way of discussing these matters is to begin with statistics, i.e., predicted statistical distributions of measurement results. There are three such distributions in physics: Maxwell-Boltzmann, Fermi-Dirac and Bose-Einstein statistics. An inquiry into the derivation of these statistics reveals what we need to understand about identity and individuation. However, many philosophers are not content with just using statistics to determine questions about identity and individuation in the quantum world. One wants © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_14

217

218

14 Identity, Individuation, Indistinguishability and Entanglement

answers to questions such as ’Are electrons real individual objects, independently of how we count them?’ This question has been lively debated by among others Becker Arenhart (2013), Becker Arenhart and Krause (2014), Krause (2010), Dorato and Morganti (2011), French and Redhead (1988, 1989); French (1989); French and Krause (2006), Halvorson and Clifton (2002), Ladyman and Bigaj (2010), Moreland (1998), Muller and Saunders (2008); Muller and Seevinck (2009) and Saunders (2003, 2006). What, then, are the criteria for identity? Some claim that it is not an empirical matter and argue that particles simply are individuals, they have a primitive ‘thisness’ which cannot be further defined. In this view, individuation has nothing to do with what we observe. This is a desperate stance. Most philosophers in the debate have not been so desperate, they have tried to explain and make coherent our use of count terms such as ‘electron’ and ‘photon’ by holding that quantum particles are, in some sense, individuals without conflicting with quantum mechanics. Becker Arenhart (2013) discusses various such attempts and concludes that none has succeeded: In fact, the search for support in quantum mechanics for the most adequate metaphysical view on individuality puts the friend of individuality in an unpleasant position: to make individuality compatible with quantum mechanics, she has to weaken her account of individuality. As we shall see, those suggestions we have just mentioned form some kind of progressive sequence, each step of which proposes a weaker notion of individuality, with less claims to empirical support from quantum mechanics. The result of the progressive weakening of the concept of individuality, as we shall argue, is that individuals end up withering away; in their place, non-individuals remain.

I see no need whatsoever for any metaphysical theory about individual objects, a theory which goes beyond what our best empirical theories tell us. That fermions and bosons are not individuals is no mystery to be explained. We can, under certain conditions, count the number of quantum particles in a system, provided the term ‘quantum particle’ is properly understood. But that does not require individuality of that which is counted; quantisation of conserved quantities explains our ability to count without individuating the counted items. If we know the total amount of a conserved quantity in a system, we can count the number of quanta. Knowing the occupation number for an electromagnetic field does not require we have counted the number of photons in the field; the field is one object and it does not consist of a number of photons conceived as individual things. The occupation number tells us how much energy of a certain frequency the field contains and since interaction is quantized, this can be expressed as a certain number of emissions of electromagnetic energy of this frequency, i.e., the number of photons of this frequency that can be emitted from the system. This requires no individuality among photons. Thus, the number used to tell the amount of a conserved quantity is a cardinal number, but it is not an ordinal number, since the portions thus counted cannot be ordered; they are all exactly alike. That one can use a natural number as a cardinal number but not as an ordinal number in real physical applications is certainly astonishing, but not conflicting with any logical principle.

14.2 Maxwell-Boltzmann Statistics

219

Trying to answer the metaphysical question whether quantum particles really are individuals is to make a mistake similar to that made by metaphysicians such as Leibniz and Wolff. Kant pointed out that they tried to say something about the world as it is in itself, disregarding the structure and function of our cognitive apparatus doing the thinking. Trying to say something about the real world from a point of view from nowhere and nowwhen, as Thomas Nagel (1986) puts it, is not possible. But we need not rely on any transcendental analysis of the mind as was Kant’s approach; it suffice to consider how we use our predicates and that is an empirical inquiry. We empiricists fully agree with Kant’s criticism of metaphysical realism. Let us stay with our means, in this case our best empirical theories about the relevant parts of the world. In the case of individuation in the quantum realm it is quantum mechanics and that theory simply tells us that fermions and bosons are to be treated as non-individuals, portions of conserved quantities all exactly alike, when calculating empirical consequences. What more do we need? A related topic is identity and individuation of quantum systems as described by wave functions. It is often taken for granted that if we have two systems | and |, these are different objects even after their tensor product || is constructed and being subjected to an evolution operator. But this is wrong; we need to consider principles of individuation also for quantum systems, which will be discussed in Sect. 14.5. There are, certainly, individual objects in the quantum world, but fermions or bosons are no such things. It is quantum systems, properly individuated and identified by empirical means, that are the individual objects.

14.2 Maxwell-Boltzmann Statistics Consider n macroscopic balls, all qualitatively alike, distributed among k possible states. Assume that any ball can enter in any particular state with the same probability, a common assumption called the Equipartition principle. In MaxwellBoltzmann statistics all particles are treated as individuals. Hence if the particle j1 , being in state sm and particle j2 being in state sn swap states, we get a new total ensemble state. If we generalise this to all particles in the ensemble and calculate the distribution function, we get the Maxwell-Boltzmann distribution. The condition in the calculation for this distribution is thus that particles can be treated as individuals, i.e. that they each satisfy an identity criterion. Since they are qualitatively exactly similar, they can only differ by being at different positions, i.e., following different trajectories in space. So when moving a particle from one state to another, its identity connected to its label must be its trajectory. But what if two particles occupy exactly the same place at a certain moment? Then identity in terms of trajectories fails, hence we must exclude that possibility. Hence, the particles must fulfil the conditions for being treated as bodies, cf., Chap. 9. In other words, if it is in principle possible to track the motion in space

220

14 Identity, Individuation, Indistinguishability and Entanglement

of individual particles in an ensemble, then Maxwell-Boltzmann statistics apply. And this hypothesis can be tested, since the statistics is an observable feature of ensembles.

14.3 Fermi-Dirac Statistics Fermi-Dirac statistics applies to ensembles where (i) the particles lack individuality and (ii) not two particles can be in the same state. Hence, if we as before consider a particle j1 , being in state sm and a particle j2 being in state sn and imagine them swapping states, so that particle j1 is in state sn and particle j2 is in state sm , we will not get a new total ensemble state, it remains the same, if the particles are fermions. That means that there is no physical characteristic, not even position, that can be connected to particle labels. Hence fermions are not individuals. They are mere portions of certain quantities. The distribution function resulting from these assumptions is the Fermi-Dirac statistics. Whether an ensemble of particles satisfies FD statistics or not is an observable feature, so one can empirically decide whether the particles satisfy conditions (i) and (ii) or not. And fermions do satisfy them. But how is it possible to count fermions in a system if they are not individuals? The answer is that fermions are definite portions of a conserved quantity. Hence, saying truly of a system that it contains n electrons, is the same as saying that it contains n elementary negative charges, in other words the system has a total negative charge of ne and since total charge is an observable, such a statement can be known to be true. But it does not presuppose that electrons are individuals. Furthermore, one should be aware of the fact that such portions of conserved quantities are distinct portions only in interactions. Consider the situation where two sources emit electrons of the same frequency and momentum. When we introduce a detector into this field and detect an electron, it is not possible to determine from which of the emitters a certain detected electron came. The conclusion to be drawn is that the conserved quantity, in this case negative charge, only is discretized into distinct portions during interactions. There are no electrons, not even in the sense of discrete portions of negative charge, in the field, as was shown in Chap. 11.

14.4 Bose-Einstein Statistics 14.4.1 Elementary Bosons Elementary bosons (photons, W-bosons, Z-bosons, gluons, the Higgs bosons and gravitons, if gravitation is quantised) are particles with integer spin, which means that several bosons of the same kind can be in the same state. In other words, swapping particle labels for two bosons in a quantum state does not change the

14.5 Individuation and Identity of Quantum States

221

total wave function. So bosons lack individuality. No wonder why they often are labelled ‘identical particles’. Physicists are not bothered by the paradoxical character of this expression. When we say of two photons that they are identical, we in fact claim that there are not really two things at all, just two portions of radiation energy. We may say that an electromagnetic field contains n photons of frequency ν, if the field energy is nhν. This is how we talk about photons in field theory.

14.4.2 Composite Bosons Composite bosons are made up of several elementary particles such that the total spin is an integer. This is true of e.g., stable nuclei with even mass number, such as Helium-4. Since all Helium-4 nuclei can be in the same quantum state, they can form a Bose-Einstein condensate. Still, swapping particle labels within such a state does not result in a different state.

14.5 Individuation and Identity of Quantum States A quantum system may be described by a wave function (t) which gives its state at the time t. The wave function determines the probability distribution for all observables on this system. If we, as I do, accept that these probability distributions reflect genuine indeterminacy, not epistemological uncertainty, we thus have a complete description of the system’s state. Multiplying this wave function by a complex number z does not change the probability distributions for the system’s observables. So (0) and z(0) describe the same state. Wave functions differing by only a complex number are said to belong to the same ray in Hilbert space; thus it is rays that represent quantum states, not particular wave functions, as pointed out by (Weinberg 1995, 49). See also Sect. 10.9. An important consequence of this criterion for identity of quantum states concerns time evolution. The time evolution operator U (t) = exp(−iHt/ h), which is unitary, returns such a complex number for given values of H and t when applied to a wave function (0). Hence unitary evolution does not change probability distributions of observables so long as the Hamiltonian is a constant, i.e., so long as no energy has been exchanged between the system under consideration and its environment. Time evolution is no state change so long as energy is conserved. This has consequences for the discussion about the measurement problem, as we will se in Chap. 16. That a mere change of time is no real state change is easy to accept when one realises that the choice of zero point on the time axis is a convention and the time evolution operator function as representing transformations between such

222

14 Identity, Individuation, Indistinguishability and Entanglement

conventional choices. For, surely, real physical states do not depend on conventional choices of parametrisation. State descriptions of course depend on choice of parametrisation; but one and the same state can be described in different ways, using different time parameter values.

14.6 Individuation of Quantum Systems It is generally assumed, for example by von Neumann (1932/1996) that the result of an interaction between two quantum systems is represented by the tensor product of their respective wave functions. This is wrong. The crucial question is what we mean by ‘interaction’. I take it that an interaction between two objects, of whatever kind, is an exchange of some conserved quantity such as energy or momentum. Moreover, it must be possible to identify the interacting objects before and after the interaction as being the same individuals. This proved to be impossible for fermions in two-particle states, which is the reason why swapping particle numbers in such a state is no interaction, no change of any kind, as shown earlier. So forming the tensor product of two systems does not represent any interaction in the ordinary sense of this word. Rather it represents uniting the two systems into one system. This is obvious already from the syntactic features of the tensor product; it is one term both in the grammatical and in the mathematical sense. Hence its reference, if it has any reference at all, is treated as one single object in the theory. This object may be described as a compound of two parts, each described by a separate wave function. But the crucial thing is that since the tensor product of these two wave functions is one term, we cannot represent internal interactions, such as energy exchanges, between its parts. Any account of an interaction between the parts of a system requires that each interacting part is represented by a singular term in the description of this interaction. This is a general feature of language, not confined to quantum theory. Thus, the identity of quantum systems are determined by the identity of state functions. It is our theory that determines identity and individuation of its objects. If we don’t want to enter into deep metaphysics, there is no alternative than to use our presently best theory of the world for this purpose.

14.7 Entanglement The word ‘entanglement’ is, in quantum theory, used for describing e.g., a singlet state of two electrons: 1 tot = √ [|φ1 (↑)|φ2 (↓) − |φ1 (↓)|φ2 (↑)] 2

(14.1)

14.7 Entanglement

223

where up and down arrows indicate spin up and spin down in a measured direction. (The state function for two photons with opposite helicity is analogous.) Upon a measurement on this state the superposition will collapse and one will either observe that particle 1 has spin up and particle 2 spin down, or vice versa. There is no time parameter in the state description, hence as soon one has observed the spin of one of the particles one can with complete certainty know the other particle’s spin, no matter the distance between them. It means that the anti-correlation between the two particles obtains instantaneously. Therefore no signal can travel between them. This is what theory tells us and it is empirically confirmed by several research groups. A recent experiment confirmed the entanglement of two photons with anticorrelated helicities over a distance of 1200 km, see Yin (2017). At first sight one would think that the anti-correlation was established at the preparation of the experiment, when the two particles were in close contact. But since the anti-correlation is independent of the direction in which the measurements are done, it would mean that the two particles had definite spins in all directions, which is impossible, it violates Heisenberg’s indeterminacy relations. But then, how is it possible to have a strict anti-correlation without a physical mechanism? This is really perplexing and I will tell a little story in order to illustrate how astonishing it is. Imagine that you have two colleagues who always wear ties at job. Both have four different ties, red, green, blue and yellow. (Ok, slightly implausible!). Having observed your two colleagues for an extended number of days you find that they always wear ties in complementary colours. If one has green tie, the other has red, if one has yellow, the other has blue. You have never observed any exception. The most reasonable explanation is of course that they keep in touch every morning, texting a SMS to each other for example, having decided in advance to always follow the rule to wear complementary coloured ties in order to make people wonder. Now suppose you can prove that no messaging between the two occurs, nor any other physical contact by phone or personal meeting before coming to job. If so one is prone to assume that they agreed on a certain rule in advance, in the vein spies send messages to their headquarter using a code book. Now suppose you can reject this hypothesis as well, by looking at the sequence of one person’s choice of colours on his ties; it is random in the sense of not compressible by any algorithm. (This cannot be conclusively proved, according to Chaitin (1975), but let us for the sake of illustration assume it has been proved.) I now guess that you all would say that such a state of affairs is impossible. Two independent sequences of events cannot be strictly correlated without there being a physical mechanism transmitting information between them, or else both being consequences of some common cause. A strict statistical correlation in an infinite sequence of pairs of events and no mechanism producing the correlation is not comprehensible. (This is the basic intuition behind Reichenbach’s principle.) But such a strict anti-correlation obtains in singlet states, transmission of signals can be excluded and hypotheses about hidden variable can also be rejected. As usual, the way out is to consider what tacit assumptions are made in thinking about singlet states. There are at least three such tacit assumptions all of which are false.

224

14 Identity, Individuation, Indistinguishability and Entanglement

1. The first tacit assumption is to conceive of ‘particles’ as confined to limited portions of space. They are not. As will be further discussed in Chap. 16, they are extended objects, standing or propagating waves, which are more or less everywhere in space when not interacting with other objects. 2. The second tacit assumption is that values of observables are discovered in measurements. This is wrong, they are better described as being produced in the measurement interaction. As will be shown in Chap. 16, if we e.g., assume that an electron just before measurement has the measured value, we immediately get a conflict with quantum mechanics, except in measurements of the second kind. A measurement of the first kind is not an observation of a pre-existing value of an observable, but a state change resulting in the measured object getting the measured value during that very measurement. This will be discussed in detail in Chap. 16. 3. The third tacit assumption is to think of two electrons being in a singlet state as two distinct objects. This is wrong. As was shown in the previous section, a joint quantum system |1 |2  is not an entity composed of two individual things, but one single thing, since |1 |2  is one term, referring to one object. The conclusion to draw is that when we measure e.g., the spin of one of the electrons in the singlet state we in fact measure the spin of the entire singlet system. Since the singlet state is broken up in the measurement interaction, we have thereafter two decoupled wave functions and therefore two distinct entities. Hence we can immediately after that measurement attribute definite spins in the measured direction to each of the two particles. This gives rise to non-local correlations. Since before the measurement there are no parts of the singlet system that have independent dynamics, the notion of transmission of information about the value of an observable from one part of the system to another part is not applicable. The expression ‘the spin of one of the electrons’ cannot be understood as that electrons individually have definite spins before measurement; they get definite spins in the measurement interaction. Upon measurement we interact with a system containing two units of negative charge vastly spread out in space. An interaction event is an exchange of a conserved quantity with this system. It occurs in a detector at a particular place, see further discussion of measurements in Chap. 16. The pointer variable of this detector is in the measurement set up correlated with a spin in a particular direction. Since the entire system has zero spin and we measure a positive or negative value, the other unit of charge must then be attributed the opposite value. All this happens in the moment of triggering the first detector; observing the state of the other detector then confirms the anti-correlation.1

1 Which

of the two detections that happens before the other is no observer-independent fact. An observer in the laboratory system may say that the spin of particle 1 was measured before that of particle 2, whereas another observer moving with high speed along a direction parallel with the line between the two detectors may say the opposite. This is reason to reject the notion that the first detection causes the state of the other particle to have a definite spin.

14.8 Summary

225

One might think that this contradicts relativity theory, but that is wrong. Relativity theory says that no signal can go faster than the velocity of light. But since the two parts of a singlet system are not two distinct objects, there is no signaling going on. One may recollect the fact that sending a signal from one object to another consists of one object emitting a photon, and another object absorbing that photon. Thus signaling is an interaction between two distinct and discernible objects. No such signaling between the pair in the singlet state can occur, since it does not consist of two distinct objects. At bottom it is discreteness of interaction, to be thoroughly discussed in Chap. 16, which is responsible for entanglement. A singlet system interacts as one indivisible unit and interacts with other things by exchanging quanta of conserved quantities. The number of quantum particles in the system is of no relevance, since quantum particles are not individuals, as was shown in Sects. 14.3 and 14.4.

14.8 Summary The discussion about identity and individuation in the quantum domain, in particular the nature of quantum ‘particles’, has been intense among philosophers, much less among physicists. From a physical point of view, what matters is correct empirical predictions, not semantic and metaphysical questions about identity. We may remember what Wittgenstein repeatedly claimed; look at how words are used! Physicists’ use particle words, such as ‘electron’ , ‘photon’ etc., not as terms for individual objects. We empiricists see no need to deviate by postulating some sort of hidden metaphysical layer of ‘true’ individuals. The use of these count terms is possible because all interactions are quantized, occurring in discrete steps. We can measure and tell the amount of a conserved quantity by telling how many portions there are. But these portions are no individual things. It is the mistaken assumption that particles are individual objects that causes our bewilderment when learning about entanglement and non-local correlations.

Chapter 15

Quantum Waves and Indeterminacy

Abstract The topic of this chapter is an analysis of wave-particle duality. This duality is explained by observing that quantum systems propagate as waves but interact as particles. The latter feature is a consequence of quantisation of interaction. Hence, wave functions refer to quantised fields that propagate as waves. It follows that the uncertainty relations reflects indeterminacy, not epistemological limits.

15.1 Introduction Quantum objects show both wave and particle properties in experiments, which triggers an ontological question: what are they really, independent of observations? Surely, they can’t be waves and particles at the same time; a particle has, by its very definition, a determinate position at each point of time and travels along a continuous trajectory, whereas a wave is a spread-out object which cannot be attributed any precisely defined trajectory. My stance is, and I think physicists who have thought about the problem agree, that quantum objects behave as waves during propagation in space and time, but interact with other objects, no matter how small or big, as particles. In the case of multi-particle systems one may remember that fermions and bosons cannot be treated as individuals. A system of n fermions, described by a wave function of the form |1 |2  . . . .|n , is one object which has a certain propensity to interact at each point in space. Swapping particle labels in this wave function does not give us a new state. The need for a 3N-dimensional functional space for this wave function does not entail anything about the dimensionality of physical space in which we observe things.1 The dimensionality of physical space will be further discussed in relation to the dimensionality of the target space in string theory in Chap. 17.

1I

have discussed this topic in Johansson (1992, 72-3).

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_15

227

228

15 Quantum Waves and Indeterminacy

Why, then, do these wave-like things behave as particles when interacting with each other? The answer is discreteness of interaction, a fundamental property of matter, discovered by Planck.

15.2 Quantum Systems Are Fields Which Propagate As Waves Interactions between objects occur only in discrete steps; this is the fundamental feature of quantum theory, discovered by Planck, se further Sect. 16.7. Such discrete interaction events are in relevant aspects similar to collisions between particles, classically conceived. When two classical particles, i.e., bodies about which we disregard their volumes and inner structures, collide, they exchange momentum and energy at well defined places. Similarly with interactions between waves. We may perhaps understand the analogy better by considering the following thought experiment. Assume that a series of plane waves, each being one single object without any parts, (as if water waves were not made up of molecules) approach a line of detectors, each capable of absorbing kinetic energy from an incoming wave, see Fig. 15.1. Assume further that interaction between a wave and the row of detectors is quantized, i.e., it can occur only in discrete portions of size big enough to trigger one detector. Quantisation means that a wave cannot interact with more than one detector, even though its energy is big enough to trigger more than one. (This is the conclusion drawn from Planck’s radiation law, as will be discussed in Sect. 16.7.) It means that when a wave arrives at the line of detectors, it will at most trigger one of these, although it approaches all at once. When a wave arrives at the detector row, we observe at most one detector being triggered and we are tempted to think that a particle, an object not big enough to hit more than one detector, had arrived at the triggered detector. Given discreteness of interaction, the observation of a series of incoming waves triggering different detectors is just like the observation of a series of particles hitting the detectors. This reasoning does not depend on any particular characteristic of the detectors. The only thing that matters is that they are individual objects which can change state independently of each other. Hence quantisation of interaction entails that no matter how spatially spread out an incoming object is, it interacts with only one detector at each point of time. Thus we have explained particle behaviour of waves, given the principle of quantisation of interaction. But what is, really, a quantum wave? The fundamental dynamical equation of quantum mechanics is Schrödinger’s equation i h¯

∂  = H ∂t

(15.1)

15.2 Quantum Systems Propagate As Waves

229

Fig. 15.1 Plane waves approaching a row of sensors

where the Hamiltonian is

 h¯ 2 2 H= − ∇ + V (r, t) 2m

(15.2)

Thus, by obtaining knowledge about the system’s mass and the potential function V we can solve Schrödinger’s equation, getting a family of complex wave functions of the form (r, t) = exp(−i(kr − ωt))

(15.3)

This function describes the state evolution of a quantum system. But what is this system? How is it identified in a concrete experimental situation? And how could an expression consisting of a real and an imaginary term refer to anything in reality? It is understandable that the logical positivists and most physicists once held that these solutions doesn’t describe anything in the real world. They held that the wave function is just a mathematical tool which enables us to calculate distributions of outcomes of all possible measurements. And since such predictions always proved correct, all is fine. Why bother about reference? As was pointed out in Chap. 7 this position is incoherent, in so far as we use the solutions to Schrödinger’s equation as singular terms in sentences being part of quantum theory. The wave functions must refer to real physical objects. If we believe the theory to be true, we hold its component sentences to be true and if a wave function occurs as singular term in such a sentence, this wave function must have a reference in the real world, not only in the realm of mathematics. In a concrete experimental situation one must be able to determine at least the mass of the system and the external potential. By attributing a constant mass to the system, it is clear that it is a physical object which can interact with its surroundings subject to momentum conservation; this is what the analysis given in Chap. 10 tells

230

15 Quantum Waves and Indeterminacy

us. A physical object which exists over an extended region in space and having a well defined value (a vector or a scalar) of a state variable at each point is what we call a field. So this is what wave functions refer to, viz., fields in space and time. This is the view taken by quantum field theorists. Here is, again, a quote from Steven Weinberg: The inhabitants of the universe were conceived to be a set of fields - an electron field, a proton field, an electromagnetic field - and particles were reduced to mere epiphenomena. In its essentials, this point of view has survived to present day, and forms the central dogma of quantum field theory; the essential reality is a set of fields subject to the rules of special relativity and quantum mechanics; all else is derived as a consequence of the quantum dynamics of these fields. Weinberg (1977, 23)

The only problematic aspect of this is that the wave function, which is a complete description of the system, usually has complex values. It means that the wave function cannot represent any directly observable quantity of the field.

15.2.1 Probability Amplitudes The referents of the solutions to Schrödinger’s equation are quantum systems. These are fields since they have an amplitude at every spacetime point, i.e., the fields are in principle everywhere. However, do we really need to assume that a physical object described as a field must be everywhere just because the field has a non-zero value everywhere? It may be considered as a mathematical fiction. If for example a field’s amplitude a particular point in spacetime is, say, less than 10−100 , should we then say that this field exists at that point or not? I would say no, since the probability that it interacts at that spacetime point is 10−200 ; we can safely dismiss such a probability. This is a practical decision, we may in any particular case decide where to put the cut-off and I don’t see any need to take a principled stance on this matter. Why should we assume that a mathematical representation of something in nature exactly, in minute detail, reflects the real thing? According to the Born rule, the square of the amplitude of the wave function at a particular point is the probability to detect the system at that point, hence the common label ‘probability amplitude’. As is well known, one cannot identify the probability for detecting a system at a particular point with the probability that the system is at that point just before and independent of the measurement. For such an identification would, by generalising, mean assuming that a system has well defined positions at all times, which would mean that it is a particle at all times, contrary to results of interference experiments. Our conclusion must be that the probability for detection at a particular point is the probability for an interaction between the system and the measurement device at that point. One can calculate this probability without having any record of series of measurements of similar events. These probabilities are transition probabilities, see Sect. 12.3. Hence, one is justified to interpret such probabilities as propensities, i.e., dispositions to change state.

15.3 Indeterminacy, Not Uncertainty!

231

One may make a further identification: the squares of respectively the real and the imaginary part of the wave function are real functions with a constant phase difference of π/2. Thus one may identify these squares with the systems’ kinetic and potential energy respectively. This identification is suggested by Chen (1990, 1993). The correct way of understanding the wave function is that it refers to a quantised field, i.e. a field which propagates in space and time according to Schrödinger’s equation. Or one may just as well call it a wave. It has no definite position, it is well spread out and it has an amplitude at each spacetime point.

15.3 Indeterminacy, Not Uncertainty! By now it should be completely clear that the so called uncertainty relations do not express uncertainty in measurements of the values of observables, but rather indeterminacies. For when we attribute two conjugate observables, for example x and px , to one and the same object they obey the relation xpx  h¯

(15.4)

This is a theorem of quantum mechanics and its derivation does not depend on any assumption about measurements. But there is no lower bound for two non-conjugate observables such as x and py xpy  0

(15.5)

This difference is not explicable if we interpret  as an expression for measurement uncertainty: why should there be any difference between the two cases? But interpreted as indeterminacy, i.e., measures of the spread of the waves, it is immediately understandable. A wave packet propagating in the x-direction must have a certain spread both in the coordinate and in momentum space. But if we have determined its position along the x-direction, it could be vastly spread out in a perpendicular direction and therefore have a well defined momentum in that direction. Fourier analysis of wave packets + quantisation of interaction tell us these things. Another strong argument for the indeterminacy interpretation of Heisenberg’s uncertainty relations is the following. If quantum particles at all times have well defined positions independently of our observations, this applies to an electron orbiting a nucleus. If so, its motion is accelerated when orbiting the nucleus. Since the electron has charge, it must then radiate energy and rapidly spiral down to the nucleus according to Maxwell’s equations. But it remains in a stationary energy state, except when emitting or absorbing a photon; this is Bohr’s postulate. So either Maxwell’s equations, i.e., the electromagnetic theory, is false, or the electron does not move non-uniformly. There is no reason to think Maxwell’s equations are not

232

15 Quantum Waves and Indeterminacy

valid, hence the electron does not orbit the nucleus. Since it nevertheless has kinetic energy, this cannot be due to the electron’s motion through space. Therefore it’s energy must be vibrational energy. Hence it is some sort of standing wave while being bound to a nucleus. It means that it has no well-defined position while being bound to a nucleus; its position is not uncertain but indeterminate. One may recollect that Bohr never claimed that electrons move in orbits around the nucleus; his postulate was that electrons are in stationary energy states when not emitting or absorbing energy. To be in a stationary energy state does not entail that it moves around.

15.4 Summary Quantum systems are best conceived as fields, objects which are spread out in space and having a field value at each spacetime point. The dynamics of these fields is described by complex wave functions. The amplitudes of these wave functions are no observables, but the squared amplitudes are identified with probabilities for detections, according to Born’s rule. These fields behave as particles in interactions with detectors. That is a consequence of discreteness of interaction, a fundamental fact which is a consequence of Planck’s radiation law.

Chapter 16

The Measurement Problem

Abstract In this chapter an explanation of the collapse during measurements is presented. First, it is argued that measurements are interactions between systems and measurements only differ from other interactions in that they are observed. This is of no physical relevance. Secondly it is shown that indeterministic, discrete and irreversible interactions, i.e. collapses, necessarily follows from Planck’s postulate that interactions occur in discrete portions ΔE = hν. So the collapse during a measurement is a consequence of the fact that measurement interactions necessarily are quantized. This fact is lost in von Neumann’s account of measurements because he represented the first step in the measurement process as the formation of the tensor product of the wave functions for respectively the measurement device and the measured object. This tensor product is one singular term referring to one single object, hence von Neumann has lost the ability to say anything at all about any interaction between the measured object and the measurement device.

16.1 Introduction The measurement problem has occupied physicists and philosophers ever since 1932, when von Neumann published his seminal work ‘Mathematische Grundlagen der Quantenmechanik’. In this book von Neumann introduced a distinction between two kinds of measurements, later labelled, by Pauli, ‘measurements of the first kind’ and ‘measurements of the second kind’. Characteristic of a measurement of the second kind is that the value of the measured observable can be predicted with certainty. (Theoretically, this value is an eigenvalue to a Hermitian operator operating on the state function representing the state of the measured system before the interaction takes place.) If the outcome cannot be thus predicted, it is a measurement of the first kind. In such cases the state  of the system before measurement can be described as a superposition of eigenstates φi to the Hermitian operator O, corresponding to the measured observable. In such cases we have

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_16

233

234

16 The Measurement Problem

=

n 

(16.1)

ci φi

i=1

where the set {φi } is a complete set of orthonormal eigenstates to O and the coefficients ci are complex numbers, which, when squared, give us the probabilities for the different possible outcomes. The measurement result is one of the eigenvalues to the operator O, corresponding to one of the eigenstates. Immediately after the measurement the system is in such an eigenstate. Hence, a measurement of the first kind is a reduction of the superposition to one of its components: =

n 

ci φi → φk

for some k1

(16.2)

i=1

This change is also referred to as a collapse of the wave function and is theoretically represented by a projection operator P acting on the superposition: Pk

n 

ci φi = ck φk

for some k2

(16.3)

i=1

Such a collapse is a discontinuous, indeterministic and irreversible state change. It is discontinuous since there exists no continuous time evolution operator that destroys the superposition. Any continuous evolution operator has the form U(t) = exp(−iHt/h), ¯ where H is the Hamiltonian. Such operators are unitary and linear. It is well known that U is unitary iff H is Hermitian. Since exp(−iHt/h) ¯ = 0 for all times t and linear, all components in the superposition will survive the unitary and linear evolution. Since  is not an eigenstate to the operator O, it follows that neither is U; hence no continuous evolution operator can transform  to an eigenstate to an operator O unless  already is an eigenstate to that operator. Hence, the state evolution (16.2) cannot be continuous. The collapse is also indeterministic; the original state function , which by assumption is a complete description of the state, does not contain any information about which term will survive the collapse in a particular case, we only have a probability distribution over the set of possibilities. Hence we cannot predict with certainty which possibility will be realised in a particular case.3

1 The

superposition may also be an integral over a continuous variable. measurements may not result in a definite value but only in an interval for the value of the measured observable. 3 This might be due either to fundamental indeterminism in nature or to the fact that the state function  is an incomplete description of the state. Many philosophers and physicists, Einstein for example, have assumed that quantum mechanics is incomplete and that nature is deterministic. But all efforts to show that quantum mechanics is an incomplete theory have been in vain and there is no positive evidence that quantum mechanics is wrong. In any case, the collapse is discontinuous 2 Some

16.2 Von Neumann’s Account of Measurements

235

Finally, since we can apply the same argument for the reversed state change φk → , it follows that (16.2) is irreversible in the sense that it is impossible to arrange conditions such that with certainty we can force the system back to its former state. By contrast, if the time evolution of a quantum system can be described by a unitary operator U(t) = exp(−iHt/h), ¯ it is continuous, deterministic and reversible. Hence, any particular system at each period of time either evolves discontinuously, indeterministic and irreversible or continuously, deterministic and reversible. These three features is a package deal in quantum mechanics.4 According to standard quantum mechanics, discontinuous state changes occur only during measurements of the first kind. This is what the projection postulate says. However, there are very strong reasons to conclude that this restriction to measurements is false. Discontinuous state changes are common, a fact which is a consequence of Planck’s discovery that interactions in the form of energy exchanges always occur in discrete steps. Since a measurement involves exchange of energy between the measurement device and a measured object, the discontinuous state change during the measurement can be explained by discreteness of interactions, thus making the projection postulate superfluous. The details of the argument will be given in this chapter. Why, then, did von Neumann introduce the projection postulate in his (von Neumann 1932/1996)? The reason is, I believe, his mistaken account of measurement interactions.

16.2 Von Neumann’s Account of Measurements According to von Neumann the measurement process proceeds in two steps; first the system to be measured (described by a state function of the form  = ci φi ) is brought into contact with the measurement apparatus M, which after a measurement will be in one of the pointer states {mi }. This coming into contact is, according to von Neumann, theoretically represented as the formation of the tensor product ||M =

n 

ci φi mi

(16.4)

i=1

and indeterministic and the present author aims at an interpretation of our presently accepted quantum mechanics, not to improve the theory or replace it with a better one; that may be a task for physicists. 4 It is of course not in general true that a discontinuous state change must be indeterministic. But in quantum mechanics, where physical states are represented by state vectors, or rather rays, in Hilbert spaces, and where continuous changes are represented by unitary operators, these features go together.

236

16 The Measurement Problem

This results in a one-to-one correlation between the possible pointer states of the measurement apparatus and the possible states of the measured object. This joint system evolves continuously guided by a unitary operator U(t) = exp(−iHt/h). ¯ But an observer will observe one of the possible values corresponding to one of the components in the superposition. So the observation brings about a collapse to one of the components. Von Neumann then argued that we may include the observer’s body, represented by the state function |O in the quantum mechanical analysis, thus writing ||M||O =

n 

ci φi mi oi

(16.5)

i=1

where {oi } consists of the possible observation states of the observer. Thus also the observer’s body is in a superposition state. But finally the observer becomes aware of a definite outcome. Von Neumann concludes that somewhere in the link from measured object to observer’s mind we must draw a line, make a ‘cut’, between the quantum system and something that observes it, but that it doesn’t matter where we draw the line. The only thing which cannot be described as a superposition is the mind state of the observer. This argument strongly suggests that the collapse is a change of information state of the observer, although von Neumann did not explicitly draw that conclusion. This in turn suggests that the wave function is not a description of the object’s state but of the information state of observers. This view was integrated into the Copenhagen Interpretation, CI for short, although it was never Bohr’s view. For a long time it was almost the received understanding of the collapse, but times have changed and presently it has few adherents.

16.3 The Copenhagen View on Measurements A core component in the Copenhagen Interpretation was the conviction that classical concepts are necessary for communication of observations. I have already, in Sect. 9.2, quoted Bohr expressing this view. On this point I think Bohr is absolutely right. Evidence consists of measurement results expressed in observation sentences, a point I argued in Chap. 3. Measurement results are pointer states of measurement devises. The crucial thing is that a pointer state must be stable for some time so that several people on the spot can observe the pointer state and agree that they see the same state. It means two things: (i) people on the spot must be able to agree on the identification of the measurement device without use of any theory, i.e., using indexicals together with pointing gestures, and (ii) this device must not change pointer state when being observed, i.e. when exchanging photons with an observer. (I take it for granted that observations are visual processes. This is perhaps not absolutely necessary, a blind person could also identify a measurement device

16.4 My Own View: A Collapse Interpretation

237

and attaching some auxiliary equipment to it he may observe its detector state. But the same argument applies even if other quanta than photons in the visible part of spectrum are exchanged between observers and a measurement device.) This means that the pointer variable must be a classical one. It follows that any attempt to give a quantum mechanical description of the measurement device as being in a superposition state of possible values of the pointer variable conflicts with the requirement on observations. Adherents to CI accepts the projection postulate as a necessary component in a theoretical account of the measurement process. With the rising popularity of scientific realism in 1970s, CI lost popularity. It was considered to lack explanatory force. The common thought, was, and still is, that we need a physical explanation of the collapse of the wave function during measurements, an explanation which starts with basic physical principles and the projection postulate is no such principle; it says that the collapse occur during measurements, and measurements do not belong to fundamental physics. A number of different explanations have been suggested. All add extra assumptions to standard quantum theory. For example the GRW (Ghirardi-Rimini-Weber) interpretation has it that there is a fundamental law to the effect that any quantum system will collapse with a certain probability per unit time, quit independently of any conditions. The many worlds interpretation assumes that there are a multitude of parallell words such that each component of a superposition exists in a distinct world; we live in one of these and therefore we observe only one of the components after the measurement. In the pilot-wave interpretation a hidden variable describing the position of the particle as following a continuous trajectory even when not observed is postulated. (The pilot wave can be deduced from the standard formalism.) Each of these interpretations thus start with an extra assumption, which, when added to standard quantum theory, doesn’t give any new empirical predictions. Hence we empiricists look upon them with skepticism, they definitely look like superfluous metaphysics. One is tempted to say: these extra assumptions is just as much in need of an explanation as what it aims to explain, the collapse. One may reasonably ask for example an adherent to the many worlds interpretation: Is your wonder about the world really diminished when you explain the collapse, an observed fact, with the existence of great number of unobserved worlds? My own solution to the measurement problem, to be given in the rest of this chapter, is not open to that critique. It is based only on a well established empirical fact, discreteness of energy exchanges, and standard quantum theory.

16.4 My Own View: A Collapse Interpretation My take on the measurement problem has three components: (i) pointer states of measurement devices are necessarily classical variables, (ii) discontinuous state changes are common events in the world, not confined to measurements, and

238

16 The Measurement Problem

(iii) the ultimate cause of discontinuous state changes such as collapses during measurements is the discreteness of interactions, discovered by Planck. The measurement device is by its very nature such that its pointer states are observable and that intersubjective agreement about the actual state is possible; that is what we mean by a pointer state and Bohr was right in stressing this point, as already pointed out. This requires that a pointer state is stable for at least a time span long enough to allow for repeated observations. If the first observer would change the pointer state just by observing it, it could not function as a pointer state, since there would not be intersubjective agreement about what is observed. So let us now have a closer look at measurement processes.

16.5 Three Steps of a Measurement: Preparation, Correlation and Detection One single measurement is of little use in physics, we usually perform series of measurements. In tests of quantum mechanics such measurements are made on ensembles of identically prepared quantum systems. A measurement series on such an ensemble consists of three steps, viz., preparation, correlation and detection. The preparation stage has the purpose of ascertain that all individual quantum systems in the ensemble to be measured upon are in the same quantum state so that one and the same wave function describes them all. In the second step the systems are directed into a place in which there is a potential gradient, originating from e.g. an electric or magnetic field, the function of which is to establish a correlation between the possible values of the observable to be measured and a position variable representing different places of detection. The detection is a registration in a macroscopic equipment of some sort (photo-detectors, GM-counters, etc). These detectors are placed at different places and the place of detection determines the value of the observable of interest. Spin measurements are typical examples. Spinhalf systems are either in the state spin(up) or in the state spin(down) in a given direction and hence two detectors at different positions along that direction are used; a click in the ‘upper’ detector is a detection of spin(up) and a click in the ‘lower’ detector means a detection of spin(down). It is obvious that spin(up)/spin(down) is relative to a chosen spatial direction. The final observations are thus registrations at different positions, so the ultimate observable is a position variable. Several authors, e.g., Margenau (1958) and Cartwright (1980) claim that this is always the case and I think they are right; no matter which observable one wants to measure, one needs to establish a correlation between the values of the observable of interest and positions of detections. So the second step is necessary and its function is to establish such a correlation. The external potential enters the formalism as a component of the Hamiltonian; systems to be measured upon pass through a portion of space where there is such a potential (in the case of spin measurements it is a magnetic field). It is possible

16.5 Three Steps of a Measurement

239

that a quantum system when passing through the potential exchanges energy and momentum with the external field, in which case it may be deflected so much that it will not arrive at any of the detectors. If so, these events are not counted in the statistics. If, on the other hand, the system does not exchange energy and momentum with the field when passing through it, its energy will be the same, hence the Hamiltonian is a constant of the motion. But the external potential has nevertheless done something with the system; it has changed the spatial distribution of its probability amplitude. In the case of a spin measurement of fermions, which have two possible spin values, the spatial distribution behind the SG-magnet looks like two well separated peaks instead of a single wavelet. Thus we get a correlation between spin values and positions of detections, if detectors are placed where the two peaks propagate. So far no measurement has been performed, since no detector has made any registration. Furthermore, since by assumption the Hamiltonian is a constant of motion, no energy exchange has occurred between the SG-magnet and the quantum system, hence it must be possible to undo the change of the spatial probability distribution indicated as two beams behind the SG-magnet in Fig. 16.1. This follows from the fact that in such a case the wave function has developed continuously and the system is still in the same eigenstate as before entering the potential. Such an experiment is depicted in Fig. 16.2. An ensemble of systems propagating in the x-direction is prepared to all be in the state ‘spin-up’ in the y-direction. After passing the first Stern-Gerlach magnet oriented in the z-direction the beam divides into two. This should not be interpreted as that individual particles are confined to one of the beams, since that would be an example of adopting the epistemic interpretation of the uncertainly relations, which is untenable. (The epistemic and the ontic interpretation of the uncertainty relations were discussed in Sect. 15.3.) It is now possible to reunite these beams, using an electric current in a wire directed in the y-direction between the beams. Fig. 16.1 A beam of particles/waves with spin up in z-direction passing a Stern-Gerlach magnet with the field gradient in the y-direction

SG-magnet

y z x

240

16 The Measurement Problem

Magnetic field z-up

Spin z-up

Magnetic field y-up Spin y-up

Spin y-up Spin z-down z y x

Fig. 16.2 A spin half system passing two Stern-Gerlach magnets

Such a current is encircled by a magnetic field and will have opposite effects on the beams, thus uniting them at some point. If we direct this united beam into a new Stern-Gerlach magnet oriented in the y-direction we will observe the entire beam being deflected in the positive y-direction, thus showing that all systems still are in the state ‘spin-up’ in the y-direction. This possible experiment has been discussed in the literature, see for example Wigner (1963), Albert (1992) and Omnés (1994). Wigner writes (op.cit., p. 10): Even though the experiment indicated would be difficult to perform, there is little doubt that the behaviour of particles and of their spins conforms to the equations of motion of quantum mechanics under conditions considered.’

Omnés agrees and writes: This result unfortunately has to be accepted without any experimental check because the experiment is too difficult to perform. (op.cit., p. 72).

Although this particular experiment may be difficult to perform, other similar experiments have been performed and from them we may draw the same conclusion. So for example, Rausch et al. (1992) reports experiments with neutron waves being split into two spatially separated beams and then united again. The results fully agree with the general conclusion that when splitting wave packets into two spatially separated beams no reduction of superposition occurs. Single neutrons cannot be attributed any definite position in either path, they spread out to both paths. Thus it is clear that collapses do not occur in the correlation stage, when systems pass the first Stern-Gerlach magnet. It occurs in the final stage, i.e., in the detector, because it is here energy (and other conserved quantities) is exchanged. Those changes brought about by e.g., a magnetic field gradient on the spatial distribution of the probability amplitude in a system, which occur without exchanging energy with the environment, I will call influences, whereas those events were the environment exchanges conserved quantities such as energy and momentum with a quantum system I call interactions. So the wave function evolves continuously

16.5 Three Steps of a Measurement

241

during an influence from the environment, but changes discontinuously during an interaction with it. I take it for granted that the epistemological fact that the described procedure is a measurement is of no relevance for these conclusions; nothing in the argument depends on anything being in fact observed; what is required is only that the pointer variable is observable and do not change when being observed. The fundamental flaw in von Neumann’s account of the measurement interaction is his assumption that the measurement device is in a superposition state of pointer states. The pointer states {mi } cannot as a matter of principle form a superposition and still be pointer states. This follows from our requirement on being a point state. Here comes a more detailed physical argument. In order for pointer states to form a superposition, that superposition must be an eigenstate to an observable incompatible with that actually measured. For if a quantum system is in a superposition state relative to one set of operators, there exist other operators not commuting with the former, such that this quantum system is in an eigenstate to the latter. For example, if the states {φi } are |spin(up)y  and |spin(down)y , thus being eigenstates to the spin-y operator, a superposition of these is an eigenstate to the spin-z-operator, as can be seen in Eq. 16.6: |spin(up)y  =

      1 i 1 , |spin(down)y  = , |spin(up)z  = i 1 0       i i 1 1 1 − = (16.6) 0 2 i 2 1

The two possible pointer states of the measurement device may thus be described as M1 = |detectorclicky(up)  and M2 = |detectorclicky(down) . Spin-up in the zdirection is the same state as the superposition in the y-direction, so there should be a corresponding state for the measurement device:

|detectorclickz(up)  = 1 √ (|detectorclicky(up)  + |detectorclicky(down) ) 2

(16.7)

But, obviously, the two detectors cannot be at those positions where they actually are and at the same time be at other places where z-spin can be detected. Pointer states are macroscopic position variables which do not follow the superposition principle, so the state (16.7) is not possible. Von Neumann’s account of the measurement interaction is wrong. That two observables are incompatible is the mathematical counterpart to the physical fact that they cannot be measured simultaneously. In the case of spin measurement this is easily recognised. When we want to measure spin in the y-

242

16 The Measurement Problem

direction, the correlation between position and spin is established in the second stage when the spin system passes a SG-magnet oriented in the y-direction. Obviously the SG-magnet cannot at the same time be oriented in the z-direction. In the formalism this is reflected by the fact that the operators Sy and Sz don’t commute. Equation 16.4 is von Neumann’s theoretical representation of the physical contact between measured object and measurement device. This representation erases the individuality of respectively, the measurement device and the measured object. As was shown in Sect. 14.5, a tensor product of two state functions is one term and therefore it cannot possibly refer to two distinct objects. The referent of the tensor product, if there is one, is theoretically treated as one object. It follows we cannot in the formalism represent an interaction between its parts, the measured object and the measurement device. Therefore, since this single object is described as being in a superposition state, no components can represent pointer variables. Stated in probability terms, if we have a joint state |φ1 |φ2  no marginal probabilities for observable values attributed to one of the systems |φ1  or |φ2 , can be defined; hence they cannot be treated as individual objects. Writing the tensor product for two quantum systems is a kind of change of perspective; we look upon that combined system from outside and treat it as an indivisible unit. The Hamiltonian now operates on the tensor product for the combined system, and if this combined system has not exchanged energy with the rest of the world, the total Hamiltonian is constant. This is perfectly compatible with internal energy exchanges between components of the combined system having occurred. Hence the combined system’s evolution, as viewed from outside, may be continuous, while discrete internal exchanges of conserved quantities have occurred. So let us now forget von Neumann’s way of describing the measurement device and instead view it as an individual object, which can be attributed values of variables unconditioned on other systems. How does a macroscopic pointer variable for a measurement device gets a definite value? It is the result of discreteness of interactions.

16.6 Discreteness of Interactions A measurement is an exchange of energy between the measured object (or a secondary object) and a detector; this is necessary in order to change the pointer state from ‘not triggered’ to ‘triggered’. Since all exchange of energy occur in discrete steps, so does the measurement interaction. But how do we know that exchange of energy occurs in discrete portions? This was Planck’s discovery, see Planck (1900). When Planck published this famous paper one had two approximate laws for the intensity distribution of black-body radiation, Wien’s and Rayleigh-Jeans’; Wien’s law was empirically adequate for high frequencies, but not for lower ones, whereas Rayleigh-Jeans’ law gave almost correct predictions for low frequencies, but not for higher frequencies. (It predicted that the energy density approached infinity as the radiation frequency increased, the so-called ultraviolet catastrophe.) So it was

16.6 Discreteness of Interactions

243

obvious that each law was based on at least one wrong assumption. In retrospect we see that the mistaken assumption in both was that emission of energy from the radiating matter is a continuous process. By instead postulating that emission of radiation occur in discrete steps,5 Planck was able to derive the correct law now named after him: u(ν, T ) =

8π hν 3 /c3 ehν/kT − 1

(16.8)

where u(ν, T ) is the energy density, ν the frequency of emitted radiation, T the absolute temperature, k is Boltzmann’s constant, c is the velocity of light and h is Planck’s constant. We easily recognise that the energy density approaches infinity when h → 0, hence it is immediately clear that h > 0. Since h = E/ν , it follows from the correctness of Eq. 16.8 that E, i.e. energy changes in the radiating body, occur in discrete portions, and this was in fact an assumption in Planck’s derivation of his law. An immediate consequence of discretised energy states of the atoms making up the radiating object is that interaction between matter and radiation fields is likewise discretised. This is a fundamental fact, the discovery of which started the development of quantum theory. A consequence of discreteness of interaction had in fact been observed quite some time before Planck found his radiation law, viz., the discrete lines in the emission spectrum from the elements, such as hydrogen. The hydrogen spectrum contains four such emission lines in the visible part of the spectrum, the frequencies of which satisfy the well-known Rydberg’s formula. Rydberg published his formula 1888, but had no theoretical underpinning for it. It was Bohr who, much later, gave the explanation: the hydrogen atom can only change its energy in discrete steps. Since these well-defined energy steps are proportional to the frequencies of the emitted radiation during the change, the proportionality constant being Planck’s constant, we have an explanation. And the same goes for other atoms. So around 1910 it was established by the leading researchers that interaction between electromagnetic fields and matter occurs in discrete portions. Here is how Bohr expressed this fundamental principle: Its [quantum theory] essence may be expressed in the so-called quantum postulate which attributes to any atomic process an essential discontinuity, or rather individuality, completely foreign to classical theories and symbolised by Planck’s quantum of action. Bohr (1928, p. 581)

And Einstein wrote : It should be strongly emphasised that according to our conception the quantity of light emitted under conditions of low illumination (other conditions remaining constant) must be proportional to the strength of the incident light, since each incident energy quantum will cause an elementary process of the postulated kind, independently of the action of

5 According

to Kuhn (1978) Planck believed he only changed the calculation procedure. It was Einstein who realised that Plank in fact had introduced a revolutionary new physical postulate.

244

16 The Measurement Problem

other incident energy quanta. In particular, there will be no lower limit for the intensity of incident light necessary to excite the fluorescent effect. Einstein (1905a, 1965).

Thus we see that discreteness of interactions actually consists of two closely related, but distinct features: 1. Every individual exchange process involves one object giving away energy and another object taking up energy (and possibly other conserved quantities) and this exchange is independent of every other interaction process. It is this aspect Bohr refers to when he writes that “the quantum postulate attributes to any atomic process an essential individuality.” The individual exchanges between atoms and fields are all “event atoms”. It means that an atom or molecule gives away one indivisible portion of energy (a photon) or takes up one such portion in every exchange process. 2. The exchange process cannot be analysed further as a sequence of incremental changes. It is a discontinuous state change, either from the state (in quantum field notation) |ground state atom + excited field to |excited atom + de-excited field or vice versa. No intermediate states are possible; it is a discontinuous state change. Thus, Einstein’s interpretation of Planck’s radiation law and Bohr’s interpretation of the discreteness of stationary states in matter could now be joined in the statement: All interactions between matter and radiation fields occur in discrete steps. As argued in Chap. 11, in electromagnetism we have a choice between conceiving fields as the objects postulated by the theory and particles as mere excitation states of these fields, or conversely to view particles as the objects talked about and fields as mere calculational devices, but that a double ontology is impossible; one cannot consistently say that material particles interact with electromagnetic fields. either we should say that interactions occur between fields, or between material objects. Hence we may reformulate: All interactions are discretised. Both Bohr and Einstein pointed out that discreteness of exchange of conserved quantities is represented in quantum theory by Planck’s quantum of action. In other words, it enters theory with the well-known formula E = hν

(16.9)

This formula tells us that any energy change E in an atom, molecule or any piece of matter is proportional to the frequency ν of the radiation field interacting with that piece of matter and thus with the frequency of the emitted or absorbed photon. So we may formulate the principle of discreteness of interaction:

16.7 From Classical to Quantum Mechanics

Principle of discreteness of interaction (Disc): in discrete steps E = hν.

245

All exchange of energy occurs

The discontinuity of interactions makes it impossible to use continuous functions of time for the mathematical representation of interaction processes at the fundamental level. In macroscopic theories such as classical mechanics, classical electromagnetism, thermodynamics and relativity theory one uses continuous functions of time for such descriptions. But in the quantum world we need other tools. In other words, already from the empirical fact that all interactions are discretised we may infer that solutions to Schrödinger’s equation cannot be used to describe the measurement process, since these solutions are continuous functions of time and the measurement interaction is a discontinuous state change. Hence, no-collapse interpretations are ruled out, because they all take for granted, as a premise, that all state changes are continuous. All no-collapse interpretations therefore need to somehow explain away what we can infer from a measurement of the first kind, viz., that the measured object has changed its state discontinuously. In standard expositions of quantum mechanics the projection postulate is added in order to account for this discontinuity. But few, if anyone, are content with it as a fundamental postulate. Why should discontinuous state changes occur only during one class of measurements, which is what the projection postulate says? Can it be replaced by some more general principle? Yes, this is precisely the role of discreteness of interactions. It is a perfectly general empirical principle, established beyond doubt. In order see its role more clearly we need to consider how discontinuity is mathematically represented in the formalism of quantum mechanics. So let us begin with taking a close look at quantisation, the transformation from classical descriptions of the dynamics of physical systems in terms of continuous variables to a description using hermitian operators with discrete spectra.

16.7 From Classical to Quantum Mechanics 16.7.1 Replacing Operators for Variables Borrowing a concise formulation from Rovelli (2004, p.14), to quantise is a technique for searching a solution to a well-defined inverse problem, viz., finding a quantum theory with a given classical limit. So we use the correspondence principle as a condition on any acceptable quantum theory. It says that in situations when the total quantity of action is much bigger than Planck’s constant, the predictions of quantum theory should be identical to those of classical mechanics. Consider, as an example, a piece of matter made up of a large number of atoms, say 1023 or more. This object will usually have a very large number of densely packed energy levels. The result of this is that usually it can interact with radiation fields of almost any frequency. In other words, it would be hard, not to say

246

16 The Measurement Problem

impossible, to observe any difference between such a system and a classical one in which interactions were truly continuous. In other words, in the limit of big systems the new mathematics must give the same, or very nearly, predictions as classical mechanics. We can formulate the principle as follows: Correspondence principle: In situations where we can disregard the fact that Planck’s constant is non-zero, i.e., where the discreteness of interactions can be neglected when calculating the values of variables, quantum and classical mechanics should make the same empirical predictions.6 How, then, do we find the correct quantum description of the time evolution of physical systems? Obviously, it cannot be done by using continuous functions of time as descriptions of their dynamical evolution. In making the transition from classical to quantum mechanics we need to replace continuous real-valued variables by something else. That something else was found to be Hermitian operators operating on classes of state functions describing quantum states. Hermitian operators are needed since they have real eigenvalues and the eigenvalues are taken to be the observable values; surely, we observe only real values of observables.7 Since the eigenvalues of some Hermitian operators often are discrete, it is possible to construct a mathematical representation of discreteness of values of observables, with suitably chosen operators. These operators operate on state functions that represent states of physical objects. How, then, are the proper Hermitian operators replacing classical variables found? The recipe is to use the following correspondence rule, which is the mathematical counterpart to the informal correspondence principle, as guide: Correspondence: For any two Hermitian operators A and B acting on a Hilbert space H and replacing classical variables A and B it holds that [A, B] = i h{A, B} ¯

(16.10)

where {A, B} is the Poisson bracket for the pair (A, B).8 It is immediately clear that in cases where we can approximate h to be zero, which means disregarding the fact that interactions are discrete, all operators commute and quantum predictions will be the same as those of classical mechanics. One should keep in mind that it is not assumed that all Hermitian operators have a counterpart among the set of observables of a system. It is only assumed that for 6 There

is also another version of the correspondence principle which says that in the limit of big quantum numbers the predictions of quantum mechanics should equal those of classical mechanics. These two versions are not strictly equivalent. For our purpose it is the first version which is useful. 7 Thus I accept the validity of the eigenvector-eigenvalue link, in contrast to modal interpretations, in which it is given up.  N  ∂f ∂g ∂f ∂g 8 In canonical coordinates the Poisson bracket is defined as {f, g} = i=1 ∂qi ∂pi − ∂pi ∂qi . If the Poission bracket differs from zero, the conjugate coordinates (pi , qi ) are functionally dependent on each other.

16.7 From Classical to Quantum Mechanics

247

any observable a Hermitian operator can be found. In this I follow what I take to be received view among physicists. For example, Dicke and Wittke (1960, 101) writes: “With every physical observable there is associated an operator.” Two observables that simultaneously do not have definite values are said to be incompatible and two such observables are replaced by non-commuting operators. If one such observable has a precise value before the measurement, those incompatible with it cannot be attributed definite values, we can only calculate a probability distribution over the set of possible values. Why is it so? Why can’t we simultaneously determine all observables? The traditional argument stemming from Heisenberg (1930) was that all measurements by necessity induce an uncontrollable disturbance on the system. The word ‘disturbance’ suggests that the system has definite values on all observables, but that it is impossible for us to know which is the correct values of all observables. This view, the epistemic interpretation of the uncertainty relations, has been heavily criticised, see Sect. 15.3. The other view, the ontic interpretation, is in my view the correct one, hence we should use the expression ‘indeterminacy relations’, not ‘uncertainty relations’. Adopting the ontic view amounts to rejecting value determinism, the view that all observables have definite values at all times, independently of being measured, and conversely, rejecting value determinism is to adopt the ontic view. Quite a number of interpreters, perhaps the majority nowadays, seem to reject value determinism and thus to adopt the ontic view. Every preparation of a quantum system, properly so called, results in a division of its observables into determinate and indeterminate ones. This is so because the preparation consists in selecting and bringing under experimental control systems with well defined states, i.e., systems described by known wave functions and these functions are eigenfunctions to some operators, while not to others. As was thoroughly discussed in Sect. 10.6, in classical mechanics the step from a mere description of the motion of bodies to an account of their interactions started with Wallis, Wren’s and Huygens’ experiments with colliding balls. In these experiments there is no problem with individuation of bodies, they are clearly visible as distinct things. But in quantum theory we cannot rely on observation when individuating among the interacting objects. We must rely on theory to individuate between quantum systems. This step is a necessary precursor for the analysis of interactions. This is the topic for the following subsection.

16.7.2 Interaction and Individuation of Quantum States Interactions, as I use this word, are exchanges of conserved quantities between physical systems. Interactions presuppose identity criteria for the interacting systems; to say that a system, which emits a portion of a conserved quantity and that this portion is absorbed by another system, presupposes that we have a criterion for

248

16 The Measurement Problem

distinguishing and identifying these two systems as separate individuals. So what are the principles of identity and individuation for quantum systems? In Chap. 10 I argued that the conservation principles, the principle of quantisation of interaction and the Fermi rule jointly function as principles of individuation of quantum systems. It means that we treat that system to which we can ascribe a definite portion of a conserved quantity, and about which we can say that it absorbs or emits portions of conserved quantities, as an individual object. Individuation is part of theory. No other option is, so far as I can see, available.9 As already pointed out, when we form the tensor product of two wave functions belonging to different rays, we get a new wave function. Its parts cannot be thought of as referring to individual systems; the tensor product is one term, see Sects. 14.6 and 16.6. Hence, according to quantum mechanics we may not think of the components of the tensor product as each describing a distinct individual. Therefore, von Neumann’s theoretical account of the measurement starting with forming a tensor product of the wave functions for the measured object and the measurement device cannot be description of an interaction. One might say that this tensor product is a description of the coupled system seen from an external point of view where the coupled system’s internal dynamics is neglected; this is unavoidable since the tensor product is one term, not two. An interaction is an exchange of a conserved quantity between two distinct quantum systems and conserved quantities are ascribed to isolated systems, so the concepts of individuation and exchange of energy (and other conserved quantities) are mutually dependent. One may think this is a vicious circle, but it is not. Suppose we find a system, seemingly closed, where energy is not conserved. From a logical point of view, we have two options, either to give up the principle of energy conservation or to say that the condition of closed system was not satisfied. I guess that for most physicists this choice is easy; the assumption of closed system is judged as false, based on coherence considerations + Noether’s theorem. And this has in fact happened, viz., in the first experiments indicating weak interactions. Neutrinos carrying energy are produced in weak interactions, but since they were not independently detectable with the technology of the time (1930s) it seemed that energy conservation was violated. But most physicists couldn’t believe that, so they instead postulated that a new so far unknown particle had been produced. This assumption marked the beginning of a new direction in the development of quantum theory. Independent evidence for this new particle, the neutrino, was much later collected. Thus, to summarise, when two distinct quantum systems interact by exchanging portion(s) of a conserved quantity, they both change their states. This follows from the principle of individuation among quantum states implicit in quantum theory. These state changes cannot be continuous since interactions are discretized and what

9 Cf.

Quine (1981b, 12): “Any coherent general term has its own principle of individuation, its own criterion of identity among its denotata.”

16.7 From Classical to Quantum Mechanics

249

to count as a state change, i.e., individuation of quantum states, is determined by the quantum theory itself.

16.7.3 Unobserved Interactions I guess that no one will contest the assumption that Nature just by itself, without any human intervention, is such that a quantum system may enter into a physical environment that physically resembles those which we may arrange in experiments. Not that Nature itself would isolate one portion and call it a ‘quantum system’, of course, but the physical interactions we try to describe would be the same. Take for example, again, the Stern-Gerlach experiment described above. Certainly, nature is full of (i) inhomogeneous magnetic fields, (ii) electrons passing such fields, and (iii) macroscopic objects that take up energy from incoming electrons. Having accepted this, we have in fact also accepted that spin operators applied to state functions may stand for all states of this physical type, be they part of measurements in the ordinary sense or not. What then about a measurement of the second kind, i.e., a measurement which is not associated with a collapse? Surely, exchange of quanta of energy, i.e., photons, between the detector and the measured object occurs also in such a measurement. So it is not a completely passive observation after all. The reason why we don’t represent this interaction by Eq. (16.2) is that the prepared state is an eigenstate to the Hermitian operator representing the type of influence done on that system. For example in spin measurement of the second kind, no influencing magnet is introduced between the preparation and detection, (because the preparation consists in having the particle beam passing through a SG-magnet and stopping one of the two beams behind the magnet) so the detection is merely a confirmation that the preparation has succeeded. But a detection is made and in that process energy quanta are exchanged. So, exchange of quanta of energy does not necessarily mean collapse of a superposition, but collapses are always associated with exchange of quanta of energy.

16.7.4 Measurements of Continuous Observables An anonymous referee to an earlier version of this chapter asked whether the generality of discreteness of interactions is compatible with the fact that there are continuous observables. The answer is yes, because concrete measurements always return rational number, not real ones. This is so because a measurement of a quantity is always a comparison between the observed magnitude and the unit for the measured quantity. We need irrational numbers for some purposes, but not for reporting measurement results. So when we say that e.g. the observable POSITION ALONG THE X - AXIS a continuous, this does not entail that concrete position

250

16 The Measurement Problem

measurements may return real numbers; they always return rational numbers. This is compatible with the fact that measurement interactions are discrete state changes.

16.8 State Evolution and Time Dependent Hamiltonians The hamiltonian operator H replaces the classical variable energy in quantum theory. It is a mathematical device needed for calculating what happens with quantum states under given energy conditions. Often solutions to Schrödinger’s equation are calculated under the assumption that the hamiltonian is a constant, i.e. that the system under consideration is not interacting with external fields during the time interval of interest. Thus the time evolution operator has the form U(t) = exp(−iHh), ¯ where H, the hamiltonian, is a constant of the motion. But this gives us no real state change. As earlier pointed out, (see Sect. 10.9) physical states are identified by rays, which are defined as sets of state functions differing only by a complex number; 1 and 2 belong to the same ray iff 1 = c2 , where c is a complex number, see e.g. Weinberg (1995, 49). So an isolated system which do not exchange energy or some other conserved quantity with its surroundings do not change its physical state. This might at first seem counter-intuitive, but if we realise that a time translation t → t + t is a displacement of the zero point t seconds backwards on the time scale, we realise that such a change is no real physical change. When a system exchanges a discrete portion of energy E = hν with its surroundings at some point of time t  , we describe the time dependence of H(t) as  H1 , if t < t  H(t) = H2 , if t > t  where H1 ψ = Eψ and H2 ψ = (E + E)ψ in a case where the system ψ has absorbed energy E from its surroundings. At the time t  no definite Hamiltonian can be given; it is at this time point the exchange occurs; this is the quantum jump.10 Representing the interaction process as an instantaneous event occurring at one point in time t  may be viewed as an idealisation, but the salient feature is that there are no intermediate energy states between E and E + E (or vice versa); this is Bohr’s postulate of stationary states. And the lack of such intermediate states can be represented by a step function.

10 It

is not unusual to consider systems where the Hamiltonian is a continuous function of time. Such a representation is in fact to treat the interaction between a field potential and the quantum system classically. It may be a reasonable approximation, but it is not consistent with the fundamental principle of quantum theory that interactions are discretised.

16.9 A Semi-formal Derivation of Collapse

251

The conclusion to be drawn from Planck’s discovery is thus that discrete exchanges of energy between physical systems occur frequently, whether observed or not. An interaction between a measurement device and a measured object is an exchange of energy between them, either the device takes up energy from the measured object or vice versa. What distinguishes measurement interactions from other interactions is that the state changes of the measurement device is, or can be, observed by humans. So the collapse of the wave function during measurements of the first kind is not something specific for measurements. (And already the existence of measurements of the second kind, where no collapse is registered, indicates that the measurement qua measurement cannot be the cause of the collapse.) The fundamental cause of the collapse is quantisation of interaction.

16.9 A Semi-formal Derivation of Collapse So far I have given an informal argument for the thesis that the collapse of the wave function during a measurement of the first kind is a consequence of discreteness of interaction. I will now give a more formal proof, albeit not fully formal in the logical sense. We represent states of quantum systems by rays in Hilbert spaces and observables are represented by Hermitian operators acting on vectors (rays) in Hilbert spaces. Given this mathematical structure, we can derive collapse from the empirical fact that interactions between quantum systems are discretised. Before doing the derivation we recapitulate the characterisation of the core concepts and the postulates needed. Definitions: 1. Interaction: An interaction is an exchange of energy between two distinct physical systems. 2. Discretness: An exchange of a conserved quantity between two systems is discrete iff it occurs in steps that have a minimum magnitude >0. 3. Collapse: A collapse is a state change of a system that is discontinuous, indeterministic and irreversible. 4. Measurement: A measurement of the value of an observable A attributed to a system S is an interaction between S and a measurement device M, such that A gets a definite value which can be inferred from the pointer state of M. A measurement of the first kind is a measurement of an observable such that its corresponding operator is not an eigenoperator to the state of the system before the measurement.

252

16 The Measurement Problem

Premises: 1. Disc: All exchange of energy between different systems occur in discrete steps E = hν, h = 6, 62 ∗ 10−34 Js. 2. Correspondence: For any two Hermitian operators A and B acting on a Hilbert space H and replacing classical variables A and B it holds that [A, B] = i h{A, B} ¯

(16.11)

where {A, B} is the Poisson bracket for the pair (A, B). 3. Poisson: There are pairs of classical variables (A, B) such that the Poisson bracket {A, B} = 0. The first premise is an empirical result, whereas Correspondence is, as already pointed out, a heuristic principle for the construction of the mathematics of quantum theory, and Poisson is a well known feature of classical mechanics in the Hamiltonian formulation. These are the three premises in the derivation. Derivation of Collapse First we will derive a well known theorem of quantum theory (as usual assuming that state functions ‘live’ in Hilbert spaces) from Correspondence and Poisson. We proceed by first stating four lemmas. We work in the Heisenberg representation which constant eigenfunctions and time dependent operators. Lemma 1 The operators A and B replacing a pair of classical variables A and B for which it holds that {A, B} = 0 don’t commute.

This is an immediate consequence of Correspondence and Poisson. Lemma 2 If two operators don’t commute, there exists no complete set of functions spanning a Hilbert space on which the operators operate, such that all functions in the set are eigenfunctions to both operators.

Proof by contraposition Suppose (1) that  is any function belonging to a complete set spanning a Hilbert space H, and (2) that  is an eigenfunction to two operators A and B with eigenvalues a and b respectively, i.e., A = a and B = b. Then [A, B] = AB − BA = Ab − Ba = bA − aB = ba − ab = 0. Since this is valid for all functions in the set spanning the Hilbert space, we have that A and B commute. Hence if they don’t commute, there exists no complete set of eigenfunctions to both operators. Lemma 3 If a system S is in state  being an eigenstate to an Hermitian operator A, but not to another Hermitian operator B, such that [A, B] = 0 then, if S evolves into an eigenstate φk to B, it will make a discontinuous state change.

Proof Suppose  = bi φi is an eigenstate to an operator A and {bi } is a complete set of eigenstates to B. A continuous evolution of the state  is represented by a unitary ¯ Such operators are linear: U(t) = operator U(t) = exp(−iHt/h). U(t) bi φi = U(t)bi φi . In other words, all components in the superposition exist at all times for which U is defined. That means that the system will never by a

16.9 A Semi-formal Derivation of Collapse

253

continuous evolution enter into one of the eigenstates to B. Hence if it has evolved into an eigenstate φk to B, this state change was discontinuous. Lemma 4 If the state evolution  = is indeterministic and irreversible.



bi φi → φk is discontinuous, then this state change

Proof If φk is one of several non-zero components in bi φi then the probability that  will change into φk is |bk |2 < 1. Hence this change is indeterministic. By the same argument the change from φk to  is indeterministic. Therefore the state change from  to another state φk is also irreversible.11 Since a discontinuous, indeterministic and irreversible state change is a collapse, by modus ponens we have from Lemmas 3 and 4: Theorem If [A, B] = 0, and the system S is in an eigenstate to A, S will collapse when changing into an eigenstate to B. This is of course no news; the purpose of this derivation is simply to show that it is derived from Correspondence and Poisson. Now it is time to consider what can be inferred from the empirical premise Disc. Using Lemma 1 we have by Modus Ponens: Conclusion 1 If Disc, then there exists pairs of Hermitian operators (A, B) acting on the Hilbert space H and replacing classical variables (A, B) such that [A, B] = 0. By Conclusion 1 and Theorem we have. Conclusion 2 If Disc, then if a system S is in an eigenstate to an operator A, it will collapse when it evolves into an eigenstate to another operator B for which it holds that [A, B] = 0. Since a measurement on a system S is an exchange of energy between S and a measurement device M, the inference from Disc to Collapse is valid also for those interactions being measurements: Conclusion 3 If Disc, then, if a system S is in an eigenstate to an operator A and an eigenvalue to an operator B, such that [A, B] = 0, is determined in a measurement, S will collapse during that measurement. It is thus shown that the projection postulate is not needed as an independent assumption, it is replaced by Disc. But Disc is not any new postulate, it was the starting point of quantum theory and has been accepted as true for more than 100 years. We have derived collapse from the fundamental empirical fact Disc, i.e., that interactions between systems are discretised, taking Correspondence and Poisson for granted. Disc is the basic empirical postulate of quantum mechanics.

11 Cf.

the notion of dynamical reversibility defined in Sect. 13.5. There is a stronger notion of irreversibility, viz, complete impossibility of reversing a system to an earlier state. But that is not the sense in which the collapse is irreversible.

254

16 The Measurement Problem

Correspondence and Poisson may be viewed as representation rules, i.e., rules for constructing a mathematical representation of observational data in the quantum realm, as is clearly expressed in e.g., Dicke and Wittke (1960, 102–3). One may observe that quantum mechanics is constructed so as to have classical mechanics as the classical limit, as is clear form using Poisson and Correspondence as premises. This is necessary since a intersubjectively observable state of a measurement device is a value of a classical variable. This was also Bohr’s position: [H]owever far the phenomena transcend the scope of classical physical explanation, the account of all evidence must be expressed in classical terms. The argument is that simply by the word “experiment” we refer to a situation where we can tell others what we have done and what we have learned and that, therefore, the account of the experimental arrangement and of the results of the observations must be expressed in unambiguous language with suitable application of the terminology of classical physics. (Bohr 1951, 209)

16.10 Summary A measurement is necessarily an exchange of energy between the measurement device and the measured object, alternatively a secondary object. An exchange of energy is a discontinuous state change, a consequence of the fact that Planck’s constant is non-zero. No continuous function or operator can therefore correctly describe the time evolution of the measured object during such a measurement. Von Neumann represented the interaction between measurement device and measured object by the tensor product of their respective wave functions. This representation does not give us any information about the energy exchange between the interacting objects, since the tensor product is one term referring the compound system, which thus is treated as one single object. In order to mathematically represent internal dynamics, interactions, in this compound system, one needs an individual term for each part in the interaction. It was von Neumann’s misrepresentation of the measurement interaction that forced him to add the projection postulate.

Chapter 17

What Is Spacetime?

Abstract The topic of this chapter is the debate between adherents to respectively relationalism and substantivalism in the settings of relativity theory. In special relativity theory one may consistently hold that spacetime intervals are mere attributes of pairs of material objects. Even though spacetime often is treated as an object in relativity theory, there is no reason to think that it is a physical object; we may very well accept it as a mathematical object without concluding that this mathematical object has a counterpart in the physical world. By the same argument one may conclude that even if string theory is a nearly correct theory, the fact that there are 10 (or 11) spacetime dimensions in this theory doesn’t say anything about the number of spacetime dimensions in the real world. When we go to general relativity we have no obvious way to distinguish between object and attribute. Einstein’s equation, the fundamental principle in general relativity, tells us that a function of the spacetime metric is equal to the matter-energy tensor. The equation merely relates two quantities to each other. One is then inclined to ask which object are these quantities attributes of and the answer is that general relativity doesn’t make any distinction between object and attribute. One may say that either we can treat matter-energy distribution as a description of what exists and the function of the spacetime metric an attribute of that object, or reverse and hold that spacetime with its metric is what exists. The structure of general relativity transcends the structure of our common sense thinking in terms of objects with attributes.

17.1 Introduction The ontology of space and time has been a focus of philosophical debate for centuries; are space and time existing things, some sort of substances, a doctrine called substantivalism, or merely a means for describing the motion of a portion of matter relative to other such portions, which is called relationalism. The fusion of space and time to spacetime in relativity theory has reshaped the debate but not decided the matter.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_17

255

256

17 What Is Spacetime?

At first sight one might have thought that relativity theory should have handed victory to relationism, but on second thought one realises that it is not to be expected that an empirical theory by itself should decide a question about ontology. And the debate is more lively than ever. A couple of decades ago Rovelli wrote: I believe that we are going through a period of profound confusion, in which we lack a general coherent picture of the physical world capable of embracing what, or at least most of what, we have learned about it. The “fundamental scientific view of the world” of the present time is characterized by an astonishing amount of perplexity, and disagreement, about what time, space, matter and causality are. (Rovelli 1997, 180)

I find Rovelli’s characterisation correct, the debate about time, space, matter and causality is as intense as ever. In this chapter I will spell out what a thorough empiricist and nominalist view of spacetime and matter looks like. The ontological question about spacetime might be stated as: is there any such thing as a geometrical object spacetime in the real world? But, then, what do we mean by ‘geometrical object’? One should clearly distinguish between two senses of ‘geometry’, mathematical geometry and physical geometry. This distinction is not upheld as clearly as it ought to be done in philosophical debates about space and spacetime. Mathematical geometry is a purely a priori inquiry. It is based on axioms which implicitly define some mathematical objects, see the discussion in Chap. 4. Physical geometry, on the other hand, is a theory where some geometrical concepts, such as POINT and DISTANCE is given physical meaning. For example, the spatial distance between two points is in physical geometry given in terms of the times it takes for photons to travel between these points. The ontological question is not whether there are geometrical objects in mathematics; that goes without saying. The question is whether one can identify some entities in the physical world, spacetime and spacetime intervals, by only describing their geometrical properties, i.e., without implicitly referring to material objects? In other words, is there in the physical world any such thing as an independently existing geometrical object, spacetime? Mathematical geometry can be done a priori, it has nothing to do with the empirical world. And the mathematical object called ‘spacetime’, a differentiable 4dimensional manifold with a metric, M, gμν , is useful for calculational purposes. Statements about this mathematical object are truth-apt. But does this mathematical entity correspond to, or represent, anything in the physical world? Empiricists are prone to answer ‘No’ and I wholeheartedly agree. There is no reason to think of spacetime as part of our physical ontology. I recognise it as nothing but a mathematical entity. However, matter-energy distribution and spacetime geometry are very closely related according to GTR. Viewing spacetime geometry as merely an attribute to portions of matter is not unproblematic because spacetime metric is part of the dynamics in GTR. It is not uncommon to describe this intimate relation in causal terms: the metric of spacetime affects matter. However, this formulation cannot be taken literally as description of a causal relation between two independently existing

17.2 The Role of Rods and Clocks in Relativity Theory

257

things. How, then, should we characterise the relation between matter-energy and spacetime metric? How to give an ontology for GTR? The problem was already discernible at the start relativity theory. Einstein based it on two primitives, rods and clocks by which operational meanings of DISTANCE and TIME were given. One might ask, what else could he have done in order to give empirical content to the concepts TIME and SPATIAL DISTANCE? But Einstein felt uneasiness in taking rods and clocks, i.e., material bodies, as primitive in relativity theory.

17.2 The Role of Rods and Clocks in Relativity Theory We regularly refer to space and time intervals in physics, but these cannot be identified independently of material objects, which is an argument for relationism. Isn’t that in conflict with the conclusion in Sect. 9.3 that identity criteria for bodies are given in terms of genidentity, continuous trajectories in space and time? Doesn’t that presuppose that points in space and time can be identified independently of bodies? No. The conclusion in Chap. 9 was that the predicates MASS, TIME INTERVAL and DISTANCE are mutually dependent on each other. And points in space and time are entities that pairwise may satisfy the predicates TIME INTERVAL and DISTANCE respectively.1 Hence, they satisfy the demand of being values of variables, which is a necessary condition for being treated as existing entities. Since identity criteria for spacetime points can be given, I have no problem with spacetime points in my ontology. But their identity criteria are ultimately given in terms of bodies, hence relationalism. I have repeatedly argued that from an epistemological point of view, bodies are primary. In relativity theory this view is explicit, spatial distances and time intervals are given in terms of rods and clocks.2 But as already noted, Einstein was a bit uneasy with this view of taking rods and clocks as primitive elements of relativity theory: It is . . . clear that the solid body and the clock do not in the conceptual edifice of physics play the part of irreducible elements, but that of composite structures, which must not play any independent part in theoretical physics. But it is my conviction that in the present stage of development of theoretical physics these concepts must still be employed as independent concepts; for we are still far from possessing such certain knowledge of the theoretical

1 More precisely, the predicate DISTANCE is the three-place predicate ‘The distance between . . . and

. . . .. is . . . .m’ and the first two term positions are to be filled with points in 3D space. In relativity theory this predicate is replaced by SPACETIME INTERVAL and the term positions are to be filled with events in spacetime. 2 As was observed in Sect. 9.5, in the definitions of SI units rods are no longer necessary since the meter is defined in terms of the time unit. Only clocks are needed as fundamental. But that is a consequence of relativity theory!

258

17 What Is Spacetime?

principles of atomic structure as to be able to construct solid bodies and clocks theoretically from elementary concepts. (Einstein 1921(1954))

Similarly, Brown concludes: In a nutshell, the idea is to deny that the distinction Einstein made in his 1905 paper between the kinematical and dynamical parts of the discussion is a fundamental one, and to assert that relativistic phenomena like length contraction and time dilation are in the last analysis the result of structural properties of the quantum theory of matter. (Brown 2005, preface)

It is indeed true that length contraction and time dilation are consequences of properties of matter. Length contraction and time dilation are functions of γ =  1 − v 2/c2 , where v is the velocity of a moving object in the observer’s frame of reference and c is the velocity of light. The constant c is in turn determined by c = √ 1/ 0 μ0 , derivable from Maxwell’s equations, where 0 and μ0 are respectively the dielectricity constant and magnetic permeability of vacuum. These constants contribute to determine the binding forces between atoms and hence contribute to determine the lengths of rods and the oscillation rate of clocks. Our length and time units are determined by certain features of matter, which in turn in part are determined by natural constants attributed to vacuum, i.e. non-matter. Once again we see how closely related the concepts of TIME, DISTANCE and BODY are. Their mutual dependencies, described in Chap. 9, is valid also in relativity theory. It may seem that there is a tension, if not an outright contradiction, between saying that lengths and time intervals ultimately depend on structural properties of matter and ‘It is on the possibility of measuring distance that ultimately the whole of dynamics rests.’ (From the first quote from Brown at the beginning of this chapter). But this tension is dissolved once we consider how we in practice measure distances and time intervals. We must base our knowledge about the natural world on something we can agree upon irrespective of theoretical convictions and that are observation reports of positions and motions of bodies. Such reports are based on agreement about the use of measurement devices such as rods and clocks. Agreement about these matters is pivotal and the ultimate basis for physics and for all empirical science, as argued in Sect. 3.8. Scientific knowledge is a communal thing, not primarily the content of subjective mind states. So bodies, in particular those used as measurement devises, are primitive from an empirical point of view. This does not conflict with the fact that the structures and properties of bodies, among them rods and clocks, are derived from our theory of matter, i.e. quantum theory. There is a kind of circularity, but it is not vicious. We have built our theory of matter upon observations of pointer states of measurement instruments, i.e., bodies. Then we use our theory of matter to infer that e.g., the length of bodies varies with temperature and with other external circumstances. Hence we need to impose e.g. temperature conditions on bodies used as length units. This in turn is a reason to improve measurements by replacing the meter standard in terms of a physical body with something more stable. Since 1983 the length unit is instead defined as the distance light travels in vacuum during a specified time. One may observe that this

17.3 GTR: The Relation Between Spacetime Structure and Matter Distribution

259

conclusion was in part based on observations and measurements using the length unit defined by the meter prototype. The new definition utilises other bodies, i.e., clocks. The history of time measurements further illustrates my point. The time unit has been changed several times since the beginning of astronomy. The first unit, the sidereal year, was for a long time sufficient for astronomy. With the advance of physics one realised that the length of the sidereal year was a function of several slightly varying factors, hence there is reason to establish a more stable time unit. In short, using the sidereal year as a time unit in the calculation of the motion of the earth in classical mechanics one realised that this unit, the time for the orbit of the earth around the sun, was not constant. So a better time unit is needed. Presently we define the time unit in terms of a number oscillations in the radiation emitted in the transition between two hyperfine levels of Cesium-137. This definition doesn’t depend on any other physical quantity, it utilises the mathematical concept of natural number and presuppose only our ability to count. Summarising, we must base our empirical theories on observations of bodies and these observations must be agreed upon no matter which physical theories one believes. If no such agreement is possible, nothing resembling modern science is possible. The choice of ‘base objects’ is informed by our best theories, and may be changed with the advancement of science. There is no vicious circularity in this process.

17.3 GTR: The Relation Between Spacetime Structure and Matter Distribution According to general theory of relativity, GTR, the metric gμν of the spacetime manifold is not any background for descriptions of the dynamics of matter, but part of the dynamics. The metric depends on the matter-energy distribution and, conversely, the matter-energy distribution depends on the metric, as can be seen from Einstein’s equation: 1 8π G Rμν − gμν R + gμν  = 4 Tμν 2 c

(17.1)

where the expression on the left hand side is a function of the spacetime metric gμν and Tμν on the right hand side is the matter-energy tensor. This equation does not say anything about which is the entity and which is its attribute, nor does it indicate which is more primary or primitive in some sense. The most obvious reading is that

260

17 What Is Spacetime?

we have two quantitative expressions3 that are declared to be equal.4 No objectattribute relation can be inferred from this equation. One is justified to conclude that spacetime metric and matter-energy distribution are two sides of the same coin. But what is the coin? Is it the 4-dimensional manifold? No. Is it spacetime? No. Here is how Rovelli explains what general relativity says: In general relativity, the metric and the gravitational field are the same entity: the metric/gravitational field. In this new situation, . . . . . . .what do we mean by spacetime and what do we mean by matter? Clearly, if we want to maintain the spacetime-versus-matter distinction, we have a terminology problem. (Rovelli 1997, 188)

I agree completely. It is common to think that matter is the source of the gravitational field, thus inviting a distinction between two kinds of entities, portions of matter and fields. Clearly, such a conception cannot be justified by reference to Einstein’s equation. But source-talk may be viewed as metaphorical. Equation 17.1 tells us that the matter-energy tensor is identical to a function of the metric. So why postulate two kinds of entities? Why think that matter is the source of the gravitational field, why not say the opposite? Or why not simply say that there is only one kind of objects in the world, matter fields, which can be given either a description using the mass-energy tensor, or in terms of the metric tensor and equation 17.1 gives us the translation between them, just as Rovelli says. Matter fields may interact when contiguous (being at the same place, taken in a sense relative to the scale). Talk of matter fields interacting with each other presupposes a method for distinguishing them. What could that be? And how to make sense of the notion of two matter fields being at the same place? As in the case of electromagnetism and quantum mechanics, empirical use of the theory requires that we use observable bodies, identified by indexicals together with pointing gestures, as reference points in descriptions of fields and their interactions. Identifying a body is to identify it at a particular place at a particular time. (This was the conclusion in Chap. 9; BODY, PLACE and TIME are mutually dependent predicates.) Once we have this basis in place we can introduce matter fields as entities and then declare clocks and rods as matter fields satisfying certain specific conditions. The fact that attributes of rods and clocks can be derived from fundamental postulates in relativity theory does not conflict with holding that rods and clocks are primary from an epistemological point of view, as I earlier have pointed out. I have no objection against saying that spacetime is a differentiable 4dimensional manifold with a certain metric, M, gμν , since this is a mathematical entity. It is useful in physics, but there is no reason to suppose it corresponds to anything in the natural world, if the presumed corresponding physical entity is

3 Or, rather, since the tensors are symmetric 4×4 matrices, thus containing 10 independent numbers,

we have 10 independent equations each equating two quantitative expressions. 4 The same is true of all equations in physics, they state identities between quantitative expressions.

No equation tell’s by itself which objects are attributed these quantities, see the discussion in Sect. 10.3.

17.4 Spacetime Functionalism

261

taken to be something distinct from matter-energy. After all, we are accustomed to using lots of mathematical objects without having any reason to assume that these mathematical objects correspond to things in the real world. Rovelli again: If we are forbidden to define positions with respect to external objects, what is the physical meaning of the coordinates x and t? The answer is: there isn’t one. In fact, it is well known by whoever has applied general relativity to concrete experimental contexts that the theory’s quantities that one must compare with experiments are the quantities that are fully independent from x and t. Assume that we are studying the dynamics of various objects (particles, planets, stars, galaxies, fields, fluids). We describe this dynamics theoretically as motion with respect to x and t coordinates, but then we restrict our attention solely to the positions of these objects in relation to each other, and not to the position with respect to the coordinates x and t. The position of any object with respect to the coordinates x and t, which in prerelativistic physics indicates their position with respect to external reference system objects, is reduced to a computational device, deprived of physical meaning, in general relativity. But x and t, as well as their later conceptualizations in terms of reference systems, represent space and time, in pre-relativistic physics: the reference system construction was nothing but an accurate systematization, operationally motivated, of the general notion of matter moving on space and time. In general relativity, such a notion of space and time, or of spacetime, has evaporated; objects do not move with respect to spacetime, nor with respect to anything external; they move in relation to one another. General relativity describes the relative motion of dynamical entities (fields, fluids, particles, planets stars, galaxies) in relation to one another. (Rovelli 1997, 190)

Thus Rovelli views GTR as vindicating Leibniz relationalism. I agree; we have no reason to postulate a separate physical entity, curved spacetime; it is a mathematical entity. Adding the postulate that this mathematical entity corresponds to something in the real world doesn’t change any predictions from GTR. So viewing spacetime as a physical entity is a superfluous piece of metaphysics.

17.4 Spacetime Functionalism In ‘Minkowski Space-time: A Glorious Non-Entity’ (Brown and Pooley 2006), the authors argued against viewing spacetime as a kind of substance and Brown (2005) contains an extended argument for the same conclusion. This has triggered some opponents to develop a view they call ‘spacetime functionalism’. Eleanor Knox, for example, reads Brown’s book as an argument for relationalism, which she finds an implausible position when it comes to general relativity. She writes: Many see strong hints in Brown’s work that he intends to advocate a new form of relationism. In particular, a focus on the explanatory status of the Minkowski metric makes many of Brown’s arguments look like attempts to block an abductive inference to the existence of Minkowski spacetime. For the most part, both arguments against Brown’s view, and extensions of the view, take this to be the obvious reading. But not all aspects of Brown’s book suggest relationism. I’ll argue in what follows that, while the standard reading makes a great deal of sense when applied to his statements about Minkowski spacetime, it commits Brown to an implausible relationism in the case of general relativity. This leaves us with another option: Brown’s dynamical relativity is best read not

262

17 What Is Spacetime?

as a salvo in the substantivalism wars, but rather as an account of what counts as spacetime structure in a given physical theory. (Knox 2019, 119)

As its name suggests, spacetime functionalism identifies spacetime with a function: “the metric field is spacetime because of what it does [. . .] and not by way of what it is” Knox (2019, 120). One wonders what could be meant by the expression ‘the metric field is spacetime because of what it does’? It seems implausible to interpret Knox as thinking of spacetime as a mathematical object, for then one would ask, how could a mathematical object do anything? Mathematical objects are no agents. Perhaps the intended meaning is that we humans do something with the metric field, viz., we use it for predictions and calculations? But it seems more plausible to read Knox as taking for granted that the metric field, identified as spacetime, is a physical entity. Another proponent of functionalism writes: I find the overall framework of spacetime functionalism attractive. Many geometrical structures are used in physics – state space, momentum space, configuration space. Not all of these deserve the name ‘spacetime’ (Brown 2005, Section 8.2). Nor is it plausible that the difference between spacetime and these other geometries is a primitive one. It must have something to do with each geometric structure’s role in the laws. As Knox rhetorically asks, “If our conceptual grasp on spacetime is not exhausted by the role it plays in our theory, what might the extra ingredient be?” (Knox 2019, 121, 9–10) I can’t think of anything. (Baker 2019)

The last reflection in the quotation above indicates that Knox and Baker do not realise the consequence of Löwenheim-Skolem’s theorem called Russell’s insight: ‘The semantic power of language to represent derives from the semantic power of context-sensitive expressions’ (Luntley 1999, 285), see Sect. 3.5. As I have pointed out several times in this book, we need indexicals and gestures in order to connect a theory to something in the real world. Knox and Baker, on their side, assume that the meaning of ‘spacetime’ is fully given by its role within our theory. That is true so long as we take for granted that spacetime is a purely mathematical object. But wasn’t the very goal of functionalism to reject that view? In Sect. 7.5 I discussed how models and theories get physical meaning and concluded that “the physical meaning of the terms ‘model’ and ‘theory’ as these words are used by physicists, comes from the units attached to the variables occurring in the expressions used. These units are defined, either directly, or in most cases indirectly, via descriptions of measurement procedures.” And we use indexicals and pointing gestures when identifying what these descriptions describe. The Lorentzian manifold M; gμν  is a mathematical object used for calculating spacetime intervals. Spacetime intervals are expressed in length units such as meters or lightyears. These units are not parts of the Lorentzian manifold with its metric. It is we humans that attach units to the numbers calculated with the use of the Lorentzian manifold when reporting the result of measurements and calculations. And this is, again, necessitated by a consequence of Löwenheim-Skolem’s theorem: a theory in itself cannot tell what it is about. We humans determine its physical meaning by using non-theoretical resources, indexicals and gestures when applying the theory in concrete cases.

17.5 String Theory and Spacetime

263

17.5 String Theory and Spacetime 17.5.1 The Dimensionality of Space Until the start of string theory around 1980 no one seriously contemplated the possibility that space has more than three dimensions and spacetime four. (Relationalists interpret the claim that space has three dimensions as short for the claim that the description of the motion of material objects require three independent coordinates, thus not postulating any physical object space.) But string theory changed this. Consistency requires that string theory is done on a 10-dimensional manifold and it has often been assumed that this entails that the physical world has more dimensions than the ‘observable’ three + time.5 The additional six dimensions are ‘compactified’, which means that they are rolled up in very small circles of size comparable to the Planck length. Hence these extra dimensions will never be ‘observed’, in the sense that we will never be able to observe objects whose trajectories requires 10 independent coordinates. Since a closed string ‘living’ on such a compactified dimension is attributed length, these strings are treated as real physical objects. I have two misgivings about this train of thought. The first and most obvious one is that the reasoning has a clear tone of substantivalism; the 10-dimensional spacetime is often talked about as if it were a physical entity. I have above given my reasons for rejecting that idea in the case of 4D spacetime and they apply independently of its dimensionality. We certainly need the concept of spatial distance when talking about the motions of bodies, but there is no need to conceive distances as entities, no matter the dimensionality of the position function. The second doubt is that even if we accept that spacetime is a physical entity, there is no reason to think that a 10-dimensional differentiable manifold, the mathematical entity at use in string theory, must correspond to a physical entity having the same number of dimensions. Why think so? Isn’t it possible that some of these dimensions in the manifold merely represents internal degrees of freedom? A simple parallell from classical mechanics might help explain the idea. A full description of the total state of a molecule, for example NH3 , requires much more than giving its position and velocity in 3D space; in addition we need to state its vibration and rotation states, and these are internal degrees of freedom. No one would infer that space has more than 3 dimensions just because the state evolution of many molecules requires more than 3 spatial and 3 linear momentum coordinates. The number of independent degrees of freedom is not the same as the dimensionality of space. Since I do not recognise space or spacetime as physical entities, nor spatial distances as properties conceived as universals, of anything, the question ‘How 5I

have put quotation marks around ‘observable’ to indicate that dimensions of space are not really observable. What we observe are bodies moving around and we need three independent coordinates to describe trajectories of bodies.

264

17 What Is Spacetime?

many dimensions has spacetime?’ may simply be rejected as being based on a false presupposition. But one may rephrase the question as ‘How many independent parameters are needed for describing the motion of physical objects?’ No one questions that a description of the trajectory of a body or particle needs three spatial coordinates + time. Vibrations and rotations of extended bodies do not affect their trajectories. Hence the degrees of freedom associated with such internal states give us no reason to assume that objects move from place to place in more than three dimensions. The empirical meaning of the claim that space is three-dimensional and time one-dimensional is that trajectories in space are tree-dimensional (and trajectories in state-space are six-dimensional, since the state of a particle is given by its position and momentum in 3D space), taking time as the independent parameter. Analogously, if strings are real physical objects which satisfy the fundamental equations of motion in string theory, we should not be astonished that more degrees of freedom are needed. Strings, if they exist, are extended objects, hence they may be ascribed internal degrees of freedom. But that is no reason to assume that there is an entity spacetime, no matter what number of dimensions we attribute to this purported entity. Some string theorists claim that the distinction between motion in spacetime and internal degrees of freedom is not relevant at the scales of string theory; all dimensions should be treated similarly. Well, that suits me fine, since I recognise spacetime as nothing but a mathematical entity, as already said. Mathematical entities are useful for calculations, but they do not tell us anything about what exists in the world. So I have no objection against treating all 10 dimensions in the manifold used in string theory similarly. But I see no reason to assume that there is a physical entity corresponding to a mathematical object spacetime, whatever its dimensions. String theory doesn’t say anything about the dimensionality of the number of independent parameters needed for defining continuous trajectories of observable physical objects, which would be the dimensionality of a physical space, if such a thing would exist. Another argument for the view that the 10-dimensional manifold needed in string theory is just a mathematical object is the existence of dualities, in particular the AdS-CFT correspondence. It says that an Anti-de Sitter (AdS) space in n dimensions is physically equivalent to a conformal field theory (CFT) on its hypersurface having n − 1 dimensions. So the fact that string theory is formulated in a 10-dimensional AdS space doesn’t say anything about the dimensionality of the physical space, since there is a dual conformal field theory formulation, which have exactly the same physical content. This strongly suggests that the dimensionality of the mathematical space in which a theory is formulated has no direct consequences for the number of dimensions needed in a dynamical theory of strings. (The argument was already discussed in Sect. 7.2.2.) The topic is further discussed in e.g. Matsubara and Johansson (2018).

17.5 String Theory and Spacetime

265

17.5.2 String Theory and GTR String theory has still, after 40 years, not been able make any new testable predictions and one may be forgiven for being skeptical about its testability at all. The relevant length scale of string theory is much below what we are able to observe. It has been said that an experimental test would require an accelerator bigger than the known universe. However, some argue that we have at least one piece of empirical evidence for string theory since we can derive GTR from it. Thus Zwiebach writes: String theorists sometimes say that string theory already made at least one successful prediction: it predicted gravity! [. . .] There is a bit of jest in saying so—after all, gravity is the oldest known force in nature. I believe, however, that there is a very substantial point to be made here. String theory is the quantum theory of a relativistic string. In no sense whatsoever is gravity put into string theory by hand. It is a complete surprise that gravity emerges in string theory. Indeed, none of the vibrations of the classical relativistic string correspond to the particle of gravity. It is a truly remarkable fact that we find the particle of gravity among the quantum vibrations of the relativistic string (Zwiebach 2009, 11)

This would have some force if string theory were the only known way to derive GTR. But Jaksland (2019) has shown that GTR can be got in several other ways. The obvious conclusion is that the derivability of GTR from string theory has no evidentiary value. Jaksland writes: Based on three case studies, the paper finds reasons to expect that general relativity will prove to be multiply realizable among possible theories of quantum gravity; known or unknown. The paper argues on these grounds that we should not be too impressed by the recovery of general relativity in theories of quantum gravity.

Jaksland continues: If this holds, i.e., if general relativity is multiply realizable in quantum gravity, then the “remarkable fact that we find the particle of gravity among the quantum vibrations of the relativistic string” and that “general relativity is naturally incorporated in the theory” are no guarantee that the stringy account of the world is on the right track! More generally, the present paper argues that there are indications that theories of quantum gravity can recover general relativity without their posited microstructure being even approximately that of the actual world.

Jaksland’s argument is convincing and my conclusion is, so far, that string theory is nothing but mathematics. Adding string theory to our two fundamental theories, the standard model and GTR, has no testable consequences not derivable from GTR or the standard model alone. There is no object in string theory that even indirectly can be connected to anything in the observable world. Thus string theory fails the criterion for being an empirical theory. In conclusion, we may once again rehearse the lesson from Löwenheim-Skolem’s theorem: no theory by itself can tell us what it is about. We need observable things and events identified by indexicals together with pointing gestures. Strings, branes and other theoretical entities in string theory lack connections, so far as we know, to such things. In this respect it differs profoundly both from GTR and the standard

266

17 What Is Spacetime?

model of QFT. These latter theories are also highly theoretical and use a lot of abstract mathematics. But in GTR and QFT it is possible to identify some parts of the mathematical apparatus with observable things and events and that gives those theories physical meaning.

17.6 Summary When discussing our understanding of relativity theory it is of utmost importance to distinguish between mathematics and physics, in particular to distinguish between mathematical and physical geometry. The fundamental mathematical concepts used in GTR are SPACETIME, DIFFERENTIABLE MANIFOLD and METRIC. All three are primarily mathematical entities and the core issue is whether they represent, or refer to, physical counterparts. There are no good reason to identify any of these with entities in the physical world. What exist according to GTR are the usual physical things, particles, bodies, stars, galaxies, etc. They can be described using either matter-energy predicates, or functions of the spacetime metric, and Einstein’s equation gives us the connection between these descriptions. String theory is nothing but mathematics; so far, no observable consequences can be drawn from string theory. This is not only a deplorable fact concerning its lack of evidence; it is also the reason why no physical meaning can be attributed to string theory.

Chapter 18

Summary and Conclusions

Abstract This chapter summarises the main line of thought of the entire book. The starting points are four: 1. Empirical evidence is the only evidence there is, or could be, for any scientific theory. 2. Nominalism: assuming universals is superfluous. 3. A thoroughly Kantian view on the relation between objects and cognitive and linguistic acts. A perspective from outside, a God’s eye point of view on the relation between things in the world and the human mind is impossible. Objects are discerned and distinguished from each other in our cognitive and linguistic acts. The act is primary in relation to its object. This applies to all cognition of objects, visible and invisible, concrete and abstract ones. 4. Lövenheim-Skolem’s theorem: no theory can by itself say what it is about. In order to give a definite interpretation of a theory in terms of objects, events and states of affairs in the real world, we need to use predicates of the theory together with indexicals and gestures; this determines the references of some singular terms.

As I wrote in the preface, there are two aims of this book: (i) to present a thoroughly empiricist position in epistemology, and (ii) to apply it in philosophy of physics. My version of empiricism is presented at the end of Chap. 3. When a name for this position is convenient it may be called ‘nominalistic empiricism’. The two core elements of this position are: (i) empirical evidence is the only evidence there can be for any theory, and (ii) nominalism. Empirical evidence consists, ultimately, of token observation statements agreed upon by people on the spot. Subjective experiences cannot be evidence for anything just because they are subjective; it is when we express our experiences using declarative sentences we get something intersubjective, which may be agreed upon as true and as being evidence for this or that hypothesis or theory. Explanatory force, simplicity, coherence or other purported virtues of a theory are no reasons for belief in that theory. No such features can be empirical evidence.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1_18

267

268

18 Summary and Conclusions

Nominalism is the thesis that there are no universals. General terms may be true of a number of things, i.e., having non-empty extensions, but they lack reference. This thesis is motivated by Ockham’s razor. If we to a first order theory, which have a number of empirical consequences, add the postulate that some of its general terms refer to properties and/or relations, we don’t get any new testable predictions. So properties and relations are superfluous entities, we have no need for them. This version of nominalism does not generally exclude abstract things. Those abstract objects that satisfy the condition for being individuals are accepted, if quantified over in a theory we hold true. An important category of abstract individuals are numbers. This empiricist outlook is combined with (i) a Kant-inspired view on the relation between knowledge-acts and objects of knowledge, and (ii) a thorough application of Löwenheim-Skolem’s theorem to questions about relations between theory and reality. Kant held that objects are the results of our cognitive acts; the knowledge act, the judgement, is prior to the object of the act. The cognitive act is not a cognition of any pre-existing and definite object, instead the object is discerned and singled out as an individual in the cognitive act. Objects of cognition are in a sense our constructions. This is his famous ‘Copernican revolution’. The classical empiricist conception of cognitive acts as some kind of causal processes from the real objects to the mind is based on a God’s eye perspective on the relation between the world and the human mind. But it is the human mind that is taking this perspective, so it is an incoherent position. Modern metaphysical and scientific realists make the same mistake. Being an empiricist I endorse Kant’s view that objects are the results of our cognitive actions; objects are objects for us. Kant discussed mainly cognition of visible physical objects, but the conclusion is valid for the cognition of all things, concrete and abstract, visible or not. Therefore we should not think of the results of our cognition acts as re-presentations of objects. It is more correct to say that they are presentations. Translating this to semantics we get that truth is prior to reference, both in the epistemic and logical sense. Holding a sentence of the form ‘a is F’ as being true is to claim that there is at least one object satisfying F. When this general term has divided reference, (not being a mass term) we, implicitly or explicitly, adopt a principle of individuation among the objects satisfying it. In other words, use of general terms with divided reference is our way of structuring the world into distinct objects. An independent reason for viewing truth as primary is the context principle, first formulated by Frege: Only in the context of a complete sentence can a term have a well defined meaning, hence a determinate reference. Why then, bring Löwenheim-Skolem’s theory into the picture? The reason is mainly negative, viz., to reject a number of versions of the view that a physical theory is an abstract structure, often viewed as a set of models. If we start with this perspective, we may ask, how do we know that this abstract structure is about physical phenomena? How do we know that it is about anything at all? How do we compare its models to physical reality? Löwenheim-Skolem’s theorem tells us

18 Summary and Conclusions

269

that cannot be done without using some extra-theoretical items, i.e., indexicals used together with pointing gestures; using such devises we can establish, empirically, the reference relation between some singular terms and their references. This fits well with the thesis that empirical evidence at bottom consists of token declarative sentences uttered in concrete situations and combined with gestures so that people on the spot can agree about the reference of the indexicals used in those sentences. Mathematical knowledge and empiricism The topic of Chap. 4 is mathematical knowledge and the existence of mathematical objects. Two questions are important from an empiricist point of view: (i)how is it possible to get any knowledge about mathematical objects, and (ii) what is the relation between mathematics and physics? My answer to the first question is based on an application of the Kantian view on objects: mathematical objects, as all objects, are the results of human cognitive acts, therefore it is possible to obtain knowledge about them. So for example, when we hold true that every natural number has a successor, we thereby say that the predicate ‘natural number’ has non-empty extension. So when stating mathematical axioms we thereby affirm the existence of things satisfying these axioms. As to the second question, it is uncontroversial to view mathematics as a tool when doing physics; we use mathematics when calculating consequences of our hypotheses and often it is no problem to distinguish between mathematical and physical statements. But when using geometrical concepts the picture is sometimes blurred. For an empiricist it is crucial to observe that geometrical concepts such as SPACE, SPACETIME and DISTANCE can be given either a mathematical or a physical reading. We must carefully distinguish between mathematical and physical geometry. Having done that we may realise that there is no good reason for any empiricist to hold that e.g., SPACETIME is a physical entity. It is a useful mathematical entity without counterpart in the real world. Empiricism and general philosophy of science In Chaps. 5, 6, and 7 I have discussed some topics in general philosophy of science, viz., induction, explanation and scientific realism, from the perspective of my nominalistic empiricism. In Chap. 5 I argue that inductive thinking is a natural habit, and habits need no justification. Hence the demand for a general justification of induction is rejected, it is a rationalistic mistake to require justification of all beliefs. Goodman’s view that the induction problem should be formulated as the question which predicates are projectible and which are not, is a step forward, but he overlooked the fact that there are two predicates in an inductive conclusion. The question is what pairs of predicates are used in a successful inductive generalisation. In his example of green or grue emeralds, the question is which pairing of predicates can be expected to succeed in predictions, emerald, green or emerald, grue?

270

18 Summary and Conclusions

In Chap. 6 I discussed explanation, theory reduction and unification, the two latter usually regarded as forms of explanation. These topics do not belong to epistemology, since no form of explanation has evidentiary value according to empiricism. Still we often require explanations in science, that is also a natural habit. We may well ask for explanations and desire stronger and stronger unity in physics and in science in general. But success in these efforts cannot be regarded as evidence for anything. To topic of Chap. 7 was the realism-antirealism debate in current philosophy of science. Scientific realists minimally adhere to two doctrines, viz., (i) that scientific theories in the mature sciences are approximately true and (ii) central terms in these theories refer. I agree on (i) but not on (ii), in so far as ‘central term’ is understood as ‘general term’. Scientific realists have to my knowledge never made the distinction between singular and general term in their arguments for scientific realism, but it is clear from the debate that by ‘term’ they mean ‘general term’. A central topic in the debate between scientific realists and anti-realists is the force of the realist argument that the success of modern science is best explained by the theories being true. Anti-realists have responded by using the underdetermination argument. I find neither the inference to the best explanation, nor the underdetermination argument convincing. First, explanations have no evidentiary value. Second, the underdetermination argument takes for granted that two different but empirically equivalent theory formulations express different theories, a presupposition which I doubt. Using so called proxy functions, introduced by Quine, i.e., truth-preserving mappings from one set objects in one theory onto another set of objects in another theory, one may hold that these two apparently different theories as merely different theory formulations, provided they are empirically equivalent. The underdetermination argument has only force if one accepts the existence of universals, which I don’t. Another topic of debate is the question whether theoretical entities exist. This focus is in my view a mistake. The distinction observable/theoretical is, as van Fraassen has pointed out, a pragmatic one, it changes with time as technological applications of new scientific achievements become available. A more useful distinction is that between individuals and universals. This distinction is made in terms of the semantic distinction singular-general term, which can be applied universally to all declarative sentences that can be paraphrased in first order predicate logic. Individuals are the referents of singular terms occurring in true sentences, universals, if they exist, are the referents to general terms. And, to repeat, an adherent to nominalistic empiricism holds that universals are superfluous; postulating universals as referents to the general terms in a theory we hold true doesn’t change the theory’s predictive power. No loss is suffered by dismissing universals.

18 Summary and Conclusions

271

Empiricism in philosophy of physics The topic of Chap. 8 is causation in physics. Many philosophers hold that causal relations somehow make up the ‘glue’ that holds the world together. I disagree. First, in sentences of the form ‘x causes y’ ‘cause’ is a two-place predicate, and according to nominalism predicates have no references; hence there are no causal relations. Secondly, when we say of an event or state of affairs E1 that it is a cause, this should be understood as short for an expression of the form ‘E1 is the cause of E2 ’. The effect E2 is often not mentioned since it is taken to be known as background to the uttering of a token of this sentence. So we have no need for the one-place predicate ‘. . . .is the cause’ when ordinary discourse is regimented into first order predicate logic, hence we have no need to quantify over objects satifying it. There are no causes in my world-view. In Chap. 9 I argued three things. First that the connection between theory and reality consists of sentences expressed in concrete situations where several people on the spot can agree on the truth of the uttered sentence. The referent of the singular term or terms is in such situations identified by the use of indexical words joined with gestures. It is necessary and sufficient that people on the spot can agree on what thing one talks about when hearing expressions such as ‘this detector’ or ‘that instrument’. The objects talked about are primarily bodies. A strong argument for the necessity of use of indexicals and gestures is derived from Löwenheim-Skolem’s theorem. Secondly, I showed that the three concepts SPACE, TIME and BODY are mutually dependent; each of them are defined in terms of the other two. Furthermore, ultimate evidence in physics consists of observation sentences about bodies at times and places. Thirdly I argued that Gauss was right in claiming that the fundamental quantities of physics are DISTANCE, TIME and MASS. All other quantities can be defined in terms of these. The fact that one can, in modern physics, reduce for example distances to times, using the constancy of the velocity of light, does not refute Gauss’ claim, since the constancy of the velocity of light is a consequence of classical electromagnetism, it is not a basic observable fact. Chapter 10 contains my empiricist account of laws. The core problem with laws is to explain our intuition that laws are not only true, but necessarily so, in contrast to true accidental generalisations. My explanation is built on the analysis of epistemologically fundamental laws, viz., those laws that are implicit or explicit definitions of physical concepts. Non-fundamental laws are universal statements being logical consequences of the fundamental laws. The necessity of definitions and logical consequences of such definitions is unproblematic both for the empiricist and his opponents. No metaphysical assumptions about powers, necessitation relations, dispositions etc., are needed. In order to avoid delving into quantified modal logic, which entails a distinction between necessary and accidental attributes to objects, I construe necessity as a modifier to the semantic predicate ‘true’. The statement ‘The law of momentum conservation is necessarily true’, is paraphrased as ‘It is necessarily true: “In

272

18 Summary and Conclusions

all closed systems the total momentum is conserved.”’ By this move we block distribution of ‘necessarily’ into a quantified sentence, since the entire expression being the argument for ‘necessarily true’ is a name without any semantic parts. Chapter 11 is a discussion of the ontology of electromagnetism. The common view is that electromagnetism describes how charged particles interact with electromagnetic fields, thus assuming a double ontology of fields and particles. Such a double ontology is deeply problematic, which is shown by the fact that consistency requires introduction of so called self-fields and that charged particles interact with their own self-fields also in the absence of fields emanating from other particles. This must be wrong, and the conclusion is that in classical electromagnetism one can either quantify over particles and regard fields as mere mathematical auxiliaries to use in calculations, or to swap ontology and quantify over fields, viewing particles as mathematical auxiliaries. The fundamental laws of electromagnetism, Maxwell’s equations + Lorentz law, allow both ways of giving an ontology to the theory. In relativistic quantum electrodynamics this choice is however not possible. A convincing argument by Malament shows that a particle interpretation of quantum electrodynamics is impossible. Our only possible choice is the field ontology, hence we are brought to quantum field theory in which particles are field quanta, not individual things. So there is an ontological tension between classical and relativistic quantum electrodynamics. This tension is related to the measurement problem, if not at bottom the same thing. Chapter 12 is a defense of the propensity interpretation of transition probabilities in quantum theory. These transition probabilities can be calculated without knowledge about frequencies, whereas in all other cases we need knowledge about frequencies when determining propensities. Hence, there is no point of introducing the concept of propensity in those latter cases, it would be a superfluous addition to vocabulary. But transition probabilities in quantum mechanics is different. Chapter 13, about the direction of time, introduces the important distinction between time reversal symmetry and dynamical reversibility. Time reversal symmetry has no consequences for what can and what cannot happen in the world; it is merely a formal requirement, justified by the observation that the time reversal operator T: t → -t merely changes the direction of the coordinate axis labelled ‘time’. It is part of CPT symmetry, which is a consequence of the demand that the physical content of any theory should be invariant under a set of coordinate transformations. Dynamical reversibility, on the other hand, means that it is possible to reverse a real state change and force a system back to its initial state by some kind of intervention. Many arguments in the debate about the direction of time have conflated these two things. The direction of time is the fact that a lot of state transitions in the physical world are dynamically irreversible, and this enables one to order many sequences of states using the predicate ‘. . . . later than. . . ..’ without having any clock. The global direction of time consists of the amalgamation of many such overlapping sequences of states. There is no such thing as ‘time in itself’. As all other quantities, TIME is a general term lacking reference.

18 Summary and Conclusions

273

Chapter 14 is devoted to questions about identity and individuation in quantum theory. The three statistical distributions, Maxwell-Boltzmann, Fermi-Dirac and Bose-Einstein statistics give different principles for individuation among particles. Which statistics an ensemble satisfy can be decided by purely empirical means, hence empirical observations are used for answering questions of individuation and identity among macroscopic particles, fermions and bosons. Chapter 15 contains a discussion of the wave character of quantum systems. The conclusion is that all quantum systems behave as a sort of waves during propagation, but interacts as particles. This is illustrated and explained in this chapter. Chapter 16 is devoted to the measurement problem. We observe collapses, i.e., discrete, random and irreversible state changes during some measurements. Many philosophers think this cannot be really true about measurement interactions, they assume that all state evolution is continuous, deterministic and reversible. thus adhering to one or other no-collapse interpretation of the measurement process. But I see no good reason to take for granted that all state changes must be continuous; this is to assume that Schrödinger’s equation is valid for all dynamical evolution, including measurement interactions, despite observations to the contrary. This is not compatible with empiricism. We observe states of quantum systems that cannot be the result of continuous evolution from the prepared states. I have in this chapter shown that one can derive, using only standard quantum formalism, the collapse from the principle of discreteness of interaction, which is a well established empirical fact. Thus the projection postulate is shown to be superfluous. Chapter 17 is a discussion about spacetime, relativity theory and gravitation. My main point is the same as Rovelli’s, viz., that spacetime structure and massenergy distribution are two sides of the same coin. Einstein’s equation says that two quantitative functions are identical. It doesn’t say anything about what is the object and what is its attributes. When trying to express what Einstein’s equation says in non-mathematical language we can either say that the spacetime structure has a certain mass-energy distribution, or reverse and say that the object which exists is the mass-energy distribution and this thing is predicated a spacetime structure. GTR doesn’t decide the matter, in fact, GTR is difficult to formulate in terms of objects with attributes. This conclusion is relational in spirit, but in fact it goes one step further. Classical relationalism says that space and time points are reducible to attributes of material objects, which are assumed to be the objects in the world. The inverted view was never considered by Leibniz and his followers. The preconception of the world as consisting of things with properties is in stark conflict with the structure of general relativity. When we try to express the content of Einstein’s equation in ordinary declarative sentences we are forced to say either that the world consists of material objects spread out in space, or that it is a spacetime field with certain ‘bumps’ here and there. True declarative sentences by their nature require objects and this might be viewed as result of linguistic evolution; conceptualising the immediate environment in this way may have been profitable for survival and reproduction. But the large scale structure of the universe is so different from anything we experience in ordinary life, so the difficulties of expressing the

274

18 Summary and Conclusions

fundamental law of GTR in terms of objects and attributes is just what one might expect. The chapter ends with some reflections on string theory. Since no testable consequences of string theory, not derivable from the standard model or relativity theory, has so far been derived, we have no empirical evidence for it. Furthermore, lacking such consequences an empiricist may view string theory as nothing but mathematics. There is nothing, so far as we know, that connects it to the physical world. Using Löwenheim-Skolem’s theorem once again, the theory could be interpreted to be about numbers and nothing else.

Bibliography

Achinstein, P. (1981). Can there be a Model of Explanation? Theory and Decision, 13, 201–227. Achinstein, P. (1984). The pragmatic character of explanation. In PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association, 1984, Volume Two: Symposia and Invited Papers (pp. 275–292) The University of Chicago Press on behalf of the Philosophy of Science Association. Acuna, P. (2016). Minkowski spacetime and Lorentz invariance: The cart and the horse or two sides of a single coin? Studies in History and Philosophy of Science, Part B, 55, 1–12. Albert, D. Z. (1992). Quantum mechanics and experience. Cambridge, MA: Harvard University Press. Albert, D. Z. (2000). Time and chance. Cambridge, MA: Harvard University Press. Albeverio, S., & Blanchard, P. (2014). Direction of time. Cham: Springer International Publishing. Aristoteles, & Barnes, J. (1975). Aristotle’s posterior analytics. Oxford: Clarendon Press. Aristoteles, & Ross, W. D. (1936). Aristotle’s physics: A revised text with introduction and commentary. Oxford: Clarendon Press. Aristotle, & Ackrill, J. L. (1963). Aristotle’s Categories and De Interpretatione. Oxford: Clarendon. Armstrong, D. M. (1983). What is a law of nature? Cambridge: Cambridge University Press. Auyang, S. Y. (1995). How is quantum field theory possible? New York: Oxford University Press. Azzouni, J. (2004). Deflating existential consequence: A case for nominalism. New York: Oxford University Press. Baker, D. J. (2019). On Spacetime Functionalism. http://philsci-archive.pitt.edu/15860/. Balashov, Y., & Janssen, M. (2003). Presentism and relativity. British Journal for the Philosophy of Science, 54, 327–346. Barcan Marcus, R. (1993). Modalities. Philosophical Essays. New York: Oxford University Press. Barnes, E. (1992). Explanatory unification and the problem of asymmetry. Philosophy of Science, 54(4), 558–571. Barrett, J. A. (2001). On the Nature of Measurement Records in Relativistic Quantum Field Theory. http://philsci-archive.pitt.edu/00000197/. Batterman, R. W. (2002). The devil in the details. Oxford: Oxford University Press. Batterman, R. W. (2010). On the explanatory role of mathematics in empirical science. The British Journal for the Philosophy of Science, 61, 1–25. Bauer, G., & Dürr, D. (2001). The Maxwell-Lorentz system of a rigid charge. Annales Henri Poincaré, 2, 179–196. Becker Arenhart, J. R. (2013). Wither away individuals. Synthese, 190(16), 3475–3494.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1

275

276

Bibliography

Becker Arenhart, J. R., & Krause, D. (2014). From primitive identity to the non-individuality of quantum objects. Studies in History and Philosophy of Modern Physics, 46, 273–282. Beeson, M. (1985). Foundations of constructive mathematics. Berlin/Heidelberg: Springer. Belot, G. (2007). Is classical electrodynamics an inconsistent theory? Canadian Journal of Philosophy, 37, 263–282. Berlin, B., & Kay, P. (1969). Basic color terms: Their universality and evolution. Berkeley, CA: University of California Press. Bigelow, J., Ellis, B., & Lierse, C. (1992). The world as one of a kind: Natural necessity and laws of nature. British Journal for the Philosophy of Science, 43(3), 371–388. Billinge, H. (2003). Did bishop have a philosophy of mathematics? Philosophica Mathematica, 11(2), 176–194. Bird, A. (2007). Nature’s metaphysics, laws and properties. Oxford: Oxford University Press. Bishop, E., & Bridges, D. S. (1985). Constructive analysis. Berlin: Springer. Bohr, N. (1928). The quantum postulate and the recent development of atomic theory. Nature, 121, 580–589. Bohr, N. (1951). Discussions with Einstein on epistemological problems in atomic physics. In P. A. Schilpp (Ed.), Albert Einstein: Philosopher-scientist (pp. 199–241). New York: Tudor Publishing Company. Born, M. (1924). Einstein’s theory of relativity. London: Methuen. Boyd, R. (1984). The current status of scientific realism. In J. Leplin (Ed.), Scientific realism (pp. 41–82). Berkeley: University of California Press. Bridgman, P. W. (1960). The logic of modern physics. New York: Macmillan. Brown, H. R. (2005). Physical relativity: Spacetime structure from a dynamical perspective. Oxford: Oxford University Press. Brown, H. R., & Pooley, O. (2006). Minkowski space-time: A glorious non-entity. In D. Dieks (Ed.), The ontology of spacetime [I] (pp. 67–92). Amsterdam: Elsevier. Bueno, O. (2014). Nominalism in the philosophy of mathematics. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy (spring 2014 ed.). Metaphysics Research Lab, Stanford University. Cantor, G. (1883). Grundlagen einer allgemeinen mannigfaltiglehre. ein mathematischphilosophisher versuch in der leher de unendlichen. Leipzig: Teubner. Carnap, R. (1928/2003). The logical structure of the world and pseudoproblems in philosophy. Chicago and La Salle: Open Court. Carnap, R. (1938). Logical foundations of the unity of science. In International encyclopedia of unified science (Vol. 1, pp. 42–62). Chicago: University of Chicago Press. Carnap, R., & Gardner, M. (1995). An introduction to the philosophy of science. New York: Dover. Carroll, L. (1895). What the tortoise said to Achilles. Mind, 4(14), 278–280. Carroll, J. W. (1994). Laws of nature. Cambridge: Cambridge University Press. Cartwright, N. (1980). Measuring position probabilities. In P. Suppes (Ed.), Studies in the foundations of quantum mechanics. East Lansing: Philosophy of Science Association. Cartwright, N. (1989). Nature’s capacities and their measurement. Oxford: Clarendon Press. Chaitin, G. J. (1975). A theory of program size formally identical to information theory. Journal of the Association for Computing Machinery, 22(3), 329–340. Chen, R. (1990). The real wave function as an integral part of Schrödinger’s basic view on quantum mechanics. Physica B, 167, 183–184. Chen, R. (1993). Schrödinger’s real wave equation (continued). Physica B, 190, 256–258. Coveney, P., & Highfield, R. (1991). The arrow of time: The quest to solve science’s greatest mystery. London: Flamingo. Creath, R. (2014). Logical empiricism. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy, Metaphysics Research Lab, Stanford University, (spring 2014 ed.). Davidson, D. (1973a). On the very idea of a conceptual scheme. In Proceedings and Adresses of the American Philosophical Association (Vol. 47, pp. 5–20) Davidson, D. (1973b). Radical interpretation. In Inquiries into truth and interpretation (pp. 125– 140). Oxford: Clarendon Press.

Bibliography

277

Davidson, D. (2001). Epistemology and truth. Subjective, intersubjective, objective. Oxford: Clarendon Press. Davidsson, D. (1991). Epistemology externalized. Dialectica, 45(2–3), 191–202. Dicke, R. H., & Wittke, J. P. (1960). Introduction to quantum mechanics. Reading: AddisonWesley Publishing Company. Dorato, M., & Morganti, M. (2011). Grades of individuality. A pluralistic view of identity in quantum mechanics and in the sciences. Philosophical Studies, 163(3), 591–610. Dretske, F. (1977). Laws of nature. Philosophy of Science, 44, 248–268. Duff, M. (2014). How Fundamental are Fundamental Constants? https://arxiv.org/pdf/1412.2040. pdf. Duff, J. B., Okun, L. B., & Veneziano, G. (2002). Trialogue on the number of fundamental constants. Journal of High Energy Physics, 3, 23. Dummett, M. (1993). The seas of language. Oxford: Clarendon Press. Dunstan, D. J. (2008). Derivation of special relativity from Maxwell and Newton. Philosophical Transactions of the Royal Society A, 366(1871), 1861–1865. Eagle, A. (2007). Pragmatic causation. In H. Price & R. Corry (Eds.), Causation, physics, and the constitution of reality: Russell’s republic revisited (Chap. 7). Oxford: Oxford University Press. Earman, J. (1974). An attempt to add a little direction to the problem of the direction of time. Philosophy of Science, 41, 15–47. Earman, J. (2002). Laws, Symmetry, and Symmetry Breaking; Invariance, Conservation Principles, and Objectivity? http://philsci-archive.pitt.edu/878/1/PSA2002.pdf. Earman, J., & Roberts, J. (1999). Ceteris Paribus, there is no problem of provisos. Synthese, 118(3), 439–478. Edgington, D. (1995). On conditionals. Mind, 104, 235–329. Einstein, A. (1905a). Ueber einen die Erzeugung und Verwandlung des Lichtes betreffenden heuristischen Gesichtspunkt. Annalen der Physik, ser 4, 17, 132–148. Einstein, A. (1905b). Zur Elektrodynamik bewegter Körper. Zeitschrift für Physik, 17(10), 891– 921. Einstein, A. (1919(1954)). What is the theory of relativity? (pp. 227–232). New York: Bonanza. Einstein, A. (1921(1954)). Geometrie und Erfahrung, Erweite Fassung des Festvortrages gehalten an der Preussischen Akademie.Trans. by S. Bargmann as ‘Geometry and Experience’ (pp. 232– 246). New York: Bonanza. Ellis, B. (1999). Causal powers and laws of nature. In H. Sankey (Ed.), Causation and laws of nature (pp. 19–35). Dordrecht: Kluwer Academic Publishers. Faye, J., & De Regt, H. W. (2019). Introduction: Norms, naturalism, and scientific understanding. Journal for General Philosophy of Science, 50, 323–326. Feferman, S. (1998). In the light of logic. New York: Oxford University Press. Feferman, S. (2005). Predicativity. In S. Shapiro (Ed.), The Oxford handbook of philosophy of mathematics and logic (pp. 590–624). New York/Oxford: Oxford University Press. Feynman, R. P. (1967). The character of physical law. Cambridge, MA: MIT Press. Feynman, R., Leigthon, R., & Sands, M. (1964). The Feynman lectures on physics II. Reading: Addison-Wesley Publishing Company. Field, H. H. (1980). Science without numbers: A defence of nominalism. Oxford: Blackwell. Frank, P. (1947). Einstein. His life and times. New York: Alfred A. Knopf. French, S. (1989). Identity and individuality in classical and quantum physics. Australasian Journal of Philosophy, 67(4), 433–446. French, S., & Krause, D. (2006). Identity in physics. A historical, philosophical, and formal analysis. Oxford: Oxford University Press. French, S., & Redhead, M. (1988). Quantum physics and the identity of indiscernibles. The British Journal for the Philosophy of Science, 39, 233–246. French, S., & Redhead, M. (1989). Why the principle of the identity of indicernibles is not contingently true either. Synthese, 78, 141–166. Friedman, M. (1974). Explanation and scientific understanding. Journal of Philosophy, 71(1), 5–19.

278

Bibliography

Friedman, M. (2001). Dynamics of reason. Stanford: CSLI Publications. Frisch, M. (2005). Inconsistency, asymmetry, and non-locality. A philosophical investigation of classical electrodynamics. Oxford: Oxford University Press. Frisch, M. (2008). Conceptual problems in classical electrodynamics. Philosophy of Science, 75, 93–105. Gadamer, H.-G. (1989). Truth and method (2nd revised ed.). London: Sheed and Ward. Gauss, C. F. (2011). Werke, volume 8. Cambridge: Cambridge University Press. Gemes, K. (1994). Explanation, unification and content. Nous, 28(2), 225–240. Gibson, R. F. (2004). The Cambridge companion to quine. Cambridge: Cambridge University Press. Goodman, N. (1946). The problem of counterfactual conditionals. Journal of Philosophy, 44, 113–128. Goodman, N. (1955). The new riddle of induction. In Fact, fiction and forecast (pp. 59–83). Cambridge, MA: Harvard University Press. Goodman, N. (1972). A world of individuals. In Problems and projects (pp. 155–172). BobsMerrill company. Greenberg, O. W. (2002). CPT violation implies violation of Lorentz invariance. Physical Review Letters, 89(23), 1602. Haack, S. (2009). Evidence and inquiry. A pragmatist reconstruction of epistemology (second expanded ed.). Amherst/New York: Prometheus Books. Halliwell, J. J., Pérez-Mercader, J., & Zurek, W. H. (1994). Physical origins of time asymmetry. Cambridge: Cambridge University Press. Halonen, I., & Hintikka, J. (1999). Unification: It’s magnificent but is it explanation? Synthese, 120(1), 27–47. Halvorson, H. (2019). The logic in philosophy of science. Cambridge: Cambridge University Press. Halvorson, H., & Clifton, R. (2002). No place for particles in relativistic quantum theories? Philosophy of Science, 69, 1–28. Hankinson, R. J. (1998). Cause and explanation in ancient greek thought. Oxford: Clarendon. Hawking, S. W., & Mlodinow, L. (2010). The grand design. London: Bantam. Heisenberg, W. (1930). The physical principles of the quantum theory. Chicago: Chicago University Press. Hellman, G. (1989). Mathematics without numbers: Towards a modal-structural interpretation. Oxford: Clarendon Press. Hempel, C. (1970). On the ‘standard conception’ of scientific theories. In M. Radner & S. Winokur (Eds.), Minnesota studies in the philosophy of science (Vol. 4, pp. 142–163). Minneapolis: University of Minnesota Press. Hempel, C., & Oppenheim, P. (1948). Studies in the logic of explanation. Philosophy of Science, 15, 135–175. Hitchcock, C. (1995). Discussion: Salmon on explanatory relevance. Philosophy of Science, 62(2), 304–320. Horwich, P. (1987). Asymmetries in time: Problems in the philosophy of science. Cambridge, MA: MIT Press. Hume, D. (1986). Treatise on human understanding. London: Penguin. Humphreys, P. (1995). Why propensities cannot be probabilites. The Philosophical Review, 94(4), 557–570. Hwang, T.-Y. (1972). A new interpretation of time reversal. Foundations of Physics, 2(4), 315– 326. Jaksland, R. (2019). The multiple realizability of general relativity in quantum gravity. Synthese. Johansson, L.-G. (1992). Understanding quantum mechanics: A realist interpretation without hidden variables. Stockholm: Almqvist & Wiksell International. Diss. Stockholm: Univerity, 1993. Johansson, L.-G. (2009). Propensities. In L.-G. Johansson, J. Österberg, & R. Sliwinski (Eds.), Logic, ethics and all that Jazz. Essays in honour of Jordan Howard Sobel (Uppsala philosophical studies, Vol. 57, pp. 161–175). Uppsala: Uppsala University.

Bibliography

279

Johansson, L.-G. (2017). The ontology of electromagnetism. Studia Philosophica Estonia, 10(1), 25–44. Johansson, L.-G. (2018). Induction and epistemological naturalism. Philosophies, 3(4), 31. Johansson, L.-G. (2019). An empiricist view on laws, quantities and physical necessity. Theoria, 85(2), 69–101. Johansson, L.-G., & Matsubara, K. (2011). String theory and general methodoloby: A mutual evaluation. Studies in History and Philosophy of Modern Physics, 42, 199–210. Jones, T. (1995). How the unification theory of explanation escapes asymmetry problems. Erkenntnis, 43(2), 229–240. Kant, I., Timmermann, J., & Klemme, H. F. (2003). Die drei Kritiken. Kritik der reinen Vernunft . . . Hamburg: Meiner. Kitcher, P. (1981). Explanatory unification. Philosophy of Science, 48(4), 507–531. Kitcher, P. (1989). Explanatory unification and the causal structure of the world. In W. Salmon & P. Kitcher (Eds.), Minnesota studies in the philosophy of science (Vol. 13, pp. 410–505). Minneapolis: University of Minnesota Press. Knox, E. (2019). Physical relativity from a functionalist perspective. Studies in History and Philosophy of Science Part B, 67, 118–124. Komech, A., & Spohn, H. (2000). Long-time asymptotics for the coupled Maxwell-Lorentz equations. Communications in Partial Differential Equations, 25, 559–584. Konopinski, E. J. (1969). Classical descriptions of motion. The dynamics of particle trajectories, rigid rotations and elastic waves. San Francisco: W.H. Freeman and Company. Krause, D. (2010). Logical aspects of quantum (non-)individuality. Foundations of Science, 15(1), 79–94. Kreisel, G. (1967). Mathematical logic: What has it done for the philosophy of mathematics? In R. Shoenman (Ed.), Bertrand Russell. Philosopher of the century (pp. 201–272). London: George Allen & Unwin. Kuhn, T. S. (1970). The structure of scientific revolutions (2nd ed.). Chicago: University of Chicago Press. Kuhn, T. (1978). Black-body theory and the quantum discontinuity 1894–1912. Chicago: University of Chicago Press. Kyburg, H. (1990). Science & reason. Oxford: Oxford University Press. Kyburg, H. (1997). Quantities, magnitudes, and numbers. Philosophy of Science, 64(3), 377–410. Ladyman, J., & Bigaj, T. (2010). The principle of the identity of indiscernibles and quantum mechanics. Philosophy of Science, 77, 117–136. Ladyman, J., & Ross, D. (2007). Every thing must go. Metaphysics naturalized. Oxford: Oxford University Press. Lange, M. (2002). An introduction to the philosophy of physics: Locality, fields, energy and mass. Oxford: Blackwell. Lange, M. (2009). Laws and lawmakers. Oxford: Oxford University Press. Lear, J. (1980). Aristotelian infinity. Proceedings of the Aristotelian Society, New Series, 80, 187–210. Lees, J. P. et al. (BABAR collaboration). (2012). Observation of time—Reversal violation in the B0 Meson system. Physical Review Letters, 109, 211801. Leplin, J. (1984). Scientific realism. Berkeley: University of California Press. Lewis, D. (1983). New work for a theory of universals. Australasian Journal of Philosophy, 61, 343–377. Lewis, D. (1986). Counterfactuals. Oxford: Basil Blackwell. Lewis, D. (1999). Humean supervenience debugged. In Papers in metaphysics and epistemology. Cambridge: Cambridge University Press. Lindblad, G. (1983). Non-equilibrium entropy and irreversibility. Dordrecht: Reidel. Lorentz, H. A. (1952). The principle of relativity: A collection of original memoirs on the special and general theory of relativity. London: Dover. Lüders, G. (1957). Proof of the TCP theorem. Annals of Physics, 2, 1–15.

280

Bibliography

Luntley, M. (1999). Contemporary philosophy of thought: Truth, world, content. Oxford: Blackwell. Mach, E. (1960). The science of mechanics: A critical and historical account of its development (6th ed.). LaSalle: Open Court. Mackey, M. C. (1993). Time’s arrrow. The origins of thermodynamic behavior. New York: Springer. Mahulikar, S. P., & Herwig, H. (2009). Exact thermodynamic principles for dynamic order existence and evolution in chaos. Chaos, Solitons and Fractals, 41(4), 1939–1948. Malament, D. (1996). In defense of dogma: Why there cannot be a relativistic quantum mechanics of (localizable) particles. In R. Clifton (Ed.), Perspectives on quantum reality (pp. 1–10). Dordrecht: Kluwer Academic Publishers. Margenau, H. (1958). Philosophical problems concerning the meaning of measurement in physics. Philosophy of Science, 25, 23ff. Matsubara, K., & Johansson, L.-G. (2018). Spacetime in string theory: A conceptual clarification. Journal for General Philosophy of Science, 49, 333–353. Maudlin, T. (2007). The metaphysics within physics. Oxford: Oxford University Press. Maxwell, J. C. (1873). A treatise on electricity and magnetism. Oxford: Clarendon. McCall, S. (1984). Counterfactuals based on real possible worlds. Nous, 18, 463–477. Mersini-Houghton, L., & Vaas, R. (2012). The arrows of time: A debate in cosmology. Berlin/Heidelberg: Springer. Misra, B., Prigogine, I., & Courbage, M. (1979). Lyapounov variable: Entropy and measurement in quantum mechanics. Proceedings of the National Academy of Sciences of the United States of America, 76, 4768–4772. Monton, B. (2007). Images of empiricism. Essays on science and stances, with a reply from Bas C. van Fraassen. Oxford: Oxford University Press. Moreland, J. P. (1998). Theories of individuation: A reconsideration of bare particulars. Pacific Philosophical Quarterly, 79, 51–63. Morrison, M. (2000). Unifying scientific theories: Physical concepts and mathematical structures. New York: Cambridge University Press. Muller, F. (2007). Inconsistency in classical electrodynamics? Philosophy of Science, 74, 253–277. Muller, F., & Saunders, S. (2008). Discerning fermions. British Journal for the Philosophy of Science, 59, 499–548. Muller, F., & Seevinck, M. P. (2009). Discerning elementary particles. Philosophy of Science, 76(2), 179–200. Mumford, S. (2004). Laws in nature. London: Routledge. Musgrave, A. (1985). Realism versus constructive empiricism. In P. M. Churchland & C. Hooker (Eds.), Images of science: Essays on realism and empiricism, with a reply from Bas C. van Fraassen (Chap. 9). Chicago: University of Chicago Press. Nagel, E. (1961). The structure of science: Problems in the logic of scientific explanation. New York: Harcourt, Brace and World. Nagel, E. (1970). Issues in the logic of reductive explanations. In H. E. Kiefer & M. K. Munitz (Eds.), Mind, science and history (pp. 117–137). Albany: State University of New York Press. Nagel, T. (1986). The view from nowhere. New York: Oxford University Press. Neurath, O. (1932). Protokollsätze. Erkenntnis, 3, 204–214. Newton, I., Cohen, I. B., & Whitman, A. M. (1687/1999). The principia: Mathematical principles of natural philosophy. Berkeley: University of California Press. Newton-Smith, W. H. (1981). The rationality of science. Boston: Routledge & Kegan Paul. Norman, J., Sylvan, R., & Priest, G. (1989). Paraconsistent logic: Essays on the inconsistent. München: Philosophia. Norton, J. D. (2003). Causation as Folk Science. www.philosophersimprint.org/003004/. Omnés, R. (1994). The interpretation of quantum mechanics. Princeton: Princeton University Press. Pargetter, R. (1984). Laws and modal realism. Philosophical Studies, 46, 335–347.

Bibliography

281

Passmore, J. (1967). Logical positivism. In P. Edwards (Ed.), The encyclopedia of philosophy (Vol. 5, pp. 52–57). New York: Macmillan. Pataut, F. (1998). Incompleteness, constructivism and truth. Logic and Logical Philosophy, 6, 63–76. Pearl, J. (2000). Causality. Models, reasoning, inference. Cambridge: Cambridge University Press. Penrose, R. (2005). The road to reality. A complete guide to the laws of universe. London: Vintage Books. Pietsch, W. (2010). On conceptual issues in classical electrodynamics. Prospects and problems of an action-at-a-distance interpretation. Studies in History and Philosophy of Science Part B, 41, 67–77. Planck, M. (1900). Zur Theorie des Gesetzes der Energiverteilung im Normalspektrum. Verhandlungen der Deutschen Physikalischen Gesellschaft, 2, 237–345. Poincaré, H. (1906). Les mathématiques et la logique. Revue de métaphysique et de morale, 14, 294–317. Poincaré, H. (1952). Science and hypothesis. New York: Dover publication. Popper, K. (1990). A world of propensities. Bristol: Thoemmes. Popper, K. (1992). The logic of scientific discovery. London: Routledge. Popper, K., & Schilpp, P. A. (1974). The philosophy of Karl Popper (1st ed.). La Salle: Open court. Price, H. (1996). Time’s arrow & archimedes’ point: New directions for the physics of time. New York: Oxford University Press. Prigogine, I., & Stengers, I. (1997). The end of certainty: Time, chaos and the new laws of nature. New York: The Free Press. Putnam, H. (1980). Models and reality. The Journal of Symbolic Logic, 45(3), 464–482. Putnam, H. (1981). Reason, truth and history. Cambridge: Cambridge University Press. Quine, W. V. O. (1951). Two dogmas of empiricism. The Philosophical Review, 60, 20–43. Quine, W. V. O. (1960). Word and object. Cambridge, MA: MIT Press. Quine, W. V. O. (1969). Natural kinds. In Ontological relativity and other essays (pp. 114–138). New York: Columbia University Press. Quine, W. V. O. (1976a). A logistical approach to the ontological problem. In The ways of paradox and other essays (2. enlarged and revised ed., pp. 197–202). Cambridge, MA: Harvard University Press. Quine, W. V. O. (1976b). Implicit definition sustained. In The ways of paradox and other essays (2. enlarged and revised ed., pp. 133–136). Cambridge, MA: Harvard University Press. Quine, W. V. O. (1976c). Three grades of modal involvement. In The ways of paradox and other essays (revised and enlarged ed., pp. 158–176). Cambridge, MA: Harvard University Press. Quine, W. V. O. (1981a). Goodman’s ways of worldmaking. In Theories and things (pp. 96–99). Cambridge, MA: Belknap Press of Harward University Press. Quine, W. V. O. (1981b). Things and their place in theories. In Theories and things. Cambridge, MA: The Belknap Press of Harvard University Press. Quine, W. V. O. (1981c). What price bivalence? In Theories and things (pp. 31–37). Cambridge, MA: The Belknap Press of Harvard University Press. Quine, W.V. O. (1992). Structure and nature. The Journal of Philosophy, 89(1), 5–9. Raatikainen, P. (2004). Conceptions of truth in intuitionism. History and Philosophy of Logic, 25, 131–145. Rausch, H., Wölwitsch, H., Clothier, R., Kaiser, H., & Werner, S. A. (1992). Time of flight neutron interferemetry. Physical Review, A46, 45–57. Ray, J. (1693). Synopsis methodica animalium quadrupedum et serpentini generis. Londini: Impensis S. Smith & B. Walford. Redhead, M. (1990). Explanation. In D. Knowles (Ed.), Explanation and its limits (pp. 135–54). Cambridge: Cambridge University Press. Reichenbach, H. (1920). Relativitätstheorie und Erkenntnis Apriori. Berlin: Springer. Reichenbach, H., & Reichenbach, M. (1999). The direction of time. Mineola: Dover. Roberts, J. T. (2008). The law-governed universe. Oxford: Oxford University Press.

282

Bibliography

Ross, D., & Spurrett, D. (2007). Notions of cause: Russell’s thesis revisited. The British Journal for the Philosophy of Science, 58(1), 45–76. Rothman, M. A. (1989). Discovering the natural laws. New York: Dover. Rovelli, C. (1997). Halfway through the woods: Contemporary research on space and time. In J. Earman & J. D. Norton (Eds.), The cosmos of science (pp. 180–223). Pittsburgh: University of Pittsburgh Press. Rovelli, C. (2004). Quantum gravity. Cambridge: Cambridge University Press. Russell, B. (1906). On some difficulties in the theory of transfinite numbers and order types. Proceedings of London Mathematical Society, 4, 29–53. Russell, B. (1913). On the notion of cause. Proceedings of the Aristotelian Society, 13, 1–26. Russell, B. (1919). Introduction to mathematical philosophy. London: Routledge. Salmon, W. C. (1984). Scientific explanation and the causal structure of the world. Princeton: Princeton University Press. Salmon, W. (1989). Four decades of scientific explanation. Minneapolis: University of Minnesota Press. Salmon, W. (1997). Causality and explanation: A reply to two critiques. Philosophy of Science, 64(3), 461–477. Salmon, W. (2001). A realist account of causation. In M. Marsonet (Ed.), The problem of realism (pp. 106–133). Burlington: Ashgate. Sarkar, S. (2015). Nagel on reduction. Studies in History and Philosophy of Science, 53, 43–56. Saunders, B. (2000). Revisiting basic color terms. Journal of the Royal Anthropological Institute, 6, 81–99. Saunders, S. (2003). Physics and Leibniz’s principles. In K. Brading & E. Castellani (Eds.), Symmetries in physics: Philosophical reflections (pp. 289–397). Cambridge: Cambridge University Press. Saunders, S. (2006). Are quantum particles objects? Analysis, 66, 52–63. Savitt, S. F. (1995). Time’s arrows today: Recent physical and philosophical work on the direction of time. Cambridge: Cambridge University Press. Schwarz, J. T. (2006(1966)). The pernicious influence of mathematics on science. In R. Hersch (Ed.), 18 unconventional essays on the nature of mathematics (Chap. 13, pp. 231–235). New York: Springer. Scorzato, L. (2014). A simple model of scientific progress. In L. Felline, F. Paoli, & E. Rossanese (Eds.), New developments in logic and philosophy of science (Vol. 3, pp. 45–57). College Publications. Searle, J. R. (1995). The construction of social reality. New York: Free Press. Segal, I. E. (1964). Quantum fields and analysis in the solution manifolds of differential equations. In W. Martin & I. E. Segal (Eds.), Proceedings of a Conference on the Theory and Applications of Analysis in Function Space (pp. 129–153). Cambridge, MA: MIT Press. Sellars, W., Rorty, R., & Brandom, R. (1997). Empiricism and the philosophy of mind. Cambridge, MA: Harvard University Press. Simon, H. (1970). The axiomatization of physical theories. Philosophy of Science, 37(1), 16–26. Smolin, L. (2006). The trouble with physics: The rise of string theory, the fall of a science, and what comes next. Boston: Houghton Mifflin Co. Spade, P. V., & Panaccio, C. (2019). William of Ockham. In E. N. Zalta (Ed.), The Stanford encyclopedia of philosophy (spring 2019 ed.). Metaphysics Research Lab, Stanford University. Spelke, E., Vishton, P., & Von Hofsten, C. (1995a). Object perception, object-directed action and physical knowledge in infancy. In The cognitive neurosciences (pp. 165–179). Cambridge, MA: MIT Press. Spelke, E., Kestenbaum, R., Simons, D., & Wein, D. (1995b). Spatiotemporal continuity, smoothness of motion and object identity in infancy. British Journal of Developmental Psychology, 13, 113–142. Spelke, E., Gutheil, G., & Van de Valle, G. (1995c). The development of object perception. In Visual cognition: An invitation to cognitive science (Vol. 2, pp. 297–330). Cambridge, MA: MIT Press.

Bibliography

283

Sundholm, G. (1983). Constructions, proofs and the meaning of logical constants. Journal of Philosophical Logic, 12, 151–172. Suppes, P. (2003). Representation and invariance of scientific structures. Stanford: CSLI. Tegmark, M. (2008). The mathematical universe. Foundations of Physics, 38(2), 101–150. Tolman, R. C. (1979). The principles of statistical mechanics. New York: Dover Publications. Tooley, M. (1977). The nature of law. Canadian Journal of Philosophy, 7, 667–698. Vallentyne, P. (1988). Explicating lawhood. Philosophy of Science, 55, 598–613. van Fraassen, B. C. (1977). The only necessity is verbal necessity. Journal of philosophy, LXXIV(2), 71–85. van Fraassen, B. C. (1980). The scientific image. Oxford: Oxford University Press. van Fraassen, B. C. (1985). Empiricism in the philosophy of science. In P. M. Churchland & C. Hooker (Eds.), Images of science: Essays on realism and empiricism, with a reply from Bas C. van Fraassen (Chap. 11). Chicago: University of Chicago Press. van Fraassen, B. C. (1989). Laws and symmetry. Oxford: Clarendon. van Fraassen, B. C. (2002). The empirical stance. New Haven: Yale University Press. van Fraassen, B. C. (2008). Scientific representation: Paradoxes of perspective. Oxford: Oxford University Press. van Fraassen, B. C., & Churchland, P. M. (1985). Images of science: Essays on realism and empiricism, with a reply from Bas C. van Fraassen. Chicago: University of Chicago Press. Vickers, P. (2008). Frisch, Muller, and Belot on an inconsistency in classical electrodynamics. British Journal for the Philosophy of Science, 59(4), 767–792. von Neumann, J. (1932/1996). Mathematische Grundlagen der Quantenmechanik (2. aufl., [nachdr. der ausg.]) Berlin: Springer, 1932 ed. Berlin: Springer. Weber, E., & van Dyck, M. (2002). Unification and explanation: A comment on Halonen and Hintikka, and Schurz. Synthese, 131(1), 145–154. Weinberg, S. (1977). The search for unity, notes for a history of quantum field theory. Daedalus, 106(4), 17–35. Weinberg, S. (1995). The quantum theory of fields: Volume 1, foundations. Cambridge: Cambridge University Press. Weyl, H. (1952/1918). Space, time and matter. New York: Dover Publications. Wheeler, J. A., & Feynman, R. P. (1949). Classical electrodynamics in terms of direct interparticle action. Reviews of Modern Physics, 21, 425–433. Whitehead, A. N., & Russell, B. (1910). Principia mathematica. Cambridge: Cambridge University Press. Wienerkreis. (1929). Wissenschaftliche Weltauffassung der Wiener Kreis (The Scientific Conception of the World: the Vienna circle). http://evidencebasedcryonics.org/pdfs/viennacircle.pdf. Wigner, E. (1960). The unreasonable effectiveness of mathematics in the natural sciences. Communications in Pure and Applied Mathematics, 13(1), 1–14. Wigner, E. (1963). The problem of measurement. American Journal of Physics, 31, 6–15. Wittgenstein, L. (1953). Philosophical investigations. Oxford: Blackwell. Wittgenstein, L., & Russell, B. (1974). Tractatus Logico-Philosophicus. London: Routledge & Kegan Paul. Wittgenstein, L., von Wright, G. H., & Anscombe, G. E. M. (1969). Über Gewissheit: On Certainty. Oxford: Blackwell. Woodward, J. (1992). Realism about laws. Erkenntnis, 36, 181–218. Woodward, J. (2003). Making things happen: A theory of causal explanation. Oxford: Oxford University Press. Wynn, K. (1990). Children’s understanding of counting. Cognition, 36, 155–193. Yin, J. et al. (2017). Satellite-based entanglement distribution over 1200 kilometers. Science, 356(6343), 1140–1144. Zee, A. (2003). Quantum field theory in a Nutshell. Princeton: Princeton University Press. Zeh, H. D. (2007). The physical basis of the direction of time (5th ed.). Berlin/Heidelberg: Springer. Zwiebach, B. (2009). A first course in string theory. Cambridge: Cambridge University Press.

Index

A Acceleration is motion of motion, 17 Accidental generalisations, 144 Achinstein, P., 43 Action-at-a-distance, 179 Actual infinity, 67 Actuality and potentiality, 17 Acuna, P., 115 AdS-CFT duality, 108, 264 Advanced solution, 205 Agreement on observation reports, 49, 77, 128, 258 Albert, D., 198, 240 Albeverio, S., 198 Analytic-synthetic distinction, 35, 156 Aristotelian view on forces, 13 Aristotle, 4, 14, 15, 26, 67, 85, 87 Aristotle on change, 16 Aristotle on directions, 16 Aristotle on individuals vs. universals, 62 Aristotle on time and change, 133 Aristotle’s categories, 17 Aristotle’s forms, 16 Armstrong, D., 139 Auyang, S., 129, 169 Axiom of choice, 64 Axioms are implicit definitions, 36 Azzouni, J., 54

B Babar collaboration, 202 Balashov & Jansen, 113 Barcan Marcus, R., 109 Barrett, J., 180

Basic statements, 80 Batterman, R., 44, 121 Bauer & Durr, 176 Bayes’ theorem, 190 Becker Arenhart, J., 218 Beeson, M., 70 Belot, G., 170, 176, 177 Berkeley, G., 56 Berlin & Kay, 83 BE statistics, 220 Bigaj, T., 218 Bird, A., 139 Bishop, E., 68 Black-body radiation, 242 Bodies are empirically primitive, 257, 258 Bodies as basic in ontology, 128 Body, 131 Bohr, N., 130, 195, 243, 254 Boltzmann, L., 212 Born, M., 159 Born rule, 230 Bose-Einstein condensate, 221 Bosons are not individuals, 220 Boyd, R., 8, 42 Boyle’s law, 85 Bridgeman, P., 156 Brouwer, L., 68, 172 Brown, H., 158, 258, 261 Burali-Forti paradox, 59

C Cantor, G., 68 Cardinal vs. ordinal number as applied to quanta, 218

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 L.-G. Johansson, Empiricism and Philosophy of Physics, Synthese Library 434, https://doi.org/10.1007/978-3-030-64953-1

285

286 Carnap, R., 33, 34, 106, 147, 164 Carroll, J., 140 Carroll, L., 79 Cartwright, N., 115, 193, 238 Cauchy, A.L., 45 Causation and relativity theory, 123 Causation as strict regularity, 119 Cause as category, 120 Causes and forces, 8 Causes, context dependence, 123 Ceteris paribus clauses, 142 Chaitin, G., 223 Chance as quantitative attribute, 184 Chance resulting from incomplete information, 186 Change, Aristotle’s view, 209 Charles, J., 85 Chen, R., 231 Cheshire cat, 107 Circularity, non-vicious, 258 Clifton, R., 179, 180, 218 Clocks have two parts, 212 Collapse is indeterministic, 189 Collapse of the wave function, 234 Colour concepts, 83 Colour spectrum divided differently, 83 Colour words, learning, 78 Compactified dimensions, 108, 263 Comte, A., 29 Concepts signify, 47 Conditional propensities, 190 Conditionals vs. conditional probabilities, 192 Conservation laws, 161 Conservation of charge, 161 Conservation of momentum, 19, 152 Constancy of velocity of light, 20, 158 Constants of the Standard Model, 137 Constructive empiricism, 38, 39 Constructivism in mathematics, 54, 55 Context principle, 49 Coordination principles, 148 Correspondence principle, 246 Corroboration, 74 Counterfactuals, 144 Coveney, P, 198 CPT symmetry, 198, 200 Creath, R., 34

D Davidson, D., 49, 77 Definite proportions, 86 Derived laws, 154 Descartes, R., 18, 28, 152

Index Determinism vs. predictability, 187 Determinism versus predictability, 10 Deterministic events, 186 Deterministic laws, 186 Dielectricity constant, 258 Dimensionless equations, 136 Dirac’s delta function, 46 Direction of time, 9, 198 Discreteness of interactions, 225, 245 Disease distinctions, 84 DN-model, 91 Dorato, M., 218 Double blind testing, 51 Dretske, F., 139 Duff, M., 135 Duhem-Quine thesis, 147 Dummett, M., 59 Dunstan, D., 20, 158 Dynamical reversibility, 198, 201 Dynamical reversibility, definition of, 207 Dynamics, empirical basis for, 30

E Eagle, A., 122 Earlier than, 206 Earman, J., 140, 142, 198 Eddington’s solar eclipse expedition, 128 Edgington, D., 193 Eigenstate, 234 Einstein, A., 20, 155, 157, 188, 215, 243, 244, 257 Einstein’s equation, 9, 178, 259 Ellis, B., 139 Emeralds, 84 Empirical adequacy, 38 Empirical substructure, 38 Empiricism, 27 Empiricist principle and empirical knowledge support each other, 48 Empiricist stance, 40 Energy is discretized, 21 Entanglement, 11, 222 Entropy, 198 Epistemic interpretation, 247 Epistemic norms, 75 Epistemologically fundamental vs. logically fundamental, 137 Epistemological naturalism, 28, 81, 98 Epistemology a social enterprise, 49 Epistemology is not first philosophy, 28 Equipartition principle, 219 Essential pronouns, 67 Evidence, 82

Index Evidence and reason for, 41 Explanation as unification, 44 Explanation of concepts, 4 Explanation of phenomena, 4 Explanations are perspective dependent, 42 Explanatory force., 42 Explicit definitions, 154

F Falsificationism, 74 FD statistics, 220 Feferman, S., 55, 60, 64, 66 Fermions are not individuals, 220 Feynman, R., 10, 20, 158, 176 Fictionalism, 54 Field, H., 54 Force, 31 Force, definition of, 153 Forces as causes, 124 Forces, rejection of, 30 Forms of intuition, 29, 57 Foundationalism, epistemic, 76 Frank, P., 129 Frege, G., 49, 64 French, S., 180, 218 Friedman, M., 92, 139, 148 Frisch, M., 170, 175 Fundamental law, definition of, 163 Fundamental laws, 142, 154 Fundamental laws are implicit definitions, 86 Fundamental laws, epistemic vs. logical ones, 86 Fundamental quantitative predicates, 135 Fundamental units, 151

G Gadamer, H.-G., 12 Galilean relativity, 19 Galilei, G., 18, 158 Gauss, C.F., 67, 136, 156 Gay-Lussac, J.L., 85 General law of gases, 85, 122 Genidentity, 130, 206 Genuine randomness, 187, 194 Geometry, mathematical vs. physical, 256 Gibbs entropy definition, 210 Gibson, R., 37 Gödel’s incompleteness theorem, 61, 69 Goldbach’s conjecture, 55 Goodman, N., 47, 63, 83, 144 Greenberg, O., 201 Grice’ rule, 75

287 Grue, 83, 84

H Haack, S., 41 Halliwell, J., 198 Halvorsen, H., 179, 180, 218 Hankinson, R., 87 Harmonic oscillator, 112 Hawking, S., 102 Heisenberg’s matrix mechanics, 6 Heisenberg, W., 247 Hellman, G., 54 Hempel, C., 164 Hempel & Oppenheim, 92 Hermeneutics, 12 Hermitian operators, 246 Hidden variables, 10 Hilbert, D., 64 Hilbert space , 189 Hilbert’s programme, 33 Hitchcock, C., 123 Holism, 51 Holistic constructivism, 56 Horizon of understanding, 12 Horwich, P., 198 H-theorem, 211 Hume, D., 4, 27, 56, 73, 81, 119 Humpreys, P., 190 Huygens, C., 19, 30, 80, 152 Hwang, T., 203, 206

I Ideas, 27 Identical particles, 217 Identity criterion, 7, 65 Identity criterion for bodies, 131, 206 Identity criterion for theories, 105 Implicit definition of predicates, 79–81, 86, 115 Impressions, 27 Incompatible observables, 247 Independent acceptability, 93 Indeterminacy relations, 231 Indeterminism, 234 Indeterminism and quantum of action, 195 Indexicals, 35, 95, 129, 137, 200, 265 Indiscernability of identicals, 132 Individuals, 62 Individuation of forces, 111 Individuation of properties, 5 Individuation of quantum systems, 219 Individuation, part of theory, 248

288 Induction, a heuristic device, 87 Induction, Aristotle’s conception, 87 Induction as concept formation, 86 Induction problem, 5, 83 Inductive habits, 50 Inductive inference, 38 Influences vs. interactions, 240 Inference to the best explanation, 39, 146 Interactions, 240, 247 Interactions are discretised, 228 International Union of Pure and Applied Physics (IUPAP), 151 Introspection is infallible, 27 Irreversible state change, 235

J Jaksland, R., 265 Johansson, L.-G., 264 Justification, 82

K Kant, existence is not a predicate, 54 Kant, I., 9, 29, 56, 76, 119, 121, 138, 219 Kant’s transcendentalism, 29 Kinematical quantities, 18 Kitcher, P., 94 Knox, E., 261 Kolmogorov’s axioms, 183, 188 Komech & Spohn, 176 König’s paradox, 59 Konopinski, E., 153 Krause, D., 218 Kreisel, G., 61 Kuhn, T., 41, 150, 166, 243 Kyburg, H., 147, 166

L Ladyman, J., 106, 107, 218 Lange, M., 139, 170 Law of causality, 121 Law of excluded middle, 55 Law of gravitation, 143, 154 Laws are implicit definitions, 156 Laws as instruments for predictions, 33 Lear, J., 68 Length contraction, 258 Leplin, J., 102 Levi-Civita, T., 21 Lewis, D., 139, 184 Lindblad, G., 211

Index Locke, J., 56 Logical positivism, 35, 101 Lorentz’ law, 159 Lorentz invariance entails CPT invariance, 201 Löwenheim-Skolem’s theorem, 34, 95, 107, 113, 137, 262, 265 Lüders, G., 201 Luntley, M., 35, 95, 129 Lyapunov operators, 212

M Mach, E., 4, 29, 35 Mackey, M., 198 Magnetic permeability, 258 Mahulikar, S., 213 Malament, D., 170, 179 Mapping a theory onto another one, 104 Margenau, H., 238 Mass, 30, 31 Mass, definition of, 80, 152 Mass, inertial and gravitational, 154 Mathematical platonism, 54 Mathematical representation, change of, 199 Matsubara, K., 264 Matter and form, 17 Matter fields as entities, 260 Maudlin, T., 140, 141 Maxwell, J., 159 Maxwell’s equations, 20, 159, 204 Maxwell’s first equation, 174 MB statistics, 219 Measurement of the second kind, 249 Measurement problem, 11 Measurements of continuous observables, 249 Measurements of the first and second kind, 233 Mersini-Houghton, L., 198 Metric in curved spacetime, 215 Mill, J.S., 29 Misra, B., 211 Mixture of macroscopic and microscopic concepts, 190 Models, 103, 112 Model theory, 112, 114 Momentum conservation, discovery of, 80, 152 Moreland, J., 218 Morgantini, M, 218 Morrison, M., 94 Motion, forced vs. natural, 16 Motion, teleological in Aristotle’s physics, 17 Muller, F., 170, 176, 218 Mumford, S., 139 Musgrave, A., 43

Index N Nagel, E., 92, 94 Nagel, T., 219 Natural constants, 135 Naturalised epistemology, 75 Naturalism, 9 Natural kinds, 48 Natural law, discovery of, 85 Negentropy, 213 Neumann, J., 222 Neurath’s boat, 49, 58, 82 Neutrino, discovery of, 248 New Principal Principle, 184 Newton’s second law, 144 Newton, I., 18, 19, 30, 40, 80, 153, 158 Newton on induction, 40 Newton’s laws, 31 Newton-Smith, W., 74 No entity without identity, 7, 58, 64, 163, 171 Noether’s theorem, 161, 248 Nominalism, 26, 63, 110 Nominalistic empiricism, 51 Nominalist theory of reference, 26 No-miracle-argument, 43, 102 Nomological necessity, 10, 144 Non-local correlations, 224 Non-locality, 11 Non-standard analysis, 46 Normativity of epistemology, 75 Norton, J., 121

O Objectified credence, 184, 187 Objective chance, 183 Objectivity vs. intersubjectivity, 187 Objectivity as intersubjective agreement, 57 Objects as phenomena, 56 Objectual quantification, 109 Observation conditional, 41 Observation reports, 30, 34, 42, 106, 110 Occasion sentences, 49, 58 Occupation number, 218 Ockham, 4, 26, 47 Ockham’s razor, 26 Okun, L., 135 Omnes, R., 240 Ontic interpretation, 247 Ontological commitment, 171 Ontological parsimony, 48 Ontological structural realism, 107 Ontological unification, 95 Overlapping calendars, 214

289 P Pargetter, R., 139 Parmenides, 16 Particle, 171 Particles and fields, 8 Particles, classical and quantum, 134 Particles/fields, not both, 170 Passmore, J., 36 Past, present and future are indexicals, 10 Pataut, F., 70 Pauli’s exclusion principle, 161 Pauli, W., 33, 233 Pearl, J., 125 Penrose, R., 142 Perception of things as similar, 78 Personal presupposition, 26 Phenomena, 148 Physicalism, 194 Physical necessity, 164 Physical necessity, definition of, 166 Physical quantities, 5 Pietsch, W., 170 Planck, M., 163, 228, 242 Planck’s radiation law, 21, 243 Platonism, 172 Poincaré, H., 10, 29, 31, 35, 60, 64 Pooley, O., 261 Popper, K., 42, 74, 80, 184 Position variable, 238 Potential infinity, 67 Predicativity, 60 Predictability entails determinism, 187 Price, H., 198, 211 Priest, G., 47 Prigogine, I., 198 Primary substance, 26, 62 Primitive ‘thisness’, 218 Principal Principle, 187 Principia as a paradigm, 19 Principle of bivalence, 55 Principles of coordination, 133 Probability for interaction, 230 Projectible predicates, 83 Projection operator, 234 Projection postulate, 235 Propensities, 183, 188, 230 Propensities as transition probabilities, 188 Proxy function, 105, 173 Putnam, H., 34, 130

Q Quantisation, 21 Quantisation of conserved quantities, 218

290 Quantisation of interaction, 135, 161 Quantitative predicates have no references, 5 Quantities, 110, 111 Quantities, fundamental, 135 Quantity of matter=mass, 17, 153 Quantum of action, 244 Quine’s radical holism, 36 Quine, W.V., 3, 4, 7, 35, 37, 41, 47, 58, 63, 65, 67, 70, 74, 78, 81, 93, 104, 128, 170, 171, 173, 195, 248

R Raatikainen, P., 70 Randomness as incomplete information, 186 Rationalism, 27 Rausch, H., 240 Ray, J., 86 Rays, 209, 212, 221, 250 Rays, definition of, 163 Redhead, M., 121 Reducing distance to time, 5 Reference and identity, 7 Reichenbach, H., 123, 133, 148, 149 Reichenbach’s principle, 223 Relationalism, 115, 255 Relativity principle, 157 Rest a natural state, 16 Rest and uniform motion relative to observer, 18 Retarded solution, 205 Ricci-Curbastro, G., 21 Richard’s paradox, 59 Roberts, J., 140 Rods and clocks in spacetime, 257 Ross, D., 107, 121 Rothman, R., 152 Rovelli, C., 245, 256, 261 Rule-following, 78 Russell, B., 121 Russell’s insight, 35, 95, 107, 129, 200, 208, 262 Russell & Whitehead, 64

S Salmon, W., 94, 123 Sarkar, S., 95 Saunders, S., 83, 218 Savitt, S., 198 Schrödinger’s wave mechanics, 6 Schwartz, J., 6, 7, 53 Scientific realism, 8, 101, 102 Scorzato, L., 45

Index Searle, J., 172 Second, definition of, 201 Second law of thermodynamics, 198 Seevinck, M., 218 Segal, I., 180 Self-field, 176 Sellars, W., 49, 82 Semantic anti-realism, critics of, 69 SG magnet, 238 Simon, H., 156 Simplicity depends on language, 45 Simplicity, not an evidentiary value, 45 Singlet state functions are singular terms, 224 SI system, 201 Smolin, L., 97 Space, 9 Space of causes vs. space of reasons, 82 Spacetime, 9 Spacetime as fourdimensional manifold, 20 Spacetime interval, 20 Spelke, E., 128 Spin measurements, 224, 238 Spurrett, D., 121 Standard Model of particle physics, 136 Stipulative definition, 36 Stoßzahlansatz, 211 String theory, 108 Structural realism, 106 Substance, 17 Substantivalism, 115, 255 Substantivalism/relationism, 19 Substitutional quantification, 109 Success criteria are confirmed predictions, 19 Sundholm, G., 70 Supervenience, 194 Suppes, P., 112 Symmetry entails conservation, 162

T Tarski, A., 112 Tegmark, M., 6, 53 Tensor product is one term, 222, 242 Tensor products, 222 Tensor products are singular terms, 248 Tensors, 20 Terms and their referents, 63 Theoretical vs. unreflected similarity, 48 Theoretical concepts are implicitly defined, 35 Theoretical reduction, 92 Theories of principle vs. constructive theories, 113 Theory of everything, 97 Thermodynamics, second law, 210

Index Things as they are in themselves, 29 Three-body problem, 10 Time, 9 Time an external parameter, 212 Time asymmetry in weak interactions, 202 Time dilation, 258 Time direction based on irreversible changes, 213 Time measurements, 132, 259 Time parameter reversal, 199 Time reversal vs. dynamical reversibility, 208 Time reversal symmetry, 198 Time, the fundamental quantity, 5 Tolman, R., 208, 211 Tooley, M., 139 Transcendental vs. empirical level, 29 Transition probability, 189, 230 Tribegrue, 84 Tropes, 127 T-schema, 50 Truthlikeness, 74 U Uncertainty relations, 231 Underdetermination, 103, 104 Unification, 92 Unitary evolution is no state change, 221 Unitary operator, 235 Universally generalised conditionals (UGC), 143 Universals, 26, 62 Universals ante res, 27, 110

291 Universals in rebus, 27, 110 Unobserved interactions, 249

V Value determinism, 247 Van Fraassen, B., 4, 38, 42, 63, 103, 114, 139, 141, 146, 148, 151, 166 Veneziano, G., 135 Verifiability criterion, 33 Vickers, P., 170 Vienna Circle, 32 Von Neumann, J., 233, 235

W Wallis, J., 19, 30, 80, 152 Weierstrass, K., 45 Weinberg, S., 163, 169, 207, 250 Weyl, H., 64, 131 Whales are not fishes, 85 Wheeler, J., 169 Wigner, E., 6, 53, 240 Wittgenstein, L., 37, 78, 81, 149, 225 Woodward, J., 125 Wren, C., 19, 30, 80, 152 Wynn, K., 66

Z Zee, A., 198 Zeh, H., 198 Zeno, 16 Zwiebach, B., 265