Quantum Fields and Processes: A Combinatorial Approach 9781108241885

Wick ordering of creation and annihilation operators is of fundamental importance for computing averages and correlation

684 119 6MB

English Pages 341 Year 2018

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

Quantum Fields and Processes: A Combinatorial Approach
 9781108241885

Table of contents :
Contents......Page 8
Preface......Page 11
Notation......Page 14
1.1 Counting: Balls and Urns......Page 18
1.2 Statistical Physics......Page 20
1.3 Combinatorial Coefficients......Page 31
1.4 Sets and Bags......Page 34
1.5 Permutations and Partitions......Page 36
1.6 Occupation Numbers......Page 39
1.7 Hierarchies (= Phylogenetic Trees = Total Partitions)......Page 42
1.8 Partitions......Page 44
1.9 Partition Functions......Page 48
2.1 Random Variables......Page 54
2.2 Key Probability Distributions......Page 56
2.3 Stochastic Processes......Page 59
2.4 Multiple Stochastic Integrals......Page 61
2.5 Iterated Ito¯ Integrals......Page 65
2.6 Stratonovich Integrals......Page 68
2.7 Rota–Wallstrom Theory......Page 71
3 Quantum Probability......Page 73
3.1 The Canonical Anticommutation Relations......Page 74
3.2 The Canonical Commutation Relations......Page 76
3.3 Wick Ordering......Page 86
4.1 Green’s Functions......Page 91
4.2 A First Look at Boson Fock Space......Page 103
5 Combinatorial Species......Page 106
5.1 Operations on Species......Page 108
5.2 Graphs......Page 111
5.3 Weighted Species......Page 112
5.4 Differentiation of Species......Page 114
6.1 Basic Concepts......Page 116
6.2 Functional Integrals......Page 120
6.3 Tree Expansions......Page 130
6.4 One-Particle Irreducibility......Page 132
7.1 Entropy and Information......Page 139
7.2 Law of Large Numbers and Large Deviations......Page 143
7.3 Large Deviations and Stochastic Processes......Page 150
8.1 Hilbert Spaces......Page 155
8.2 Tensor Spaces......Page 157
8.3 Symmetric Tensors......Page 161
8.4 Antisymmetric Tensors......Page 168
9.1 Operators on Fock Spaces......Page 174
9.2 Exponential Vectors and Weyl Operators......Page 189
9.3 Distributions of Boson Fields......Page 194
9.4 Thermal Fields......Page 200
9.5 q-deformed Commutation Relations......Page 201
10.1 The Bargmann–Fock Representation......Page 206
10.2 Wiener Product and Wiener–Segal Representation......Page 208
10.3 Ito–Fock Isomorphism......Page 210
11 Local Fields on the Boson Fock Space: Free Fields......Page 215
11.1 The Free Scalar Field......Page 216
11.2 Canonical Operators for the Free Field......Page 248
12.1 Interacting Neutral Scalar Fields......Page 254
12.2 Interaction with a Classical Current......Page 263
13.1 Operators on Guichardet Fock Space......Page 269
13.2 Wick Integrals......Page 279
13.3 Chronological Ordering......Page 281
13.4 Quantum Stochastic Processes on Fock Space......Page 284
13.5 Quantum Stochastic Calculus......Page 286
13.6 Quantum Stratonovich Integrals......Page 293
13.7 The Quantum White Noise Formulation......Page 294
13.8 Quantum Stochastic Exponentials......Page 296
13.9 The Belavkin–Holevo Representation......Page 299
14.1 A Quantum Wong Zakai Theorem......Page 309
14.2 A Microscopic Model......Page 329
References......Page 333
Index......Page 339

Citation preview

C A M B R I D G E S T U D I E S I N A DVA N C E D M AT H E M AT I C S 1 7 1 Editorial Board ´ S , W. F U LTO N , A . K ATO K , F. K I RWA N , P. S A R NA K , B. BOLLOBA B . S I M O N , B . TOTA RO

QUANTUM FIELDS AND PROCESSES Wick ordering of creation and annihilation operators is of fundamental importance for computing averages and correlations in quantum field theory and, by extension, in the Hudson–Parthasarathy theory of quantum stochastic processes, quantum mechanics, stochastic processes, and probability. This book develops the unified combinatorial framework behind these examples, starting with the simplest mathematically, and working up to the Fock space setting for quantum fields. Emphasizing ideas from combinatorics such as the role of the lattice of partitions for multiple stochastic integrals by Wallstrom–Rota and combinatorial species by Joyal, it presents insights coming from quantum probability. It also introduces a “field calculus” that acts as a succinct alternative to standard Feynman diagrams and formulates quantum field theory (cumulant moments, Dyson–Schwinger equation, tree expansions, 1-particle irreducibility) in this language. Featuring many worked examples, the book is aimed at mathematical physicists, quantum field theorists, and probabilists, including graduate and advanced undergraduate students. John Gough is professor of mathematical and theoretical physics at Aberystwyth University, Wales. He works in the field of quantum probability and open systems, especially quantum Markovian models that can be described in terms of the Hudson– Parthasarathy quantum stochastic calculus. His more recent work has been on the general theory of networks of quantum Markovian input-output and their applications to quantum feedback control. Joachim Kupsch is professor emeritus of theoretical physics at the University of Kaiserslautern, Germany. His research has focused on scattering theory, relativistic S-matrix theory, and infinite-dimensional analysis applied to quantum field theory. His publications have examined canonical transformations, fermionic integration, and superanalysis. His later work looks at open systems and decoherence and he coauthored a book on the subject in 2003.

CAMBRIDGE STUDIES IN ADVANCED MATHEMATICS Editorial Board: B. Bollob´as, W. Fulton, F. Kirwan, P. Sarnak, B. Simon, B. Totaro All the titles listed below can be obtained from good booksellers or from Cambridge University Press. For a complete series listing visit: www.cambridge.org/mathematics. Already published 132 J. V¨aa¨ n¨anen Models and Games 133 G. Malle & D. Testerman Linear Algebraic Groups and Finite Groups of Lie Type 134 P. Li Geometric Analysis 135 F. Maggi Sets of Finite Perimeter and Geometric Variational Problems 136 M. Brodmann & R. Y. Sharp Local Cohomology (2nd Edition) 137 C. Muscalu & W. Schlag Classical and Multilinear Harmonic Analysis, I 138 C. Muscalu & W. Schlag Classical and Multilinear Harmonic Analysis, II 139 B. Helffer Spectral Theory and Its Applications 140 R. Pemantle & M. C. Wilson Analytic Combinatorics in Several Variables 141 B. Branner & N. Fagella Quasiconformal Surgery in Holomorphic Dynamics 142 R. M. Dudley Uniform Central Limit Theorems (2nd Edition) 143 T. Leinster Basic Category Theory 144 I. Arzhantsev, U. Derenthal, J. Hausen & A. Laface Cox Rings 145 M. Viana Lectures on Lyapunov Exponents 146 J.-H. Evertse & K. Gy˝ory Unit Equations in Diophantine Number Theory 147 A. Prasad Representation Theory 148 S. R. Garcia, J. Mashreghi & W. T. Ross Introduction to Model Spaces and Their Operators 149 C. Godsil & K. Meagher Erd˝os-Ko-Rado Theorems: Algebraic Approaches 150 P. Mattila Fourier Analysis and Hausdorff Dimension 151 M. Viana & K. Oliveira Foundations of Ergodic Theory 152 V. I. Paulsen & M. Raghupathi An Introduction to the Theory of Reproducing Kernel Hilbert Spaces 153 R. Beals & R. Wong Special Functions and Orthogonal Polynomials 154 V. Jurdjevic Optimal Control and Geometry: Integrable Systems 155 G. Pisier Martingales in Banach Spaces 156 C. T. C. Wall Differential Topology 157 J. C. Robinson, J. L. Rodrigo & W. Sadowski The Three-Dimensional Navier–Stokes Equations 158 D. Huybrechts Lectures on K3 Surfaces 159 H. Matsumoto & S. Taniguchi Stochastic Analysis 160 A. Borodin & G. Olshanski Representations of the Infinite Symmetric Group 161 P. Webb Finite Group Representations for the Pure Mathematician 162 C. J. Bishop & Y. Peres Fractals in Probability and Analysis 163 A. Bovier Gaussian Processes on Trees 164 P. Schneider Galois Representations and (ϕ ,  )-Modules 165 P. Gille & T. Szamuely Central Simple Algebras and Galois Cohomology (2nd Edition) 166 D. Li & H. Queffelec Introduction to Banach Spaces, I 167 D. Li & H. Queffelec Introduction to Banach Spaces, II 168 J. Carlson, S. Mller-Stach & C. Peters Period Mappings and Period Domains (2nd Edition) 169 J. M. Landsberg Geometry and Complexity Theory 170 J. S. Milne Algebraic Groups 171 J. Gough & J. Kupsch Quantum Fields and Processes

Quantum Fields and Processes A Combinatorial Approach JOHN GOUGH Aberystwyth University

J OAC H I M K U P S C H University of Kaiserslautern

University Printing House, Cambridge CB2 8BS, United Kingdom One Liberty Plaza, 20th Floor, New York, NY 10006, USA 477 Williamstown Road, Port Melbourne, VIC 3207, Australia 314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India 79 Anson Road, #06–04/06, Singapore 079906 Cambridge University Press is part of the University of Cambridge. It furthers the University’s mission by disseminating knowledge in the pursuit of education, learning, and research at the highest international levels of excellence. www.cambridge.org Information on this title: www.cambridge.org/9781108416764 DOI: 10.1017/9781108241885 © John Gough and Joachim Kupsch 2018 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2018 Printed in the United States by Sheridan Books, Inc. A catalogue record for this publication is available from the British Library Library of Congress Cataloguing-in-Publication Data Names: Gough, John, 1967– author. | Kupsch, Joachim, 1939– author. Title: Quantum fields and processes : a combinatorial approach / John Gough, Aberystwyth University, Joachim Kupsch, University of Kaiserslautern. Description: Cambridge : Cambridge University Press, 2018. | Series: Cambridge studies in advanced mathematics ; 171 | Includes bibliographical references and index. Identifiers: LCCN 2017036746 | ISBN 9781108416764 (Hardback : alk. paper) Subjects: LCSH: Combinatorial analysis. | Quantum field theory. | Probabilities. Classification: LCC QA165 .G68 2018 | DDC 530.14/3015116–dc23 LC record available at https://lccn.loc.gov/2017036746 ISBN 978-1-108-41676-4 Hardback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

To Margarita, Sigrid, and John Junior.

Contents

Preface Notation

page xi xiii

1 Introduction to Combinatorics 1.1 Counting: Balls and Urns 1.2 Statistical Physics 1.3 Combinatorial Coefficients 1.4 Sets and Bags 1.5 Permutations and Partitions 1.6 Occupation Numbers 1.7 Hierarchies (= Phylogenetic Trees = Total Partitions) 1.8 Partitions 1.9 Partition Functions

1 1 3 14 17 19 22 25 27 31

2 Probabilistic Moments and Cumulants 2.1 Random Variables 2.2 Key Probability Distributions 2.3 Stochastic Processes 2.4 Multiple Stochastic Integrals 2.5 Iterated It¯o Integrals 2.6 Stratonovich Integrals 2.7 Rota–Wallstrom Theory

37 37 39 42 44 48 51 54

3 Quantum Probability 3.1 The Canonical Anticommutation Relations 3.2 The Canonical Commutation Relations 3.3 Wick Ordering

56 57 59 69

vii

viii

Contents

4

Quantum Fields 4.1 Green’s Functions 4.2 A First Look at Boson Fock Space

74 74 86

5

Combinatorial Species 5.1 Operations on Species 5.2 Graphs 5.3 Weighted Species 5.4 Differentiation of Species

89 91 94 95 97

6

Combinatorial Aspects of Quantum Fields: Feynman Diagrams 6.1 Basic Concepts 6.2 Functional Integrals 6.3 Tree Expansions 6.4 One-Particle Irreducibility

99 99 103 113 115

7

Entropy, Large Deviations, and Legendre Transforms 7.1 Entropy and Information 7.2 Law of Large Numbers and Large Deviations 7.3 Large Deviations and Stochastic Processes

122 122 126 133

8

Introduction to Fock Spaces 8.1 Hilbert Spaces 8.2 Tensor Spaces 8.3 Symmetric Tensors 8.4 Antisymmetric Tensors

138 138 140 144 151

9

Operators and Fields on the Boson Fock Space 9.1 Operators on Fock Spaces 9.2 Exponential Vectors and Weyl Operators 9.3 Distributions of Boson Fields 9.4 Thermal Fields 9.5 q-deformed Commutation Relations

157 157 172 177 183 184

10 L2 -Representations of the Boson Fock Space 10.1 The Bargmann–Fock Representation 10.2 Wiener Product and Wiener–Segal Representation 10.3 It¯o–Fock Isomorphism

189 189 191 193

11 Local Fields on the Boson Fock Space: Free Fields 11.1 The Free Scalar Field 11.2 Canonical Operators for the Free Field

198 199 231

Contents

ix

12 Local Interacting Boson Fields 12.1 Interacting Neutral Scalar Fields 12.2 Interaction with a Classical Current

237 237 246

13 Quantum Stochastic Calculus 13.1 Operators on Guichardet Fock Space 13.2 Wick Integrals 13.3 Chronological Ordering 13.4 Quantum Stochastic Processes on Fock Space 13.5 Quantum Stochastic Calculus 13.6 Quantum Stratonovich Integrals 13.7 The Quantum White Noise Formulation 13.8 Quantum Stochastic Exponentials 13.9 The Belavkin–Holevo Representation

252 252 262 264 267 269 276 277 279 282

14 Quantum Stochastic Limits 14.1 A Quantum Wong Zakai Theorem 14.2 A Microscopic Model

292 292 312

References

316

Index

322

Preface

It is probably safe to say that very few people have developed an interest in physical sciences due to combinatorics. Yet combinatorics makes its presence felt in modern mathematics and physics in a fundamental and elegant manner, and goes far beyond standard school problems such as to determine how many ways to rearrange the letters of the word MISSISSIPPI. It has been argued in many places that modern physics owes much to the work of Ludwig Boltzmann, who in many ways was the first scientist to think like a modern physicist. Certainly, he was the first to use probability in an essential way, and his ideas on the microscopic foundations of thermodynamics directly influenced Gibbs and primed both Planck and Einstein at the start of the twentieth century. The machinery needed for modern mathematical physics was assembled in the twentieth century, and concentrated on the areas of functional analysis and probability theory, moving inexorably toward the description of stochastic processes and quantized fields. Behind this, however, was the realization that combinatorics played an important role. In writing this book, we have been influenced by several recurring ideas in mathematical physics that all have an underlying combinatorial core. Wallstrom and Rota were the first to notice that several disparate strands, such as multiple It¯o integrals, Wick products, and normal ordering, could be conveniently expressed in terms of the combinatorics of the lattice of partitions. They, in turn, were influenced by Meyer’s book, Quantum Probability for Probabilists (1995). In many respects, our focal point for this book has been (Bose and Fermi) Fock space: this is the framework for quantum field theory and the quantum stochastic calculus of Hudson and Parthasarathy, and is well known to probabilists. The creation and annihilation operators, satisfying canonical (anti-)commutation relations, are then the objects of much of our attention, and we give a good deal of attention to both their mathematical and physical aspects. The elements of combinatorics that we xi

xii

Preface

cover here are those arising from quantum fields and stochastic processes. However, this gives ample room to bring in modern approaches, especially the combinatorial species of Joyal, to give a more algebraic feel than the traditional Feynman diagram approach. We combine elements of species with the Guichardet notation for symmetric Fock spaces in order to construct a field calculus that leads to explicit combinatorial formulas in place of the diagrammatic expansions. We have tried to resist the temptation to show off the often surprising and striking combinatorial expressions that arise. In truth, combinatorics appears as a tool in many branches of mathematics – and very frequently in the proofs of mathematical physics propositions – but doing justice to these varied and multifaceted techniques would be beyond the scope of this book. We are grateful for the input of many colleagues over the years. We would especially like to thank Robin Hudson, Luigi Accardi, Yun Gang Lu, Igor Volovich, Wilhelm von Waldenfels, Hans Maassen, Madalin Guta, Hendra Nurdin, Matthew James, Joe Pul´e, and Aubrey Truman, and to acknowledge the enormous influence of the extended quantum probability community at large. The completion of the project is in no small part due to Professor Oleg Smolyanov, whose constant refrain, “you need to write a book,” propelled us forward. We are also very grateful to Diana Gillooly and the staff at Cambridge University Press for their help and advice while writing. Finally, we would like to thank our families – Margarita and John Junior, and Sigrid – for their support and patience in waiting for a book dedicated to them.

Notation

General Mathematical ran(f ) A A∗ spec(A) n (t, s) σn (t, s) (x)

range of a function f transpose adjoint/Hermitian conjugate/complex conjugate spectrum of a matrix/operator A the simplex {(tn , . . . , t1 ) : t ≥ tn > tn−1 > · · · t1 ≥ s} the simplex {(tn , . . . , t1 ) : t ≥ tσ (n) > tσ (n−1) > · · · tσ (1) ≥ s} Heaviside function

Spaces, Norms, Etc. C R Z N N+ N− M ˆ M h, H L2 (R, C) 2 (N, C) B(h) T(h) (h) Sn (h)

complex numbers real numbers integers the natural numbers {1, 2, 3, . . .} nonnegative integers {0, 1, 2, 3, . . .} the set {0, 1} Minkowski spacetime dual Minkowski spacetime Hilbert space (always assumed separable!) complex-valued square integrable functions on R complex-valued square summable sequences bounded operators on a Hilbert space h bounded operators on a Hilbert space h Fock space (bosonic) over a Hilbert space h Permutations on {1, 2, . . . , n} Fock space over a Hilbert space h xiii

xiv  + (h)  − (h) ⊗ ∨ ∧

Notation

Boson Fock space over a Hilbert space h Fermion Fock space over a Hilbert space h tensor product symmetric tensor product antisymmetric tensor product

Combinatorics xn

n mn  m

Perm(X) Part(X) Pair(X) Power(X) Hn (x) 

falling factorial powers, i.e., x (x − 1) (x − 2) · · · (x − n + 1) Stirling number of the first kind Stirling number of the second kind permutations on a set X partitions of a set X pair partitions of a set X power set (set of finite subsets) of a set X [n/2] k n! Hermite polynomial, = k=1 2k(−1) xn−2k k!(n−2k)! Wick product

Quantum Mechanics ψ(x) |ψ ρ

ψ|φ P D(β) | exp β  :X :

wavefunction state vector/ket density operator Dirac notation for inner product projection Weyl displacement operator Exponential vector (Bargmann state) group product for the Heisenberg group Wick (normal) ordering of an operator X

Classical Probability P E A χE [[X, Y]]t Xt[n] ,

probability measure expectation σ -algebra sample space indicator function for event E t the quadratic covariance, that is, 0 dXs .dYs t the process 0 (dXs )n

Notation  (n) (1) −[0,t]n dXtn . . . dXt1 e/Xt X δY T S eZt

xv

the diagonal-free stochastic integral the diagonal-free stochastic exponential Stratonovich differential (i.e., X dY + 12 dX dY). Stratonovich time-ordered exponential

Quantum Field Theory  J  φX = x∈X φ(x) GX KX A(f ) D(f ) W(f ) ≡ D(if ) (U) d(H) ˆ ) (f P, P ↑

space of all field configurations space of all source fields (= test functions in the dual of ) Green’s functions (moments = expectation of φX ) connected Green’s functions (= cumulants) annihilation operator with test function f Weyl unitary with test function f second quantization of a unitary U differential second quantization Segal’s field operator, = A(f ) + A∗ (f ) Poincar´e group, the orthochronous Poincar´e group

1 Introduction to Combinatorics

How we count things turns out to have a powerful significance in physical problems! One of the oldest problems stems from undercounting and overcounting the number of possible configurations a particular system can have – mathematically, this is usually due to the fact that objects are mistakenly assumed to be indistinguishable when they are not, and vice versa. However, one of the great surprises of physics is that identical particles are fundamentally indistinguishable. In this chapter, we will introduce some of the basic mathematical objects that occur in physical problems, and give their enumeration. Statistical mechanics is one of the key sources of ideas, so we spend some time on the basic concepts here, especially as partition functions are clear examples of generating functions that we will encounter later on. We will recall some of the basic mathematical concepts in enumeration, leading on to the role of generating functions. At the end, we make extensive use of generating functions, exploiting the methods for dealing with partition functions in statistical mechanics, but for specific combinatorial families such as permutations and partitions. We start, however, with the touchstone for all combinatorial problems: how to distribute balls in urns.

1.1 Counting: Balls and Urns Proposition 1.1.1 There are K N different ways to distribute the N distinguishable balls among K distinguishable urns. The proof is based on the simple observation that there are K choices of urn for each of the N balls. Suppose next that we have more urns than balls.

1

2

Introduction to Combinatorics

Figure 1.1 Occupation numbers of distinguishable balls in distinguishable urns. Proposition 1.1.2 The total number of ways to distribute the N distinguishable balls among K distinguishable urns so that no urn ends up with more than one ball is (later we will call this a falling factorial) K N  K (K − 1) · · · (K − N + 1) . The argument here is simple enough: if we have already distributed M balls among the urns, with no urn having more than one ball inside, so if we now want to place in an extra ball we have these K − M empty urns remaining to choose from; therefore, K M+1 = K M (K − M) with K 1 = K. Let nk denote the number of balls in urn k; we call this the occupation number of the urn. See Figure 1.1. We now give the number of possibilities leading to a given set of occupation numbers, subject to the constraint of a fixed  total number of balls, K k=1 nk = N. Proposition 1.1.3 The number of ways to distribute N distinguishable balls among N distinguishable urns so that we have a prescribed number nk balls are in the kth urn, for each k = 1, . . . , K, is the multinomial coefficient

N! N = . n1 ! n2 ! . . . nK ! n1 . . . nK The proof here is based on the observation that there are nN1 ways to choose N−n1 the n1 balls to go into the first urn, then n2 ways to choose next n2 balls to go into the second urn, and so on, leading to



N − n1 N − n1 − · · · − nK−1 N! N ··· ≡ . n2 nK n1 ! n2 ! . . . nK ! n1 Suppose however that the balls are in fact indistinguishable as in Figure 1.2! Then we do not distinguish between distributions having the same occupation numbers for the urns.

1.2 Statistical Physics

3

Figure 1.2 Occupation numbers of indistinguishable balls in distinguishable urns. Proposition 1.1.4 There are N+K−1 ways to distribute N indistinguishable N balls among K distinguishable urns. Proof Take, for example, K = 6 urns and N = 8 balls and consider the distribution represented by occupation sequence (1, 2, 0, 4, 1, 0) , then encode this as follows: •| • •| | • • • •| • | which means one ball in urn 1, two balls in urn 2, no balls in urn 3, and so on. In this encoding, we have N + K − 1 symbols (balls and sticks), N of which are balls and K − 1 of which are sticks (separations between the urns). In any such distribution, we must choose which N of the N + K − 1 symbols are to be the different ways to do this. Each way of selecting balls, and there are N+K−1 N these symbols corresponds to a unique distribution, and vice versa. Proposition 1.1.5 The number of ways to distribute N indistinguishable balls K . among K distinguishable urns, if we only allow at most one ball per urn, is N That is, we must choose N out of the K urns to have a ball inside. These enumerations turn out to be of immediate relevance to sampling theory in statistics. We note that if we have a set of K items and we draw a sample K such samples – of size N, then if we make no replacement there will be N imagine placing a ball into urn j if the jth element is selected! If replacement is . allowed, then the number of samples is N+K−1 N

1.2 Statistical Physics Counting problems surfaced early on in the theory of statistical mechanics, and we recall the basic setting next.

4

Introduction to Combinatorics

Figure 1.3 Three microstates, each corresponding to N = 24 particles, with 8 “on” (the black boxes).

1.2.1 The Microcanonical Ensemble Two-State Model Ludwig Boltzmann pioneered the microscopic derivation of laws of thermodynamics. To understand his ideas, we consider a very simple model of a solid material consisting of N particles, where each particle can be in either one of two states: an “off” state of energy 0 and an “on” state of energy ε. The total energy is therefore U = εM, where M is the number of particles in the “on” state. In Figure 1.3, we have three typical examples where N = 24 and U = 8ε, that is, in each of these we have 8 “on” states of of a total of 24. Each of these configurations is referred to as a microstate, and we say that they are consistent with the macrostate (U = 8ε, N = 24). Boltzmann’s idea was that if the system was isolated, so that the energy U was fixed, the system’s internal dynamics would make it jump from one microstate to another with only microstates consistent with the fixed macrostate (U, N) allowed. (In other words, the number of particles, N, and their energy U, are to be constants on the motion – whatever that happens to be.) Here the total number of microstates consistent with macrostate (U = εM, N) is then

N W (U, M) = . M He then made the ergodic hypothesis: over a long enough period of time, each of these microstates was equally likely: that is, the system may be found to be in a given microstate with frequency 1/W. Therefore, long time averages would equate to averages over all the microstates consistent with the macrostate, with each microstate having equal probabilistic weight 1/W. The latter probability system is known as the microcanonical ensemble. We note that the set of all microstates consistent with (U = 8ε, N = 24) also includes some less-thanrandom-looking configurations such as the ones shown 24 in Figure 1.4. But, nevertheless, they each get equal weight: here W = 8 = 735, 471. At this resolution, we would expect to see the system run through the possible microstates, so that the picture over time would appear something like static on a TV screen. Configurations with a discernable pattern, as in Figure 1.4,

1.2 Statistical Physics

5

Figure 1.4 Another three microstates consistent with the (U = 8ε, N = 24). Despite their apparent structure, each has the same 1/735,471 chance to occur as the more random ones in Figure 1.3. may flash up briefly from time to time, but most of the time we are looking at fairly random-looking configurations such as in Figure 1.3. If N is large, and the particles are small, then we would expect to be looking most of the time at a uniform gray – the shade of gray determined by the ration M/N. Boltzmann’s remarkable proposal was that the entropy associated with a macrostate (U, N) was the logarithm of the number of consistent microstates S (U, N) = k ln W (U, N) where k is a scale factor fixing our eventual definition of temperature scale. N! . If we take N large In the present case, we have W (U = εM, N) = (N−M)!M! with U = Nu for some fixed ratio u, then using Stirling’s approximation, ln N! = N ln N − N + O(ln N), we find that the entropy per particle in the bulk limit (N → ∞) is S (U = Nu, N) N = −kp0 ln p0 − kp1 ln p1

s (u) = lim

N→∞

u where p1 = M N ≡ ε and p0 = 1 − p1 . (Note p1 is the proportion of particles that are “on” in each of these microstates, with p0 the proportion “off”.) Alternatively, we may write this as

u u  u  u  ln + 1 − ln 1 − . s (u) = −k ε ε ε ε

From thermodynamics, ε one should identify the temperature T via the relation ∂s k 1/T = ∂u ≡ − ε ln u − 1 , and so in this model we have u=

ε . eε/kT + 1

Somewhat surprisingly, this artificial model actually shows very good qualitative behavior for small values of u – see, for instance, Callen (1985, chapter 15). (Note that for 0 ≤ u < ε/2, the temperature will be positive, but becomes negative for higher values ε/2 < u ≤ ε. Negative temperatures do actually

6

Introduction to Combinatorics

Figure 1.5 A microstate consistent with the (U = 8ε, N = 24) in the Einstein model: one of 7,888,725. makes physical sense, however, and are encountered in the related model of a two-state ferromagnet.) Einstein’s Model A related model is the Einstein model for a crystalline solid. The difference is that each particle can have allowed energies 0, ε, 2ε, 3ε, . . . . This time we may depict a microstate as in Figure 1.5, where the number n in each box tells us that the corresponding particle has energy nε (or, perhaps more physically, that there are n quanta in the box!). We see that the number of microstates consistent with a macrostate (U = εM, N) will be

N+M−1 W= . M In other words, the number of ways of distributing M is indistinguishable quanta among N distinguishable boxes. This time, again using Stirling’s identity, one may show that the entropy per particle in the bulk limit is

 ε  u u  + ln 1 + , s (u) = k ln 1 + ε ε u and that u = eε/kTε −1 .

1.2.2 The Canonical Ensemble Boltzmann’s ergodic hypothesis marks the introduction of probability theory into physics. So far we have not used probability explicitly; however, this changes if we consider the situation depicted in Figure 1.6. For simplicity, we work with the two-state model. We fix the total energy Utot of the Ntot = N + N  particles, and then Boltzmann’s principle tells us that each microstate of the total system is equally likely. However, the total number of “on” particles in total is constant, the number that are inside the system will vary with a probability distribution that we determined by the ergodic hypothesis. Let Mtot = Utot /ε, and suppose have x particles in the “on” state in the system – then there are Nx microstates

1.2 Statistical Physics

7

Figure 1.6 A system with N particles and microstate ω forms a subsystem of a larger system of N + N  particles. N of the system consistent with this, and Mtot −x microstates possible for the complement. Allowing for all possible x leads to



Mtot N + N N N = . x Mtot Mtot − x x=0

N, N 

If and Mtot are large of the same order, then it turns out that the largest term in the sum comes from x ≈ ρN, where ρ = Mtot /Ntot . In fact, this one term alone dominates to the extent that one may ignore all the other terms. This is related to the large deviation principle, which we discuss in Chapter 7. The energy of the subsystem is now a random variable, and Boltzmann’s hypothesis tells us its distribution. We now consider the situation where the number N of particles in our system is large but fixed, but we take N  → ∞. We do this in such a way that the average number u¯ = ε Mtot /Ntot is a constant. Suppose we have a fixed microstate ω of our subsystem with energy E(ω), i.e. the number of “on” particles in the subsystem. Then the probability of the particular microstate, ω, occurring is pN  (ω) = W(U  , N  )/W(Utot , Ntot )



N + N N = u¯  u¯ −u u¯  εN + ε N ε (N + N ) where u = E(ω)/N is the energy density of the subsystem. Now making the approximation that W(uN, N) ≈ eNs(u)/k , we find

N  k ln pN  (ω) = N s u¯ + (¯u − u)  − (N + N  ) s(¯u) N 

 N  = −N s(¯u) + kN s u¯ + (¯u − u)  − s(¯u) N → −N s(¯u) + kNs (¯u)(¯u − u),

8

Introduction to Combinatorics

since s is differentiable. Specifying the average energy per particle in the bulk limit N  → ∞ to be u¯ is equivalent to fixing the temperature T via the rela1  (¯  u ), from which we see that ln p (ω) → tion 1/T = s N kT F − uN , where F = N u¯ − Ts(¯u) . (Note that the relation between u¯ and T is one-to-one. The variable F = U − TS is the Helmholtz free-energy in thermodynamics.) That is, we obtain the probability 1 −E(ω)/kT e , Z where the normalization is given by the canonical partition function  e−E(ω)/kT , ZN = e−F/kT = pcan. (ω) =

ω

where the sum is over all microstates consistent with having a fixed number N of particles. The probability distribution that we obtain in this way is called the canonical ensemble and is interpreted as saying that our subsystem is in thermal equilibrium with a heat bath at temperature T. The derivation presented in the preceding is actually very general. We relied on the relation W(uN, N) ≈ eNs(u)/k , but not the specifics of the entropy per particle, s(u). So the same argument will go through so long as s(u) exists and defines a monotone increasing, strictly concave function of u.

1.2.3 The Grand Canonical Ensemble We now describe the situation in statistical mechanics where we have a gas consisting of a number of particles, N, each of which can have one of K distinguishable states with energy values ε1 ≤ ε2 ≤ · · · ≤ εK ; see Callen (1985). We allow both the energy and the number of particles to vary: we allow microstates ω, which have N(ω) particles and energy E(ω), and introduce the grand canonical ensemble 1 pg.c. (ω) = e− E(ω)−μN(ω) /kT ,  where the normalization is given by the grand canonical partition function  − E(ω)−μN(ω) /kT = e , ω

where the sum is now over all microstates – unrestricted in both number and energy. We introduce the standard notation of the inverse temperature β = 1/kT, and the parameter μ is known as the chemical potential. The alternate parameter z = eβμ is called the fugacity.

1.2 Statistical Physics

9

We will make extensive use of the following lemma.   Lemma 1.2.1 (The ←→ Lemma) Let M and K be countable sets, and let MK denote the sequences m = (mk )k∈K where mk ∈ M, and let f : K × M → C. We have the formal series relation       f (k, mk ) = f (k, m) . m∈MK k∈K

k∈K

m∈M

Proof If we expand out the right-hand side, we find that we get a sum over  terms of the form k∈K f (k, mk ), where all possible values mk ∈ M will occur. Written in terms of the m = (mk )k∈K gives the left-hand side. In many cases, the expression will be convergent.

1.2.4 Maxwell–Boltzmann Statistics Here we assume that the particles are all distinguishable. Suppose that the jth particle has energy εk(j) , then the sequence of numbers k = (k (1) , . . . , k (N)) determine the state of the gas. In particular, the set of all possible configurations is N,K = {1, . . . , K}N , and we have # N,K = K N . We give a Boltzmann weight to a state k ∈ N,K of  e−βE(k) , where the total energy is E (k) = N j=1 εk(j) . We shall be interested in the canonical partition function  K N N     ZN (β) = e−βE(k) = e−βεk(j) ≡ e−βεk , k∈ N,K

k∈ N,K j=1



k=1



where we used the ←→ Lemma for the last part. The associated grand canonical partition function is  (β, z) =

∞  N=0

zN ZN (β) =

1−z

1 K

−βεk k=1 e

.

This was recognized as leading to an unphysical answer as some of the thermodynamic potentials (in particular, the entropy) ended up being nonextensive. This was known as the Gibbs paradox, and resolution was to apply a correction factor 1/N! to each ZN (β), nominally to account for indistinguishability of the gas particles and to crudely correct for overcounting of possibilities. This now leads to the physically acceptable form

10

Introduction to Combinatorics  K  ∞   1 N −βεk z ZN (β) = exp z e  (β, z) = . N! N=0

k=1

1.2.5 Bose–Einstein Statistics Bosons are fundamentally indistinguishable particles. We are able to say, for instance, that nk particles have the kth energy value εk , for each k, but physically there is no more detailed description to give – the particles have no identities of their own beyond that. The set of all possible configurations is therefore  + N,K

= (n1 , . . . , nK ) ∈ (N+ ) : K

K 

 nk = N

k=1

N+K−1 where N+  {0, 1, 2, . . .}. In particular, we have # + . The N,K = N  K energy associated with a state n = (n1 , . . . , nK ) is then E (n) = k=1 εk nk , and we are led to the Boson canonical partition function ZN+ (β) =



K  

e−βE(n) =

n∈ + N,K

e−βεj nk .

k=1 n∈ + N,K

This time the associated grand canonical partition function is + (β, z) =

∞ 

zN ZN+ (β)

N=0



=

K  −βε nk ze k

(n1 ,...,nK )∈(N+ )K k=1

= =

∞ K   −βε n ze k k=1 n=0 K  k=1

1 , 1 − ze−βεk

(1.1)

  where again we use the ←→ Lemma at the last stage. The thermodynamic potentials have the correct scaling properties and we do not have to resort to any ad hoc corrections of the type needed for Maxwell–Boltzmann statistics.

1.2 Statistical Physics

11

1.2.6 Fermi–Dirac Statistics Fermions are likewise indistinguishable particles; however, we can never have more that one in the same energy state εk . The set of all possible configurations is therefore   K  K nk = N − N,K = (n1 , . . . , nK ) ∈ (N− ) : k=1

where N−  {0, 1}. In particular, we note that # − N,K = canonical partition function is ZN− (β) =

K  

K N . The Fermion

e−βεj nk ,

k=1 n∈ − N,K

and, similar to the Boson case, the grand canonical partition function is −

 (β, z) =

∞  N=0

zN ZN− (β)

=

K  1  k=1 n=0

−βεk n

ze

K  ≡ 1 + ze−βεk . k=1

1.2.7 Entropy of Statistical Ensembles Let be the set of all microstates. The set of all probability measures, , is  the set of vectors π = (p (ω)) with p (ω) ≥ with ω∈ p (ω) = 1. Let Y(ω) be some real-valued function on , then its average for a particular π ∈  is  E[Y]  Y(ω) p(ω). (1.2) ω∈

Let us take N(ω) and E(ω) to be, respectively, the number of particles and the energy associated with a particular microstate ω. We introduce the following sets of microstates: (N) = {ω ∈ : N (ω) = N} and (U, N) = {ω ∈ : E (ω) = U, N (ω) = N}. We write N and U,N for those measures supported on (N) and (U, N) respectively. In general, for π ∈  we define its entropy (Boltzmann’s H-function) to be  p (ω) ln p (ω) . (1.3) H (π ) = − ω∈

The microcanonical ensemble is then distinguished as the measure in U,N which maximizes H (π ). Indeed, we have W (U, N) = # (U, N) and the

12

Introduction to Combinatorics

maximizing π is the uniform measure p (ω) = value then being

1 W(U,N) ,

with the maximum

max H (π ) = ln W (U, N) .

(1.4)

π ∈U,N

The Boltzmann entropy is Smicro. (U, N) = k max H (π ) .

(1.5)

π ∈U,N

Similarly, we may define ¯ N =k Scan. U,

max

¯ π ∈N :E[E]=U

H (π ) ,

(1.6)

and ¯ N¯ = k Sg.c. U,

max

¯ π ∈:E[E]=U,E[N]= N¯

H (π ) ,

(1.7)

and the maximizing probability measures are the canonical and grand canonical ensembles respectively. For instance, in the canonical ensemble case, we may employ Lagrange multipliers α, β and equivalently seek the maximizer over π ∈ N :    (π ) = H (π ) − α p (ω) − β E (ω) p (ω) . ω

ω

Treating the p (ω) as independent, where N (ω) = N, we have implying −1 − ln p (ω) − α − βE (ω) = 0, or  1 −βE(ω) , N (ω) = N; ZN e , p (ω) ≡ 0, otherwise.

∂ ∂p(ω)

= 0

 −βE(ω) . This is, of course, the canonical ensemble at with ZN = ω∈ N e 1 inverse temperature β = kT . For this ensemble, we have Eβ,N can. [E] =

1 ZN

N(ω)=N  ω

E (ω) e−βE(ω) = −

∂ ∂ ln ZN = (βF) ∂β ∂β

β,N ¯ This then and we take the unique parameter value of β such that Ecan. [E] = U. yields

1 ¯ , ¯ N =− F−U Scan. U, T consistent with the definition F = U − TS from thermodynamics.

1.2 Statistical Physics

13

We have made some tacit assumptions about the various thermodynamic variables appearing in the preceding. In practice, the limit values of the intensive variables per particle will be convex or concave functions, but not necessarily strictly so – that is, they may have a linear part. These are associated with phase transitions, and in such cases the various ensembles may be inequivalent. For more on this, see the review of Touchette (2009) and the references therein.

1.2.8 Integer Partitions We digress slightly with a tale about a French aristocrat, Chevalier de Mere, who posed the following early question in probability theory: given a roll of three fair dice, is it more likely to get a total score of 11 or a total score of 12? His answer was that both events were equally likely and here is the argument. We obtain 11 from three dice in p (11, 3) = 6 ways 11 = 6 + 4 + 1 = 6 + 3 + 2 = 5 + 5 + 1 = 5 + 4 + 2 = 5 + 3 + 3 = 4 + 4 + 3, and 12 from three dice in p (12, 3) = 6 ways 12 = 6 + 5 + 1 = 6 + 4 + 2 = 6 + 3 + 3 = 5 + 5 + 2 = 5 + 4 + 3 = 4 + 4 + 4. The problem, however, came to the attention of the famous mathematician Pascal, who gave a different answer. The situation where we roll a 6, a 5, and a 1 is not in fact a single outcome but, rather, is an event corresponding to six distinct outcomes – the three dice are distinguishable, so we can go to finer detail and say which dice take which value. Likewise rolling two 5’s and a 1 is an event corresponding to three distinct outcomes, depending on which die is to be the 1. But the event corresponding to three 4’s is a single outcome. Pascal argued that de Mere undercounted the events. The numbers of possibilities according to Pascal are p˜ (11, 3) = 6 + 6 + 3 + 6 + 3 + 3 = 27 and p˜ (12, 3) = 6 + 6 + 3 + 3 + 6 + 1 = 25, respectively. So a total score of 11 is more likely. In retrospect, Pascal figured out the principle of distinguishability and applied Maxwell–Boltzmann statistics to the dice – as dice are macroscopic entities, this is the correct thing to do. De Mere, on the other hand, was using Bose–Einstein statistics for the dice. Let us denote the number of de Mere events corresponding to rolling a total score of n from m dice by p (n, m). Is there a nice formula for these numbers? Well, let us assume that each die has K faces with scores 1, 2, . . . , K each with

14

Introduction to Combinatorics

a probability 1/K to occur. Suppose that we roll n1 1’s, n2 2’s, n3 3’s, and so on, and obtain a total score of n, then n = K + K + +2+ · · · + K + · · · + 2 · · · + 2 + 1 + 1 + · · · + 1, nK terms

n2 terms

(1.8)

n1 terms

and in the process must have rolled m = nK + · · · + n2 + n1 dice. We refer to (1.8) as an integer partition – more specifically, an integer partition of n into m parts. Lemma 1.2.2 The number, p (n, m), of integer partition of n into m parts have the following generating function:  n,m

p (n, m) zm xn =

K  k=1

1 . 1 − zxk

Proof Let n = (n1 , n2 , . . . , nK ) ∈ (N+ )K give a de Mere event; then it K corresponds to rolling m = k=1 nk dice and getting a total score of n = K kn , from (1.8). Each de Mere event corresponds to an integer partition k k=1 and we have K   K p (n, m) zm xn = z k=1 nk x k=1 knk . n,m

n∈(N+ )K

However, this is just the Bose grand canonical partition function (1.1) with x = e−β and k = k. This result, in fact, goes back to Euler. Note that we may even take our dice to have K = ∞ faces if desired.

1.3 Combinatorial Coefficients We will now introduce formal notations for some of the combinatorial objects appearing in the preceding.

1.3.1 Factorials The nth falling factorial power is defined by1 xn  x (x − 1) (x − 2) · · · (x − n + 1)

(1.9)

1 We refer to these as the Pochhammer symbols, though the notation we use here is due to Graham, Knuth, and Patashnik (1988). The symbol xn is pronounced as “x to the n falling.”

1.3 Combinatorial Coefficients

15

as well as the nth rising factorial power xn  x(x + 1)(x + 2) · · · (x + n − 1).

(1.10)

(We also set x0 = x0 = 1.) The falling and rising binomial coefficients may then defined to be2

n n (n − 1) · · · (n − k + 1) nk =  k − k! k!

n (n + 1) · · · (n + k − 1) n nk =  k! k! k + We remark that the falling binomial coefficients nk − are just the familiar n choose-k binomial coefficients nk . It is easy to derive the following properties. The falling and rising binomial coefficients satisfy the following properties: 1. nk + = n+k−1 k − n 2. k + = (−1)k −n k − 3. Recurrence relations,





n n−1 n−1 = + , k − k−1 − k −





n n n−1 = + , k + k−1 + k + with n0 ± = 1, and 0k ± = 0 if k = 0. The first of these can be used to generate the usual binomial coefficients through the Pascal triangle construction. 4. Generating relations,  n

tk = (1 ∓ t)∓n . k ± k≥0

The generating relations are both instances of the general binomial theorem  (1 + t)p = k≥0 pk tk , valid for |t| < 1.

1.3.2 Stirling Numbers We now derive the connection between the ordinary and falling factorial powers. We begin by remarking that the right-hand sides of (1.9) and (1.10) may be expanded to obtain a polynomial in x of degree n with integer coefficients. 2 There is no established notation for these, and we use + and – due to the eventual connection

with Bose and Fermi counting statistics.

16

Introduction to Combinatorics

The Stirling numbers of the    first and second kind are defined respectively as the coefficients mn and mn appearing in the relations  n  n n m n x ; xm . x ≡ (1.11) x ≡ m m m m The nth Bell number Bn , is then defined to be n    n . Bn  m

(1.12)

m=1

The Stirling numbers satisfy the following properties:     1. mn = nn = n1 = 1.   2. The Stirling numbers of the first kind are integers satisfying mn ≥ 0. It turns out that  this is also   true of the second kind numbers. 3. We have mn = 0 = mn for integers m = 1, . . . , n.  4. The set of rising power polynomials xn : n = 0, 1, 2, . . . is linearly independent, as is the set of falling power polynomials. 5. The Stirling numbers are dual in the sense that   if we introduce   the pair of infinite matrices s and S with entries snm = mn and Snm = mn for n, m ∈ {0, 1, 2, . . .}, then sCS = SCs = I, where C is the checkerboard signed matrix Cnm = (−1)n+m . In particular, s−1 = CS and S−1 = Cs. This follows from observing that    n n+m n m x , x ≡ (−1) m m    n n+m n x ≡ xm , (−1) m m and so, from the defining relations and linear independence, we see for instance that     n k = δnm . (1.13) (−1)n+k k m k

6. The Stirling numbers satisfy the recurrence relations (Stirling’s identities)       n+1 n n = +n , (1.14) m m−1 m       n+1 n n = +m , (1.15) m m−1 m

1.4 Sets and Bags

17

  with 11 = 1 = 11 . (This follows readily from the relations xn+1 = xn × x × xm = (x − m + m) xm = xm+1 + mxm .) (x + n)and ∞ n  n 7. m=1 m = (1) = n!. (Substitute x = 1 in the defining relation (1.11). Note that there is no such simple expression for the Bell numbers.) From the Stirling identities, the Stirling numbers may then be generated recursively using a construction similar to Pascal’s triangle, viz., n\m

1

2

3

4

5

6

1 2 3 4 5 6

1 1 2 6 24 120

1 3 11 50 274

1 6 35 225

1 10 85

1 15

1

  n m

n\m

1

2

3

4

5

6

1 2 3 4 5 6

1 1 1 1 1 1

1 3 7 15 31

1 6 25 90

1 10 65

1 15

1

  n m

The first few Bell numbers are n

1

2

3

4

5

6

7

8

Bn

1

2

5

15

52

203

877

4140

1.4 Sets and Bags In this section, we fix a set X. A sample of size n drawn from X is a sequence x1 , x2 , . . . , xn of elements taken one after the other from X. Ultimately we are only interested in the elements drawn and not the order in which they were drawn. We say that the sampling was done without replacement if, once an element was drawn, it was not available to be drawn again: in this case, we have that {x1 , . . . , xn } will be a set with n distinct elements. If replacement is allowed, then some elements may be drawn more than once and we refer to such instances as coincidences.

18

Introduction to Combinatorics

A bag, or multiset, drawn from a set X is an unordered (finite!) collection x1 , . . . , xn ∈ X where we may have some coincidences, that is, several of the elements of the bag may be the same element of X. A set drawn from X is a bag in which there are no coincidences, that is, no element of X may appear more than once. We shall write Bag(X) and Power(X), respectively, for the collections of bags and sets3 that we can draw from X For a countable set X, the occupation numbers for bags drawn from X are the functions nx : Bag(X) → {0, 1, 2, . . .} = N+ , with nx (X) counting the number of times an element x ∈ X appears in the bag X. Proposition 1.4.1 There is a one-to-one correspondence between bags drawn from a set X and sequences of occupation numbers n = (nx )x∈X , that is, (N+ )X . Sets are then just the bags with no coincidences, and so the occupation numbers for sets are restricted to (N− )X . In fact, if X consists of N elements, we have the enumerations N N • the number of bags of size m is m + ≡ N+m−1 ; N N • the number of sets of size m is m − ≡ m Note that the total number of sets that can be drawn from a set of size N  N is m N m = 2 . The total number of bags will always be infinite! It is worthwhile comparing these observations to the original problem of having N indistinguishable urns into which we can distribute m balls! Corollary 1.4.2 Let us define the size function for a bag (or a set) as  nx (·). N(·) =

(1.16)

x∈X

Then



tN(A) = (1 − t)−N ,

(1.17)

tN(A) = (1 + t)N .

(1.18)

A∈Bag(X)



A∈Power(X)

The proof in either case comes down to setting f (x, nx ) = tnx in the   ←→ Lemma and observing that the sum over bags/subsets 3 The set of all subsets of a given set X is traditionally called the power set of X.

1.5 Permutations and Partitions

19

equates with a sum over the corresponding occupation sequences. (In fact, it is just a special case of the calculations for the grand canonical Boson and Fermion partition functions!) Fixing t ∈ (−1, 1), we have 

 N tN(A) = 1 + t + t2 + · · · = (1 − t)−N .

A∈Bag(X)

The argument for sets is similar. We recognize the generating functions here for N m ± , so these clearly give the number of bags/sets of size m that may be drawn from N elements.

1.5 Permutations and Partitions A permutation of a set of elements is an arrangement of those elements into a particular order. The set of permutations over a set X will be denoted by Perm(X). Suppose for concreteness that X = {x1 , . . . , xn }, then a permutation σ ∈ Perm(X) is a bijective mapping from X to itself and this may be uniquely represented by the sequence (σ (x1 ), . . . , σ (xn )), which of course must be a list of all elements of X with no repetition. Perm(X) forms a nonabelian group under composition. We shall use the notation σ 0 = id, σ 1 = σ , σ 2 = σ ◦ σ , and so on. Given a permutation σ ∈ Perm(X), the orbit of an element x ∈ X under σ is the sequence {x, σ (x), σ 2 (x), . . . }. As the orbit must lie within X, it is clear that we eventually must have σ k (x) = x for some 0 < k ≤ n: the smallest such value is called the period of the orbit and clearly the orbit beyond this point (σ n+k (x) = σ n (x)). The ordered collection  repeats itself 2 k−1 (x) is referred to as a cycle or more explicitly a x, σ (x); σ (x); . . . ; σ k-cycle. Cycles will be considered to be equivalent under cyclic permutation in the sense that [a1 ; a2 ; . . . ; ak ] is not distinguished from [a2 ; a3 ; . . . ; ak ; a1 ], etc. Thus each k-cycle is equivalent to k sequences depending on where on the orbit we choose to start. Clearly orbits arising from the same permutation σ either coincide or are completely disjoint; this simple observation leads to the cyclic factorization theorem for permutations: each permutation σ can be uniquely written as a collection of disjoint cycles. An element x ∈ X is a fixed point of the permutation σ if x = σ (x) – that is, x forms a 1-cycle. We would like to work out the number s(n, m) of permutations on n elements that have exactly m cycles – these turn out to be the Stirling number of the first kind.

20

Introduction to Combinatorics

Lemma 1.5.1 (Counting Permutations) Let Permm (X) be the set of permutations in Perm(X) having exactly m cycles. If X has size n, then the number of permutations in Perm(X) is given by the Stirling numbers of the first n kind m . Proof Let X = {x1 , . . . , xn } and suppose Permm (X) = s(n, m). Suppose that we have a new element xn+1 not in X, then Perm(X ∪ {xn+1 }) has s(n + 1, m) permutations. Some of these will have xn+1 as a fixed point – these permutations will then have m − 1 other cycles drawn from X, and so there will be s(n, m − 1) of these permutations. The remaining permutations will have xn+1 included in a cycle of period two or more: now if we take any permutation in Permm (X), then we could insert the xn+1 before any one of the elements x ∈ X in the cyclic decomposition – in this way, we would generate all permutations of this particular type, and so there are ns(n, m) such possibilities. We therefore obtain s(n+1, m) = s(n, m−1)+ns(n, m), which is the same recurrence relation (1.14) as the Stirling numbers of the first kind. Clearly   s(1, 1) = 1 = s(n, n) while s(n, m) = 0 if m > n. Therefore, s(n, m) ≡ mn . A partition of a set X is an unordered collection (set!) of nonempty subsets whose union is X. The subsets are called the blocks of the partition. We denote the set of all partitions of X by Part(X), and specifically the set of all partitions into exactly n blocks by Partn (X). A partition will be called a pair partition if each of its blocks contains exactly two elements, and we shall denote the set of all pair partitions of X by Pair (X). Lemma 1.5.2 (Counting Pair Partitions) A finite set X will have no pair partitions if it has an odd number of elements. If X has 2k elements then

|Pair (X) | =

(2k) ! . 2k k!

Proof Without loss of generality, take X = {1, 2, . . . , 2k}. We have 2k × (2k − 1) choices for the first pair, then (2k − 2) × (2k − 3) for the second, and so on. This gives a total of (2k) !. However, we have overcounted by a factor of k!, as we do not label the pairs, and by 2k , as we do not label the elements within each pair either. On the other hand, if X has an odd number of elements, then there are no ways to partition into pairs. The number S(n, m) of partitions of n elements into m parts turns out to be the Stirling numbers of the second kind. For instance, the set {1, 2, 3, 4} can be partitioned into 2 blocks in S (4, 2) = 7 ways:

1.5 Permutations and Partitions

{{1, 2} , {3, 4}} , {{1, 3} , {2, 4}} , {{1, 4} , {2, 3}} ,

21

{{1} , {2, 3, 4}} , {{2} , {1, 3, 4}} , {{3} , {1, 2, 4}} , {{4} , {1, 2, 3}} ,

and into 3 blocks in S (4, 3) = 6: {{1} , {2} , {3, 4}} ,

{{1} , {3} , {2, 4}} ,

{{1} , {4} , {2, 3}} ,

{{2} , {3} , {1, 4}} ,

{{2} , {4} , {2, 4}} ,

{{3} , {4} , {1, 2}} .

In combinatorial terms, S (n, m) counts the number of ways to distribute n distinguishable balls among m urns, where we do not distinguish the urns and no urn is allowed to be empty. Lemma 1.5.3 (Counting All Partitions) The number of partitions of the set  n X of size n having exactly m blocks is the Stirling number of the second kind m . In particular, the Bell numbers Bn count the number of ways to partition a set X of size n. Proof Let S(n, m) be the number of partitions of n items into exactly m blocks. We first of all show that we have the formula S(n + 1, m) = S(n, m − 1) + mS(n, m). This is relatively straightforward. We see that S(n + 1, m) counts the number of partitions of a set X = {x1 , . . . , xn , xn+1 } having m blocks. Some of these will have the singleton {xn+1 } as a block: there will be S(n, m − 1) of these as we have to partition the remaining elements {1, . . . , n} into m − 1 blocks. The others will have xn+1 appearing with at least some other elements in a block: we have S(n, m) partitions of {1, . . . , n} into m blocks, and we then may place xn+1 into any one of these m blocks yielding mS(n, m) possibilities. Clearly S(1, 1) = 1 and S(n, n) = 1 while S(n, m) = 0 if m > n. The S(n, m)   nnumbers therefore satisfy the same generating relations (1.15) as the m and so are one and the same.   In combinatorial terms, mn counts the number of ways to distribute n distinguishable balls among m urns, where we do not distinguish the urns, and no urn is allowed to be empty. The proofs of the enumerations followed from elementary arguments where we show that s(n, m) and S(n, m) satisfy the same recurrence relations as the Stirling numbers. We will reestablish this later using the powerful machinery of combinatorial species.

22

Introduction to Combinatorics

1.5.1 The Lattice of Partitions A partial ordering of Part (X) is given by saying that π ≤  if every block of  is a union of one or more blocks of π . In such situations. we say that π is finer than  . We also write π <  if we have π ≤  and π =  . We then have a minimum (finest) element given by 0(X)  {{x} : x ∈ X} , and a largest (coarsest) element 1(X)  {X} , and for all π ∈ Part (X) we have 0(X) ≤ π ≤ 1(X). We shall often encounter the collection of partitions that are strictly finer than 1(X) (proper partitions!) and give this the following special notation: Part< (X)  {π ∈ Part(X) : π < 1(X)} . Mathematically, the collection Part (X) forms a lattice, meaning that for every pair of partitions π and  , there is a coarsest one π ∧  (the meet of π and  ) that is finer than both, and a finest one π ∨  (the join of π and  ) that is coarser than both. Specifically, π ∨  is the partition whose blocks are all nonempty intersections of a block from π and a block from  .

1.6 Occupation Numbers 1.6.1 Occupation Numbers for Partitions Given π ∈ Part (X), we let nk (π ) denote the number of blocks in π having size k. We shall refer to the nk as occupation numbers and we introduce the functions   nk (π ) , knk (π ) . E (π ) = (1.19) N (π ) = k≥1

k≥1

Note that a partition π partitions a set of E (π ) = n elements into N (π ) = m blocks. It is sometimes convenient to replace sums over partitions with sums over occupation numbers. Recall that a partition π will have occupation numbers n = (n1 , n2 , n3 , . . . ) and we have N (π ) = N (n) = n1 + n2 + n3 + · · · and E (π ) = E (n) = n1 + 2n2 + 3n3 + · · · .

1.6 Occupation Numbers

23

Lemma 1.6.1 The number of partitions of a set of n elements leading to the same set of occupation numbers n is given by n! 1 (1! )n1 (2! )n2 (3! )n3 . . . n1 ! n2 ! n3 ! . . .  1 . = n! (k! )nk nk !

νPart (n) =

(1.20)

k=1

The proof follows from the observation that there are n! ways to distribute the n objects; however, we do not label the nk blocks of size k, nor their contents. Bell Polynomials We remark that the multinomials defined by

Bell (n, m; z) =

E(n)=n,N(n)=m 

νPart (n) zn11 zn22 . . . ,

n

for complex sequences z = (z1 , z2 , . . .), are known as the Bell polynomials. In particular, the Stirling numbers of the second kind take the form   E(n)=n,N(n)=m  n = Bell (n, m; 1, 1, 1, . . .) = νPart (n) , m n or   E(n)=n,N(n)=m ∞   1 n . = n! m (k! )nk nk ! n k=1

∞ 1 n Theorem 1.6.2 (The composition formula) Let h(x) = n=1 n! hn x and ∞ 1 n functions about the origin. Then g(x) = n=1 n! gn x be a pair of analytic  1 n their composition h ◦ g has the expansion ∞ n=1 n! (h ◦ g)n x , where (h ◦ g)n =



hm Bell (n, m; g1 , g2 , g3 , . . .) .

(1.21)

m

We introduce some notation based on this result: Let h = (hn )n and g = (gn )n be sequences of scalars where n = 1, 2, 3, . . .; then we define h©g to be the sequence whose nth term is (h ◦ g)n as in (1.21).

24

Proof

Introduction to Combinatorics

We have ∞ m  1 k gk x k!

∞  1 hm (h ◦ g) (x) = m! m=1

∞  1 hN = m!



= = =

hm

k nk =m



N(n)=m 

m=1

n

∞ 

N(n)=m 

hm

m=1 ∞  n,m=1



n

m=1 ∞ 

k=1

n

nk

 ∞ m 1 k gk x n1 , n2 , n3 , . . . k!

1 E (n) ! 1 E (n) !

k=1

∞ 

1 nk xE(n) (gk ) nk ! (k! )nk k=1 ∞   n k gk xE(n) vPart (n) k=1

1 hm Bell (n, m; g1 , g2 , g3 , . . .) xn . n!

Along similar lines, we may prove Fa`a di Bruno’s Formula Let h and g be smooth functions, then  dm h dn h ◦ g = (x) dxn dxm m



g(x)

 dg d2 g Bell n, m; , 2 , . . . . dx dx

1.6.2 Occupation Numbers for Permutations In the same vein, we may introduce occupation numbers for permutations. Given a permutation σ ∈ Perm (X), we let nk (σ ) be the number of k-cycles in  the decomposition of σ . Similarly, setting N (σ ) = k≥1 nk (σ ) and E (σ ) =  k≥1 knk (σ ), we have that σ is a permutation of E (σ ) elements into N (σ ) cycles. In general, several distinct permutations on the same set of n objects can lead to the same set of occupation numbers, and the degeneracy is now given by νPerm (n) =

 1 n! = n! . knk nk ! (1) (2) (3) . . . n1 ! n2 ! n3 ! . . . 1

n1

n2

n3

k=1

(This time we must compensate for the k different ways to represent a k-cycle as a sequence.)

1.7 Hierarchies (= Phylogenetic Trees = Total Partitions)

25

For Stirling numbers of the first kind, we have   E(n)=n,N(n)=m ∞   n 1 = n! nk n m k k! n k=1

= Bell (n, m; 0! , 1! , 2! , . . .)

1.7 Hierarchies (= Phylogenetic Trees = Total Partitions) Given a set of n items, say {1, 2, . . . , n}, with n ≥ 2. We partition the set into two or more parts – that is, make a proper partition. We may repeat this procedure by making a proper partition each nonsingleton block until we have eventually “atomized” down to the finest partition 0n = {{1}, {2}, . . . , {n}}. We refer to this as a total partition. For example, let X = {1, 2, . . . , 10}; then an example of a total partition is given by π0 = 110 ≡ {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, π1 = {{1, 2, 5, 7, 8}, {3}, {4, 6, 9, 10}}, π2 = {{1, 2}, {5}, {7, 8}, {3}, {4}, {6, 9, 10}}, π3 = 010 ≡ {{1}, {2}, . . . , {n}}. This gives a sequence 010 = π3 < π2 < π1 < π0 = 110 . This increasingly finer sequence of partitions can be can be described by Figure 1.7. Alternatively, we may sketch this as a (phylogenetic) tree; see Figure 1.8. Each node of the tree corresponds to a block B, and at each stage we have a proper partition π = {A1 , . . . , Am } ∈ Part< (B) and the branches are then labeled

Figure 1.7 A hierarchy as an increasingly finer sequence of partitions of a set down to the finest partition 0n .

26

Introduction to Combinatorics

Figure 1.8 A hierarchy as a tree. by the parts of π . We may draw this section of the tree as the following directed graph (a tree!) to sketch this: B r    AA  A r r · · ·Ar A1 A2 Am The requirement that the partition be proper means that we have at least two branches! Eventually we come to a point where we are down to singletons and the process terminates. Note that the set of leaves of the tree is then just the original set X, and that for any part B appearing in the hierarchy (that is, any node) the actual set B is just the subset of leaves associated with each node. The object obtained in this way is termed a hierarchy. Let X be a finite set. A hierarchy on X is a directed tree having subsets of X as nodes, with the property that the sets at nodes coming immediately from a parent node form a partition of the set at the parent node. X (or equivalently 1(X)) is the root of the tree, and the leaves (terminal nodes) are the singletons of X (equivalently 0(X)). We denote the collection of all hierarchies on a set X by Hier (X). As we have seen, a hierarchy on X is a strictly order sequence of partitions 0(X) = πr < · · · < π2 < π1 < π0 = 1(X), and, in the tree representation, the integer r gives the length of the longest branch.

1.8 Partitions

27

We would like to know the values of hn , the number of hierarchies on a set of n elements. We may work out the lowest enumerations: r When n = 2 we have the one tree rA r. When n = 3, we have the possibilities r r AAr AA r r Ar r r Ar,   and, when we count the number of ways to attach the leaves, we have h3 = 1 + 3 = 4 possibilities. When n = 4, we have the topologically distinct trees r r r r AEEA AA AAr AAr  EA   Ar   A   Ar r r Er Ar,   r r r Ar, r r r Ar, r r r Ar,  which implies that h4 = 1 + 4 + 6 + 12 + 3 = 26. We find that n

1

2

3

4

5

6

7

8

hn

1

1

4

26

236

2752

39208

660032

. We will say slightly more about the enumeration of hierarchies after we encounter the methods of combinatorial species later.

1.8 Partitions Let Part(n) denote the class of partitions on a set of n items, which for definiteness we may take to be {1, . . . , n}. The class of all partitions is then Part = ∪n≥1 Part(n). We have introduced the occupation numbers nk as the functions on Part counting the number of blocks of a partition having exactly k elements. We let N denote the sigma-algebra on Part generated by the occupation number functions. We recall that we have defined already two N -measurable functions   knk (π ) , nk (π ) , N (π ) = E (π ) = k

k

giving the number of elements being partitioned by π and the number of parts of π respectively. We say that a N -measurable function ψ is multiplicative if it takes the product form ψ (π ) ≡

∞  k=1

n (π )

λk k

(1.22)

28

Introduction to Combinatorics

for scalars λ1 , λ2 , λ3 , . . . . The sequence λ of scalars uniquely determines the multiplicative function, and vice versa, so we may write ψ = ψλ if we wish to emphasize this dependence.

1.8.1 Coarse Graining and M¨obius Inversion Let us fix a finite set X. We recall that the set of partitions of X is a partially ordered set, in fact a lattice. A function f : Part (X) × Part (X) → C is an incidence kernel if f (π ,  ) vanishes in all cases where we do not have π ≤  . Incidence kernels can be added in the obvious way, and a product * is given by the convolution  f (π , ρ) g(ρ,  ). (f ∗ g) (π ,  ) = π ≤ρ≤

The algebra spanned by incidence functions over a general partially ordered set P is called the incidence algebra, I (P), and its identity is the Kronecker delta δ (π ,  ) defined by  1, π = ; δ(π ,  ) = 0, otherwise. An important example of an incidence kernel is the zeta kernel ζ (π ,  ) defined as  1, π ≤ ; ζ (π ,  ) = 0, otherwise. The zeta kernel will have an inverse in the incidence algebra that is called the M¨obius kernel μ, and we shall construct μ in the following. We first remark that if π ≤  , then we can introduce the occupation numbers nk (π ,  ) counting the number of blocks of  that occur as the union of exactly k blocks of π . We say that an incidence kernel  is multiplicative if we have  n (π , )  (π ,  ) ≡ λk k k

for scalars λ1 , λ2 , λ3 , . . . . (We understand nk (π ,  ) ≡ 0 if π   .) We may write  (π ,  ) ≡ λ (π ,  ) for emphasis. We also have the identity λ (0n , 1n ) = λn

(1.23)

1.8 Partitions

29

since 1n is a single block of size n, i.e. nk (0n , 1n ) = 1 if k = n and 0 otherwise. The collection of partitions ρ such that π ≤ ρ ≤  is called a segment, and we denote this as [π ,  ]. We note that the segment is isomorphic to the set nk (π , ) [π ,  ] = ×∞ k=1 Part (k)   and we have N (π ) = k nk (π ,  ) and N ( ) = k knk (π ,  ). The number of partitions in a segment is # [π ,  ] = νPart (n (π ,  )). ψ ∈ I (P), Let P and P  be partially ordered sets and fix functions  on I P × P  by ψ × their product ψ × ψ ψ  ∈ I P  . We define    ,  ) ψ  π  ,   , in which case ψ × ψ  ∗ ψ : π, π ,  ,  → ψ (π φ × φ = (ψ ∗ φ) × ψ ∗ φ  .

Lemma 1.8.1 The convolution of two multiplicative kernels is again a multiplicative kernel. The proof follows directly from our remarks about segments of partitions and products of functions on the incident algebra. Now let us observe that if we are give a sequence h = (hn )∞ n=1 , then we may define an associated ∞ 1 k • Exponential generating function, eh (x) = k=1 k! hk x ∞ nk (π ) • Multiplicative function, ψh (π ) = k=1 λk ∞ nk (π , ) • Multiplicative kernel, h (π ,  ) = k=1 λk Theorem 1.8.2 The map : h → eh is an anti-isomorphism from the set of multiplicative kernels with convolution as product to the set of scalar functions having zero constant term and composition as products. In particular, eh ◦eg = eh©g and g ∗ h = h©g , where © is the product of sequences introduced in (1.21). Proof Let us remark first that the formula eh ◦ eg = eh©g is just a restatement of Theorem 1.6.2. To prove the second part, we may compute the nth term of g ∗ h using equation (1.23) to be  g (0n , π ) h (π , 1n ) . g ∗ h (0n , 1n ) ≡ π ∈Part(n)

However, we have g (0n , π ) =

∞  k=1

n (π )

gk k

,

h (π , 1n ) = hN(π ) .

30

Introduction to Combinatorics

(The first of these relations follows follows from the fact π is made up of nk (π ) blocks of size k consisting of elements of 0n . The second follows from the fact  that 1n is made up of a single block consisting of the N (π ) = k nk (π ) parts of π .) We therefore get 

g ∗ h (0n , 1n ) ≡

hN(π )

π ∈Part(n)

=

E(π )=n 

∞ 

n (π )

gk k

k=1

νPart (n) hN(n)

n

=



∞ 

n

gk k

k=1

hm Bell (n, m; g1 , g2 , g3 , . . .) ,

m

which by Theorem 1.6.2 equals the nth term of h©g. Corollary 1.8.3 Suppose eg has a compositional inverse, say eh = e−1 g , then g has the convolutional inverse h , that is, g ∗ h = δ. Theorem 1.8.4 (M¨obius inversion formula for partitions) The M¨obius kernel is given by     k−1 (k − 1) ! nk (π , ) , π ≤ ; k≥1 (−1) μ (π ,  ) = 0, otherwise; where nk (π ,  ) the number of blocks of  that occur as the union of exactly k blocks of π . Proof

The zeta kernel is given by ζ = g ,

with g = (1, 1, 1, , . . .), so eg(x) = ex − 1. Its inverse is therefore μ = h , where eh (ex − 1) = x, that is, eh (x) = ln (1 + x) and so hn = (−1)n−1 (n − 1) !. This gives the result. The zeta transform of ψ : Part (X) → C, is Zψ : Part (X) → C, where   ζ ( , π ) ψ (π ) = ψ (π ) . Zψ ( )  π ∈Part(X)

π ≥

From the M¨obius inversion formula for partitions, its inverse is then given by   Z −1 ψ ( )  μ ( , π ) ψ (π ) = μ ( , π ) ψ (π ) . π ∈Part(X)

π ≥

1.9 Partition Functions

31

1.9 Partition Functions Let λ = (λ1 , λ2 , λ3 , . . .) be a fixed sequence of complex numbers, and we take the multiplicative function ψλ . We consider the “canonical partition functions” 

Zn (λ) = Zψλ (0n ) =

ψλ (π ) ,

π ∈Part(n)



Z˜ n (λ) = Z −1 ψλ (0n ) =

μ (π ) ψλ (π ) ,

π ∈Part(n)

along with the “grand canonical partition functions”  (z, λ) =

 zn n

n!

˜ (z, λ) = 

Zn (λ) ,

 zn n

n!

Z˜ n (λ) .

Proposition 1.9.1 The grand partition functions take the following form: ⎧ ⎫ ⎨ 1 ⎬  (z, λ) = exp λ k zk , ⎩ ⎭ k! k≥1 ⎧ ⎫ ⎨ (−1)k−1 ⎬ ˜ (z, λ) = exp λk zk .  ⎩ ⎭ k

(1.24)

(1.25)

k≥1

 n  nk (π ) Proof  (λ; z) can be written as n zn! π ∈Part(n) ∞ which now k=1 λk involves an unrestricted sum over partitions of all sizes. We can alternatively describe this as a sum over all occupation sequences:  (λ; z) =

∞   νPart (n)  n

E (n) !

λ k zk

= =

∞  n k=1 ∞ ∞  k=1 n=0 ∞ 

exp

k=1

.

k=1

Using the expression (1.20) for νPart (n), and the have  (λ; z) =

nk





 nk 1 λ k zk nk (k! ) nk !  n 1 k λ z k (k! )n n! λk k z. k!



Lemma, we

32

Introduction to Combinatorics

Likewise, we pick up the additional factors μk = (−1)k−1 (k − 1) ! when computing the M¨obius transformed version: ˜ (z, λ) = 

∞  nk  νPart (n)  (−1)k−1 (k − 1) ! λk zk E (n) ! n k=1





exp

k≥1

(−1)k−1 λk zk . k

The notion of a partition function is borrowed from statistical mechanics and relates to partitioning of energy among various energy states, as opposed to the term partition from combinatorics. The calculation in the preceding lemma is similar to the calculation of the grand canonical partition function for the free Bose gas at finite volume, where we have nk particles in the kth energy, with   N = k nk particles in total and an energy E = k εk nk . In our case, we have εk = k.

1.9.1 The Hermite Polynomials As a special case of the partition function for λ = (λ1 , λ2 , 0, 0, . . .), we consider the case where we truncate beyond pairs:  ψλ (π ) =

n (π ) n2 (π ) λ2 ,

nk (π ) = 0, ∀k ≥ 3; otherwise.

λ11

0,

We write the partition function as ˜ (λ1 , λ2 , 0, 0, . . . ; z) = eλ1 z− 2 λ2 z , G (λ1 , λ2 , z) =  1

2

and expand to get G (λ1 , λ2 , z) ≡

∞ n  z n=0

n!

Hn (λ1 , λ2 ) .

The function Hn (z, 1) is readily seen to be a polynomial of degree n in z called the nth Hermite polynomial. Lemma 1.9.2 Hn (λ1 , λ2 ) = (−λ2 )n eλ1

2 /2λ 2

∂ n −λ1 2 /2λ2 . ∂λn1 e

1.9 Partition Functions

Proof

33

From Taylor’s theorem, one finds that the partition function is G (λ1 , λ2 , z) = eλ1 /2λ2 e−(λ1 −λ2 z) /2λ2 ∞  1 ∂n 2 2 = eλ1 /2λ2 (−λ2 z)n n e−λ1 /2λ2 n! ∂λ1 2

n=0

and the result follows by comparing coefficients of zn . Lemma 1.9.3 For each n, the function Hn (λ1 , λ2 ) is analytic in λ1 and λ2 with Hn (λ1 , λ2 ) =

n/2] [ k=1

(−1)k n! λn−2k λk2 2k k! (n − 2k) ! 1

and conversely λn1 =

n/2] [ k=1

Proof

n! λ2k Hn−2k k 2 k! (n − 2k) ! 2

(λ1 , λ2 ) .

Observe that 1

G (λ1 , λ2 , z) = eλ1 z e− 2 λ2 z ∞  ∞   1  1 λ2 k m m 2k λ z − = z m! 1 k! 2 2

m=0

k=0

zn

gives the result. The converse comes and extracting the coefficients of 1 2 from likewise expanding eλ1 z = G (λ1 , λ2 ; z) e+ 2 λ2 z and comparing coefficients of zn . Of course, the first of these relations just says that Hn (λ1 , λ2 ) =

n1 +2n 2 =n

(−1)n2

n1 ,n2 ≥0

n! λ n1 λ n2 , n n 1 2 1 2 n1 ! n2 ! 1 2

which we could have deduced directly. Corollary 1.9.4 Hn (λ1 , λ2 ) is a polynomial of degree n in λ1 with leading coefficient unity. Moreover, Hn (−λ1 , λ2 ) = (−1)n Hn (λ1 , λ2 ) .   n/2 Note the rescaling Hn (λ1 , λ2 ) = λ2 Hn √λλ1 . The first few instances are 2

H0 (λ1 , λ2 ) = 1;

H3 (λ1 , λ2 ) = λ31 − 3λ1 λ2 ;

H1 (λ1 , λ2 ) = λ1 ;

H4 (λ1 , λ2 ) = λ41 − 6λ21 λ2 + 3λ22 ;

H2 (λ1 , λ2 ) = λ21 − λ2 ;

H5 (λ1 , λ2 ) = λ51 − 10λ31 λ2 + 15λ1 λ22 .

34

Introduction to Combinatorics

Lemma 1.9.5 The differential recurrence relation for the Hermite polynomials are: ∂ H0 (λ1 , λ2 ) = 0, ∂λ1 ∂ Hn (λ1 , λ2 ) = nHn−1 . ∂λ1 Proof

Note that, on one hand, ∞

 zn+1 ∂ G (λ1 , λ2 , z) = zG (λ1 , λ2 , z) = Hn (λ1 , λ2 ) ∂λ1 n! n=0

while on the other ∞

 zn ∂ ∂ G (λ1 , λ2 , z) = Hn (λ1 , λ2 ) ∂λ1 n! ∂λ1 n=0

and so comparing powers of z gives the result. Lemma 1.9.6 The algebraic recurrence relations: H1 (λ1 , λ2 ) = λ1 H0 (λ1 , λ2 ) , Hn (λ1 , λ2 ) = λ1 Hn−1 (λ1 , λ2 ) − nλ2 Hn−2 (λ1 , λ2 ) ,

(n > 1)

∂ G (λ1 , λ2 , z) = (λ1 − λ2 z) G (λ1 , λ2 , z), then use Proof Note first of all that ∂z the Hermite power series expansion to obtain the result.

By combining the algebraic and differential recurrence relations, one can deduce readily that   2 ∂ 2 ∂ + (1.26) Hn = nHn . −λ2 ∂λ1 ∂λ1 This gives a second characterization of the Hermite polynomials: y = Hn (x) is uniquely defined as the polynomial solution with leading coefficient unity to the ordinary differential equation (ODE) y − zy + ny = 0

(1.27)

wherever n is a nonnegative integer. 2 Let γ be standard Gaussian measure, γ (x) dx = (2π )−1/2 E−x /2 dx. Then 2 the Hermite polynomials are orthogonal as elements of L (R, γ ) and in particular

Hn , Hm L2 (R,γ ) = n! δnm ; they are moreover a complete basis for L2 (R, γ ).

(1.28)

1.9 Partition Functions

35

The Hermite functions hn (x) are defined to be 1 e−x /4 Hn (x) . hn (x) = √ n! (2π )1/4 2

(1.29)

They consequently satisfy the ODEs  

1 2 1 d2 hn . − 2 + x hn = n + 4 2 dx

(1.30)

√ The mapping L2 (R, γ ) → L2 (R) given by f (x) → ρ (x) f (x), where ρ (x) = (2π )−1/2 exp −x2 /2 is the standard Gaussian density, is an isometry, viz. '

f |g L2 (R,γ ) = f¯ (x) g (x) ρ (x) dx √ √ = ρf | ρg L2 (R) (1.31) and extends to a unitary map. The set of functions {hn : n = 0, 1, . . .} form a complete orthonormal basis for L2 (R).

1.9.2 The Charlier Polynomials Another situation of interest is when we take λ1 = x − all k ≥ 2, then



λ and λk =

x √ k λ

for

 √ n1 (π )N(π )−n1 (π ) ψλ (π ) = x − λ

and we obtain

⎧ ⎫ ⎨ √ ⎬  z 1 ˜ (λ; z) = exp − λz + x  (−1)k−1 ( √ )k ⎩ k λ ⎭ k≥1

∞ √ √ z x  zn − λz 1+ √ Cn (x, λ). =e ≡ n! λ n=0 √ The functions Cn (x, λ) are polynomials in z of degree n called the Charlier polynomials. More generally,  n  √ n−2m √ − λ zm Cn (x, λ) = m m    √ n−2m n m k n−k z . = λ (−1) m k m,k

36

Introduction to Combinatorics

The Charlier polynomials have the property that they are orthogonal with respect to the Poisson measure of intensity λ, that is,  √ √ Cn (x, λ)Cm (x, λ)p (x, λ) = n! δn,m x≥0 x

with p (x, λ) = e−λ λx! . In particular, they form a complete basis for L2 (N, Pλ ), the set of functions on N = {0, 1, 2, . . .} with the Poisson measure Pλ .

2 Probabilistic Moments and Cumulants

In this chapter we will look at moments and cumulants of random variables and, more generally, stochastic processes. It turns out that the two most important families, the Gaussian and the Poissonian random variables, have moments that involve the combinatorial enumerations encountered in the introductory chapter. Moreover, the relationship between moments and cumulants is also deeply combinatorial and we study this is the first half of the chapter. In the second part, we extend this to stochastic processes and, in particular, moments of stochastic integrals. We will assume the standard Kolmogorov setting of a probability space ( , F, P) consisting of a measurable space ( , F), with being the sample space and F the σ -algebra of events, and P being a probability measure on the events.

2.1 Random Variables 2.1.1 Moments A random variable X on a probability space ( , F, P) is a real-valued measurable function and we define expectations according to ' '   f (X(ω))P [dω] = f (x)KX [dx]. E f (X) = R



Here KX is the probability distribution of X, and is defined as the pullback KX = P ◦ X −1 , where X −1 [A] = {ω ∈ : X(ω) ∈ A} for any Borel subset A. That is, the event “X takes a value in A” has probability Prob[X ∈ A] = KX [A] = P[X −1 [A]].

37

38

Probabilistic Moments and Cumulants

The expectation of the nth power of X, when it exists, is called its nth moment: '  n μn = E X = xn KX [dx]. (2.1) R

The (exponential) moment generating functions is defined to be    1 μn tn , MX (t) = E etX = n!

(2.2)

n≥0

whenever convergent. Note that MX is the Laplace transform of KX .

2.1.2 Cumulants Cumulants κn are defined through the relation ∞  1 κn tn = ln MX (t) n!

(2.3)

n=1

or

∞

1 n n=0 n! μn t

 = exp

∞

1 n n=1 n! κn t

 . One sees from Theorem 1.6.2 (by

taking h = exp so that hm = 1) that the relation between the moments and cumulants is  μn = Bell(n, m; κ1 , κ2 . . .). (2.4) m

The first few terms are μ1 = κ1, μ2 = κ2 + κ12 , μ3 = κ3 + 3κ1 κ2 + κ13 , μ4 = κ4 + 4κ1 κ3 + 3κ22 + 6κ12 κ2 + κ14 , etc.    1 1 n κn tn = ln 1 + ∞ n=1 n! μn t , we may again use n! Theorem 1.6.2 (taking h(x) = ln(1 + x) so that hm = (−1)m−1 (m − 1)!) to get  κn = (−1)m−1 (m − 1)! Bell(n, m; μ1 , μ2 . . .). (2.5) Conversely, from

∞

n=1

m

2.2 Key Probability Distributions

39

so that the inverse relationship is κ1 = μ1 , κ2 = μ2 − μ1 2 κ3 = μ3 − 3μ2 μ1 + 2μ1 3 κ4 = μ4 − 4μ3 μ1 − 3μ2 3 + 12μ2 μ1 2 − 6μ1 4 , etc. Note that μ1 = κ1 is the mean, and κ2 is the variance of the random variable!

2.2 Key Probability Distributions We now look at three of the most important examples.

2.2.1 Standard Gaussian Distribution A continuous real-valued random variable has a standard Gaussian distribution if it has a probability density ρX (x) = γ (x) given by γ (x) = (2π )−1/2 e−x

2 /2

,

(2.6)

leading to the moment generating function ' ∞ 1 2 1 2 MX (t) = √ etx e− 2 x dx = et /2 . 2π −∞ We see that all cumulants vanish except κ2 = 1. Expanding the moment generating function yields ⎧ ⎨ (2k)! , n = 2k; μn = 2k k! ⎩ 0, n = 2k + 1. The nth moment has the combinatorial interpretation as the number of pair partitions of n items.

2.2.2 Poisson Distribution We take X to be discrete with Prob[X = n] =

1 n −λ λ e n!

40

Probabilistic Moments and Cumulants

for n = 0, 1, 2, . . ., and where the parameter λ must be positive. This gives the Poisson distribution of intensity λ. The moment generating function is readily computed and we obtain MX (t) = exp{λ(et − 1)}. Taking the logarithm shows that all cumulants of the Poisson distribution are equal to λ: κn = λ, for all n = 1, 2, . . . . From (2.11), we see that they are polynomials of degree n in λ with the Stirling numbers of the second type as coefficients,  n λm , (2.7) μn = m m and in particular λ is the mean (as well as the variance!). Note that we obtain the generating function for Stirling numbers of the second kind,  1 n t tn λm = eλ(e −1) , (2.8) n! m n,m  1 t and setting λ = 1 gives n n! Bn tn = ee −1 the generating function for the Bell numbers. We may also expand the moment generating function MX (t) to obtain the explicit, though not particularly useful, identity   m

n 1  m = (−1)m−k kn . m k m! k=0

2.2.3 Gamma Distribution Let X be a positive continuous variable with density  x > 0, (λ)−1 xλ−1 e−x , ρX (dx) = 0, x ≤ 0, ∞ where λ > 0, and the Gamma function is (λ) = 0 yλ−1 e−y dy with (n + 1) = (n)! for integer n ≥ 0. We refer to this as a Gamma distribution, specifically the (λ) distribution. Its moment generating function is MX (t) = (1 − t)−λ .

2.2 Key Probability Distributions Now (1 − t)−λ =

41

∞ −λ ∞ 1 n n n n=0 n (−t) ≡ n=0 n! λ t and so its moments are  n λm . μn = λn = (2.9) m m

2.2.4 Factorial Moments E [X n ]

as the n-th falling factorial moment. The moment generator We refer to   X z = et − 1. Taylor can be written as MX (t) ≡ E (1 + z) , where we set ∞ x n ∞ 1 n n x expanding about z = 0 leads to (1 + z) = n=0 n z = n=0 n! x z and so ∞    1  n n E X z . E (1 + z)X = n!

(2.10)

n=0

  Clearly E (1 + z)X acts as the falling factorial moment generating function. From relation (1.11) for the Stirling numbers, we see that the ordinary and falling factorial moments are related by        n E Xm , E Xn ≡ m m        n E Xn ≡ E Xm . (2.11) (−1)n+m m m  in the case of the Poisson distribution with intensity λ we have For example, E (1 + z)X = eλz and so we see that the falling factorial moments are just   E X n = λn . In general, we have  1     E X n zn = E (1 + z)X = MX (ln (1 + z)) , n! n  1 ( )   E X n zn = E (1 − z)−X = MX (− ln (1 − z)) , n! n and from the di Bruno formula      Bell(n, m; 0! , −1! , 2! , −3! , . . .) E X m , E Xn = ( E X

n

)

m

=

 m

  Bell(n, m; 0! , 1! , 2! , 3! , . . .) E X m .

42

Probabilistic Moments and Cumulants

2.3 Stochastic Processes We now broaden our outlook by going from random variables to stochastic processes (families of random variables parameterized by a time variable). We start by recalling the basic definitions and properties of stochastic integrals. There are a number of equivalent formulations and we begin by recalling some basic concepts in this direction due to P. A. Meyer, especially from his appendix to Emery (1989). Our aim is to generalize some of the earlier generating function results to encompass the setting of stochastic integration. For definiteness, we shall work with semimartingale processes. We fix a probability space ( , F, P), then a real-valued stochastic process X = X(t, ω) is a jointly measurable function from R+ × to R. We shall frequently write the process using the notation X or (Xt ). A process (Xt ) is said to have independent increments if, for each pair of intervals [s1 , t1 ] and [s2 , t2 ] the increments Xt1 − Xs1 and Xt2 − Xs2 are independent random variables. It is furthermore said to have stationary independent increments if the distribution of Xt+h − Xt depends only on h > 0. The two core examples are the standard Wiener process, (Wt ), which has stationary independent increments Wt+h − Wt with a mean zero Gaussian distribution of variance h, and the Poisson process of rate ν, (Nt ), which has stationary independent increments Nt+h − Nt with a Poisson distribution of intensity νh. A celebrated theorem of Wiener gives an explicit construction of a probability space ( W , FW , PW ), called the canonical Wiener space, where W = C0 [0, ∞) is the space of all continuous pathsx = x (t) with x (0) = 0. T Our aim is to define stochastic integrals S Yt dXt for time intervals [S, T] and suitable stochastic integrals. To this end, let M be a subdivision (mesh) of the time interval, say t0 = S < t1 < t2 < · · · < tN = T and let |M| = maxk (tk+1 − tk ). We then define the finite sum  IM (Y, X) = Ytk (Xtk+1 − Xtk ). k

The integrand Y is said to be simple if there exists a subdivision M such that Yt (ω) = yk (ω) for tk < t ≤ tk+1 for each k. In this case, IM (Y, X) defines the T stochastic integral S Yt dXt . We denote the sigma-algebra on R × generated by the simple processes as F(S). A process X is then said to be an integrator if there is map I(·, X) extending the finite sum from the simple functions to all uniformly bounded F(S) functions, and satisfying the dominated convergence theorem in probability. A filtration is a family, (Ft )t≥0 , of sigma-algebras that are nested (that is, Fs ⊂ Ft whenever s < t), more specifically future-continuous (that is,

2.3 Stochastic Processes

43

Fs = ∩s s > 0. We say that a process (X)t has future (=right) and past (=left) limits at t if the limits Xt± (ω) = lim Xt+τ (ω) τ →0±

exist for each ω. In such cases, we often write (Xt± ) or even just X± for the corresponding processes. We also set 1 [Xt+ + Xt− ], Xt  Xt+ − Xt− , 2 and these are the instantaneously averaged and the jump discontinuity processes respectively. These are sometimes shortened to X∗ and X when no confusion arises. Semimartingales then have the property of possessing both limits and being continuous from the future, that is, X+ ≡ X. These are known as cadlag processes, from the French designation continue a´ droit, limite a´ gauch. The version X− will then be caglad. Moreover, for a semimartingale X the finite sums IM (Y, X) converge in probability for any uniformly bounded past-continuous T adapted Y as the mesh size |M| → 0. We then denote the limit as S Yt dXt . Given a pair of semimartingales X and Y, it turns out that the following limit in probability exists: ' T  dXt dYt  lim (Xtk+1 − Xtk )(Ytk+1 − Ytk ), Xt∗ 

S

|M|→0

k

and we define their quadratic covariation as1 ' t [[X, Y]]t  dXs .dYs . 0

In particular, [[X, X]]t is called the quadratic variation of X. For instance, the Wiener process satisfies [[W, W]]t = t, and in fact a theorem of L´evy characterizes the Wiener process as the only local martingale with W0 = 0 having this property. The Wiener process is a continuous semimartingale, so 1 The standard probabilist notation is [X, Y] , but we avoid this due to potential confusion with t

commutators.

44

Probabilistic Moments and Cumulants

we have W = W− = W+ . For the Poisson process N, we have the other remarkable identity [[N, N]]t = Nt . Here we have jump process N, which can equal 0 or 1. In any finite interval, the Poisson process will have a finite number of jumps only. We may also define the processes ' t  (Xtk+1 − Xtk )n ≡ (dXs )n , Xt[n]  lim |M|→0

0

k

where appropriate. In particular, X [2] is the quadratic variation [[X, X]]. For n > 2 we have W [n] = 0 and N [n] = N. A key result concerning semimartingales is that we have the following integration by parts (IP) formula: ' t ' t Xs− dYs + Ys− dXs + [[X, Y]]t . Xt Yt = 0

0

It is useful to write this is in the differential notation d(Xt Yt ) = Xt− dYt + (dXt )Yt− + dXt .dYt .

(2.12)

The stochastic form of the Taylor series is 1 df (Xt ) = f  (Xt− ) dXt + f  (Xt− ) (dXt )2 + · · · 2 ∞  1 (n) = f (Xt− ) dXt[n] , n! n=1

The well-known It¯o formula for the Wiener process is 1 df (Wt ) = f  (Wt ) dWt + f  (Wt ) dt 2 while for the Poisson process we have   df (Nt ) = f (Nt− + 1) − f (Nt− ) dNt . From this point onward, we assume that all processes are semimartingales.

2.4 Multiple Stochastic Integrals (j)

(j)

Let Xt be stochastic processes for j = 1, . . . , n, all taken to have X0 = 0 for convenience. We use the following self-explanatory conventions: ' t ' t ' (n) (1) (n) (1) (n) (1) dXtn . . . dXt1 ≡ dXtn . . . dXt1 , Xt . . . Xt = 0

0

[0,t]n

thereby representing the product as an integral over the hypercube [0, t]n .

2.4 Multiple Stochastic Integrals

45

In the following, we denote by σn (t) the n−simplex in [0, t]n determined by a permutation σ ∈ Sn : that is,   σn (t) = (tn , . . . , t1 ) ∈ (0, t)n : tσ (n) > tσ (n−1) > · · · > tσ (1) . The simplex determined by the identity permutation will be written simply as n (t), that is, t > tn > tn−1 > · · · > t1 > 0. Clearly ∪σ ∈Sn σn (t) is [0, t]n with the absence of the hypersurfaces (diagonals) of dimension less than n corresponding to the tj being equal. Moreover, the σn (t) are disjoint for different σ . Removing these diagonals makes no difference for standard integrals as we are ignoring a set of Lebesgue measure zero, however, this is not true for stochastic integrals!  We also define the diagonal-free integral “−” to be the expression with all the diagonal terms subtracted out. Explicitly '  ' (n) (1) (n) (1) dXtn . . . dXt1  dXtn . . . dXt1 . − [0,t]n

σ σ ∈Sn n (t)

Take s1 , s2 , . . . to be real variables and let π = {A1 , . . . , Am } be a partition of {1, . . . , n} then, for each i ∈ {1, . . . , n}, define sπ (i) to be the variable sj where i lies in the part Aj . Lemma 2.4.1 The multiple stochastic integral can be decomposed as '  ' (n) (1) (n) (1) dXtn . . . dXt1 = − dXsπ (n) . . . dXsπ (1) [0,t]n

N(π ) π ∈Part(n) [0,t]

  Proof The n = 1 case is immediate as the and − integrals coincide. In the situation n = 2, we have by the IP formula (2.12) ' t ' t ' t (2) (1) (2) (1) (2) (1) dXt2 Xt− + Xt− dXt1 + dXs(2) dXs(1) Xt Xt = 2 1 0 0 0 ' ' ' t (2) (1) (2) (1) = dXt2 dXt1 + dXt2 dXt1 + dXs(2) dXs(1) t>t >t >0 t>t1 >t2 >0 0 ' 2 1 ' (2) (1) (2) (1) ≡− dXt2 dXt1 + − dXs dXs [0,t]2

[0,t]

and this is the required relation. The higher-order terms are computed through repeated applications of the It¯o (n+1) formula. An inductive proof is arrived at along the following lines. Let Xt be another process, then the preceding rule yields ' ' (n+1) (n+1) Xt Yt = − dXtn+1 dYtn + − dXs(n+1) dYs [0,t]2

[0,t]

46

Probabilistic Moments and Cumulants (n)

(1)

and we take Yt = Xt . . . Xt . Assume the formula is true for n. The first term will be the sum over all partitions of {n + 1, n, . . . , 1} in which {n + 1} appears as a singleton, and the second term will be the sum over all partitions of {n + 1, n, . . . , 1} in which n+1 appears as an extra in some part of a partition of {n, . . . , 1}. In this way, we arrive at the appropriate sum over Part (n). Corollary 2.4.2 The inversion formula for off-diagonal integrals is ' '  (n) (1) (n) (1) − dXtn . . . dXt1 ≡ μ (π ) dXsπ (n) . . . dXsπ (1) [0,t]n

[0,t]N(π )

π ∈Part(n)

where μ (π ) = (−1)N(π ) is the M¨obius function for partitions.

2.4.1 Expectation of Products of Wiener Integrals Let us define random variable Wt ( f ), for square-integrable test function f , by ' t Wt (f )  f (s) dWs 0

and set W (f ) = W∞ (f ). We have that ' ' f (s1 ) g (s2 ) dWs1 dWs2 + − Wt (f ) Wt (g) = − [0,t]2

f (s) g (s) (dWs )2 .

[0,t]

As the Wiener process has independent increments, we see that the diagonalfree integrals will vanish:  ' '     f (s1 ) g (s2 ) dWs1 dWs2 = − f (s1 ) g (s2 ) E dWs1 E dWs2 E − [0,t]2

[0,t]2

= 0. Therefore, we obtain (taking t → ∞) the following identity known as the It¯o isometry: ' ∞   f (s) g (s) ds. (2.13) E W (f ) W (g) = 0

More generally, we see that   E Wt (fn ) . . . Wt (f1 ) =

 π ∈Part(n)

' E[ −

[0,t]N(π )

fn (sπ (n)) . . . f1 (sπ (1))

×dWsπ (n) . . . dWsπ (1) ]. As (dWt )2 = dt with higher powers vanishing, we only encounter partitions into singletons and pairs. Furthermore, the integrals over a single dWs will

2.4 Multiple Stochastic Integrals

47

average to zero, leaving a sum over pair partitions, which is known as the Isserlis formula2

'

 '   (2.14) E W (f2k ) . . . W (f1 ) = fp k fq k · · · fp 1 fq 1 . Pairs(2k)

The average over an odd number of terms will vanish. In particular, we see that

k  '  ' ∞ ( )  1 1 (2k) ! W(f ) 2 2 E e = f (t) dt . f (t) dt = exp 2 (2k) ! 2k k! k=0  ˙ t dt. We remark that we may write the integrals in the form W (f ) = f (t) W Formally then we have the mnemonic    ˙ t1 = ˙ t2k . . . W δ tp k − tq k . . . δ t p 1 − tq 1 . E W Pairs(2k)

˙ is not a genuine stochastic process, but may be thought The formal derivative W of as a singular process called white noise.

2.4.2 Expectation of Products of Poisson Integrals Let us similarly define Poisson integrals of the form ' t f (s) dNs . Nt (f ) = 0

Again we see that   E Nt (fn ) . . . Nt (f1 ) =



' E −

π ∈Part(n)

[0,t]N(π )

fn (sπ (n)) . . . f1 (sπ (1)) 

× dNsπ (n) . . . dNsπ (1) . This time each part of size m will have integrators (dNs )m = dNs , which will be independent of the remaining integrators (as they correspond to nonoverlapping increments by the diagonal-free construction) and average to νds. The situation where the fj are equal is tractable; here each part of size k makes a contribution  ν f (s)k ds and we have

nk (π ) ∞ '     E Nt (f )n = ν f (s)k ds . π ∈Part(n) k=1 2 The right-hand side is sometimes referred to as the Haffnian of the matrix F, here with

 components Fij = fi fj dt. This terminology is due to E. Caianiello (1973), who realized that it is essentially the permanent of F: a determinant would yield a Pfaffian!

48

Probabilistic Moments and Cumulants

In particular, using Proposition 1.9.1, we find '



( ) z Nt (f ) k E e =  z; ν f (s) ds  = exp

k ∞ k  z k=1

k!

' ν



k

f (s) ds

 '    zf (t) = exp ν e − 1 dt .  The nth cumulant of Nt (f ) is therefore ν [0,t] f (s)n ds. For the shot noise  ∞ process It = h (s − t) dNs we deduce that its nth cumulant3 is ν 0 h (s)n ds. We remark that we can likewise introduce a singular process N˙ t as the formal derivative of the Poisson process. This time we have the mnemonic    δ na ta(1) , . . . , ta(na ) δ nb tb(1) , . . . , tb(nb ) . . . , E N˙ tn . . . N˙ t1 = Part(n)

where the sum is over partitions {A, B, . . .} of {1, . . . , n} with A = {a (1) , . . . , a (na )}, and so on, and where δ m (s1 , . . . , sm ) = δ (s1 − s2 ) δ (s2 − s3 ) . . . δ (sm−1 − sm ) for m > 2, but with δ 1 (s) ≡ 1.

2.5 Iterated It¯o Integrals Let {Xt : t ≥ 0} be a classical process. Its diagonal-free exponential is defined to be  1 ' − dXtn . . . dXt1 . (2.15) e/Xt  n! [0,t]n n≥0

By symmetry, we have e/Xt =

 1  ' dXtn . . . dXt1 n! σn (t) n≥0

=

'

σ ∈Sn

n≥0 n (t)

dXtn . . . dXt1

  3 The statement that the mean is ν ∞ h (s) ds and the variance is ν ∞ h (s)2 ds is the content of 0 0 Campbell’s theorem. However, this formula gives all cumulants.

2.5 Iterated It¯o Integrals

It is clear that e/Xt is the solution to the integro-differential equation ' t dXs e/Xs− , e/Xt = 1 +

49

(2.16)

0

and we obtain the series expansion through iteration. We may also use the notation of an It¯o time-ordered exponential: t

e/Xt ≡ T I e

0

dXs

.

(2.17)

Theorem 2.5.1 Let {Xt : t ≥ 0} be a stochastic process, and recall the defi nition of the related processes Xt[k] = [0,t] (dXs )k , for k = 1, 2, . . . . Then its diagonal-free exponential is given by e/Xt = eY(t) ,

(2.18)

where Yt =

 (−1)k+1 k

k≥1

Xt[k] .

(2.19)

Inversely, we have Xt =

 1 [k] Y . k! t

(2.20)

k≥1

Proof

We first observe that ' ∞  nk (π )  Xt[k] dXsπ (n) . . . dXsπ (1) = . [0,t]N(π )

k=1

Therefore, e/zXt =

 1 zn n! n≥0



(−1)N(π ) ψλ (π ),

π ∈Part(n)

where ψλ is the multiplicative function on the species of partitions, Part, with (stochastic) coefficients λk = Xt[k] . We then recognize the “grand canonical partition function” from Proposition 1.9.1, that is, ⎧ ⎫ ⎨ (−1)k+1 ⎬   ˜ z, Xt[1] , Xt[2] , . . . ≡ exp Xt[k] , zn e/zXt =  ⎩ ⎭ k k≥1

which yields (2.19). To get the inverse relation, we note that Rt = eY(t) satisfies the stochastic differential equation (SDE) equivalent to the equation (2.16): dRt = (edY(t) − 1) Rt− ,

R0 = 1.

(2.21)

50

Probabilistic Moments and Cumulants

As we have identified Rt = e/zXt , it follows that Rt also solves the SDE dRt = (dXt ) Rt− with Rt . Therefore, we must have dXt = (edY(t) − 1) =

 1  1 [k] (dYt )k ≡ dY , k! k! t k≥1

k≥0

from which we deduce (2.20).

2.5.1 Iterated Wiener Integrals Let us take to be the Wiener process W. Here we have dWs dWs = ds with higher powers vanishing, therefore X (i) s

Wt[1] = Wt , ' t W1[2] = (dW)2 = t, 0

Wt[k]

= 0, k ≥ 0.

We therefore see that   1 ˜ (Wt , t, 0, . . . ; u) = exp uWt − tu2 . e/uWt ≡  2

(2.22)

This implies

' −

[0,t]n

dWtn . . . dWt1 = tn/2 Hn

Wt √ t



where Hn are the Hermite polynomials. The implication that ' σn (t)

dWtn . . . dWt1 =

1 n/2 t Hn n!



Wt √ , t

(2.23)

for any permutation σ , is due originally to It¯o. t More generally, we could set Xt = 0 f (s) dWs in which case Xt[2] = t 2 [k] 0 f (s) ds with Xt = 0 for k ≥ 3, yielding t

e/

0

' fdW

≡ exp 0

for f locally square-integrable.

t

1 fdW − 2

'



t

f 0

2

,

(2.24)

2.6 Stratonovich Integrals

51

2.5.2 Iterated Poisson Integrals Let Nt be the Poisson process. Then we have the differential rule (dNt )k = dNt for all positive integer powers k. Therefore, Nt[k] = Nt ,

k = 1, 2, . . . .

We then obtain ˜ (z; mNt , Nt , . . .) = e/zNt = 

∞  k=1

and using

∞

k=1 (−1)

k−1 zk k

  zk exp (−1)k−1 Nt k

= ln (1 + z), we end up with e/zNt = (1 + z)Nt .

(2.25)

Using the binomial theorem, we deduce that ' σn (t)

dNtn . . . dNt1 =



Nt , n

(2.26)

 n and so −[0,t]n dNtn . . . dNt1 = Nt . Let us remark that we could repeat the preceding exercise with Nt replaced by the compensated Poisson process Yt  Nt − t. This has the modified rule (dYt )k = dYt + dt for all positive integer powers, except k = 1. This complication is easily dealt with, and the diagonal-free exponentiated random variable associated with the compensated Poisson process Yt is found to be e/zYt = e−zt (1 + z)Yt +t . We deduce that ' − dYtn . . . dYt1 = Cn (Nt , t) , [0,t]n

where Cn (x, t) are the Charlier polynomials.

2.6 Stratonovich Integrals An alternative to the It¯o integral is to consider the limit of finite sum approximations of the form  Ytk∗ Xtk+1 − Xtk k

where we evaluate the integrand at the midpoint tk∗ =

1 (tk+1 + tk ) . 2

52

Probabilistic Moments and Cumulants

The limit is the Stratonovich, or symmetric, integral and we denote it by  Yt δXt . For a pair of martingales X and Y, its relation to the It¯o integral is ' ' 1 Y− δX+ ≡ Y− dX+ + [[Y, X]] 2 or, in differential form, 1 Y− δX+ = Y− δX+ + dY+ dX+ . 2 The integration by parts formula now takes on the traditional Leibniz form: d (XY)+ = (δX+ ) Y− + X− (δY+ ) . Let us look at a few examples.  First, we consider Z = f (W) δW. It follows that 1 1 dZ = f (W) δW = f (W) dW + df (W) dW ≡ f (W) dW + f  (W) dt 2 2  so equivalent to an It¯o integral of the form f (W) δW =  that we obtain  f (W) dW + 12 h (W) dt. More generally, for a diffusion determined by the SDE dXt = v (Xt ) dt + σ (Xt ) dWt ,

X0 = x0 ,

we may try and write this as an equivalent Stratonovich SDE: dXt = w (Xt ) dt + ς (Xt ) δWt ,

X0 = x0 .

We have exploited the fact diffusions are continuous semimartingales and so X+ = X− = X. We then have 1 dXt = w (Xt ) dt + ς (Xt ) dWt + dς (Xt ) dWt 2  1  1 ς (Xt ) dXt + ς  (Xt ) dXt dXt dWt = w (Xt ) dt + ς (Xt ) dWt + 2 2   1  = w (Xt ) + ς (Xt ) σ (Xt ) dt + ς (Xt ) dWt , 2 where we used the fact that (dXt )2 = σ (Xt )2 dt. We see that ς ≡ σ and 1 v (x) ≡ w (x) + σ (x) σ  (x) . 2 We refer to v as the It¯o drift velocity and w is the Stratonovich drift velocity.

2.6 Stratonovich Integrals

53

Poisson processes are discontinuous, however, so we need to distinguish between N = N+ and N− . We have f (N− ) δN = f (N− ) dN + df (N− ) dN− =

 1 f (N− + 1) + f (N− ) dN. 2

2.6.1 Stratonovich Time-Ordered Exponentials Let (Zt ) be a semimartingale. The Stratonovich exponential process of Z is defined as the solution St = of the SDE dSt = St− δZt ,

S0 = 1,

whenever it exists. The solution will be denoted as St = T S eZt . We now relate this to earlier notions of exponentials of stochastic processes. Proposition 2.6.1 For a semimartingale Z, we have T S eZt ≡ e/Xt , where 1 dX = dZ + dZdX. 2 Proof

(2.27)

Let St = T S eZt so that 1 dS = S− δZ = S− dZ + dS− dZ. 2

If we also require St = e/Xt then we also have dS = S− dX. Substituting in the term dS = S− dX, and dividing through by S− , which will be positive when the SDE is well posed, we obtain the consistency condition (2.27). Example: Wiener Stratonovich Exponentials Let us set Zt = zWt then make the assumption that dXt ≡ αWt + βdt. The consistency condition reads as 1 αdW + βdt = kdW + kαdt 2 2 2 so that α = k and β = 12 k. Therefore, T S ekWt = e/kWt + 2 k t ≡ e/kWt e 2 k t . However, using (2.22) we get T S ekWt = ekWt . 1

1

Example: Poisson Stratonovich Exponentials Now lets try Zt = kNt and we make the assumption that Xt = lNt . The consistency condition (2.27) then implies 1 l = k + lk 2

54

Probabilistic Moments and Cumulants

and so l =

k . 1− 12 k

We therefore find, with the help of (2.25), that T S ekNt = e/

k Nt 1− 1 k 2

 ≡

1 + 12 k 1 − 12 k

Nt .

The problem is well posed so long as −2 < k = 2.

2.7 Rota–Wallstrom Theory Finally, we discuss the relation of the preceding concepts of diagonal-free multiple stochastic integrals to the combinatorial methods introduced by Rota and Wallstrom (1997). To begin with, let us fix a measurable space (X, X ). We consider vectorvalued measures on X , that is, a map M : X → V, where V is a normed vector  space with the property that M [∅] = 0, and M [∪n An ] = n M [An ], where the {An } are pairwise disjoint elements of X and the sum converges in the norm topology. The problem of defining products of vector-valued measures is not as trivial as for real-valued measures. In particular, we shall take V = L2 ( , F, P), with the L2 -norm topology, where we fix a probability space ( , F, P). In such circumstance, we shall refer to L2 ( , F, P)-valued measures as random measures. Let π be a partition of {1, . . . , n}, then we define a map Pπ(n) from the subsets of X n to itself by 

π Pπ(n) B  (x1 , . . . , xn ) ∈ B : i ∼ j ⇐⇒ xi = xj π

where i ∼ j means that the indices i and j lie in the same block. For instance, if (n) the partition is π = {{1, 2} , {3, 5, 7} , {4, 6}}, then Pπ B consist of the elements in B of the form (x, x, y, z, y, z, y) with x, y, z all distinct. We have that     (n) Pπ(n) B ∩ P B =∅ whenever π =  . We also have the projective property (n) ◦ Pπ(n) = δ ,π Pπ . P (n)

It is convenient to introduce the map P≥π defined by (disjoint union!) (n)

(n) B, P≥π B  ∪ ≥π P

(2.28)  π (n) (n) so that P≥π B = (x1 , . . . , xn ) ∈ B : i ∼ j &⇒ xi = xj . The map P≥0 is then just the identity map between subsets of X n . We note that

2.7 Rota–Wallstrom Theory

55

(n)

P0 B ≡ {(x1 , . . . , xn ) ∈ B : no coincidences!} , P1(n) B ≡ {(x, . . . , x) ∈ B} We shall make the assumption that each Pπ(n) is measurable from X n to itself: this is not guaranteed, but will be case if X is a Polish space and X is the corresponding Borel σ -algebra. The following notion of “good” measures is due to Rota and Wallstrom. Let M1 , . . . , Mn be random measures, then we define M1 ⊗ · · · ⊗ Mn on the product sets X n by M1 ⊗ · · · ⊗ Mn [A1 × . . . An ] = M1 [A1 ] . . . Mn [An ] and we say that the measures are jointly good if M1 ⊗ · · · ⊗ Mn extends to a (unique) random measure on X n , which we also denote as M1 ⊗ · · · ⊗ Mn . A random measure M is said to be n-good if its n-fold product M ⊗n extends in this way. The notion of jointly good is due to Farr´e, Jolis, and Utzet (2008). We are then interested in the restricted measures (n) (M1 ⊗ · · · ⊗ Mn )π  (M1 ⊗ · · · ⊗ Mn ) ◦ P≥π ,

Stπ (M1 ⊗ · · · ⊗ Mn )  (M1 ⊗ · · · ⊗ Mn ) ◦ Pπ(n) . These are then related by the identities  (n) (n) , (M1 ⊗ · · · ⊗ Mn ) ◦ P≥π ≡ (M1 ⊗ · · · ⊗ Mn ) ◦ P

(2.29)

 ≥π

(M1 ⊗ · · · ⊗ Mn ) ◦ Pπ(n) ≡



 ≥π

(n)

μ (π ,  ) (M1 ⊗ · · · ⊗ Mn ) ◦ P≥ ,

(2.30)

which follow immediately from (2.28) and its M¨obius inversion. Now let X be an L2 -semimartingale, then we may define a random measure MX on the Borel subsets of R+ by extension of MX : (s, t] → X (t) − X (s) . In particular, given a collection X (1) ,. . . X (n) of such processes, their diagonalfree multiple integrals are identified with ' (n) (1) (n) − dXtn . . . dXt1 ≡ MX (1) ⊗ · · · ⊗ MX (n) ◦ P0 [B]. B

We also note that the cumulant X [n] of a process X occurs as MX [n] ≡ (MX )⊗n ◦ P1 . (n)

Further discussion can be found in the monograph of Peccati and Taqqu (2011).

3 Quantum Probability

Our ultimate aim is to study quantum fields; however, in this chapter we look at quantum mechanical systems with the intention of working through some of the combinatorial aspects relating to commonly occurring operators for quantum states, as an analogy to random variables. The combinatorics associated with creation and annihilation operators for the Harmonic oscillator turns out to be particularly relevant as a bridge between what we have already seen and the fields to come later on. In quantum theory, physical variables are understood as observables, which are modeled as self-adjoint operators acting on a fixed Hilbert space h. Observables do not necessarily commute, and this is the main point of departure from classical probability. Probabilities enter as a state (expectation) E, which is a map, assigning a complex number E[A] to each observable A, satisfying the following properties:1 1. Linearity E [αA + βB] = αE [A] + βE [B]; 2. Positivity X ≥ 0 &⇒ E [X] ≥ 0; 3. Normalization E [I] = 1; for all observables X, Y and scalars α, β. Here I denotes the identity operator. A central result of functional analysis – see Reed and Simon (1972) – is that every self-adjoint operator has an associated projection valued measure PX such that ' X≡ xPX (dx), (3.1) R

which is the spectral decomposition of X. The distribution of the observable is then KX = E ◦ PX . This is equivalent to the Born rule, which states that the 1 We will reserve the symbol E[ · ] for classical (Kolmogorov) averages and use E[ · ] for quantum

mechanical averages.

56

3.1 The Canonical Anticommutation Relations

57

probability that the observable X will be measured to have a value in the Borel subset A is KX [A] = E[PX [A]]. Note that, despite its quantum origins, KX is then just a classical probability distribution!

3.1 The Canonical Anticommutation Relations We start with anticommutation relations because they occur in the simplest form for two-level systems, and here the operators are matrices on h = C2 . We introduce annihilation and creation operators     0 0 0 1 ∗ a= , a = . 1 0 0 0 They satisfy the canonical anticommutation relations aa∗ + a∗ a = I, We recall the Pauli matrices    0 1 0 σx = , σy = 1 0 i

a2 = a∗2 = 0. −i 0



 ,

σz =

1 0

0 −1

 .

They satisfy the commutation relation [σx , σy ] = 2iσz , et cyclia. We then have a = σ− = σx − iσy a∗ = σ+ = σx + iσy with σ± being standard notation as raising and lowering operators in quantum mechanics. An orthonormal basis for the two-dimensional Hilbert space h is then given by     1 0 ∗ |0 = . , |1 = a |0 = 0 1 |0 is called the ground state vector, and we note that a|0 = 0. The ground state expectation is then defined to be    0 x11 x10 E [X] = 0|X|0 = [0, 1] = x00 . x01 x00 1

58

Quantum Probability

In general,   E etX = ex+ t p+ + eλ− t p− , where x± are the eigenvalues of X and (with φ± the eigenvectors) p± = | φ± |0 |2 . That is, P[dx] = p+ δx+ [dx] + p− δx− [dx] with   Prob X = x± = p± .

3.1.1 The Standard Fermionic Gaussian Distribution The observable ∗

Q=a+a =



0 1

1 0



1 in the ground state. This is 2 the distribution of a fair coin. For reasons that will become apparent later, we may refer to its as the standard Fermionic Gaussian distribution. The observable   1 0 ∗ !a a= 0 0

has eigenvalues ±1 with equal probability p± =

which has spectrum N− . In particular, it has 0,1 as eigenvalues, and |0 , |1 as corresponding eigenvectors. A related observable is   2 1 N = a∗ a + a + a∗ + I = (a + I)∗ (a + I) = 1 1 √ 3± 5 , and these have associated probabilities p± = and it has eigenvalues 2 1 1 ± √ for the ground state. It is not difficult to see that the moments of N in 2 2 5 the ground state are the Fibonacci numbers E [N n ] = Fn : F0 = F1 = 1, and Fn = Fn−1 + Fn−2 . A little more work shows that (for λ > 0) √ √ Nλ = a∗ a + λa + λa∗ + λ √   √ √ λ 1√ +λ (3.2) = (a + λI)∗ (a + λI) = λ λ

3.2 The Canonical Commutation Relations has ground state distribution P[dx] = p+ δμ+ [dx] + p− δμ− [dx], where √ 2λ + 1 ± 4λ + 1 μ± = , 2

1 1 . 1∓ √ p± = 2 4λ + 1

59

(3.3)

We may similarly refer to this as the Fermionic Poisson distribution of intensity λ.

3.2 The Canonical Commutation Relations We shall be interested in a pair of mutually adjoint operators b and b∗ on a Hilbert space h satisfying the canonical commutation relations  ∗ (3.4) b, b = I, where I is the identity operator. Alternatively, we introduce the quadrature operators Q, P defined to be Q = b + b∗ , 1 P = (b − b∗ ), i and these satisfy the equivalent canonical commutation relations [Q, P] = 2i I. It is clear that we cannot realize these as operators on a finite dimensional Hilbert space, as taking the trace of equation (3.4) would lead to 0 = dim(h). The operators may be realized concretely on h = L2 (R, dq) as Qψ(q) = qψ(q), Pψ(q) = −2iψ  (q). We shall call this the q-representation. From time to time, we will adopt the Dirac bra-ket notation |ψ for the “ket” associated with a vector ψ, in which case the q-representation is q|ψ ≡ ψ(q). It is convenient to adopt a complex phase space picture where we set β = 12 (x + iy) ∈ C. The quadrature coordinates are then x = 2Reβ, y = 2Imβ; see Figure 3.1. We may introduce a symplectic area on the complex phase space Im[β1∗ β2 ] = for βj = 12 xj + 12 yj .

1 (x1 y2 − y1 x2 ), 4

(3.5)

60

Quantum Probability

Im β β +β

2 s 1 6 β2  s                                         sβ1 s   - Re β

Figure 3.1 Signed phase (symplectic) area Im[β1∗ β2 ]. We then view b, b∗ and Q, P as the “quantized” versions of the variables β, β ∗ and x, y, respectively. Proposition 3.2.1 There is a unique up to normalization vector ψ0 (with ket written as |0 and referred to as the vacuum vector) such that b|0 = 0. The observable Q in this state is a standard Gaussian. d )ψ0 (q) = 0, which has the normalized Proof The relation becomes (q + 2 dq solution

ψ0 (q) = q|0 = (2π )−1/4 e−q

2 /4

.

The probability density of Q for this state is then |ψ0 (q)|2 = γ (q)  (2π )−1/2 e−q

2 /2

.

3.2.1 Weyl Displacement Unitaries The Weyl displacement operator with argument β ∈ C is the unitary operator D(β)  eβb

∗ −β ∗ b

.

Note that we may alternatively write, with β = D(β) = e2iIm[b

∗ β]

(3.6) 1 2 (x + iy),

i

= e 2 (Qy−Px) .

We will derive the main properties of the displacement operators, and to this end the following (BCH) formula will be indispensable.

3.2 The Canonical Commutation Relations

61

Lemma 3.2.2 (The Baker–Campbell–Hausdorff Formula) Let A and B be operators such that [A, B] commutes with both A and B, then 1

eA+B = eA eB e− 2 [A,B] .

(3.7)

Proof Let Rt = et(A+B) and St = e−tA Rt , then dtd St = e−tA BetA St = e−tA BetA St ≡ Bt St , where Bt = e−tA BetA . Now we have that dtd Bt = e−tA [B, A] etA = [B, A], since the commutator [B, A] commutes with A. Therefore, Bt = B + t [B, A]. And so dtd St = (B + t [B, A])St . From the initial condition S0 = I, one sees that 1 2 [A,B]

St = etB e− 2 t

,

where the second part follows from the fact that [A, B] commutes with B. Setting t = 1 yields the result. As a corollary, if A and B satisfy the conditions of the preceding lemma, then eA eB = eB eA e[A,B] .

(3.8)

Proposition 3.2.3 (The Weyl Form of the Canonical Commutation Relations) The Weyl displacement operators satisfy the relations ∗

D(β2 )D(β1 ) = e−iIm[β2 β1 ] D(β1 + β2 ).

(3.9)

The proof follows directly as an application of the BCH formula. As a corollary, the Weyl displacement operators are unitary and D(β)−1 = D(β)∗ = D(−β). The collection of pairs (β, θ ) with β complex and θ real forms a group with product  D (β1 , θ1 ) D (β2 , θ2 ) ≡ D ((β1 , θ1 )  (β2 , θ2 )) . This is the Heisenberg group. We may introduce related operators D(β, θ ) = e−iθ D(β), where θ is a real phase. If we introduce the law   (β2 , θ1 )  (β1 , θ1 ) ≡ β1 + β2 , θ1 + θ2 + Im β2∗ β1 Proposition 3.2.4 Let β = 12 (x + iy) ∈ C, then we may write D(β) ≡ e−|β|

2 /2

eβb e−β



∗b

= e+|β|

2 /2

e−β b eβb





= eixy/4 e−ixP/2 eiyQ/2 = e−ixy/4 eiyQ/2 e−ixP/2 .

62

Quantum Probability

They are all consequences of the BCH formula. The displacement vectors get their name from the following property: D(β)∗ bD(β) = b + β,

(3.10)

and likewise D(β)∗ b∗ D(β) = b∗ + β ∗ . To see this, let us fix β and set Dt = D(tβ) for real t, and put bt = D∗t bDt . Then d bt = D∗t [b, βb∗ − β ∗ b]Dt = β dt so bt = b + tβ. Setting t = 1 gives the result.

3.2.2 Number States The number operator ! is then defined as !  b∗ b,

(3.11)

and the trio b, b∗ and ! satisfy the algebraic relations [b, b∗ ] = 1,

[b, !] = b,

[b∗ , !] = −b∗ .

(3.12)

The latter two equalities can be expressed as !b = b(! − 1) and !b∗ = respectively. In the q-representation, we have

b∗ (! + 1)

∂ ∂ 1 1 q+ , b∗ ≡ q − , 2 ∂q 2 ∂q ∂2 1 1 ! ≡ − 2 + q2 − . 4 2 ∂q b≡

The eigenfunctions |n of ! in the q-representation are the square-integrable solutions hn to the differential equation

∂2 1 2 1 − 2+ q − hn (q) = nhn (q) 4 2 ∂q and these are the Hermite functions 1 * γ (q)Hn (q),

q|n = hn (q) = √ n! where we have n = 0, 1, 2, . . . , and Hn are the Hermite polynomials. (Again γ is the standard Gaussian distribution function.) The of vectors |n are called the number vectors, and we exploit the fact that {hn (x) : n = 0, 1, 2, . . .} form a complete orthonormal basis for the Hilbert space L2 (R, dx) to get the resolution of identity

3.2 The Canonical Commutation Relations 

|n n| = 1.

63

(3.13)

n≥0

The vectors |n , n ∈ N+ are then the complete set of eigenstates of !, specifically !|n = n|n . The case n = 0 is just the vacuum vector! It therefore follows that ! has spectrum equal to N+ . We have that vector b|0 = 0. From the identity !b = b(! − 1), we also see that b|n is an eigenvector of ! with eigenvalue n − 1, for n ≥ 1, since !b|n = (b! − b)|n = (n − 1)b|n , and therefore b|n is proportional to |n − 1 . In fact, since n|b∗ b|n = √

n|!|n = n, we find that b|n = n|n − 1 . Likewise, b∗ |n is an eigenvector ∗ of √ ! with eigenvalue n + 1 and a similar argument shows that b |n = n + 1|n + 1 . We summarize as follows √ b∗ |n = n + 1|n + 1 ,  √ n|n − 1 , n ≥ 1; b|n = (3.14) 0, n = 0. The operator b∗ is therefore referred to as a raising operator and b as a lower operator. Indeed, we can obtain all number vectors by repeated applications of the raising operator (b∗ )n |n = √ |0 . n!

(3.15)

3.2.3 Bargmann Vectors The Bargmann vector with argument β ∈ C is the vector defined by ∗

| exp(β)  eβb |0 .

(3.16)

Note that | exp(0) is the vacuum state |0 . Proposition 3.2.5 The Bargmann vectors satisfy the properties | exp(β) = e|β|

2 /2

D(β) |0 ,

−α ∗ β−|α|2 /2

D(α)| exp(β) = e

β1∗ β2

exp(β1 )| exp(β2 ) = e

| exp(β + α) ,

,

−1/4 − 14 q2 +βq− 12 β 2

q|β = (2π )

e

.

64

Quantum Probability ∗

Proof The first relation follows from the fact that e−β b |0 ≡ |0 . We there∗ ∗ fore have | exp(β) ≡ eβb e−β b |0 , and using Proposition 3.2.4 we may rewrite in terms of D(β). The second relation follows from D(α)| exp(β) = e|β|

2 /2

|β|2 /2

=e

= e|β|

2 /2

= e−α

D(α)D(β) |0 ∗

e−iImα β D(α + β) |0 ∗

e−iImα β e−|α+β)| | exp(α + β)

∗ β−|α|2 /2

2

| exp(β + α) .

To prove the third, we first note that

0|D(β)|0 = e−|β|

2 /2





0|eβb e−β b |0 = e−|β|

2 /2

0|0 = e−|β|

2 /2

,

and then

exp(β1 )| exp(β2 ) = e|β1 |

2 /2

0|D(−β1 )D(β2 )|0 e|β2 |

= e|β1 |

2 /2

eiIm



β1∗ β2



e|β2 |

2 /2

2 /2

0|D(β2 − β1 )|0

β1∗ β2

=e

The Bargmann vectors may be computed in the q-representation. We use the factorization D(β) = e−ixy/4 eiyQ/2 e−ixP/2 , where x = 2Reβ and y = 2Imβ. Now, in the q-representation ∂ −x ∂q

eiyQ/2 ψ(q) = eiyq/2 ψ(q) and e−ixP/2 ψ(q) = e

ψ(q) = ψ(q − x)

so that

q| exp(β) = e|β|

2 /2

e−ixy/4 eiyq/2 (2π )−1/4 e−(q−x) 1 2 +βq− 12 β 2

≡ (2π )−1/4 e− 4 q

2 /4

.

The set of Bargmann vectors {| exp(βj ) : j = 1, . . . , n}, for distinct complex numbers exp(βj ), is linearly independent. In general, the set of all exponential vectors is overcomplete, and we have the following resolution of identity. Proposition 3.2.6 The Bargmann vector | exp(β) may be expanded in the number basis as  βn | exp(β) = √ |n . n! n≥0

3.2 The Canonical Commutation Relations

We see again

exp(α)| exp(β) =



65

exp(α)|n n| exp(β)

n≥0

=

 (α ∗ β)n n≥0

n!



= eα β .

Lemma 3.2.7 The exponential vectors give the following resolution of identity ' dβdβ ∗ −ββ ∗ |β β| = 1. (3.17) e π Proof

First of all, note that | exp(β) exp(β)| = =

 β n (β ∗ )m |n m| √ n! m! n,m≥0 

rn+m iθ(n−m) e |n m| √ n! m! n,m≥0

where β = r eiθ . The element of area in polar coordinates is rdr dθ , and performing the θ -integration, one finds that the terms n = m vanish. This leaves ' ∞ ' 2n dβdβ ∗ −ββ ∗ 2  r | exp(β) exp(β)| = 2rdr e−r e |n n| π n! C 0 n≥0  |n n|, = n≥0

using the substitution t = r2 and the identity n! = ness relation for number states gives the result.

∞ 0

e−t tn dt. The complete-

Lemma 3.2.8 The number operator generates unitaries eitN and these have the following action on Bargmann vectors: eit! | exp(β) = | exp(eit β) . Proof

(3.18)

Note that

eiθ! | exp(β) = eiθ!

 βn  (eiθ β)n |n = | exp(eiθ β) . √ |n = √ n! n! n≥0 n≥0

For a one-dimensional harmonic oscillator model for a particle with mass m kg with harmonic frequency ω s−1 , the Hamiltonian operator is ˆ = H

1 2 1 pˆ + mω2 qˆ 2 , 2m 2

(3.19)

66

Quantum Probability

where the canonical position and momentum operators satisfy the standard  Heisenberg commutation relations qˆ , pˆ = ih. ¯ In this problem, a natural length scale is set by + h¯ σ = . (3.20) 2mω It is then convenient to introduce dimensionless operators b=

σ 1 qˆ + i pˆ , h¯ 2σ

σ 1 qˆ − i pˆ , (3.21) 2σ h¯   and these satisfy the commutation relations b, b∗ = 1. (Essentially the quadratures are Q = σ1 q and P = σh¯ p.) The Hamiltonian may be written in terms of the corresponding b and b∗ as



1 1 ∗ ˆ =ω !+ . H = hω b b+ ¯ 2 2 b∗ =

3.2.4 Expectations The vacuum state expectation is defined by E0 [X] = 0|X|0 . For instance, we average D(z) to obtain ( ∗ ∗ ) 2 E0 ezb −z b = e−|z| /2 and setting z = it and −s, for t, s real, leads to ( )   2 2 E0 eitQ = e−t /2 , E0 eisP = e−s /2 . So both quadratures have standard Gaussian distribution. A normalized Bargmann vector is called a coherent state, and we write |coh(β) = e−|β|

2 /2

| exp(β) .

The coherent state with amplitude exp(β) likewise defines an expectation Eβ [X] = e−|β| exp(β)|X| exp(β) . 2

The vacuum is of course the special case exp(β) = 0, but we may relate the general coherent state to the vacuum via | exp(β) = e|β|

2 /2

D(β)|0 ,

3.2 The Canonical Commutation Relations

67

so that   Eβ [X] = 0|D(β)∗ XD(β)|0 = E0 D(β)∗ XD(β) . In particular, ( ∗ ∗ ) ( ) ∗ ∗ Eβ ezb −z b = E0 D(β)∗ ezb −z b D(β) ( ) ∗ ∗ = E0 ez(b+β) −z (b+β) ≡ e−|z|

2 /2+zβ ∗ −z∗ β

.

From this we deduce that the quadratures Q and P are Gaussian with unit variance, but with displaced means 2Reβ and 2Imβ, respectively. The observable ! has a Poisson distribution in the coherent state:   2 Eβ eit! = e−|β| exp(β)|eit! | exp(β) = e−|β| exp(β)| exp(eit β) 2

= e|β|

2 (eit −1)

.

(3.22)

Alternatively, we may say that the observable D(β)∗ !D(β) = (b + β)∗ (b + β) = ! + β ∗ b + β ∗ b + |β|2

(3.23)

has a Poisson distribution in the vacuum state.

3.2.5 Squeezing Let us introduce the operators   b2 ,

 ∗  (b∗ )2 .

(3.24)

We find that [b,  ∗ ] = 2b∗ ,

[, b∗ ] = 2b

(3.25)

and [,  ∗ ] = 4! + 2, [!,  ∗ ] = 2 ∗ , [, !] = 2.

(3.26)

ˆ and  ∗ is a In particular, the set consisting of linear combinations of I, !,  Lie algebra with commutator as bracket.

68

Quantum Probability

For complex ε, we define the squeezing operator by   1 ∗ 1 ∗ S (ε)  exp ε − ε  . 2 2

(3.27)

This is a unitary family and we note that S (ε)−1 = S (ε)∗ = S (−ε) . Lemma 3.2.9 Let ε have the polar form reiθ , then S (ε)∗ bS (ε) = cosh (r) b + sinh (r) eiθ b∗ Proof

(3.28)

Let b (u) = S (uε)∗ bS (uε) for real u, then d b (u) = εb (u)∗ , du 2

d 2 and so du 2 b (u) = r b (u). This is a simple second-order ODE with operatord valued initial conditions b (0) = b and du b (u) |u=0 = εb∗ yielding the solution (3.28).

The transformation b → cosh (r) b + sinh (r) eiθ b∗ preserves the canonical commutation relations and is referred to as a Bogoliubov transformation. Lemma 3.2.10 Let ε have the polar form reiθ , then the squeezing operator may be placed in the following Wick ordered form − 1 ∗ , (3.29) S (ε) = ζ  (cosh r)−!+ 2 ζ ∗

 where ζ = exp 12 eiθ tanh r . Proof We shall map the Lie algebra generated by I, !,  and  ∗ to the Lie algebra of Pauli matrices as follows: j (I) = I and   1 1 0 1 j(!) = σz − = , 2 2 0 −3   0 0 j() = 2iσ− = , 2i 0   0 2i j( ∗ ) = 2iσ+ = . 0 0 The map j preserves the Lie brackets since [ j(), j( ∗ )] = 4 j(!) + 2,   j(!), j( ∗ ) = 2 j( ∗ ),   j(), j(!) = 2 j().

3.3 Wick Ordering

69

In other words, j is a Lie algebra homomorphism. We note that j()∗ = j( ∗ ) so that j is not a *-map, but this does not affect our computation. We see that   1 1 ˆ εj( ∗ ) − ε∗ j() j (S (ε)) = exp 2 2   0 ireiθ = exp −ire−iθ 0   1 sinh r eiθ cosh r 2 . = − 12 sinh r e−iθ cosh r ∗

Alternatively, let us try and write S (ε) = es et eu! ev , then     1 u 0 0 0 2 exp j (S (ε)) = e exp exp 0 0 0 − 32 u -    , 1u 0 1 0 1 2it e2 = es − 32 u 2iv 1 0 1 0 e  2u  3 e − 4tv 2it . = es− 2 u 2iv 1 

s

0 2it

2iv 0



Comparing entries gives 3

es− 2 u = cosh r, e2u − 4tv = 1, 3 3 1 1 ves− 2 u = − sinh r e−iθ tes− 2 u = sinh r eiθ , 2 2 with unique solution es =

√ 1 , cosh r

eu =

tanh r e−iθ . This yields the stated form.

1 cosh r ,

t =

1 2

tanh r eiθ and v = − 12

3.3 Wick Ordering Let f (β, β ∗ ) be a polynomial function in complex variables β and β ∗ . The Wick (or normal) ordered operator corresponding to P with β replaced by b and β ∗ by b∗ is the operator : f (b, b∗ ): where we make this replacement in each term putting all b∗ to the left and b’s to the right. For instance, f (β, β ∗ ) = β 2 β ∗ + 4β ∗2 β leads to : f (b, b∗ ): = b∗ b2 + 4b∗2 b, while : (b + b∗ )2 : = b2 + 2b∗ b + b∗2 .

70

Quantum Probability

We also have

 n

b∗m bn−m , :Q : = m m n

: !n: = :(b∗ b)n: = b∗n bn ,

: D(β): = : eβb

∗ −β ∗ b





: = eβb e−β b .

The Wick ordering process is linear insofar as   : c1 f1 (b, b∗ ) + c2 f2 (b, b∗ ) : ≡ c1 : f1 (b, b∗ ) : +c2 : f2 (b, b∗ ) : .

(3.30)

Proposition 3.3.1 We have : Qn: = Hn (Q),

Qn = (−1)n Hn (−Q):,

and in particular : (b + b∗ )n: = Hn (b + b∗ ). Proof

For t real, we have using Proposition 3.2.4, ∗



: etQ: = : et(b+b ): = etb etb = e−t

2 /2



etb+tb = etQ−t

2 /2

.

(3.31)

Expanding in powers of t and using the generating function relation for Hermite polynomials gives the first result. Rearranging the preceding identity as 2 : etQ+t /2: = etQ gives the second result. Proposition 3.3.2 For z ∈ C, set N = (b + z)∗ (b + z) = ! + zb∗ + z∗ b + |z|2 , then  n : N m: and : N n: = N n . (3.32) Nn = m m Proof

On the one hand, we have

exp(α)|eitN | exp(β) = exp(α)|D(z)∗ eitN D(z)| exp(β) = e−zα

∗ −z∗ β−|z|2 ∗



exp(α + z)|eitN | exp(β + z)

= e−zα −z β−|z| exp(α + z)| exp(eit β + eit z)   ∗ it ∗ ≡ exp (α + z) (β + z)(e − 1) + α β . 2

While on the other

exp(α)| : eisN : | exp(β) = eis(α+z)

∗ (β+z)+α ∗ β

.

The two expressions agree if is + 1 = eit . Since the Bargmann vectors were arbitrary, we have eitN = : e(e

it −1)N

:,

(3.33)

3.3 Wick Ordering

71

which should be compared with the generating functions for the Poisson moments/Stirling numbers of the second kind (2.8). Expanding in t yields  n m equation (3.33) as the identity N n = m m : N :. We may rearrange the N uN N n : e : ≡ (1 + u) from which we deduce that : N : = n! n = N n . These identities hold true for z = 0 so we have, for instance,  n b∗m bm . !n = m m

(3.34)

3.3.1 Quantization Let F be a function of the complex phase plane variable β = 12 (x + iy), which we can write either as F(β, β ∗ ) or f (x, y). The phase space Fourier transform is defined by ' dβ2 dβ2∗ ∗ F˜ β1 , β1∗ = e2iIm[β1 β2 ] F(β2 , β2∗ ) π or equivalently f˜ (x1 , y1 ) =

'

i

e 2 (x1 y2 −y1 x2 ) f (x2 , y2 )

dx2 dy2 . 4π

The inverse transform remarkably is given by ' dβ dβ ∗ ∗ ∗ ˜ 2 , β2∗ ) 2 2 F β1 , β1 = e2iIm[β1 β2 ] F(β π where the usual sign change is accounted for by the symplectic area. To see this, we note that ' i dx2 dy2 e 2 (x1 y2 −y1 x2 ) f˜ (x2 , y2 ) 4π ' ' i dx2 dy2 dx3 dy3 [(x −x )y −(y −y )x = e 2 1 3 2 1 3 2 ] f (x3 , y3 ) 4π 4π ' = δ(x1 − x3 )δ(y1 − y3 )f (x3 , y3 ) dx3 dy3 = f (x1 , y1 ). The Weyl quantization f (Q, P) of a function f = f (x, y) is the operator defined by ' ' i dxdy dβdβ ∗ ˜ fˆ = e 2 (Qy−Px) f˜ (x, y) ≡ D(β)F(β, β ∗) . (3.35) 4π π

72

Quantum Probability

The Weyl quantization has the property that  m Pn−m , + bP)n = am bn−m Q (aQ m n Pn−m consists of all which implies that the Weyl quantization Q symmetrically order the operators, e.g.

n m

ways to

1 2 3  2 P3 = [Q P + QPQP2 + QP2 QP + QP3 Q + PQPQP Q 10 +PQ2 P2 + P2 Q2 P + PQP2 Q + P2 QPQ + P3 Q2 ]. It is not difficult to show that in the q-representation

' i q1 + q2 , y dy

q1 |f (Q, P)|q2 = e 2 (q1 −q2 )y f 2 and inversely

' f (x, y) =

We note that

i 1 1 (Q, P)|x + q dq. e 2 qy x − q|f 2 2

) ' ( dxdy ∗   . tr f (Q, P) g(Q, P) = f (x, y)∗ g(x, y) 4π

We shall now show that a similar principle can be applied to Wick ordering.   Proposition 3.3.3 F (β, β ∗ ) = Eβ : F(b, b∗ ): . Proof

This follows immediately from the fact that

exp(β)| : F(b, b∗ ): | exp(β) = F(β, β ∗ ) exp(β)| exp(β) = e|β| F(β, β ∗ ). 2

Lemma 3.3.4 The Wick order form is given by ' dβdβ ∗ ∗ ∗ ˜ . β ∗) : F(b, b∗ ): = eβb e−β b F(β, π This is almost the same as (3.35) except that D(β) = eβb ∗ ∗ by eβb e−β b in the integrand!

(3.36) ∗ −β ∗ b

is replaced

Taking coherent state expectations Eα of the right-hand side leads to ' dβdβ ∗ ∗ ∗ ˜ β ∗) eβα e−β α F(β, ≡ F(α, α ∗ ), π   which we identify with Eα :F(b, b∗ ): through Proposition 3.3.3.

Proof

3.3 Wick Ordering

73

We also see that   '   dβdβ ∗ −ββ ∗ |exp(β) exp(β)| e tr : F(b, b∗ ): = tr : F(b, b∗ ) : π ' ∗ dβdβ −ββ ∗

exp(β)| : F(b, b∗ ): |exp(β) = e π ' dβdβ ∗ F β, β ∗ . = π Proposition 3.3.5 The projection onto a coherent state |coh(α) is Pα = 2 e−|α| | exp(α) exp(α)|. We have Pα = : Fα (b, b∗ ), where Fα (β, β ∗ ) = e−|β−α| . 2

Proof

Again from Proposition 3.3.3, we will have Fα (β, β ∗ ) = Eβ [Pα ] = e−|β|

2 −|α|2

| α|β |2 = e−(β

∗ −α ∗ )(α−β)

.

The special case α = 0 yields the identification : e−!: = |0 0|, but we may expand  (−1)n  (−1)n : !n: = !(! − 1) · · · (! − n + 1) : e−!: = n! n! n n

 n ! = = 0! . (−1) m n Interpreting 00 = 1 shows that this surprising expression is indeed the correct one for projection onto the vacuum. Corollary 3.3.6 For l, m = 0, 1, 2, . . . , we obtain the following representation of rank-one operators:  (−1)n |l m| = (b∗ )n+l bn+m . n! n≥0

4 Quantum Fields

The aim of this chapter is to introduce standard combinatorial objects in a logical way to field theory. At the outset, we employ the shorthand notation of Guichardet for symmetric functions. Once this is done, we exploit the notion of combinatorial species (next chapter) as a mechanism for isolating the features, and to develop the powerful approach for characterizing and calculating moment generating functions.

4.1 Green’s Functions Let X be a Lusin space with a nonatomic measure, which we denote by dx. Formally a random field φ over the space X is determined by its moments1 G(x1 , . . . , xn ) = E[φ(x1 ) . . . φ(xn )], which we refer to as n-point Green’s functions. Without loss of generality, we shall assume that the fields are real scalar. They must be completely symmetric in all arguments: G(xσ (1) , . . . , xσ (n) ) = G(x1 , . . . , xn ) for any permutation σ of the indices. We may use the diagram in Figure 4.1 to denote a particular Green’s function. It is useful to assume that we have random variables ' φ(x)J(x)dx Q[J] = X 1 To avoid unnecessary complications, we just take it for granted that E is an expectation (=state)

on a sample space  consisting of all realizations of the fields. The exact mathematical realization of  as a measure space, as well as the construction of a probability measure equivalent to the expectation, is of course another matter; however, we wish to focus on algebraic properties in this chapter.

74

4.1 Green’s Functions

75

Figure 4.1 The diagram representing the Green’s function G(x1 , . . . , xn ).

Figure 4.2 The series expansion appearing in (4.1) is drawn in terms of diagrams. The terms J(x) are represented by a star at position x, while  the general term G(x1 , . . . , xn ) J(x1 ) . . . J(xn ) dx1 . . . dxn is denoted by the n-point Green’s function with each leg terminated by a star.

for J in some suitable class of test function J. In this case, we introduce the moment generating functions ZG [J]  E[eQ[J] ] and have formally  1 ' ZG [J] ≡ (4.1) G(x1 , . . . , xn ) J(x1 ) . . . J(xn ) dx1 . . . dxn n! n≥0

This may be described diagrammatically as in Figure 4.2. We now make the observation that, because the order of the variables in a completely symmetric function does not matter, we may write G(x1 , . . . , xn ) ≡ G(X)

(4.2)

whenever these arguments are distinct, and X is the set {x1 , . . . , xn }. Further, as X is assumed to be a measurable Lusin space with nonatomic measure dx and we are assuming that G is measurable, we do not actually need to know the values of the Green’s function for situations where we have coincidences, that is, two or more points equal. We may formulate this as follows: (The No Coincidence Principle) If X is a Lusin space with nonatomic measure dx, then any sequence of functions (Gn )∞ n=0 , with Gn measurable on n × X, is equivalent to a measurable function G : Power(X) → C.

76

Quantum Fields

Note that the measurability on Power(X) is the one induced from X. Note also that we include a value G0 = G (∅).

4.1.1 Guichardet Shorthand Notation We can also introduce a measure dX on Power(X) as follows Guichardet (1970). A measure dX on Power(X) is given by '  1 ' F(x1 , . . . , xn ) dx1 . . . dxn . F(X)dX = n! n≥0

A further shorthand is the following: let f be measurable on X, then for X, a finite subset of X, we set  f (x). (4.3) fX  x∈X

The moment generating function (4.1) for a sequence G of Green’s functions can be written in shorthand notation as ' (4.4) ZG [J] = G(X)J X dX, for suitable test functions J ∈ J. We shall interpret (4.4) as saying a family G of Green’s functions has generating functional ZG . Conversely, we say that a functional F = F[J] is analytic if it possesses a representation F =  ≡ ZG for some measurable G. Lemma 4.1.1 Define the function exp (f ) on the power set of X by exp (f ) : X → f X , then '  Zexp(f ) [J] = exp f (x) J (x) dx . Proof

In this case, we have ' ZG [J] = f X J X dX  1 ' = f (x1 ) . . . f (xn )J (x1 ) . . . J (xn ) dx1 . . . dxn n! n≥0

n  1 ' = f (x)J (x) dx n! n≥0

and summing the exponential series gives the result.

4.1 Green’s Functions

77

4.1.2 The Wick Product Suppose that we have a pair F and G of Green’s function sequences. Is there a sequence of Green’s functions, say F  G, such that ZF [J]ZG [J] = ZFG [J]? The Wick product of functions on Power(X) is defined to be  F  G(X)  F(X1 )G(X2 ).

(4.5)

X1 +X2 =X

The sum is over all decompositions of the set X into ordered pairs (X1 , X2 ) whose union is X. Proposition 4.1.2 We have ZFG = ZF ZG ,

(4.6)

where the F  G is the Wick product. Proof

or

Writing this out, we see that ' ' FY)G(Z)J Y+Z dYdZ = F  G(X)J X dX  n1 ,n2 ≥0

1 n1 ! n2 !

' F(y1 , . . . , yn1 )G(z1 , . . . , zn2 )

× J(y1 ) . . . J(yn1 )J(z1 ) . . . J(zn2 ) dy1 . . . dyn1 dz1 . . . dzn2  1 ' = F  G(x1 , . . . , xn ) J(x1 ) . . . J(xn ) dx1 . . . dxn n! n≥0  and comparing powers of J we get F  G(X) = Y+Z=X F(Y)G(Z). Here the sum is over all possible ways to decompose the set X into two disjoint subsets. As an exercise, try to show that the Wick product of exponential vectors is exp(f )  exp(g) ≡ exp(f + g).

(4.7)

Also try to show that the Wick product is a symmetric associative product with the general form  G1 (X1 ) . . . Gn (Xn ), G1  · · ·  Gn (X) = X1 +···Xn =X

where the sum is now over all decomposition of X into n ordered disjoint subsets.

78

Quantum Fields

Some care is needed when the Green’s functions are the same. Denoting the nth Wick power of F by F n =F · ·  F, we have   · n times



F n (X) =

F(X1 ) . . . F(Xn )

X1 +···+Xn =X

= n!

unordered 

F(X1 ) . . . F(Xn )

X1 +···+Xn =X,

where the last sum is over all ways to decompose the set X into nonempty subsets {X1 , . . . , Xn }, this time unordered! We digress slightly to prove the following important result.  Lemma 4.1.3 (The  Lemma) For F, a measurable function on ×p Power(X), we have the identity '

F X1 , . . . , Xp dX1 . . . dXp ≡

'



F X1 , . . . , Xp dX.

X1 +···+Xp =X

(4.8) Proof If we write both of these expressions out longhand, then the left-hand 1 , where #Xk = nk . On the right-hand side, we side picks up the factors n1 !...n p! 1 get n! , where #X = n. Both expressions are multiple integrals with respect to either dX1 . . . dXp or dX with X1 + · · · + Xp = X; however, on the right-hand n , giving the number of decomposide, we obtain an additional factor n1 ,...,n p sitions of X with nk elements in the kth subset. This accounts precisely the combinatorial factors, so both sides are equal.

4.1.3 The Composition Formula Let us consider a family of Green’s functions F with F (∅) = 0. We now wish to find the Green’s functions H (X) such that ZH [J] = h(ZF [J]), where h is an analytic function of complex numbers. If we have the Maclaurin  1 hn zn , then series expansion h(z) = n n! H≡

 1 hn F n , n! n

4.1 Green’s Functions

79

and so H (X) =

 n

hn

unordered 

F(X1 ) . . . F(Xn ).

X1 +···+Xn =X

As F (∅) = 0, we effectively have the sum over all {X1 , . . . , Xn } consisting of nonempty disjoint subsets of X whose union is X – in other words, partitions! – for all possible n. Let π = {X1 , . . . , Xn } be a partition of X, then we write F π  F(X1 ) . . . F(Xn ). The union of two disjoint subsets A and B is denoted as A + B. So we have unordered 

F(X1 ) . . . F(Xn ) =

X1 +···+Xn =X



Fπ .

π ∈Partn (X)

The number of blocks making up the partition is denoted as N(π ). We now arrive at the following result. Theorem 4.1.4 Let F be a measurable function on Power(X) with F[∅] = 0,  1 hn zn , then the equation and h be a function with Maclaurin series h(z) = n n! ZH [J] = h(ZF [J]) generates the Green’s functions  H(X) = hN(π ) F π . (4.9) π ∈Part(X)

4.1.4 Green’s Functions: Cumulants In many practical applications, one wishes to characterize a probability distribution through its cumulants. Let G be a system of Green’s functions. The cumulant Green’s functions K are defined by their generating function W[J] = ZK [J], where W[J] = ln ZG [J].

(4.10)

We may use a diagrams to describe the ordinary and cumulant Green’s functions, as in Figure 4.3. The series expansion for (4.10) can be then represented in diagram form now as in Figure 4.4. Theorem 4.1.5 The ordinary Green’s functions can be expressed in terms of the cumulant Green’s functions according to the rule  Kπ . (4.11) G (X) = π ∈Part(X)

80

Quantum Fields

2

1

2

1 3

3

4

4

n

n

Figure 4.3 The diagram on the left (black center) represents an ordinary Green’s function G123...n , and the one on the right (gray center) represents a cumulant Green’s function K123...n .

Figure 4.4 The series expansion for W[J] = there is no constant term in the expansion.



KX dX. Note that K∅ = 0, so

Conversely we have K (X) =



μ (π ) Gπ ,

(4.12)

π ∈Part(X)

where μ (π ) = (−1)N(π )−1 (N (π ) − 1)!. Proof For the first part, we simply use the preceding lemma with hn = 1 so that h (t) = et and take F = K and H = G. Note that K[∅] ≡ 0. To invert the relationship, we now set F to be ˜ (X) = G



G (X) , 0,

X=  ∅; X = ∅.

and H = K. In this case, ZK = log ZG = log 1 + ZG˜ since G (∅) = 1. We set h (t) = ln (1 + t), and so hn = (−1)n−1 (n − 1)! in this case.

4.1 Green’s Functions

81

Writing G123 as shorthand for G (X) with X = {x1 , x2 , x3 } and so on, we have G1 = K1 , G12 = K12 + K1 K2 , G123 = K123 + K12 K3 + K23 K1 + K31 K2 + K1 K2 K3 , .. . and inversely K1 = G1 , K12 = G12 − G1 G2 , K123 = G123 − G12 G3 − G23 G1 − G31 G2 − 2G1 G2 G3 , .. . Let us examine the 4th order Green’s function expanded in terms of its cumulants: G1234 = K1234 + K1 K234 + K2 K134 + K3 K124 + K4 K123 + K12 K34 + K13 K24 + K14 K23 + K1 K2 K34 + K1 K3 K24 + K1 K4 K23 + K2 K3 K14 + K2 K4 K13 + K3 K4 K12 + K1 K2 K3 K4 .

This may be represented as the sum over the following diagrams; see Figure 4.5. In general, an n-point Green’s function G12...n will be a sum of Bn cumulant Green’s functions, where Bn (the nth Bell number; see subsection 1.3.2) counts the number of ways to partition a set of n indices into nonempty subsets. More n specifically, the Stirling numbers of the second kind, m count the number of 1

2

3

4

1

2

1

3

4

3

1

2

1

3

4

1

2

3

4

3

2

1

4

2

3

1

4

3

1

2

1

2

1

2

1

2

4

3

4

3

4

3

4

3

4

2

1

2

4

3

2

1

4

3

2

1

4

3

2

4

Figure 4.5 Diagrammatic expansion of G1234 into cumulants

82

Quantum Fields

ways to partition n items into exactly   m subsets (blocks); see subsection 1.3.2: in the sum, there will be exactly mn terms, which are a product of m cumulants.     For the preceding case n = 4, we have 41 = 1, 42 = 7, 43 = 6, 44 = 1, and B4 = 15.

4.1.5 Calculus for Fields If a generating function ZG [J] is Fr´echet differentiable about J = 0 to all orders, then we may work out the components G(X) according to   δ . (4.13) G(X) = ZG [J] δJ (x) x∈X

J=0

A useful shorthand is to introduce the multiple Fr´echet derivative δ δ = X δJ δJ (x) x∈X

along with the derivative

δZG [J] defined by δJ δZG [J] δZG [J] : X → . δJ δJ X

In particular,

δZG [J] δJ

= G. J=0

δZG [J]  = G (· + Y) J Y dY, that is, δJ ' δZG [J] = G (X + Y) J Y dY. δJ X

Proposition 4.1.6 We have

Proof

To see this, note that ∞

δ  1 δZG [J] = δJ (x) δJ (x) n! n=0

=

∞  n=1

' =

1 (n − 1)!

' Gn (x1 , . . . , xn ) J (x1 ) . . . J (xn ) dx1 . . . dxn ' Gn (x, x2 , . . . , xn ) J (x2 ) . . . J (xn ) dx2 . . . dxn

G ({x} + Y) J Y dY.

Therefore, more generally

δZG [J]  = G(· + Y)J Y dY. δJ

4.1 Green’s Functions

83

Proposition 4.1.7 (The Leibniz Rule for Fields) For analytic functional U and V, we have that δ δU [J] δV [J] U [J] V [J] =  . δJ δJ δJ To see this, set U = ZF and V = ZG , then we are required to show that  δZF [J] δZG [J] δ ZF [J] ZG [J] = . X δJ δJ X1 δJ X2 X1 +X2 =X

The proof follows by elementary induction. For several terms, we just find multiple Wick products. We note that if W = W [J] is a given functional, then δW δh (W) = h (W) , δJ (x1 ) δJ (x1 ) δ 2 h (W) δW δW δ2W = h (W) + h (W) . δJ (x1 ) δJ (x2 ) δJ (x1 ) δJ (x2 ) δJ (x1 ) δJ (x2 ) .. . The pattern is easy to spot, and we establish it in the next lemma. Lemma 4.1.8 (The Chain Rule for Fields) Let W = W [J] possess Fr´echet derivatives to all orders and let h be smooth, then

 δ δW π (N(π )) [J]) [J]) h = h , (W (W δJ X δJ π ∈Part(X)

where h(n) (t) is the nth derivative of h at t. δh (W) δW , the iden= h (W) x δJ δJ (x) tity is trivially true for n = 1. Now assume that it is true for n, and let |X| = n, then

 δW π δ δh (W) δ (N(π )) = h (W [J]) δJ (x) δJ X δJ (x) δJ π ∈Part(X)

 δW π δW = h(N(π )+1) (W [J]) δJ (x) δJ π ∈Part(X)

 δW π δ (N(π )) + h (W [J]) δJ (x) δJ

Proof

This is easily seen by induction. As

π ∈Part(X).

However, the first term on the right-hand side is a sum over all parts of X + {x} having x occurring as a singleton, while the second term, when differentiated

84

Quantum Fields

with respect to J (x), will be a sum over all parts of X + {x} having x in some part containing at least one element of X. Thus we may write the preceding as

 δh (W) δW π (N(π )) [J]) = h . (W δJ δJ X+{x} π ∈Part(X+{x})

The identity then follows by induction. The chain rule is in fact a generalization of the Fa`a di Bruno formula. We note that for h (t) = et , we get the equation  δW π δ W[J] W[J] e = e . (4.14) δJ X δJ π ∈Part(X)

4.1.6 Zero-Dimensional Fields We now take our space X to consist of a single point denoted as *. In this case, our “fields” reduce to a single random variable φ, the field at *. The Green’s functions Gn are then the moments μn = E [φ n ] and the generating  1 Gn tn . functions ZG [t] are just the exponential generating series g (t) = n≥0 n! (Technically we have abandoned our no coincidence rule, but this is not a problem here.) The chain rule formula then becomes   dn (N(π )) h = h g(#A) (t) , (g (t)) (g (t)) dtn π ∈Partn

A∈π

where h(n) and g(n) are the nth derivatives of h and g. The sum is over partitions of n objects. We could rewrite this in terms of occupation numbers of partitions: E(n)=n ∞   nk   dn (N(n)) (k) g h = ν h (g (t)) (n) (g (t)) (t) Part dtn N k=1

n∈N+

=

E(n)=n  n∈NN +

∞   nk  n! 1 (N(n)) h g(k) (t) . (g (t)) n1 n2 (1! ) (2! ) . . . n1 ! n2 ! . . . k=1

This, of course, is just the Fa`a di Bruno formula. It may also be expressed using Bell polynomials as n    dn (m) (1) (2) h = h Bell n, m; g , g , . . . . (g (t)) (g (t)) (t) (t) dtn m=1

4.1 Green’s Functions

85

The relationship between moments and cumulants is now simply μn =

 

κ#A ≡

π ∈Partn A∈π

n 

Bell (n, m; κ1 , κ2 , . . .) ,

m=1

which likewise recovers the earlier results.

4.1.7 Nonsymmetric Green’s Functions We will later encounter Green’s functions of the form G(x1 , . . . , xn ) = (x1 ) . . . (xn ) , where · is a quantum expectation (state) and the fields (x) are operators that need not commute among themselves. In such cases, we have to deal with Green’s functions where the order of the arguments are important. To this end, we denote by X = (x1 , . . . , xn ) the sequence and write accordingly G = G(X). Despite this lack of complete symmetry, many of the ideas already encountered still apply. For instance, the generating function ZG may still defined as in (4.1), though it only contains information about the symmetrized Green’s function 1  G(xσ (1) , . . . , xσ (n) ), n! σ ∈Sn

and this is all we would obtain when differentiating the generating function. We define a subsequence of X = (x1 , . . . , xn ) specifically to be a subsequence (xi(1) , . . . , xi(p) ), which respects the original order, that is, i(1) < i(2) < · · · < i(p). We include the possibility of an empty sequence (p = 0), and X itself (p = n). Two subsequences of X are disjoint if they have no common element. In this way, we may define a Wick product of nonsymmetric Green’s functions as  F(X1 ) G(X2 ), (4.15) F  G (X)  X1 +X2 =X

where now the sum is over all pairs of disjoint subsequences of X whose union is X. A partition of X is understood as a set of nonempty pairwise disjoint subsequences of X whose union is X. Their enumeration is exactly the same as that for sets. This is a consequence of the fact that, having fixed the sequence X, we

86

Quantum Fields

only deal with subsequences that respect this order, and so these are in one-toone correspondence with subsets of the set X = {x1 , . . . , xn } . We may therefore define cumulants of nonsymmetric Green’s functions in the obvious way:  (4.16) G(X) = K(X1 ) . . . K(Xm ), where we now have a sum over all partitions {X1 , . . . , Xm } of X with 1 ≤ m ≤ n. This is almost identical to the situation in Theorem 4.1.4, and the corresponding inverse relation likewise holds. Indeed, the expansions in (4.13) and (4.13) hold with the only caveat that moments are all order dependent, e.g. G123 is shorthand for G(x1 , x2 , x3 ), and so on, but this is not an issue as all terms in the development are displayed respecting this order.

4.2 A First Look at Boson Fock Space When describing several quantum particles, it is important to take into account whether they are identical or not. The wave function of n particles will be a square-integrable function ψn (x1 , . . . , xn ) with the modulus-squared ρn (x1 , . . . , xn ) = |ψn (x1 , . . . , xn )|2 interpreted as the probability density to find the particles at the position (x1 , . . . , xn ). If the particles are indistinguishable, then quantum theory imposes the symmetry ρn (xσ (1) , . . . , xσ (n) ) = ρn (x1 , . . . , xn ) for every permutation σ . That is, the probability density is completely symmetric under interchange of the particle labels. In Figure 4.6, as the identical particles are truly indistinguishable, only the locations are relevant. In other

Figure 4.6 Quantum particles at positions x1 , . . . , x5 .

4.2 A First Look at Boson Fock Space

87

words, the Gibbs 1/n! correction introduced in classical statistical physics is a physical feature in the quantum case! While there may be several mathematical schemes that lead to the densities ρn being completely symmetric, only two mechanisms are encountered in Nature: • Bosons: The wavefunctions are themselves completely symmetric: ψn (xσ (1) , . . . , xσ (n) ) = ψn (x1 , . . . , xn ), • Fermions: The wavefunctions are completely antisymmetric: ψn (xσ (1) , . . . , xσ (n) ) = (−1)σ ψ(x1 , . . . , xn ), Here (−1)σ denotes the sign of the permutation. More generally, we may have a state that has an indefinite number of particles. That is, we have a sequence of wave functions (ψn )n∈N+ with the complex values ψn (x1 , . . . , xn ) being the probability amplitude for n particles. This includes the case n = 0, where ψ0 is a complex scalar. We then have ' pn = |ψn (x1 , . . . , xn ) |2 dx1 . . . dxn as the probability that there will be exactly n particles, and the state normaliza tion is ∞ n=0 pn = 1. In the Boson case, we obtain a function on the power set of X by introducing  : Power(X) → C √ : {x1 , . . . , xn } → n! ψ(x1 , . . . , xn ).

(4.17)

Normalization then may be written in Guichardet shorthand as ' |(X)|2 dX = 1. We have gone from the Hilbert space L2 (X, dx) for a single particle, and we are now dealing with an indefinite number of indistinguishable Boson particles with state  ∈ L2 (Power(X), dX), which is the Boson Fock space over L2 (X, dx). We in fact have a mapping exp : L2 (X, dx) → L2 (Power(X), dX) : f → exp(f ).

(4.18)

This mapping is analytic and, as we shall prove in the next chapter, will map from a dense subset of L2 (X, dx) to a dense subset of  ∈ L2 (Power(X), dX). Moreover, we have

88

Quantum Fields

exp(f )| exp(g) = e f |g .

If we consider again the degenerate situation of a zero-dimensional configuration space X consisting of a single point *, then L2 (∗) ≡ C. We recover the situation of the previous chapter with the Bose Fock space becoming the Hilbert space for the canonical commutation relations (CCR), and exp(β) being the Bargmann state with test “function” β.

5 Combinatorial Species

At this juncture, it is appropriate to mention a relatively recent branch of combinatorics that offers a powerful way to arrive at enumerations, and which is closely related to some of the ideas introduced in the previous chapters. This is the theory of combinatorial species, and we will outline its basic feature in this chapter, and rederive some of our basic enumeration problems (permutations, partitions, hierarchies, etc.) in this setting with relative (indeed astonishing) ease. Several of these basic species – notably partitions (Part) and permutations (Perm) – have been encountered already, and some of the key constructions involved have already appeared when we were discussing the Wick product and the composition formula. Let us consider a bijection φ on the configuration space X. This will induce a map φ between sets according to φ : A = {x1 , . . . , xn } → A = {φ (x1 ) , . . . , φ (xn )}, which in turn induces a map between partitions, that is,   φ : π = {A, B, C, . . .} → A , B , C , . . . ; see Figure 5.1. the key feature of a bijection is that the induced maps preserve cardinality. From a combinatorial point of view, nothing much has actually gone on. If we are interested in combinatorial structures such as partitions on a set of n items, then all we really need to know is that there really are n items to work with. The bijection φ leaves this invariant. We could focus on the number Nn (Part) of partitions of a set of n items, which we have seen is given by the Bell number Bn . We may then introduce the exponential generating function (EGF) for Partitions, which we take to be  1 z Nn (Part) zn ≡ ee −1 . Part (z)  n! n≥0

In a similar fashion, we could consider any permutation σ of the elements of a subset X = {x1 , . . . , xn }. This leads to a permutation φ (σ ) on which is 89

90

Combinatorial Species

Figure 5.1 The combinatorial features of a partition of a set of items into subsets {A, B, C} remain unchanged under the action of a bijection φ of the items. the permutation on the image set φ (X) taking φ (xk ) to φ xσ (k) . As there are Nn (Perm) = n! permutations on a set of n items, we get the EGF for 1 . permutations to be Perm(z) = 1−z The idea can be extended to any combinatorial structure S that is built over a system on n items, where n is allowed take on values in N+ . We could set S (X) to be the set of those structures built on items forming a given finite set X. (We have assumed implicitly that the items are labeled!) The EGF for S is then defined to be  1 Nn (S ) zn . S (z)  n! n≥0

As we have seen, the combinatorial features of S (X) should only depend on the set X up to bijections. (In fact, we have been far too restrictive in imposing that the φ’s come about as maps from a fixed space X to itself – all we really need is the bijective property.) The induced action of a bijection φ on the combinatorial structures is called a transport of the structure, and the transportation properties of structures are the key to the definition of combinatorial species introduced by Joyal (1981). A bijection φ : X → X is just a permutation of the vertices, which we can view alternatively as a passive relabeling of the vertices. The concept of structure should be independent of the labeling used, however. A species of structures is a rule S that associates to each finite set X a finite set S (X), and to each bijection φ : X → Y a function S (φ) : S (X) → S (Y), such that • S (idX ) = idS (X) . φ

ψ

• Whenever we have bijections X → Y → Z, then S [ψ ◦ϕ] = S [ψ]◦S [ϕ].

5.1 Operations on Species

91

An element s ∈ S (X) is called an S -structure on X, and the map S [ϕ] is called the transport of S -structures along φ. The EGF of the species S is then  1 N (S , n) zn , S (z)  n! n≥0

where N (S , n) gives the number of S -structures over a set of n elements. Some more examples are tabulated in the following:

Permutations Linear orders Cycles Sets Power Sets

S

N (S , n)

S (z)

Perm Lin Cyc Set Power

n! n! (n − 1) ! 1 2n

(1 − z)−1 (1 − z)−1 − ln (1 − z) ez e2z

The species Perm, for instance, gives the permutations over finite sets. We have n! permutations on a set of n elements. A linear order on a set X is a sequence of all the elements with no repetition. There is a one-to-one correspondence between the permutations and the linear orders over a given set; however, the most natural way to construct a bijection between the two is to fix a linear order, say (x1 , . . . , xn ), and any other linear order (xσ (1) , . . . , xσ (n) ) corresponds to the permutation: xi → xσ (i) . Let φ be a bijection on X, and therefore an element of Perm(X), then Perm(φ) transports permutations on X to themselves. Similarly, Lin(φ) will transport linear orders on X to themselves, but not in the same way. Permutations and linear orders are different species despite having the same enumeration. A cycle is a permutation that acts transitively on X, that is, the orbit {σ n (x) : n = 0, 1, 2, . . .} of any element x ∈ X is all of X. The number of cycles on a set of n elements is (n − 1)!. The set species is given by Set(X) = {X} and is (at first sight) rather trivial. It associates the one set {X} with any set X. The power set of X, written as Power(X), is the set of all subsets of X as before.

5.1 Operations on Species Let S and R be species, then their sum S + R is defined by saying that an (S + R) structure on a finite set X is either an S -structure or a R-structure. That is, (S + R) (X) is the disjoint union of S (X) and R (X). We may define S + S by artificially attaching a color to each of the species, so that we

92

Combinatorial Species

could have, for instance, Sred + Sblue . We may use several colors for multiple sums of the same basic species. The EGF of a sum of species is then just the sum of the respective series: (S + R) (z) = S (z) + R(z). Let S be a species and n ≥ 0 an integer. We define Sn by  S (X) , if |X| = n; Sn (X) = ∅, otherwise. In particular, every species may be (canonically) decomposed as  Sn S = n≥0

and we may also define S+ =



Sn ,

n≥1

Seven =



Sn ,

n even

Sodd =



Sn .

n odd

Let S and R be species, then their product S  R is defined by saying that an S  R structure on a finite set X is an S -structure on some subset Y of X and a R-structure on the complement X/Y. As the notion suggests, this is just the Wick product in a different guise; indeed, we have  S (X1 ) R (X2 ) . S  R (X)  X1 +X2 =X

The EGF of a product of species is then naturally defined as the product of the respective series (just as in subsection 4.1.2): (S  R) (z) = S (z) R(z). Just as the notion of product of species abstracts the Wick product, so too can we abstract the concept of composition of species from that of composition of moment generating functions. Let F and G be species with G [∅] = ∅. We define the composition F ◦ G to be the species   F [π ] G [Y]. (5.1) F ◦ G (X)  π ∈Part(X)

Y∈π

5.1 Operations on Species

93

Theorem 5.1.1 Let F and G be two species of structures and suppose that G [∅] = ∅. Then the series associated to the species F ◦ G has EGF F ◦ G (z) = F (G (z)).

5.1.1 Examples of Species Derangements As an example, the number of derangements (permutations without fixed points) is denoted dn . In a classic problem of combinatorial probability, we have that p(n)dn /n! is the probability that if n men leave their top hats in the cloak room before an opera, and the hats are then returned at random, that no one gets his original hat back. We now show how to derive this using species. Let Der be the species of derangements, then Perm = Set  Der, which basically says that a permutation consists of fixed points (a set) and cycles of length 2 or more (derangements). We then get Der(z) =

Perm(z) = e−z /(1 − z), Set(z)

from which we read off Nn (Der) = dn to be dn = n!

n  (−1)k k=0

Therefore, p(n) =

n k=0

(−1)k k! ,

k!

.

which is approximately e−1 = 0.36788 . . .. Sets

We specify an element of the power set X by simultaneously the set itself and its complement in X. Therefore, Power = SetSet, and so Power(z) = Set(z)2 . Partitions From the simple observation that a partition is a set of nonempty disjoint sets, we get the following characterization of partitions as a species. A partition is a set of nonempty subsets. The species of partitions therefore satisfies the relation Part = Set ◦ Set+ . We see that Part (z) = exp(ez − 1) ≡

 1 Bn zn . n! n

94

Combinatorial Species

Pairs The species with pairings as the structure will be denoted as Pair(X). These are sets of subsets of size 2, and so Pair = Set ◦ Set2 . It follows immediately that



1 2 z Pair (z) = exp 2



from which we read off that the number of ways to pair off n elements is  (2k)! , n = 2k; 2k k! Nn (Pair) = 0, n odd. Permutations Likewise, a permutation is a set of disjoint cycles, so we similarly have Perm = Set ◦ Cycle. We deduce that Perm (z) = exp{Cycle (z)}, and so we recover Cycle (z) = ln (1 − z)−1 .

5.2 Graphs Let G denote the species of graphs. Note that n Nn (G ) = 2(2) ,

since if we have a set X of size n giving the possible vertices, there will be n2 possible edges that may or may not be present in the graph. The EGF lacks a closed form, but does have a radius of convergence equal to 12 . We consider the species of connected graphs denoted by Gcon . Proposition 5.2.1 The species of connected graphs Gcon generates all graphs by the identity G = Set ◦ Gcon . In particular, Gcon (z) = ln G (z) . Proof Every graph can be considered as a set of connected subgraphs, and so as species we have G = Set ◦ Gcon . For the EGFs, we then have G (z) = exp Gcon (z). From this point of view, we may define the species of connected structures in a species S by the relation S = Set ◦ Scon .

5.3 Weighted Species

95

5.3 Weighted Species A weight w on a species S is a family of functions w : S (X) → C for each finite set X. We shall be interested in weights that are preserved under transport; that is, for any bijection φ : X → Y, we have w (φS) = w (S) for every S ∈ S (X). In other words, the weight attached to a structure is independent of the underlying labeling. A pair (S , w) consisting of a species and a weight preserved under the transport is termed a weighted species. We may then define a weighted EGF to be S (w) (z) 

∞  1  w (S) zn . n! n=0

S∈Sn

For the uniform weight w (S) = 1, we recover the usual EGF. Most of what we have done up to now carries over to weighted species. The sum of (S , w) and (R, v) is (S + R, u), where  w (S) , S ∈ S (X) ; u (S) = v (S) , S ∈ R (X) . The product of (S , w) and (R, v) is (S  R, u), where a typical product structure (S, R) with S ∈ (X1 ) and R ∈ R (X2 ) will have the weight u (S, R) = w (S) v (R) . The composition of (S , w) and (R, v) is (S ◦ R, u), where a typical structure (S, R1 , . . . , Rn ) with π = {R1 , . . . , Rn }, S ∈ S (π ), and Ri ∈ R (Xi ) will have the weight u (S, R1 , . . . , Rn ) = w (S) v(R1 ) . . . v(Rn ). The rules for weighted EGFs are then (S + R)u (z) = S w (z) + R v (z) (S  R)u (z) = S w (z) R v (z) (S ◦ R)u (z) = S w R v (z) with the appropriate choice of u in each case.

5.3.1 Enumerating Permutations A weight on the species of permutations is given by w (σ ) = λN(σ ) ,

96

Combinatorial Species

where λ > 0 and N(σ ) counts the number of cycles in the permutation σ . As permutations are just sets of cycles, we have (Perm, w) = (Set, 1) ◦ Cycle+ , λ , where we attached weight λ to each cycle. Therefore, Permw (z) = exp {−λ ln (1 − z)} =

1 (1 − z)λ

.

At this stage, however, we recognize the moment generating function for the Gamma distribution! Therefore, Permw (z) =

∞  1 Perm μ (λ) zn , n! n n=1

where

μPerm (λ) n

are the Gamma distribution moments μPerm (λ) = λ (λ + 1) · · · (λ + n − 1) ≡ λn¯ . n

We may expand the rising factorial power λn¯ as1  n λm . λn¯ ≡ m m

5.3.2 Enumerating Partitions A weight on the partitions is given by w (π ) = λN(π ) , where λ > 0 and N(π ) is the number of blocks making up a partition π . We then have (Part, w) = (Set, 1) ◦ (Set+ , λ) . Therefore, Partw (z) = eλ(e −1) . z

We write as a power series in z Partw (z) =

∞  1 Part μ (λ) zn n! n n=1

  1 We now see why the Stirling numbers of the first kind ( n ; see subsection 1.3.2), count the m number of permutations of a set of n elements having exactly m cycles. Indeed, we have   ∞  1  n m n 1 λ z = . n! m m − z)λ (1 n=0

5.4 Differentiation of Species

97

but this time we recognize the moment generating function for the Poisson distribution! We now have that2  n Part λm . μn = m m

5.3.3 Enumerating Hierarchies Proposition 5.4.1 (Enumerating Hierarchies) The EGF for hierarchies will be  1 hn zn , and satisfies denoted as Hier (z) = n n! eHier(z) = 2Hier (z) − z + 1. Proof A hierarchy is a tree with a root, and from these grow at least two subhierarchies: Hier = Set1 + Set≥2 (Hier) or

  Hier (z) = z + eHier(z) − 1 − Hier (z) .

Rearranging gives the result.

5.4 Differentiation of Species Let S be a species, then define the species derivative S  by S  (X) = S (X + {x∗ }), where x∗ is new outside of X. element We have N S  , n = N (S , n + 1), which means that the EGFs are S  (z) =

d S (z). dz

Properties of the differentiation operation are    • (S + R) = S + R    • (S  R) = S  R    • (S ◦ R) = S ◦ R

  2 And so we see once more that the Stirling numbers of the second kind n count the number of m ways to partition a set of n elements into exactly m blocks. We have the identity   ∞  z 1  n m n λ z = eλ(e −1) . n! m m

n=0

98

Combinatorial Species

As an example, let us consider Cycle . A cycle with a distinguished element can be understood as a linearly ordered list of the remaining elements of the cycle starting with the one next to the distinguished element, therefore Cycle =Lin. We could argue that Lin=Set0 +Set1 Lin and so Lin(z) = 1 + zLin(z), or lin(z) = (1 − z)−1 , which we have already seen! Then Cycle(z) = − log(1 − z), by integrating and noting that Cycle(0) = 0. We also have Lin =Lin  Lin, and we could deduce Lin(z) from this differential equation with the condition Lin(0) = 1. Note that derivative of the species Tree gives the species Forest. Finally, we remark that differentiation carries over immediately to weighted species since the derivative inherits the same weight.

6 Combinatorial Aspects of Quantum Fields: Feynman Diagrams

We now apply the Guichardet shorthand notation more extensively in field theory, and in particular, as complementary to the usual diagrammatic approach. We will deal with algebraic features, relegating the rigorous analysis to a later stage. Essentially what we are going to do is to use the Guichardet notation as an alternative to Feynman diagrams. (For other applications of combinatorial species to describing Feynman diagrams, see Faris, 2009/11.) To be specific, what we will study in this section is the combinatorics behind euclidean quantum field theory or, equivalently, the Schwinger functions. These will be properly introduced in the next chapter, but for the time being we content ourselves with giving a rudimentary introduction.

6.1 Basic Concepts Let X be a finite set, then a classical field over X is a function on X taking real values: we will call such a function ϕ = {ϕx : x ∈ X}, a field realization. The space of all field realizations can be denoted as , and here it is R#X . A realvalued function on  may be called a functional andwe suppose the existence of a functional S [·], called the action, such that  =  exp {S [ϕ]} Ddϕ < ∞,  where Dϕ = x∈X dϕx denotes a standard Lebesgue measure. We then arrive at a well-defined probability measure P on  determined by dP (ϕ) =

1 S[ϕ] Dϕ. e 

(6.1)

The expectation of a (typically nonlinear) functional F = F [·] of the field is the integral over : '

F[φ] = F [ϕ] dP (ϕ) . (6.2) 

99

100

Combinatorial Aspects of Quantum Fields: Feynman Diagrams

Here we adopt the convention that φ is a random variable, with specific realizations denoted by ϕ. The label x ∈ X is assumed to give all relevant information, which may be position on a finite lattice, spin, or such. So long as X is finite, the mathematical treatment is straightforward. Otherwise, we find ourselves having to resort to infinite dimensional analysis. We may introduce a dual space J of fields J = {J x : x ∈ !}, which are referred to as the source fields. Our convention will be that the source fields carry a “contravariant” index, while the field, and its realizations, carry a “covariant” index. The duality between fields and  sources will be written as ϕ, J = x∈X ϕx J x . In statistical mechanics, we would then typically consider a sequence (Xn )n of finite subsets of a fixed infinite lattice with each Xn contained within Xn+1 and ∪n≥1 Xn giving the entire lattice. This bulk limit is studied, for instance, as the thermodynamic limit. The general situation that we are really interested in, however, is when X is continuous. If we want X = Rd , then we should take J to be the space of Schwartz functions on Rd and  to be the tempered distributions. (The field realizations are more singular than the sources!)       (6.3) S Rd = J ⊂ L2 Rd ⊂  = S Rd  Here, the duality is denoted by ϕ, J = X ϕx J x dx. In keeping with the idea that sources carry a contravariant index and fields a covariant one, we introduce an Einstein summation convention that repeated indices are to be summed/integrated over. So, for example,

ϕ, J ≡ ϕx J x .

(6.4)

The appropriate way to consider randomizing the field in the infinite dimensional case will then be to consider probability measures on the Borel sets of , that is, on the σ -algebra generated by the weak topology on the space of test functions. We shall nevertheless assume that we may  fix a probability measure P on  and compute expectations F [φ] =  F [ϕ] dP (ϕ) for suitable functionals F. (Here we denote the random variable corresponding to the field by φ.) In particular, we shall assume that the expectation ) ' (

φ,J ≡ e ϕ,J dP (ϕ) (6.5) Z [J] = E e 

exists. This will act as a moment generating functional for the measure P and we shall work formally in the following, effectively assuming the existence of moments to all orders, leaving until later. The Green’s   analytic considerations functions G (x1 , . . . , xn ) = E φx1 . . . φxn , and we may write this as

6.1 Basic Concepts

GX = E [φX ] =

δZ [J] δJ X

101

,

(6.6)

J=0

where for X = {x1 , . . . , xn } we write G (x1 , . . . , xn ) as GX (a completely symmetric tensor with covariant rank #X = n) and introduce the shorthand  φx (6.7) φX = x∈X

as well as JX =



 δ δ #X = . X δJ δJ x

Jx,

x∈X

(6.8)

x∈X

Note that we need only the set X – the ordering of the elements is irrelevant! We y of course have the formal expression δJδ x (J y ) = δx . Similarly, we shall write  δ δ x∈X δϕx . δϕX = rank n and A multilinear map T : ×n J → C is called a tensor of covariant it will be determined by the components Tx1 ...xn such that T J(1) , . . . , J(n) = x1 xn . . . J(n) . Likewise, we refer to a multilinear map from ×n  to the Tx1 ...xn J(1) complex numbers as a tensor of contravariant rank n. We shall say that a functional F = F [J] is analytic in J if it admits a series  1 fx1 ...xn J x1 . . . J xn , where fx1 ...xn are the components expansion F [J] = n≥0 n! of a completely symmetric covariant tensor and as usual the repeated dummy indices are summed/integrated over. A more compact notation is to write the series expansion in terms of the Guichardet notation1 as ' F [J] = fX J X dX. Note that reads as

δ δJ y1

. . . δJδym F [J] = δ #Y δJ Y



1 x1 n≥0 n! fy1 ...ym x1 ...xn J

'

fX J dX = X

. . . J xn , and this now

' fY+X J X dX.

We recall the Leibniz rule  δ δF1 δFm . . . F ··· X , = (F ) 1 m X X 1 X +···+X =X m δJ δJ δJ m 1   δW and the exponential formula δJδX eW = eW π ∈Part(X) A∈π δJ A from subsec  tion 4.14. Evaluating at J = 0 recovers GX = π ∈Part(X) A∈π KA , as in (4.11). 1 We could extend the Einstein summation to Guichardet notation and write, for instance, f J X , X  where the repeated index X now implies the integral-sum fX J X dX over the power set. While

attractive, we avoid this notation as it is perhaps being a shorthand too far at this stage!

102

Combinatorial Aspects of Quantum Fields: Feynman Diagrams

6.1.1 Presence of Sources Let J be a source field. Given a probability measure P0 , we may introduce a modified probability measure PJ , absolutely continuous with respect to P0 , and having Radon–Nikodym derivative 1 dPJ exp { ϕ, J } , (ϕ) = Z0 [J] dP0   where Z0 [J] = E0 e φ,J is the moment generating function for P0 . The corresponding state is referred to as the state modified by the presence of a source field J, and we denote its expectation by EJ . = 0. The Evidently, we just recover the reference measureP0 when . we.put J// Laplace transform of the modified state will be ZJ J  = exp φ, J  J and it is readily seen that this reduces to      Z0 J + J  . ZJ J = Z0 [J] In particular, the cumulants are obtained through WJ [·] = W0 [J + ·] − W0 [J] and we find     δWJ J  δW0 J  + J J KX = = . δJ X δJ X   J =0

J =0

It is, however, considerably simpler to treat  parameter and just  J as a free consider P as being the parametrized family PJ : J ∈ J . In these terms, we have ' δW0 [J] J = KX+Y J Y dY, (6.9) KX = δJ X where KX are the zero-source cumulants. We point out that it is likewise more convenient to write the moments in the presence of a field J as ( ) 1 δZ [J] 1 . GJX = E0 φX e φ,J = Z [J] Z [J] δJ X The mean field in the presence of the source, φ¯ [J], is defined to be φ¯ x [J] =

φx J and is given by the expression2 ' δW0 [J] = Kx+Y J Y dY φ¯ x [J] = δJ x  1   E0 φx φx1 . . . φxn J x1 . . . J xn (6.10) = n! n≥0

and, of course, reduces to E0 [φx ] when J = 0. 2 Summation convention in place!

6.2 Functional Integrals

103

6.2 Functional Integrals We shall in the following make the assumption that we may write the probability measure in the functional integral form 1 S[ϕ] e Dϕ  for some analytic functional S, called the action. The normalization is then  S[ϕ]  =  e Dϕ, and we have the moment generator ' 1 e ϕ,J +S[ϕ] Dϕ. Z [J] =   P [dϕ] =

The definition is of course problematic, but we justify this by fixing a Gaussian reference measure as a functional integral.

6.2.1 Gaussian States We construct an Gaussian state explicitly in the finite dimensional case where #X < ∞ by the following argument. Let g−1 be a linear, symmetric operator on  with well-defined inverse g. We shall write gxy for the components of g−1 and gxy for the components of g. That is, the equation g−1 ϕ = J, or gxy ϕy = J x , will have unique solution ϕ = gJ, or ϕx = gxy J y . We shall assume that g is positive definite and so can be used as a metric. A Gaussian measure is then given by    1 xy 1 exp − g ϕx ϕy dϕx , (6.11) Pg (dϕ) = 0 2 x∈X (2π )#X det g which we may say is determined from the quadratic action given by 1 1 Sg [ϕ] = − gxy ϕx ϕy ≡ − ϕ|g−1 |ϕ . 2 2 The moment generating function is then given by   1 1 gxy J x J y ≡ e 2 J|g|J . Zg [J] = exp 2

(6.12)

(6.13)

In the infinite dimensional case, we may use (6.13) as the definition of the measure: 1

Pg (dϕ) = e− 2 ϕ|g

−1 |ϕ

Dϕ.

The measure is completely characterized by the fact that the only nonvanishing cumulant is K{x,y} = gxy . We describe these using the following diagram in Figure 6.1.

104

Combinatorial Aspects of Quantum Fields: Feynman Diagrams

Figure 6.1 The two-point function gxy for a Gaussian distribution.

g

Figure 6.2 The fourth-order moments are Gx1 x2 x3 x4 = gx1 x2 gx3 x4 +gx1 x4 gx2 x3 + gx1 x3 gx2 x4 . If we now use (4.11) to construct the Green’s functions, we see that all odd moments vanish while    gx(p1 )x(q1 ) . . . gx(pk )x(qk ) , (6.14) Eg φx(1) . . . φx(2k) = Pair(2k)

where the sum is over all pair partitions of {1, . . . , 2k}. The right-hand side terms. To this end, we introduce some notawill of course consist of (2k)! 2k k! tion. Let P ∈ Pair be a given pair partition of a subset X, say P = (X)   xp(1) , xq(1) ), . . . , xp(k) , xq(k) , then we write gP = gxp(1) xq(1) . . . gxp(k) xq(k) in which case the Gaussian moments are  g GX ≡

gP .

(6.15)

P ∈Pair(X)

For instance, the 2kth-order moments are then a sum of describe in diagrams the fourth-order term in Figure 6.2.

(2k)! 2k k!

terms, and we

6.2.2 General States Suppose we are given a reference Gaussian probability measure Pg . and sup 1 y1 ...yn v pose that V [·] is some analytic functional on , say V [ϕ] = n≥0 n! ϕy1 . . . ϕyn , or more compactly, ' (6.16) V [ϕ] = vX ϕX dX. A probability measure, P, absolutely continuous with respect to Pg , is then prescribed by taking its Radon–Nikodym to be

6.2 Functional Integrals

105

dP 1 (ϕ) = exp {V [ϕ]} , dPg  provided, of course, that the normalization    ≡ Eg eV[φ] < ∞.

(6.17)

Formally, we have P [dϕ] =

1 S[ϕ] e Dϕ, 

where the action is



Likewise,  ≡

S [ϕ] = Sg [ϕ] + V [ϕ] ' 1 = − ϕ|g−1 |ϕ + vX ϕX dX. 2 eS[ϕ] Dϕ.

6.2.3 Feynman Diagrams Lemma 6.2.1 Let E [·] be an expectation with the corresponding  X Green’s func]. [φ] [φ For analytic functionals A = A φX dX, B [φ] = = E tion G X X  X B φX dX, and so on, we have the formula ' E [A [φ] B [φ] . . . Z [φ]] = (A  B  . . .  Z)X dX. (6.18) Proof

The expectation in (6.18) reads as '   AXa BXb . . . Z Xz E φXa φXb . . . φXz dXa dXb . . . dXz ' = AXa BXb . . . Z Xz GXa +Xb +···+Xz dXa dXb . . . dXz ⎛ ⎞ '  ≡ ⎝ AXa BXb . . . Z Xz ⎠ GX dX Xa +Xb +···+Xz =X

 where we use the  lemma, Lemma 4.1.3.

 Corollary 6.2.2 Given an analytic functional V [φ] = vX φX dX, where we assume that V [0] = 0, then ⎛ ⎞ '   V[φ]  E e vπ ⎠ GX dX, (6.19) = ⎝ where vπ 

 A∈π

π ∈Part(X)

vA .

106

Combinatorial Aspects of Quantum Fields: Feynman Diagrams

Proof

We have v∅ = 0, and so we use (6.18) to get '   n X E V [φ]n = v GX dX ⎞ ⎛ '  vπ ⎠ GX dX. ≡ n! ⎝ π ∈Partn (X)

The relation (6.19) then follows by summing the exponential series. The expression (6.19) applies to a general state. If we wish to specify to a Gaussian state Eg then we get the following expression.   Theorem 6.2.3 Let  = Eg eV[φ] , as in (6.17), then '   ≡ vπ gP dX. (6.20) π ∈Part(X) P ∈Pair(X)

The proof is then just a simple substitution of the explicit form (6.15) for the Gaussian moments into (6.19). To understand this expression, let us look at a typical term appearing on the right-hand side. Let us fix a set X, say X = {x1 , . . . , x10 }. There must be an even number of elements; otherwise, the contribution vanishes! We fix a partition π = {A, B, C} of X, say A = {x1 , x2 , x3 , x4 }, B = {x5 , x6 , x7 }, and C = {x8 , x9 , x10 }, and a pair partition P consisting of the pairs (x1 , x2 ) , (x3 , x5 ) , (x4 , x6 ) , (x7 , x9 ) , (x8 , x10 ). The contribution is the vx1 x2 x3 x4 vx5 x6 x7 vx8 x9 x10 gx1 x2 gx3 x5 gx4 x6 gx7 x9 gx8 x10 , where we have an implied integration over repeated dummy indices. This can be described diagrammatically as follows (see Figure 6.3): for each of the

Figure 6.3 A Feynman diagram; see text.

6.2 Functional Integrals

107

elements of the partition π , we draw a vertex with the corresponding number of legs, in this case, a four-vertices for vA = vx1 x2 x3 x4 and a pair of three-vertices pairwise according for vB = vx5 x6 x7 , vC = vx8 x9 x10 . We then connect the vertices to the partition P, with each connected pair xp , xq picking up a factor gxp xq ; the final diagram should have all lines contracted. Note that the diagram at the end is presented in a simple form (the labels being redundant) and corresponds to a scalar. We sum over all diagrams that can be generated this way. However, there may be several choices that lead to topologically identical diagrams. More generally, we have the following. Theorem 6.2.4 Let P [dϕ] =

1 S[ϕ] Dϕ, e

with ' 1 S [ϕ] = − ϕ|g−1 |ϕ + vX ϕX dX, 2

then moments of the state are given by '  1 GX ≡ 



vπ gP dY.

(6.21)

(6.22)

π ∈Part(Y) P ∈Pair(X+Y)

Here the rules are as follows: choose all possible subsets Y, all possible partitions π of Y, and all possible pair partitions P of X + Y; draw an m-vertex for each part of π of size m, label all the edges at each vertex by the corresponding elements of Y, and connect up all elements of X + Y according the pair partition. We integrate over all Y’s, and sum over all π ’s and P’s. The formula (6.22) is not yet a perturbation expansion in powers of the potential v since the normalization factor −1 is also a functional of v. To obtain such an expansion, we introduce the notion of a linked Feynman diagram. A connected Feynman diagram in the expansion of the integral in (6.22) is linked to the (external) set X, if it contains at least one element x ∈ X. Given two disjoint sets X and  πY with #Y = n, we choose a pair (π , P) ∈ Part (Y) × Pair (X + Y). Then v gP dy1 . . . dyn is a product of Feynman integrals, each integral being depicted by a connected Feynman diagram.3 We say that (π , P) is linked to X if all these integrals contain at least one external argument x ∈ X. In the pictorial description, that means all connected Feynman diagrams obtained from vπ gP dy1 . . . dyn are linked to X. For an analytic formulation of this property, it is convenient to define the function  1 if (π , P) is linked to X, X (π , P)  (6.23) 0 otherwise. 3 The product may include factors such as g x1 x2 without y-integration (0-dim. integration).

108

Combinatorial Aspects of Quantum Fields: Feynman Diagrams

In general, a pair (π , P) ∈ Part (Y) × Pair (X + Y) is not linked to X. But it is possible to split the set Y into two subsets Y = Y1 + Y2 such that the following conditions are satisfied: (i) The function vπ gP is factorized into vπ gP = vπ1 gP1 vπ2 gP2 with π1 ∈ Part (Y1 ) , P1 ∈ Pair (Y1 ) and π2 ∈ Part (Y2 ), P2 ∈ Pair (X + Y2 ). (ii) All factors of vπ2 gP2 are linked to X. That means vπ gP can be separated into a factor without a link and into a factor, which is depicted by linked diagrams. This observation leads to the following Lemma 6.2.5 The integrand in (6.22) has the representation 





vπ gP =

f (Y1 )g(Y2 , X).

(6.24)

Y1 +Y2 =Y

π ∈Part(Y) P ∈Pair(X+Y)

with the functions 

f (Y1 ) =



vπ gP

and

π ∈Part(Y1 ) P ∈Pair(Y1 )





g(Y2 , X) =

X (π , P) vπ gP .

π ∈Part(Y2 ) P ∈Pair(Y2 +X)

 The integral in (6.22) is then evaluated with  Lemma. '





' vπ gP dY =

' f (Y1 )dY1

g(Y2 , X)dY2

π ∈Part(Y) P ∈Pair(X+Y) (6.20)

'



= 



X (π , P) vπ gP dY2 .

π ∈Part(Y2 ) P ∈Pair(Y2 +X)

Inserting this result into (6.22), the normalization  drops out, and as final result we obtain the linked cluster expansion for the moments ' GX =





π ∈Part(Y) P ∈Pair(Y+X)

which contains only terms linked to X.

X (π , P) vπ gP dY,

(6.25)

6.2 Functional Integrals

109

6.2.4 The Dyson–Schwinger Equation Theorem 6.2.6 (Wick Expansion) We have the moment generating function Z [J] for the measure P [dϕ] = 1 eS[ϕ] Dϕ given by    ( ) δ 1 1

ϕ,J +V[ϕ] = exp V (6.26) Z [J] = Eg e Zg [J] .   δJ Here we understand

    ' δ X δ exp V ≡ exp dX v δJ δJ X

and the proof rests on the identity for suitable analytic functionals F:   (   ) ( ) δ δ Zg [J] = F Eg e ϕ,J = Eg F [ϕ] e ϕ,J . F δJ δJ We now derive a functional differential equation for the generating function. Lemma 6.2.7 The Gaussian generating functional Zg satisfies the differential equations     δ + J x Zg [J] = 0, (6.27) Fgx δJ where Fgx [ϕ] =

δ δϕx Sg [ϕ]

≡ −gxy ϕx .



Proof Explicitly, we have Zg = exp 12 J x gxy J y so that which can be rearranged as

xy δ x −g + J Zg = 0. δJ y

δ δJ x Zg

= gxy J y Zg ,

Lemma 6.2.8 For the moment generating function Z [J] for the measure P [dϕ] = 1 eS[ϕ] Dϕ, we have     δ δ FIx (6.28) − gxy y + J x Z [J] = 0, δJ δJ where FIx [ϕ] = Proof

δ δϕx V

[ϕ].

We observe that from the Wick expansion identity (6.30) we have    1 x δ x Zg [J] J Z [J] = J exp V  δJ

110

Combinatorial Aspects of Quantum Fields: Feynman Diagrams

and using the commutation identity          δ δ x x δ J , exp V = −FI exp V δJ δJ δJ we find

   1 δ J Z [J] = exp V J x Zg [J]  δJ      δ δ 1 exp V Zg [J] − FIx  δJ δJ   δ δ Z [J] , = gxy y Z [J] − FIx δJ δJ x

which gives the result. Putting these two lemmas together, we obtain the following result. Theorem 6.2.9 (The Dyson-Schwinger Equation) The generating functional Z [J] for a probability measure P [dϕ] = 1 eS[ϕ] Dϕ satisfies the differential equation     x δ x (6.29) + J Z [J] = 0, F δJ where F x [ϕ] =

δS[ϕ] δϕx

= − 12 gxy ϕy + FIx [ϕ].

Corollary 6.2.10 Under the conditions of the preceding theorem, if the perturbation is V [ϕ] = vX ϕX dX, then the ordinary moments of P satisfy the algebraic equations '    gxx GX−x + gxy dY vy+Y GX+Y . (6.30) GX+x = x ∈X

 Proof We have F x = −gxy ϕy + vx+X ϕX dX. The Dyson–Schwinger equation then becomes ' δZ xy δZ + dY vx+Y Y + J x Z = 0 −g δJ y δJ and we consider applying the further differentiation −gxy Gy+X + dY vx+Y GX+Y +

δX δJ X

to obtain

δ x J Z = 0. δJ X

The result follows from setting J = 0. The Dyson–Schwinger equation (6.30) can be expressed in diagrammatic form as in Figure 6.4.

6.2 Functional Integrals

111

Figure 6.4 The Dyson–Schwinger equation (6.30) in diagrammatic form. We may write (6.30) in index notation as m      5 gxxi E φx1 . . . φ E φ x φ x 1 . . . φx m = xi . . . φxm i=1

+gxy

 1   vyy1 ...yn E φy1 . . . φyn φx1 . . . φxm , n! n≥0

where the hat indicates an omission. This hierarchy of equations for the Green’s functions is equivalent to the Dyson–Schwinger equation. We remark that  the first term on the right-hand side of (6.30) contains  the moments E  φX−x , which are of order two smaller than the left-hand  side E φX+x . The second term on the right-hand side of (6.30) contains the moments of higher order, and so we generally use this equation  cannot    φ = recursively. In the Gaussian case, we have E  g X+x x ∈X gxx Eg φX−x   by just knowing the first and Eg φX−x , from which we can deduce  (6.14)  second moments, Eg [φx ] = 0 and Eg φx φy = gxy . The Dyson–Schwinger Equation for Cumulants The Dyson–Schwinger equations may be alternatively stated for W. They are '   δW x xy δW x+X J −g + dX v = 0. (6.31) δJ y δJ A π ∈Part(X) A∈π

δW The Dyson–Schwinger equation for cumulants may be rearranged as δJ x =    δW y y+X , and the diagrammatic form is gxy J + gxy dX v π ∈Part(X) A∈π δJ A

112

Combinatorial Aspects of Quantum Fields: Feynman Diagrams

Figure 6.5 The Dyson–Schwinger equation based on

δW δJ x .

Figure 6.6 The Dyson–Schwinger equation for the second cumulant.

presented in Figure 6.5. We group together topologically identical diagrams and include the combinatorial coefficients. By taking a further derivative δJδ y of the equation (6.31) and evaluating at J = 0, we obtain the diagrammatic expression for the second cumulant in Figure 6.6. A useful way to see this is to note that we may define an operator D by Dx =

1 δ δW δ (Z [J] ·) ≡ x + x x Z [J] δJ δJ δJ

  δ + J x Z [J] = 0, which and do the Dyson–Schwinger equation (6.29), F x δJ becomes {F x [D] + J x } 1 = 0, that is,     δW [J] x δ x F + + J 1 = 0. δJ δJ

(6.32)

6.3 Tree Expansions

113

6.3 Tree Expansions We have introduced moment generating functions formally structured as ' 1 e ϕ,J +S[ϕ] Dϕ. Z [J] =   Following a well-worn principle that we do not try to justify here, one might reasonably expect that the greatest contribution to the integral comes from the field ψ for which the exponent ϕ, J + S [ϕ] is stationary, that is, we have the approximation Z [J] ' e ψ,J +S[ψ] , where

δ δϕx

( ϕ, J + S [ϕ])

ϕ=ψ

= 0. We make the following assumptions: the

stationary solution ψ = ψ [J] exists and is unique for each fixed J. It will satisfy automatically the identity Jx +

δS [ϕ] δϕx

that is,

= 0, ϕ=ψ[J]

' J − g ψy + x

vx+X ψX dX = 0

xy

or rearranging gives4

' ψx = Jx +

vxX ψX dX.

(6.33)

We may rewrite (6.33) as ψ = gJ + f (ψ), where f (ψ)x is the rather involved  expression vxX ψX dX. However, we may in principle iterate to get ψ [J] = gJ + f (gJ + f (J + f (gJ + · · · ))) , or in more detail ' ψx = Jx +

dX0 vxX0



⎛ ⎝Jx1 +

x1 ∈X0

' = Jx +

dX1 vxX1 1

' dX0 vxX0 JX0 +

⎞  Jx2 + · · · ⎠

'

dX0 dX1 vxX0

x2 ∈X1

 x1 ∈X0

vxX1 1 JX1 + · · · .



4 Here we lower contravariant indices using the metric g, i.e., J = g J y = x xy vxX = gxy vy+X .



gxy J (y) dy and

114

Combinatorial Aspects of Quantum Fields: Feynman Diagrams

The preceding expression admits a remarkable diagrammatic interpretation. We use the following diagrams for the tensors ψx and Jx :

For simplicity, we assume that vX is nonzero only if #X = 3, 4, that is, + φ 4 theory. Then the equation (6.33) reads as

φ3

1 y1 y2 1 v ψy1 ψy2 + vxy1 y2 y3 ψy1 ψy2 ψy3 (6.34) 2! x 3! and this may be depicted by the diagram shown in Figure 6.7. The process of iteration then leads to the following expansion in terms of the diagrams shown in Figure 6.8. The general term here is a phylogenetic tree. In the expansion in Figure 6.8, we have lumped topologically equivalent trees together, hence the combinatorial factors appear. However, the sum here is over all phylogenetic trees – or equivalently all hierarchies! – which is why it is sometimes known as a tree expansion. In φ 3 + φ 4 theory, each node of the tree will have either two or three branches. ψx = Jx +

Figure 6.7 Use of diagrams to depict the equation (6.34).

Figure 6.8 A tree expansion of the field ψx in φ 3 + φ 4 theory.

6.4 One-Particle Irreducibility

115

We may write the expansion in terms of hierarchies '  ψx = dX vx (H) J X ,

(6.35)

H∈Hier(X)

where vx (H) is the weight attached to a given hierarchy H. The weight is easily calculated by drawing out the tree diagram: for each node we have the tree section B r AA  A r r · · ·Ar A1 A2 Am xA ...xA

which gets a factor vxB 1 m , where various labels xB , xA1 , . . . , xAm are dummy variables in X. Apart from the root (xB = x), and the leaves, all these variables are contracted over in the product of such factors (the contractions corresponding to the branches between nodes!).

6.4 One-Particle Irreducibility In this section, we derive algebraic rules for decomposing the cumulant (connected) Green’s functions into further basic diagrams. These are known as the one-particle irreducible (1PI) diagrams, as they have the property that cutting a line corresponding to a propagator g does not split the diagram into disconnected parts; see, for instance, Rivers (1987). This decomposition has been very important in relation to later renormalization methods.

6.4.1 Legendre Transforms Suppose that W = W [J] is a convex analytic function. That is, W is a realvalued analytic functional with the property that W [tJ1 + (1 − t) J2 ] ≤ tW [J1 ] + (1 − t) W [J2 ]

(6.36)

for all 0 < t < 1 and J1 , J2 ∈ J. The Legendre transform (more exactly, the Legendre-Fenchel transform) of W is then defined by  [ϕ] = inf {W [J] − ϕ, J } . J∈J

(6.37)

116

Combinatorial Aspects of Quantum Fields: Feynman Diagrams

 [ϕ] will then be a concave (i.e.,  is convex) analytic functional in ϕ and we may invert the formula as follows: W [J] = sup { [ϕ] + ϕ, J } .

(6.38)

ϕ∈

If the functional W is taken to be strictly convex, that is, if we have strict inequality in (6.36), then the infimum is attained at a unique source J¯ = J¯ [ϕ] for each fixed ϕ, and so   . /  [ϕ] ≡ W J¯ [ϕ] − ϕ, J¯ [ϕ] . Moreover, we may invert J¯ :  → J to get a mapping φ¯ : J → ., and for / fixed J the supremum is given by φ¯ [J], thus giving W [J] = ¯ [J] + φ¯ [J] , J , where we take   (6.39) ¯ [J]   φ¯ [J] . The extremal conditions are then δ , J¯ x ≡ − δϕx

φ¯ x ≡

δW . δJ x

(6.40)

Let W  [J] be the symmetric tensor with entries δ 2 W [J] δ φ¯ x [J] ≡ . x y δJ δJ δJ y This will be positive definite – it will be interpreted in the following as the covariance of the field in the presence of the source J. Likewise, if we let   [ϕ] 2 , then we have be the linear operator with entries δϕδx δϕ y  [J] = Wxy

δ 2  [ϕ] δ J¯ y [ϕ] =− , δϕx δϕy δϕx   and so we conclude that W  [J] and −¯  [J] =   φ¯ [J] will be inverses for each other. In other words,  xy [ϕ] =

δ 2 W [J] δ 2  δJ x δJ y δϕy δϕz

= −δxz .

¯ ϕ=φ[J]

(6.41)

¯ ¯ [J]  Lemma  6.4.1 Let F :  → R be a functional and define F : J → R by F ¯ F φ (J) , then δ 2 W [J] δF δ F¯ [J] = δJ x δJ x δJ y δϕ Proof

This is just the chain rule, as

δ φ¯ y [J] δJ x

=

¯ φ[J]

.

δ 2 W[J] δJ x δJ y .

6.4 One-Particle Irreducibility

117

Let us introduce the tensor coefficients X [J] 

δ δϕX

¯ φ[J]

.

(6.42)

Note that x,y [J] are the components of ¯  [J]. Lemma 6.4.2 We have the following recurrence relation δW = δJ X+y

 π ∈Part< (X)

δ2W p+Zπ δJ y δJ p





δW δJ A+zA

A∈π

 ,

(6.43)

where, again, each zA is a dummy variable associated with each component part A and Zπ = {zA : A ∈ π }. Proof Taking rule, we find

δ δJ X

of (6.41) and using the multiderivative form of the Leibniz

  δ2W δ {y,z}  0= X δJ δJ x δJ y  δW δ = {y,z} δJ X1 +x+y δJ X2 X1 +X2 =X

=

 X1 +X2 =X



δW δJ X1 +x+y

 

y+z+Zπ

π ∈Part(X2 )



A∈π

δW δJ A+zA

 .

δW {y,z} , which will Now the X1 = X, X2 = ∅ term in the last sum yields δJ X+x+y be the highest-order derivative appearing in the expression. We take this over to 2 the left-hand side, and then multiply both sides by − δJδ x W δJ z , which is the inverse of {y,z} . Finally, we get the expression

δW δJ X+x+y



=



X1 +X2 =X π ∈Part(X2 )



×

 A∈π

δW δJ A+zA

δ2W δW y p X δJ δJ δJ 1 +x+q



{p,q}+Zπ .

We note that the sum is over all X1 + x (where X1 ⊂ X, but not X1 = X), and partitions of X2 can be reconsidered as a sum over all partitions of X + x, excepting the coarsest one. Let us do this and set X  = X + x and Zπ = Zπ + q; dropping the primes then yields (6.43).

118

Combinatorial Aspects of Quantum Fields: Feynman Diagrams

6.4.2 Field–Source Relations We now suppose that W in the preceding is the cumulant generating functional, that it is convex, and that the Legendre transform functional  admits an analytic expansion of the type '  1 (6.44)  y1 ...yn ϕy1 . . . ϕyn =  X ϕX dX  [ϕ] = n! n≥0

for constant coefficients  X – these are given a diagrammatic representation in Figure 6.9. δW In this case, φ¯ x = δJ x will give the mean field (6.10) in the presence of source J. Inversely, the source field J¯ will be then given by ' δ ¯J x [ϕ] = − = −  x+X ϕX dX. (6.45) δϕx  1 xy1 ...yn In longhand, J¯ x [ϕ] = − n≥0 n!  ϕy1 . . . ϕyn . Gaussian States For the Gaussian state, we have Wg [J] = 12 gxy J x J y , and so here φ¯ x = gxy J y . The inverse map is then J¯ x [ϕ] = gxy ϕy , and so we obtain 1 g [ϕ] = − gxy ϕx ϕy . 2 Note that g ≡ Sg . General States For a more general state, we will have  = g + I , where  1 1  x1 ...xn ϕx1 . . . ϕxn , I [ϕ] =  x ϕx + π xy ϕx ϕy + 2 n! n≥3

Figure 6.9 The coefficients  X appearing in (6.44) are completely symmetric contravariant tensors, and we use this diagram to represent them – their contravariant nature is expressed by the fact that they have connection points rather than legs.

6.4 One-Particle Irreducibility

119

that is,  xy = −gxy + π xy . It follows that 1 J¯ x [ϕ] = − x + gxy − π xy ϕy −  xyz ϕy ϕz − · · · 2 ¯ we may rearrange this to get and (substituting ϕ = φ)

1 yzw y y yz ¯ ¯ ¯ ¯ φx = gxy J +  + π φz +  φz φw + · · · . 2

(6.46)

(6.47)

Without loss of generality, we may take  x = 0 as we could always absorb this term otherwise into J x . With this choice, we have [ϕ = 0] = 0 and ¯ = 0] = 0. φ[J We note that φ¯ appears on the right-hand side of (6.47) in a generally nonlinear manner: let us rewrite this as φ¯ = gJ + f φ¯ , where fx (ϕ) = gxy π yz ϕz +  yz1 ...zn ϕ . . . ϕ . We may reiterate (6.47) to get an expansion z1 zn n≥3 gxy  φ¯ = gJ + f (gJ + f (gJ + f (gJ + · · · ))) , and we know that this expansion should be resummed to give the series expansion in terms of J as in (6.10). Self-Energy  xy [ϕ]

= Now conclude that

−gxy

+

xy I [ϕ],

or   [ϕ] = − 1 − I [ϕ] g g−1 , so we

 −1 =g W  [J] = −  φ¯ [J]

1 1 = −1 ,  ¯ 1 − I [J] g g − ¯ I [J]

(6.48)

xy  1 xyz1 ...zn ¯ φz1 [J] . . . φ¯ zn [J]. This relation may where ¯ I [J] ≡ π xy + n≥1 n!  be alternatively written as the series W  [J] = g + g¯ I [J] g + g¯ I [J] g¯ I [J] g + · · ·

(6.49)

In particular, we note that W  [J = 0] = K is the covariance matrix while ¯ I [J = 0] ≡ π , and we obtain the series expansion K = g + gπ g + gπ gπ g + · · ·

(6.50)

We now wish to determine a formula relating the cumulants in the presence of the source J to the tensor coefficients X [J] defined in (6.42). Theorem 6.4.3 We have the following recurrence relation     J J p+Zπ J [J] KX+y = Kyp  K A+zA , π ∈Part< (X)

A∈π

(6.51)

120

Combinatorial Aspects of Quantum Fields: Feynman Diagrams

Figure 6.10 Third-order terms.

where, again, each zA is a dummy variable associated with each component part A and Zπ = {zA : A ∈ π }. This, of course, just follows straight from (6.43). The crucial thing about (6.51) is that the right-hand side contains lower-order cumulants and so can be used recursively. Let us iterate once: 

J J = Kyp KX+y J Kyp

 π ∈Part< (X)

π ∈Part< (X) p+Z J π

[J]



A∈π



p+Zπ [J]

J Kqz A



J KA+z = A

A∈π

q+Zπ  [J]

π  ∈Part< (A)

 B∈π 

J KB+z . B

What happens is that each part A of the first partition gets properly divided up into subparts, and we continue until we eventually break down X into its singleton parts. However, this is just a top-down description of a hierarchy on X. Proceeding in this way, we should obtain a sum over all hierarchies J , and for each node/part A, of X. At the root of the tree, we have the factor Kyp appearing anywhere in the sequence, labeled by dummy index zA , say, we will break it into a proper partition π  ∈Part< (A) and with multiplicative factor  J q+Zπ  [J], where Zπ  will be the set of labels for each part Kqz A π  ∈Part< (A) π .

B∈ If we set X = {x1 , x2 }, then we have only one hierarchy to sum over and we J J K J K J qz1 z2 [J]; see Figure 6.10 for a graphical reprefind Kyx = Kyp x1 z1 x2 z2 1 x2 sentation. J is likewise developed in Figure 6.11. The four-vertex term Kxyzw It is useful to use the bottom-up description to give the general result. First we introduce some new coefficients defined by ϒxY1 ...xn [J]  KxJ1 z1 . . . KxJn zn {z1 ,...,zn }+Y [J] .

6.4 One-Particle Irreducibility

121

Figure 6.11 Fourth-order terms. J . Then we find that with the exceptional case ϒxy  Kxy   z (m) J KX+y = ϒy+Zπ (m) [J] ϒZA (m−1) [J] π A(m) ∈π (m) H={π (1) ,...,π (m) }∈H(X)   z (m−1) × (6.52) ϒZA (m−2) [J] . . . ϒZπ (2) [J] . A(m−1) ∈π (m−1)

π

A(1) ∈A(1)

For instance, we have the following expansions for the lowest cumulants: J Kyx = ϒyx1 x2 , 1 x2

  J r = ϒ + ϒ ϒ + · · · , Kyx yx x x rx x yx1 1 2 3 2 3 1 x2 x3 J Kyx = ϒyx1 x2 x3 x4 1 x2 x3 x4     + ϒyx1 r ϒrx2 x3 x4 + · · · + ϒyx1 xr2 ϒrx3 x4 + · · ·     + ϒyx1 r ϒrx2 q ϒqx3 x4 + · · · + ϒy rq ϒrx1 x2 ϒqx3 x4 + · · ·

The terms in round brackets involve permutations of the xj indices leading to 1 4 distinct terms. Thus, there are 1 + 2 2 = 4 terms making up the right-hand side for the fourth-order cumulant: the first term in round brackets corresponds } {x }}} to the hierarchy {{{x1 }} , {{x there are three such second-order 2 , 3 and hierarchies. There are 1 + 53 + 12 51 42 = 26 terms making up the right-hand side for the fifth-order cumulant.

7 Entropy, Large Deviations, and Legendre Transforms

In this chapter, we go back and consider the mathematical definition of entropy as a measure of statistical uncertainty as introduced by Shannon. We relate this to large deviation theory – a generalization of Laplace asymptotics for sequences of probability distributions – and see how the cumulant moments and rate functions (entropic distance) of limit probability distributions are related by a Legendre–Fenchel transform. Indeed, there is an analogy between entropic functions and the one-particle irreducible generating functions. We describe briefly the Freidlin–Wentzell theory as an example of large deviations for stochastic processes.

7.1 Entropy and Information 7.1.1 Shannon Entropy We begin by trying to quantify the amount of surprise associated by the occurrence of a particular probabilist outcome or, more generally, event. Intuitively, the surprise of an event should depend only on the probability of its occurrence, so we seek a function s : [0, 1] → [0, ∞), : p → s (p) , with s (p) measuring our surprise at seeing an event of probability p occur. If an event is certain to occur, then there is no surprise if it happens so we require that s (1) = 0. This should be the least surprising event! It is convenient to define the unlikeliness of the event to be 1p , that is, inversely proportional to the probability p. Another desirable feature is that the greater the unlikeliness is, then the more we are surprised, and this translates to s being monotone decreasing: s (p) > s (q) whenever p < q. We would also desire that events 122

7.1 Entropy and Information

123

that have similar levels of unlikeliness lead to levels of surprise that are also close: mathematical s must be a continuous function. This still leaves the exact form of s wide open, but we impose one more condition coming from probability theory. If two events are independent, then the surprise at seeing both occur is no more and no less than the sum of the surprises of both. That is, s (pq) = s (p) + s (q) . This now restricts us to the logarithms: s (p)  − log p.

(7.1)

The only freedom we have left is to choose the base of the logarithm, and we will fix base 2. The sequel log will always be understood as log2 . The surprise is therefore the logarithm of the unlikeliness, or equivalently p = 2−s . Now let us consider a probability vector π = (p1 , . . . , pn ). The   vector π  belongs to the simplex n = (p1 , . . . , pn ) : pk ≥ 0, k pk = 1 . We may think of a discrete random variable X, which is allowed to take on at most n values, say x1 , . . . , xn , with probabilities pk = Pr {X = xk }: in this way π gives its probability distribution. If we measure X, then our surprise is finding the value xk should then be − log pk . The average surprise is then  h (X) = − pk ln pk . (7.2) k

We adopt the standard convention that −x ln x takes the limit value − lim x ln x = 0 x→0+

for x = 0 by continuity. In other words, while events that have extremely high levels of unlikeliness lead to a high level of surprise, they make negligible contribution to h (X) due to their low probability to occur. This is a consequence of the fact that the surprise only grows logarithmically. It turns out that h (X) plays an important role in information theory. This emerged from the work of Shannon on coding theory and, recognizing the same mathematical form as the entropy of an ensemble in statistical mechanics – Boltzmann’s H-function (1.3) – Shannon used the term entropy for h (X). Strictly speaking, the entropy h is a function of the probability distribution of X, and not X itself. In other words, h (X) depends only on the probabilities for X to take on its various variables – it doesn’t matter what these values are, so long as they are distinct, let alone what probability space we choose to represent X on.

124

Entropy, Large Deviations, and Legendre Transforms

Relative Entropy Let π and π  be probability vectors on n , then we define the relative entropy of π  to π as 

n  p pk log k . D π  ||π  pk k=1





It is easy to see that D π  ||π ≥ 0 with equality if and only if π  = π . (This is Gibbs’ inequality, and we will prove it more generally later.) We note that the relative entropy can be written as n    pk − log pk + log pk = E S − S , D π  ||π = k=1

where S is the surprise of π (i.e, the random vector with values sk = − log pk ) and S is the corresponding surprise of π  . This can be further broken down as D π  ||π = E [S] − H π  .

7.1.2 Differential Entropy We generalize the Shannon entropy to continuous random variables X possessing a probability distribution function ρX . For definiteness, we fix on Rn -valued random vectors X and define the differential entropy to be ' H (X)  − ρX ln ρX . Again the notation associates the entropy with the continuous random variable X as a shorthand for the actual dependence, which is on the distribution ρX . However, there are a number of subtle differences between the discrete case. First there is the fact the probability density  ρX sets a scale that is in turn inherited by the entropy. The integral is − Rn ρX (x) ln ρX (x) dx, and this assumes that we have already replaced any physical units required to describe the variables with dimensionless ones. If X was originally a distance parameter measured in meters, then ρ would have units m−n , and so the logarithm would not make sense. Second, unlike the discrete case where the probabilities pk must lie in the range 0 to 1, the values ρX (x) need only be positive. We therefore typically encounter the function y = −x ln x for values x > 1 as well, leading to possible negative terms appearing in the integrand; see Figure 7.1. Technically, the integral may be restricted to the support of probability density if desired.

7.1 Entropy and Information

125

Figure 7.1 The graph of y = −x ln x. We note that the entropy may be considered as the expectation H (X)  −E[ ln ρX ]. Indeed, ln ρX may be called the surprise function of the random variable. One issue that presents itself straight away is the question of physical units for the differential entropy. The discrete entropy is naturally dimensionless as it involves probabilities; however, H deals with probability densities. Even after making X dimensionless, we could apply an affine linear rescaling X → AX +c with A invertible. Here we find H (AX + c) = H (X) + ln |A| . We see that the differential entropy is translation invariant, but is only determined up to an additive constant dependent on the choice of units to render the random variable dimensionless. Entropy of Gaussian Variables Let X be Gaussian random variable (n = 1) with mean μ and standard deviation σ . Then ρX (x) =

− 1 (x−μ)2 √ 1 e 2σ 2 2πσ



√ 1 HGaussian (X) = −E − 2 (X − μ)2 − ln 2π σ 2σ 1 1 = + ln (2π ) + ln σ , 2 2



or σ = √ 1 eHGaussian (X) . Perhaps not surprisingly, the entropy increases as 2π e the uncertainty σ does. The result generalizes to the case of an n-dimensional Gaussian with covariance matrix C: HGaussian (X) =

1 n n + ln (2π ) + ln |C|. 2 2 2

126

Entropy, Large Deviations, and Legendre Transforms

Rearranging gives |C|1/n =

1 2 HGaussian (X) en . 2π e

Relative Entropy Let ρ1 and ρ2 be probability density functions on Rn , then the relative entropy of ρ1 with respect to ρ2 is defined to be ' ρ1 D (ρ1 ||ρ2 )  ρ1 ln . ρ2 This is also known as the Kullback–Leiber divergence, the entropy of discrimination, and the information distance. More exactly, let P1 and measures on Rn , then we define their relative entropy as P2 be probability dP1 1 dP1 ln dP2 if P1 is absolutely continuous with respect to P2 (with dP dP2 being the Radon–Nikodym derivative), but taken as -∞ otherwise. We have the following property. Proposition 7.1.1 (Gibbs’ Inequality) The relative entropy satisfies D (ρ1 ||ρ2 ) ≥ 0. Furthermore, D (ρ1 ||ρ2 ) = 0 if and only if ρ1 = ρ2 .  Proof We may write D (ρ1 ||ρ2 ) = A ρ1 φ (y), where y = ρρ21 , φ (y) = − ln y andA is the support of ρ1 . As φ is convex, we may apply Jensen’s inequality to get ρ1 φ (y) ≥ φ( ρ1 y) but ρ1 y = 1 and φ (1) = 0. As φ is strictly convex, we get the equality only if y ≡ 1 almost everywhere. As a measure of the distance from ρ2 to ρ1 , D (ρ1 ||ρ2 ) has the desirable property of being positive and vanishing only if ρ1 and ρ2 are the same. It is not a metric, however, since it is not symmetric! Let ρk be both n-dimensional Gaussian variables with mean vectors μk and covariance matrices Ck respectively for k = 1, 2. Then  1 1 D (ρ1 ||ρ2 ) = (μ1 − μ2 ) C2−1 (μ1 − μ2 ) + tr C1 C2−1 2 2 1 1 − ln |C1 C2−1 | − n. 2 2

7.2 Law of Large Numbers and Large Deviations Let X be a random variable, say taking values in Rd , and suppose that it has a well-defined cumulant generating function

7.2 Law of Large Numbers and Large Deviations

127

' WX (t) = ln

et.x KX [dx]

for t ∈ Rd , where KX is its probability distribution. (Note that t.x is shorthand  here for dj=1 tj xj .) The law of large numbers is the principle that if we generate independent copies of X, say random variable X1 , X2 , . . . , then their geometric  means X¯ n = 1n nk=1 Xk ought to have  a distribution that is increasingly concentrated about the mean value μ = xKX [dx] as n becomes large. Indeed, we see that

) ( 1 ) ( 1 t . WX¯ n (t) = ln E e n t.(X1 +···+Xn ) = n ln E e n t.X = n ln WX n and assuming that the cumulant function is differentiable about t = 0, we get limn→∞ WX¯ n (t) = WX (0) = μ. Alternatively, we may say that the probability distribution Kn of X¯ n converges to δμ . On the one hand, we have the conclusion that   lim Pr X¯ n ∈ A = 1 n→∞

d for any (say open)  set A ⊂ R containing the mean μ. On the other, we would ¯ / A should vanish if A does not contain the mean μ. Clearly the get Pr Xn ∈ recording of a value for the geometric mean X¯ n away from μ, known as a large deviation, is an increasingly rare event for large n, and the next question is how to quantify how small these probabilities tend to get. We first look at a concrete example. As an example, we take X to be a Bernoulli variable: taking value 1 with probability p and 0 with probability 1 − p. This has the cumulant generating function WX (t) = ln 1 − p + et p

and, of course, the mean value μ = p. For integer m between 0 and n, we have

m m m ¯ = p (1 − p)n−m . Pr Xn = n n We now assume that both n √ and m are large with mn = q ∈ (0, 1). Using Stirling’s approximation n! ≈ 2π ne−n nn , we find that

m Pr X¯ n = ≈ e−nI(q)+O(ln n) , n where I (q) = q ln

1−q q + (1 − q) ln . p 1−p

128

Entropy, Large Deviations, and Legendre Transforms

We see that the probability of getting a large deviation (that is, a mean value q different from p from a sample of size n) is exponentially small. Another way of saying this is that our surprise at recording a sample average q different from p grows roughly proportional to the sample size n – specifically the surprise is to leading orders nI (q). In particular,   1 lim − ln Pr X¯ n = q = I (q) . n The result can actually be significantly strengthened. Not only is the probability for X¯ n to equal q decay exponentially with rate I (q), but so too does the probability for X¯ n to be equal to q or further away! The function I (q) is called the rate function and is sketched later in this chapter. The main features are that it is convex with a minimum of zero at q = p. There are a number of things going on in this example, but let’s pull out some key points. The probability of a collection of large deviation events is a sum over small probabilities that are decaying exponentially fast with the sample size n. Mathematically, the event with the least rate of decay is the one that should dominate – it may be exponentially small but the others are vanishing faster! Put another way, the decay of a collection of large deviation events with sample size t is determined by the least surprising of these events. The rate function takes form in this example and in fact it is just the relative a specific entropy D πq ||πp of the Bernoulli probability vector πq = (q, 1 − q) to πp = (p, 1 − p). Evidently, the rate function ought to be determined by the common distribution KX of the random variables that we’re sampling and indeed it does depend on the parameter p that fixes the Bernoulli distribution in this case. However, we remark that the rate function and the cumulant generating function are Legendre transforms of each other: n→∞

I (q) = min {tq − WX (t)} . t

  In this case, the minimizer t∗ (q) is then seen to be ln qp 1−p 1−q and one readily sees that I (q) = t∗ (q) q−WX (t∗ (q)). Conversely, the rate function determines the distribution as we can invert to get WX (t) = minq {tq − I (q)}. Going beyond the Bernoulli case, it turns out that the straightforward generalization holds for the distribution of the sample mean for random variables on R, and this is known as Cram´er’s Theorem. Effectively we have the law of large numbers, which is telling us that the distribution Kn of the average, X¯ n , of a sample of size n converges to the degenerate distribution δμ . The large deviation principle tells us a bit more, roughly that Kn [dx] behaves asymptotically as e−nI(x) “dx,” where “dx” is a proxy for some background measure that we don’t care about. Here I (x) is called the rate function, and it should be strictly

7.2 Law of Large Numbers and Large Deviations

129

positive except at x = μ, where it vanishes. This is indeed very rough, but gives the basic intuition needed. In order to state these precisely, it is convenient to make some formal definitions.

7.2.1 Large Deviation Property The following is a mathematical definition. Let (Kn )n be a sequence of probability measures on a Polish space X, and I : X → [0, ∞], a function with compact level sets, that is, {x ∈ X : I (x) ≤ c} is compact for each c ≥ 0. We say that (Kn )n satisfies a large deviation property with rate function I if for every open set A 1 lim sup− ln Kn [A] ≤ inf I (x) , x∈A n→∞ n and for every closed subset C, we have lim inf − n→∞

1 ln Kn [A] ≥ inf I (x) . x∈A n

This means that if A is any Borel subset, then 1 1 inf I (x) ≤ lim inf− ln Kn [A] ≤ lim sup − ln Kn [A] ≤ info I (x) , n→∞ n x∈A n n→∞ x∈A¯ where Ao is the interior of A and A¯ is its closure. In particular, if the large deviation property holds, then for any set with infx∈A¯ I (x) = infx∈Ao I (x) we have that the following limit exists: 1 lim − ln Kn [A] ≡ inf I (x) . x∈A n

n→∞

We say that a pair of sequences (an )n and (bn )n are asymptotically logarithmically equivalent, written an ( bn , if lim

n→∞

1 1 ln an = lim ln bn . n→∞ n n

If a sequence of measures (Kn )n satisfies a large deviation property with rate function I, then we denote this by Kn [dx] ( e−nI(x) “dx". Let us stress that the large deviation property is a mathematical definition. The measures (Kn )n are not necessarily supposed to arise as distributions of sample mean variables Sn , though they usually relate to some form of distributions for rare events. The following result is useful at this level of generality.

130

Entropy, Large Deviations, and Legendre Transforms

Theorem 7.2.1 (G¨artner–Ellis) Let (Kn )n be a sequence of probability measures on X ≡ Rd . Suppose that limit ' 1 W (t) = lim ln ent.x Kn [dx] n→∞ n exists for each t ∈ Rd , and defines a differentiable function, then (Kn )n satisfies a large deviation property with rate function I given by I (x) = sup {t.x − W (t)} . t

The result may be justified on the following grounds: if (Kn )n did satisfy a large deviation property, then ' ' nt.x e Kn [dx] ( en[t.x−I(x)] “dx" ( en supx {t.x−I(x)} , where in the last part one argues that it is the largest rate that dominates; this would imply that W (t) ≡ supx {t.x − I (x)}, which may be inverted to get I (x) = supt {t.x − W (t)} provided W (·) is differentiable. A related result is Varadhan’s Theorem. Theorem 7.2.2 (Varadhan) Let (Kn )n be a sequence of probability measures on a Polish space X, and I : X → [0, ∞] a rate function with compact level sets. Then the large deviation property is equivalent to the condition that ' 1 lim ln en(x) Kn [dx] = sup { (x) − I (x)} n→∞ n x for every continuous bounded real function . The intuition justifying this is similar to before: ' ' en(x) Kn [dx] ( en(x) e−nI(x) “dx" ( en supx {(x)−I(x)} , where again we use the rule of thumb that the largest exponential rate dominates. This motivates the following guess that ' 1 lim ln en(x) Kn [dx] = sup { (x) − I (x)} . n→∞ n x Note that if we take X ≡ Rd in Varadhan’s Theorem, then we cannot automatically take  (x) to be t.x as this is not bounded. The following two results are very useful in practical applications. Theorem 7.2.3 (The Contraction Principle) Let φ : X → X be continuous and consider the pullback probabilities Kn = Kn ◦ φ −1 .

7.2 Law of Large Numbers and Large Deviations

131

Then if (K n )n satisfies a large deviation principle with rate function I, it follows that Kn n satisfies a large deviation principle with rate function I  given by   I  x = inf I (x) : φ (x) = x . Theorem 7.2.4 (Deformation Principle) Let  be a bounded continuous function, then define probabilities ˜ n [dx] = 1 en(x) Kn [dx] , K n  where the normalization is n = en(x) Kn [dx].  If (Kn )n satisfies a large ˜ n satisfies a large deviation deviation property with rate function I, then K n property with rate function I˜ (x) = I (x) −  (x) − inf {I − } .

7.2.2 Large Deviation Results As we have mentioned, the definition of the large deviation property applies to more general situations than sampling a random variable. However, we now return to this situation and state the classical result, which is now a corollary to the G¨artner–Ellis Theorem. Theorem 7.2.5 (Cram´er) Let X be Rd with distribution KX , with cumulant  an tx generating function WX (t) = ln e KX [dx] well defined for all t ∈ Rd , and let Kn be the distribution for the geometric mean of n independent samples of X. Then (Kn )n satisfies a large deviation property with the rate function I (x) = sup {t.x − WX (t)} . t

Here we arrive WX as the form of the scaled cumulant function limit W since ' ( )   1 1 ln ent.x Kn [dx] = ln E et.(X1 +···+Xn ) = ln E et.X = WX (t) . n n We note that WX is automatically analytic, so the differentiability condition comes for free. Empirical Measures Next let X be a Polish space and M (X) be the space of Radon probability measures on X, equipped with the topology of weak convergence. (In fact, M (X) can be turned into a Polish space with the Wasserstein metric.) Given an i.i.d. sequence X1 , X2 , . . ., we may define the empirical measures

132

Entropy, Large Deviations, and Legendre Transforms 1 δXk (ω) [dx]. n n

Mn (ω; dx) =

k=1

So we may think of Mn (·, dx) as a random measure, that is, a random variable taking values in M (X). A sequence of probability measures on M(X) is then given by Pn [A] = Pr {Mn (·, dx) ∈ A} . Intuitively one expects that the empirical measures should converge in some sense as n tends to infinity to the common distribution KX for the terms in the i.i.d. sequence. This can be elegantly formulated as a large deviation property. Theorem 7.2.6 (Sanov) The family (Pn )n of distributions for the empirical measures obtained by sampling a random variable X satisfy a large deviation property with the rate function I K  = D K  ||KX . To complete the picture, we should remark that relative entropy has a natural variational expression:  '   fdK − WX (f ) D K ||KX = sup f

where

X

' WX (f ) = ln

ef (x) KX [dx]. X

The supremum is taken over all continuous real Borel-measurable functions on X (and we can even restrict to bounded functions!). Actually, Sanov’s Theorem is a higher-level version of Cram´er’s Theorem, and we may deduce the latter from the form using the contraction principle with the map φ : M (X) → Rd : ' φ (K)  x K[dx]. X

Proof of Sanov’s Theorem When X Is Finite It is instructive to look at Sanov’s Theorem in the case where X is finite, say $X = d. In this case, we may introduce random variables Nn (ω, x) for each x ∈ X that count the number of times x occurs in the sequence (X1 (ω) , . . . , Xn (ω)). Indeed, Nn (·, x) has a Bin(n, p (x)) distribution where p (x) = Pr {X = x}. The empirical measure simplifies to

7.3 Large Deviations and Stochastic Processes

Mn (·, dx) =



133

pn (·, x) δx [dx] ,

x∈X

where pn (ω, x) = Nn (ω,x) . n We may now view Mn as an Rd valued random variable – constrained to be a probability vector. Specifically, we may think of Mn as the random vector (pn (·, x))x∈X . In this context, we can apply G¨artner–Ellis, and here the scaled cumulant function W (t) is   (  (  ) 1 )  1 n x t(x)pn (·,x) t(x)Nn (·,x) t(x) x ln E e = ln E e ≡ e p (x) n n x where we use the fact that the occupation numbers {Nn (·, x)}x∈X have a multinomial distribution Bin (n, π ) where π = {p (x)}x∈X . The G¨artner–Ellis Theorem then states that the distributions of the Mn satisfy a large deviation property with the rate function       t(x) t (x) p (x) − e p (x) I π = sup t



x

x



where π  ≡ p (x) x∈X . The supremum is attained, and we have the conditions (from differentiating wrt. t (x) for fixed x) 0 = p (x) −

et(x) p (x) eW(t)

or t (x) = ln

p (x) + W (t) . p (x)

Substituting back in gives   p (x) ≡ D π  ||π . I π = p (x) ln p (x) x

7.3 Large Deviations and Stochastic Processes 7.3.1 Path Integrals and the Wiener Process Recall that the Wiener process {W(t) : t ≥ 0} is a family of real-valued random variables parameterized by time t ≥ 0, such that ∞

E[e

0

f (t)dW(t)

1

] = e2

∞ 0

f (t)2 dt

(7.3)

134

Entropy, Large Deviations, and Legendre Transforms

for any real-valued square-integrable function f . The process is a Markov process, and indeed, the probability density associated with it taking values q1 , . . . , qn at times t1 < · · · < tn is  n   T(qk , tk |qk−1 , tk−1 ) ρ(q1 , t1 ), (7.4) ρn (q1 , t1 , . . . , qn , tn ) = k=1 q2

where the initial probability density is ρ(q, t) = √ 1 e− 2t and (for t > t ) the 2π t transition mechanism is  )2 1 − (q−q 2(t−t ) . e (7.5) T(q, t|q , t ) = √ 2π(t − t ) From the probability densities ρn , we can use Kolmogorov’s Reconstruction Theorem to construct a probability space ( W , FW , PW ) for the process. In particular, Wiener was able to give a more explicit construction where we take sample space to be the set of all continuous paths parameterized by time t ≥ 0 (trajectories) starting at the origin, W = C0 [0, ∞),

(7.6)

such that if q = {q(t) : t ≥ 0} is an outcome, then W(t) takes on the value q(t). The sigma-algebra, FW , in this case is the one generated by the cylinder sets: that is, the sets of all trajectories that are required at a finite number of certain times to be in certain fixed interval regions. We refer to this as the canonical Wiener process, and from now on take it as the default probability space ( W , FW , PW ) for the Wiener process. We have ' ∞ ∞ f (t)dW(t) 0 ]= e 0 f (t)dq(t) PW [dq]. (7.7) E[e C0 [0,∞)

However, it is sometime convenient to think of a path integration, or functional integral, formulation of Wiener averages. Let t > 0 and choose 0 ≤ t1 < · · · < tn ≤ t for fixed n. Then we have '  (q −q )2 − 12 nk=1 kt −tk−1 k k−1 dq1 . . . dqn , E[Ft [W(t1 ), . . . , W(tn )]] = Ft [q1 , . . . , qn ] Nn e (7.8) −1 with q0 = 0, and Nn = k = 1n 2π (tk − tk−1 ) . Taking a continuum limit n → ∞ with max(tk − tk−1 ) → 0, we may formally consider the limit functional Ft [W] of the Wiener process over the time interval [0, t] with expectation '  1 t 2 Ft [q] e− 2 0 q˙ (τ ) dτ Dq, (7.9) E[Ft [W]] = 

C0 [0,t]

7.3 Large Deviations and Stochastic Processes

135

t where Dq is the formal limit of Nn dq1 . . . dqn , and 0 q˙ (τ )2 dτ the limit of n (qk −qk−1 )2 k=1 tk −tk−1 . This would be then normalized (setting Ft ≡ 1) by '

t

1

C0 [0,t)

e− 2

0

q˙ (τ )2 dτ

Dq = 1.

(7.10)

While this is appealing, sadly neither of the limits makes sense separately: a well-known theorem of Andr´e Weil shows that there is no translationally invariant measure on the trajectories, so Dq is meaningless; a theory of Wiener shows that while the noncontinuous trajectories for  t a set of PW -measure zero, so too do the differentiable trajectories, and so 0 q˙ (τ )2 dτ is a well-defined integral for almost no trajectory. Nevertheless, the combination t

1

Pt [dq] ≡ e− 2

0

q˙ (τ )2 dτ

Dq

(7.11)

can be used formally to great effect. For instance, we have ∞

E[e

0

' f (t)dW(t)

]=

t

C0 [0,t]

=e

 1 t 2 0

e

0

f (t)dq(t) − 12

e

'

f (τ )2 dτ

1

t 0

f (τ )2 dτ

q˙ (τ )2 dτ

0

1

C0 [0,∞)

= e2

t

e− 2

t

Dq

q(τ )−f (τ )]2 dτ 0 [˙

Dq

,

t where we translate q(t) → q(t) + 0 f (τ )dτ . Now let us state a simple result about the rescaled Wiener process. Theorem 7.3.1 (Schilder) Let {W (t)}0≤t≤T be the canonical Wiener process on C0 [0, T], and let Kn be the law for √1n W (·). Then the family {Kn }n satisfies a large deviation property with the rate function 1 I (q) = 2

'

T

|˙q (t)|2 dt,

(7.12)

0

with I (q) = ∞ if the path is not absolutely continuous. A proof of this may be found in Dembo and Zeitouni (1998); however, we can see the germ of the idea by observing that 1

Kn [dq] ≡ e− 2 n

T 0

q˙ (τ )2 dτ

Dq.

(7.13)

136

Entropy, Large Deviations, and Legendre Transforms

Indeed, by G¨artner–Ellis, we have

' T 1 W(f ) = lim ln e 0 f (t)˙q(t)dt Kn [dq] n→∞ n C [0,T] ' 0 T 1 = lim ln en 0 f (t)˙q(t)dt PW [dq] n→∞ n C0 [0,T] ' 1 T = |f (t)|2 dt, 2 0

with the Legendre–Fenchel transform  ' T f (t) q˙ (t) dt − W (f ) I (q) = inf f

=

1 2

1 = 2

'

0 T

|˙q (t)|2 dt −

0

'

T

1 inf 2 f

'

T

|f (t) − q˙ (t)|2 dt

0

|˙q (t)|2 dt.

0

7.3.2 Large Deviation for Diffusions A diffusion process in Rn driven by m independent Wiener processes {Wα (t)} is described in component form by dX i (t) = vi (X(t)) dt +

m 

biα X(t) dWα (t),

(7.14)

α=1

(i = 1, . . . , n), with initial conditions X(0) = x0 ∈ Rn . Here the differentials are understood in the It¯o sense, that is, dX(t) means X(t +dt)−X(t) for positive increments dt. We therefore have that E[dX i (t)] dt = E[vi (X(t)] dt, so v is the average velocity vector field. To work out a path integral formulation, let us consider the 1-d case dX(t) = v(X(t))dt + b(X(t))dW(t). Roughly speaking, for t2 > t1 with t2 − t1 small, the variable [X(t2 ) − X(t1 ) − v(X(t1 )(t2 − t1 )]/b(X(t1 )) is approximately a standard Gaussian and so the transition probability is 2 x2 −x1 −v(x1 )(t2 −t1 ) − 1 2(t2 −t1 )b(x1 )2 e T(x2 , t2 |x1 t1 ) ≈ √ 2π(t2 − t1 ) −1/2 × 1 + ∇v(x1 )(t2 − t1 ) , (7.15) the final term arising as a Jacobian. We can guess the path integral form at this stage. The associated probability measure on the space of sample trajectories, Cx0 [0, t] in the multidimensional case, is

7.3 Large Deviations and Stochastic Processes

137



t PX,t (dx) = e− 0 LX (x,˙x) Dx,

(7.16)

where the action functional now comes from the associated Lagrangian 1 1 −1 [˙x − v (x)] XX (x) [˙x − v (x)] + ∇v(x), (7.17) 2 2  j i where XX (x) is the n × n diffusion matrix with entries m α=1 bα (x) bα (x). We n assume that the diffusion matrix is invertible at each x ∈ R . LX (x, x˙ ) =

Theorem 7.3.2 (Freidlin–Wentzell) Let {Xn (t)}0≤t≤T be the It¯o diffusion process satisfying the stochastic differential equation (SDE) 1  dXn (t) = v (Xn (t)) dt + √ bα (Xn (t)) dWn (t) (7.18) n α with Xn (0) = 0, where the Bα (·) are independent canonical Wiener processes and the drift vector field v (·) is uniformly Lipschitz continuous. We make the assumption that X is strictly positive definite (so that the generator of the diffusion is elliptic). Then the family {Kn }n given by Kn = P ◦ X−1 n satisfies a large deviation property with the rate function '  −1   1 T q˙ − v (q) XX (q) q˙ − v (q) dt, (7.19) I (q) = 2 0 with I (q) = ∞ if the path is not in the Sobolev space H 1 [0, t], Rd of Rd -valued L2 functions on [0, T] with L2 derivative. A proof of this may be found in Dembo and Zeitouni (1998) and Freidlin and Wentzell (1998), but it is not too difficult to motivate the result formally by the same type of argument given after Schilder’s Theorem. There is an intriguing connection between the rate functions, I, appearing here, and the one-particle irreducible generating functions, , considered in the previous chapter. Both occur as Legendre–Fenchel transforms of the cumulants of various random variables, processes, or fields. This connection between probabilistic distributions and quantum field theory goes back to Jona-Lasinio (1983). Subsequently, a least effective action () principle for fluctuations in terms of stochastic processes was developed by Eyink (1996) along similar lines.

8 Introduction to Fock Spaces

Fock spaces were introduced by Vladimir Fock as the appropriate Hilbert space setting for quantum fields corresponding to Bosonic quanta. They have since emerged as a rich source of mathematical investigation, and are relevant to a large number of areas of mathematics and deserving independent investigation. In this chapter, we set up the basic definitions for Full (that is, distinguishable quanta), Boson, and Fermion cases.

8.1 Hilbert Spaces 8.1.1 Notations and General Statements Linear spaces are spaces over the field of complex number C unless stated differently. Hilbert spaces are denoted as H or h with elements f , g, . . .. The positive definite inner product f ∈ H, g ∈ H → f | g ∈ C is linear in g and antilinear in f and satisfies the hermitean symmetry

g | f = f | g ∗ . The inner product determines a norm on H according to * *f * = f | f ≥ 0.

(8.1)

(8.2)

Indeed, we note that *f * = 0 if and only if f = 0. The inner product may in fact be recovered from the norm by means of the polarization identity 4 f | g = *f + g*2 − *f − g*2 − i *f + ig*2 + i *f − ig*2 .

(8.3)

All Hilbert spaces considered here are separable, that is, we shall always assume the existence of a finite or countable orthonormal basis. The completeness axiom for Hilbert spaces is the requirement that every sequence of vectors that is Cauchy in the norm must converge in the norm to 138

8.1 Hilbert Spaces

139

an element of the Hilbert space. In general, an infinite dimensional linear space E with a positive definite inner product need not necessarily be complete, and in such cases we refer to the space as an inner product space, or pre-Hilbert, space. However, one can always define a (unique) completion 6 E of E, which is a Hilbert space, see e.g. Reed and Simon (1972). Hilbert spaces can emerge in a more general setting. Let E be a linear space with a positive hermitean form f , g ∈ E → B(f , g) ∈ C, i.e. a form that satisfies B(f , g) = B(g, f )∗ and B(f , f ) ≥ 0 for all f , g ∈ E. The space E0  {f : B(f , f ) = 0} may be nontrivial. Then B(f , g) induces a positive definite inner product into the quotient space E/E0 , and the completion of this quotient space is a Hilbert space. The space E is called semi-Hilbert space. See e.g. Maurin (1968)1 . The continuous linear operators A, B, . . . on an inner product space E form the linear space L(E), and a norm of this space is given by the operator norm *A* = sup *Af * = f ,*f *=1



sup

f ,g:*f *=1,*g*=1

sup

f ,g,*f *=1,*g*=1

| f | Ag |

Re f | Ag .

(8.4)

If the space E is completed to the Hilbert space H, the operator A can be extended to a continuous operator on H with the norm (8.4). A mapping f ∈ H → f % ∈ H is denoted as conjugation if it has the properties f →  f % is antilinear, i.e. (αf )% = α % f % for α ∈ C and f ∈ H, f →  f % is isometric, i.e. *f % * = *f * for f ∈ H, f →  f % is involutive, i.e. f %% = f for f ∈ H.

(8.5)

Using the polarization identity, we derive from the isometry of this antilinear mapping

f % | g% = f | g ∗ .

(8.6)

Hence any conjugation is antiunitary. Proposition 8.1.1 If f → f % is a conjugation, then (f | g)  f % | g

(8.7)

is a bilinear symmetric nondegenerate form on H.

1 In this reference, the semi-Hilbert-space is called a pre-Hilbert space. In most of the literature –

and in the present book – the notion “pre-Hilbert space” is reserved to spaces with a positive definite form.

140

Proof

Introduction to Fock Spaces

The form is obviously bilinear. The identities (8.1) and (8.6) yield (8.6)

(8.1)

(f | g) = f % |g = f |g% ∗ = g% |f = (g|f )

(8.8)

and (8.7) is symmetric. Assume (g | f ) = 0 for all f ∈ H, then (g | f ) =

g% | f = 0 for all f ∈ H. Take f = g% , then g% | g% = 0 and consequently, g% = 0. Since the involution is isometric that is only possible for g = 0. The subset HR  {f = f % | f ∈ H} is a linear space over the field R, and it is a real Hilbert space with the inner product of H. Any f ∈ H can be written as sum f = f1 + if2 with vectors f1,2 ∈ HR . These vectors are f1 = (f + f % )/2 and f2 = −i(f − f % )/2. Any orthonormal (ON) basis {ek : k ∈ K} of the real space HR is an ON basis of the complex space H.

8.1.2 Examples Take the with functions Rn , x → ϕ(x) ∈ C, then a simple conjugation is the usual complex conjugation of functions (8.9) ϕ(x) → ϕ % (x)  ϕ(x)∗ . L2 -space

L2 (Rn )

The functions that are “real” with respect to this involution, i.e. ϕ % = ϕ, are just the functions with values in R. However, the Fourier transform (F ϕ) (k) = f (k) maps the involution (8.9) onto the involution f (k) → (f % )(k) = f (−k)∗

(8.10)

2 n for the functions f (k) ∈−1L (R ). In this context, “real” now implies f (k) = ∗ f (−k) , that is, then F f (x) = ϕ(x) is a real-valued function.

8.2 Tensor Spaces There is an extensive literature about tensor algebras. We only refer to Bourbaki (1974), Greub (1978), and to the appendix in Dieudonn´e (1972).

8.2.1 The Tensor Product General Definitions Let φ : E1 × E2 × · · · × En → E be an n-linear mapping of the pre-Hilbert spaces Ek , k = 1, 2, . . . , n, into the pre-Hilbert space E with the following properties:

8.2 Tensor Spaces

141

(i) If fk , gk ∈ Ek , k = 1, 2, . . . , n, then we may define the inner product of f = φ(f1 , . . . , fn ) ∈ E and g = φ(g1 , . . . , gn ) ∈ E to be

f | g =

n 

fk | gk .

(8.11)

k=1

(ii) The linear hull of {φ(f1 , . . . , fn ) | f1 ∈ E1 , . . . , fn ∈ En } is dense in E. This mapping is called the tensor product of the spaces Ek . The usual notations are E = E1 ⊗E2 ⊗ · · · ⊗En and f = f1 ⊗ · · · ⊗fn . Those elements of E, which have the representation f = f1 ⊗ · · · ⊗fn , are called decomposable. The tensor space E is the linear hull of all decomposable tensors. The tensor product is associative: (8.12) E1 ⊗E2 ⊗E3 = E1 ⊗ E2 ⊗E3 = E1 ⊗E2 ⊗E3 . A trivial example is where all the factor spaces are one-dimensional, that is, E1 = · · · = En = C, in which case C⊗C⊗ · · · ⊗C = C. An immediate consequence of the definition is that if the vectors ekj ∈ Ek , j ∈ Jk ⊂ N form an orthonormal basis of the space Ek , k = 1, 2, . . . , n, then the tensors e1j1 ⊗e2j2 ⊗ · · · ⊗enjn ∈ E,

(j1 , . . . , jn ) ∈ J1 × · · · × Jn

(8.13)

form an orthonormal basis of E. Hence any element F ∈ E can be represented as  f = c(j1 , . . . , jn ) e1j1 ⊗e2j2 ⊗ · · · ⊗enjn (8.14) (j1 ,...,jn )∈J1 ×···×Jn

with coefficients c(j1 , . . . , jn ) ∈ C such that  *f *2 = |c(j1 , . . . , jn )|2 .

(8.15)

(j1 ,...,jn )∈J1 ×···×Jn

Our interest will be in the situation where the factors are Hilbert spaces H1 and H2 , in which case the algebraic tensor product H1 ⊗H2 consists of all decomposable tensors. We shall more often work with the completion of the linear hull of H1 ⊗H2 , as this is again a Hilbert space, which we denote by H1 ⊗ H2 . For example, let H1 = L2 (M1 , dμ1 ) and H2 = L2 (M2 , dμ2 ) be two 2 L -spaces. Then the tensor space H1 ⊗ H2 can be identified with L2 (M1 × M2 , dμ1 × dμ2 ), and the bilinear product is the numerical product f (x1 ) ∈ L2 (M1 , dμ1 ), g(x2 ) ∈ L2 (M2 , dμ2 ) → h(x1 , x2 ) = f (x1 )g(x2 ) ∈ L2 (M1 × M2 , dμ1 × dμ2 ).

(8.16)

142

Introduction to Fock Spaces

The inner product obviously factorizes according to (8.11). If {ϕm (x1 )} is an orthonormal basis of L2 (M1 , dμ1 ) and {ψn (x2 )}, an orthonormal basis of L2 (M2 , dμ2 ), then {ϕm (x1 )ψn (x2 )} is an orthonormal basis of L2 (M1 × M2 , dμ1 × dμ2 ); see Reed and Simon (1972, Sect. II.4).

8.2.2 The Tensor Algebra Let H be a Hilbert or pre-Hilbert space with dimension dim H ≥ 1. Then the linear hull of all decomposable tensors of degree n of this Hilbert space is denoted by H⊗n = H⊗H⊗ · · · ⊗H, with n factors where n = 1, 2, . . .. We also take H⊗0 to be the one-dimensional space C. The tensor algebra over a (pre-)Hilbert space H is the set (H) as the set of all sequences F = (F0 , F1 , . . .), with Fm ∈ H⊗m , m = 0, 1, 2, . . ., and Fn = 0 if n > N for some N ∈ N. We now show step-by-step that this is indeed an algebra. First, we see that (H) becomes a vector space with the following definitions: if F = (F0 , F1 , . . .) and G = (G0 , G1 , . . .) are elements of (H), then we may define their linear combination H = αF + βG for α, β ∈ C in the obvious way as H = (H0 , H1 , . . .) ∈ (H),

with Hn = αFn + βGn ∈ H⊗n . (8.17)

An inner product on the tensor algebra may be defined as follows: for F, G ∈ (H), we set ∞ 

Fn | Gn n ,

F | G =

(8.18)

n=0

where Fn | Gn n is the inner product of H⊗n . The norm is given as usual by * *F* = F | F . (8.19) With these definitions, (H) is a pre-Hilbert space. If we identify Fn ∈ H⊗n with the corresponding sequence (0, . . . 0, Fn , 0, . . .) ∈ (H), then the spaces H⊗n are orthogonal subspaces of (H). As an alternative to writing F = (F0 , F1 , . . .) ∈ (H), we shall write in the sequel F=

∞ 7

Fn .

(8.20)

n=0

From (8.12) follows the simple identity H⊗m ⊗H⊗n = H⊗(m+n) for the tensor product of the Hilbert spaces H⊗m and H⊗n . It is straightforward to derive the identity

F⊗G | F  ⊗G = F | F  G | G for tensors F, F  ∈ H⊗m and G, G ∈ H⊗n .

(8.21)

8.2 Tensor Spaces

143

With the additional rule λ⊗Fn = Fn ⊗λ = λFn for λ ∈ C = H⊗0 and Fn ∈ for n ≥ 0, the tensor product can be extended to an associative product 8 8∞ ⊗n on the space (H). For F = ∞ n=0 Fn and G = n=0 Gn with Fn , Gn ∈ H , we define H = F⊗G by H⊗n ,

H=

∞ 7

Hn with Hn =

n=0

n 

Fk ⊗Gn−k .

(8.22)

k=0

Since the sum terminates, there is no problem of convergence. With this definition, it is clear that the linear space (H) becomes an algebra, and the unit of this algebra is 1 ∈ C = H⊗0 ⊂ (H).

8.2.3 The Fock Space The completion of the tensor space H⊗n will be denoted H⊗n , and the completion of the tensor algebra (H) is the direct orthogonal sum (H) =

∞ 7

H⊗n .

(8.23)

n=0

This space consists of all (finite and infinite) series F=

∞ 7 n=0

Fn with Fn ∈ Hn and

∞ 

*Fn *2 = *F*2 < ∞,

(8.24)

n=0

and is referred to as the Fock space. From (8.21) follows the norm identity 9 9 9Fm ⊗Gn 9 = *Fm * *Gn *

(8.25)

for Fm ∈ H⊗m and Gn ∈ H⊗n . This implies that the tensor product of H⊗m and H⊗n is in fact continuous and may therefore be extended to the Hilbert spaces H⊗m and H⊗n by continuity. For arbitrary tensors Fm ∈ H⊗m and Gn ∈ H⊗n , we then obtain Fm ⊗ Gn ∈ H⊗(m+n) with the norm identity (8.25). The tensor product is therefore defined on the algebraic sum of the completed spaces 7 H⊗n , (8.26) fin (H) = n≥0

and fin (H) is an algebra. Unfortunately the tensor product (8.22) is not continuous in the norm (8.19) on (H), and it has no continuous extension onto the whole Fock space (H).2

2 It is, however, possible to define other Hilbert norms such that the tensor product is continuous

on the whole Fock space. See Kupsch and Smolyanov (2000).

144

Introduction to Fock Spaces

As an example, take the Hilbert space H = L2 (R, dμ) where dμ is a positive measure on R. Then the tensors in H⊗n , n ∈ N, are functions fn (x1 , . . . , xn ) on Rn , which are square integrable with respect to the product measure dn μ = dμ × · · · × dμ, i.e. fn ∈ L2 (Rn , dn μ). An element of the Fock space (H) is a    n

sequence f = {f0 , f1 , . . .} of tensors f0 ∈ C and fn ∈ L2 (Rn , dn μ) with a finite norm ∞ '  2 2 |fn (x1 , . . . , xn )|2 dn μ. *f * = |f0 | + (8.27) n n=1 R

8.3 Symmetric Tensors 8.3.1 The Fock Space of Symmetric Tensors With Sn , we denote the set of all n! permutations of the numbers {1, 2, . . . , n}. For each σ ∈ Sn , the mapping f1 ⊗ f2 ⊗ · · · ⊗ fn ∈ H⊗n → fσ (1) ⊗ fσ (2) ⊗ · · · ⊗ fσ (n) ∈ H⊗n is linear in each factor (in fact, it is unitary!). The symmetrization prescription (n)

P+ f1 ⊗ f2 ⊗ · · · ⊗ fn 

1  fσ (1) ⊗ fσ (2) ⊗ · · · ⊗ fσ (n) n!

(8.28)

σ ∈Sn

defines a linear operator on H⊗n , n ≥ 2. (For n = 0, 1, we just take Pn+ to be the identity.) If τ ∈ Sn is a fixed element of Sn , the set {σ τ | σ ∈ Sn } is again Sn : the set of all permutations on {1, . . . , n} being the same as the set of all permutations on {τ (1), . . . , τ } for any permutation τ . The operator (8.28) therefore satisfies (n) P(n) + f1 ⊗ f2 ⊗ · · · ⊗ fn = P+ fτ (1) ⊗ fτ (2) ⊗ · · · ⊗ fτ (n)

(8.29)

for any permutation τ ∈ Sn . Excercise Prove the identities   (n) (n) (n) P+ P+ f1 ⊗ f2 ⊗ · · · ⊗ fn = P+ f1 ⊗ f2 ⊗ · · · ⊗ fn

(8.30)

and (n)

P+ f1 ⊗ f2 ⊗ · · · ⊗ fn | g1 ⊗ g2 ⊗ · · · ⊗ gn = f1 ⊗ f2 ⊗ · · · ⊗ fn | P(n) + g1 ⊗ g2 ⊗ · · · ⊗ gn for arbitrary fk , gk ∈ H, k = 1, . . . , n.

(8.31)

8.3 Symmetric Tensors

145

By linearity these identities imply the operator identities (n)2 (n)∗ (n) P+ = P(n) + and P+ = P+

(8.32)

on H⊗n . Hence is a continuous3 projection operator on H⊗n and can be extended to a continuous orthogonal projection operator on H⊗n . The image P(n) +

n+ (H)  P+ H⊗n ⊂ H⊗n (n)

is a closed subspace of tensors of degree n.

H⊗n

and the elements F ∈

(8.33) n+ (H)

are the symmetric

With P+ λ  λ ∈ C = H⊗0 and P+ f  f ∈ H = H⊗1 , the operators P+ are defined on all the spaces H⊗n , n = 0, 1, 2, . . .. We now define a projection operator P+ on the Fock space (H) by (0)

(1)

F=

∞ 7

Fn → P+ F 

n=0

∞ 7

(n)

P+ Fn if Fn ∈ H⊗n . (n)

(8.34)

n=0

The restriction of this operator to H⊗n agrees with P(n) + . The closed linear subspace  + (H)  P+ (H) =

∞ 7

n+ (H)

(8.35)

n=0

is the Fock space of symmetric tensors or the Boson Fock space. As example, take the Hilbert space H = L2 (R, dμ) as in the Fock example. Then the tensors in n+ (H), n ∈ N, are functions fn+ (x1 , . . . , xn ) in L2 (Rn , dn μ), which are symmetric with respect to the exchange of the variables xj ↔ xk . An element of the Fock space  + (H) is a sequence f = {f0 , f1 , . . .} of tensors f0 ∈ C and symmetric functions fn ∈ L2 (Rn , dn μ) with a finite norm (8.27).

8.3.2 The Algebra of Symmetric Tensors The algebraic subspace of symmetric tensors of degree n is denoted by A+ n (H), that is, (n) ⊗n = (H) ∩ n+ (H) ⊂ H⊗n . A+ n (H) = P+ H

The tensor product (8.22) can be defined for arbitrary tensors in the space 7 + A+ (8.36) A+ (H)  P+ (H) = n (H) = (H) ∩  (H). n≥0 3 The continuity follows from the observation that *P F*2 = (P F | P F) = s s s (Ps F | F) ≤ *Ps F* *F*. Hence *Ps F* ≤ *F* for all F ∈ H⊗n .



P2s F | F



=

146

Introduction to Fock Spaces

Since F⊗G is in general not an element of A+ (H), for arbitrary F, G ∈ A+ (H), we must define F, G ∈ A(H) → P+ (F ⊗ G) ∈ A(H)

(8.37)

to get a bilinear product within A+ (H), which – as a consequence of (8.29) – is symmetric P+ (F ⊗ G) ≡ P+ (G ⊗ F) .

(8.38)

Excercise Derive the identity P+ (P+ F ⊗ P+ G) = P+ (F ⊗ G)

(8.39)

for tensors F ∈ H⊗m and G ∈ H⊗n and arbitrary numbers m, n ∈ N. As a consequence of (8.39) the product (8.37) is associative. The standard definition of the symmetric tensor product is F ∨ G, defined by + (m + n) ! + + P+ (F ⊗ G) ∈ A+ (8.40) F ∈ Am , G ∈ An → F ∨ G = m+n . m! n! There is a good reason for the combinatorial coefficient, which we now explain. In principle, we could take any sequence of nonzero complex numbers (c (n))n and then define a product for symmetric tensors of degree m and n by + F ∈ A+ m , G ∈ An → F ◦ G =

c(m + n) P+ (F ⊗ G) ∈ A+ m+n , c(m)c(n)

(8.41)

and extend this definition by linearity to a product on A+ (H). Let us first prove the following general result. Proposition 8.3.1 The bilinear extension of the product (8.41) is an associative symmetric product on A+ (H). + Proof We give an indication of the proof. Take Fm ∈ A+ m (H), Gn ∈ An (H), + and Hk ∈ Ak (H), then

c(m + n + k) c(m + n) P+ (F ⊗ G ⊗ H) c(m + n)c(k) c(m)c(n) c(m + n + k) = P+ (F ⊗ G ⊗ H) c(m)c(n)c(k)

(F ◦ G) ◦ H =

(8.42) (8.43)

and c(m + n + k) c(n + k) P+ (F ⊗ G ⊗ H) c(m)c(n + k) c(n)c(k) c(m + n + k) = P+ (F ⊗ G ⊗ H) . c(m)c(n)c(k)

F ◦ (G ◦ H) =

(8.44) (8.45)

8.3 Symmetric Tensors

147

Hence the product is associative for tensors of fixed degree. The statement for general tensors follows by linearity. The symmetry follows from (8.38). With each of these definitions, the product (8.41) is symmetric and A+ (H) is an associative algebra that √ is generated by vectors f ∈ H. Our specification c(n) = n!, however, leads to the following exceptional property relating to orthogonal subspaces. Lemma 8.3.2 If H1 and H2 are two orthogonal subspaces of H, then the inner product of tensors F1 ∨ F2 and G1 ∨ G2 with Fk , Gk ∈ A+ (Hk ), k = 1, 2, factorizes into

F1 ∨ F2 | G1 ∨ G2 = F1 | G1 F2 | G2 .

(8.46)

Proof The proof is first given for tensors of fixed degree Fkm ∈ A+ m (Hk ), Gkn ∈ A+ n (Hk ). The inner product (8.18) of these tensors is : + (m + n) ! (p + q) !

F1m ⊗ F2n |P+ (G1p ⊗ G2q ) .

F1m ∨F2n |G1p ∨G2q = m! n! p! q! (8.47) Thereby, we have used the definition (8.40) of the product and the properties P∗+ = P+ = P2+ of the projection operator. Since F1m ∨ F2n ∈ A+ m+n (H) and (H), the inner product vanishes unless m + n = p + q. G1p ∨ G2q ∈ A+ p+q Due to linearity in the right factor, it is sufficient to continue the calculation (p) with decomposable tensors G1p = P+ g1 ⊗ · · · ⊗ gp , gj ∈ H1 , and G2q = (q) P+ gp+1 ⊗ · · · ⊗ gp+q , gp+j ∈ H2 . Then we can use the explicit formula (8.28) for the projection operators. The factorization (8.21) yields that the inner product (8.47) vanishes unless both conditions m = p and n = q are satisfied. Moreover, we obtain the identities

F1m ∨ F2n | G1m ∨ G2n (m + n) ! (m+n)

F1m ⊗ F2n | P+ = (g1 ⊗ · · · ⊗ gm+n ) m! n!  1 =

F1m ⊗ F2n | gσ (1) ⊗ · · · ⊗ gσ (m+n) m! n! σ ∈Sm+n

 1

F1m | gσ (1) ⊗ · · · ⊗ gσ (m) = m! n! σ ∈Sm  gm+τ (1) ⊗ · · · ⊗ gm+τ (n) × F2n | τ ∈Sn

148

Introduction to Fock Spaces (m)

(n)

= F1m | P+ (g1 ⊗ · · · ⊗ gm ) F2n | P+ (gm+1 ⊗ · · · ⊗ gm+n ) = F1m | G1m F2n | G2n .

(8.48)

Hence (8.46) is true for tensors of fixed degree. 8 8 For general elements of A(Hk ) we have Fk = m Fkm and Gk = n Gkn with Fkm ∈ Am (Hk ) and Gkn ∈ A+ n (Hk ), and the left side of (8.46) is 

F1 ∨ F2 | G1 ∨ G2 =

F1m ∨ F2n | G1p ∨ G2q mnpq

=



F1m ∨ F2n | G1m ∨ G2n mn



F1m | G1m F2n | G2n . =

(8.49)

mn

But this result agrees with

F1 | G1 F2 | G2 =

 

F1m | G1m

F2n | G2n m

n

and (8.46) is true for all tensors. If we start with the general definition (8.41) of the product, the result (8.48) can only be derived if + c(m + n) (m + n) ! = c(m)c(n) m! n! is satisfied. The product (8.40) is therefore uniquely determined by the factorization property. From (8.40), we obtain 1⊗F = F ⊗1 = F for all F ∈ A+ (H). Hence A+ (H) is an algebra with a unit, and the element 1 ∈ C ⊂ A+ (H) is the unit of the algebra. This unit is often called the vacuum vector and written as or |0 . From the definition (8.40) and the norm identity (8.25), we obtain + m+n (m + n) ! *Fm * *Gn * ≤ 2 2 *Fm * *Gn * *Fm ∨ Gn * ≤ (8.50) m! n! + + for F ∈ A+ m (H) and G ∈ An (H). Hence the tensor product of Am (H) and + + + An (H) can be extended to the Hilbert spaces m (H) and n (H) by continuity. + (H) and G ∈  + (H), we then obtain a tensor For arbitrary tensors Fm ∈ m n n + + (H) = Fm ∨ Gn ∈ m+n (H) with the norm estimate (8.50). The space fin  + (H) ∩ fin (H) is therefore an algebra. But the symmetric tensor product is not continuous on A+ (H), and it has no continuous extension onto the whole Bose Fock space  + (H).

8.3 Symmetric Tensors

149

We now derive tensors. The tensors √ some useful results for decomposable √ f1 ∨ · · · ∨ fm = m!P+ f1 ⊗ · · · ⊗ fm and g1 ∨ · · · ∨ gn = n!P+ g1 ⊗ · · · ⊗ gn with fj , gk ∈ H have the inner product

f1 ∨ · · · ∨ fm | g1 ∨ · · · ∨ gm = m! f1 ⊗ · · · ⊗ fm | P+ g1 ⊗ · · · ⊗ gm 

f1 ⊗ · · · ⊗ fm | gσ (1) ⊗ · · · ⊗ gσ (m) = δn,m σ ∈Sm

= δn,m

m  

fj | gσ (j) .

σ ∈Sm j=1

Hence the general rule is 

0 if m = n per fj | gk if m = n where we have used the permanent of a matrix A = αjk j,k=1,...,n

f1 ∨ · · · ∨ fm | g1 ∨ · · · ∨ gn =

per A 

n  

αjσ (j) .

(8.51)

(8.52)

σ ∈Sn j=1

A simple consequence of (8.51) is 9 ∨n 92 9f 9 = n! *f *2n .

(8.53)

We now return to the original occurrence of Fock space in chapter 4 as the Guichardet space L2 (Power (X) , dX) over a one-particle space L2 (X, dx). In fact, we have the straightforward identification    + L2 (X, dx) ∼ = L2 (Power (X) , dX) , which is given by the unitary map   7 Fn ∈  + L2 (X, dx) → F˜ ∈ L2 (Power (X) , dX) , F= n

where

√ F˜ : {x1 , . . . , xn } = n! Fn (x1 , . . . , xn ) . √ As the reader may suspect, the n! factors play a combinatorial role. Indeed, the unitary map has the following action: ˜ F ∨ G → F˜  G. The previous lemma then can be alternatively proved as follows: assume that Fk , Gk ∈ A+ (Hk ) for k = 1, 2 with H1 and H2 orthogonal subspaces, then

150

Introduction to Fock Spaces

˜1 G ˜ 2 =

F˜ 1  F˜ 2 |G

' '

=





F˜ 1 (X1 )∗ F˜ 2 (X2 )∗

X1 +X2 =X



˜ 2 (Y2 ) dX ˜ 1 (Y1 ) G G

Y1 +Y2 =X

˜ 1 (X1 ) F˜ 2 (X2 )∗ G ˜ 2 (X2 ) dX, F˜ 1 (X1 )∗ G

X1 +X2 =X

where we used the orthogonality property  to show that two decompositions of X had to coincide. We can then use the  formula, Lemma 4.1.3, to show that the last term equals ' ˜ 1 F˜ 2 |G ˜ 2 . ˜ 1 (X1 ) F˜ 2 (X2 )∗ G ˜ 2 (X2 ) dX1 dX2 = F˜ 1 |G F˜ 1 (X1 )∗ G If D is a linear subset of H, we write A+ (D) for the subalgebra of A+ (H), which is generated by vectors in D (i.e., A+ (D) is spanned by symmetric tensor products of vectors in D). For a more precise estimate of the symmetric tensor product we introduce the  + following family of Hilbert norms for a tensor F = ∞ p=0 Fp with Fp ∈ p (H) *F*2(α)

∞  9 92 = (p! )α 9Fp 9p , α ≥ 0.

(8.54)

p=0 + The completion of fin (H) with the norm *F*(α) is called the Fock scale + + + + (H) ⊂ (α) (H) ⊂ (β) (H) ⊂ (α) (H). These spaces have the inclusions fin + + + (0) (H) =  (H) if α ≥ β ≥ 0. The symmetric tensor product on fin (H) can + + + (H) × (α) (H) into (β) (H) if be extended to a continuous mapping from (α) α > β; see the appendix A of Kupsch and Smolyanov (1998) or Kupsch and + + (H) = ∪ + Smolyanov (2000). Hence the linear space > α>0 (α) (H) ⊂  (H) is an algebra with the symmetric tensor product.

8.3.3 Basis Systems If ek , k ∈ K ⊂ N+ is an ON basis of H.4 Let n = (n1 , n2 , . . .) ∈ (N+ )K be  an occupation number sequence with N(n) = k∈K nk . If N(n) = n, then we may define a symmetric tensor of degree n by F+ (n)  e1 ∨ · · · ∨ e1 ∨ e2 ∨ · · · ∨ e2 ∨ · · ·       n1

n2

1 2 = e∨n ∨ e∨n ∨ ··· . 1 2

4 If dim H = ∞, then K is the set N. If dim H = K < ∞, the index set is {1, 2, . . . , K}.

(8.55)

8.4 Antisymmetric Tensors

151

Only a finite number of occupation numbers, at most n, can be larger than zero. A factor with nk = 0 is, of course, understood as e∨0 k = 1. From (8.51), we obtain that the tensors F+ (m1 , m2 . . .) and F+ (n1 , n2 . . .) are orthogonal if the sequences (m1 , m2 . . .) and (n1 , n2 . . .) differ. The norm is *F+ (n1 , n2 . . .)*2 =  k∈K nk !; see (8.53). Since the linear span of the tensors ek1 ∨ · · · ∨ ekn with (k1 , . . . , kn ) ∈ Kn is dense in n+ (H) and any tensor ek1 ∨ ·· · ∨ ekn coincides with one of the tensors  F(n1 , n2 , . . .), we see that the tensors F+ (n1 , n2 , . . .)|nk ∈ N+ , k∈K nk = n form an orthogonal basis of n+ (H). An ON basis of n+ (H) is therefore given by  e+ (n) 



− 1 2

nk !

1 2 e∨n ∨ e∨n ∨ ··· 1 2

(8.56)

k∈K

with the occupation numbers satisfying N(n) = n. The whole Fock space 8∞ +  + (H) = n=0 n (H) has the ON basis e+ (n1 , n2 . . .) with nk ∈ N+ and  n < ∞. That is, the index set of this basis is the set of all terminating k k∈K sequences n = (n1 , n2 . . .) ∈ (N+ )K . The ON basis for the Boson Fock space constructed in this manner is labeled by the (occupation numbers of) bags drawn from K. If 1 ≤ dim H = K < ∞ also, the tensor spaces n+ (H) = A+ n (H) have a K+n−1 + , n ≥ 0. But the Fock space has always finite dimension dim An (H) = n infinite dimension.

8.4 Antisymmetric Tensors 8.4.1 Fock Space of Antisymmetric Tensors For a permutation σ ∈ Sn , we define the number [σ ] as [σ ] = 0 if σ is an even permutation and [σ ] = 1 for odd permutations. The linear mapping 1  (n) (−1)[σ ] fσ (1) ⊗ fσ (2) ⊗ · · · ⊗ fσ (n) (8.57) P− f1 ⊗ · · · ⊗ fn  n! σ ∈Sn

with n ∈ N defines an n-linear mapping on the tensor space H⊗n , which is antisymmetric against the exchange of any two of the vectors fk . More precisely, since (−1)[τ σ ] = (−1)[τ ] (−1)[σ ] for τ , σ ∈ Sn , we obtain (n)

(n)

P− f1 ⊗ f2 ⊗ · · · ⊗ fn = (−1)[τ ] P− fτ (1) ⊗ fτ (2) ⊗ · · · ⊗ fτ (n) for any permutations τ ∈ Sn .

(8.58)

152

Introduction to Fock Spaces

Excercise Prove the identities   (n) (n) (n) P− P− f1 ⊗ f2 ⊗ · · · ⊗ fn = P− f1 ⊗ f2 ⊗ · · · ⊗ fn

(8.59)

and (n)

P− f1 ⊗ f2 ⊗ · · · ⊗ fn | g1 ⊗ g2 ⊗ · · · ⊗ gn (n)

= f1 ⊗ f2 ⊗ · · · ⊗ fn | P− g1 ⊗ g2 ⊗ · · · ⊗ gn

(8.60)

for arbitrary fk , gk ∈ H, k = 1, . . . , n. By linearity these identities imply the operator identities (n)2

P−

(n)

(n)∗

= P− and P−

(n)

= P−

(8.61)

on H⊗n . Hence P− is a projection operator on H⊗n , which can be extended to a projection operator on H⊗n . We can now continue as in the symmetric case. The image (n)

n− (H)  P− H⊗n ⊂ H⊗n (n)

(8.62)

is a closed subspace of H⊗n , and the elements F ∈ n− (H) are the antisymmetric tensors of degree n ∈ N. (0) (1) With P− λ  λ ∈ C = 0− (H) and P− f  f ∈ H = 1− (H), we now define a projection operator P− on the Fock space (H) by F=

∞ 7 n=0

Fn → P− F 

∞ 7

with Fn ∈ H⊗n

(n)

P− Fn ,

(8.63)

n=0

The restriction of this operator to H⊗n agrees with P− . The closed linear subspace (n)

 − (H)  P− (H) =

∞ 7

P− H⊗n (n)

(8.64)

n=0

is the Fock space of antisymmetric tensors or the Fermion Fock space. As an example, take the Hilbert space H = L2 (R, dμ) as in the Fock example. Then the tensors in n− (H), n ∈ N, are functions fn− (x1 , . . . , xn ) ∈ L2 (Rn , dn μ) that are antisymmetric with respect to the exchange of the variables xj ↔ xk . An element of the Fock space  + (H) is a sequence f = {f0 , f1 , . . .} of tensors f0 ∈ C and antisymmetric functions fn ∈ L2 (Rn , dn μ) with a finite norm (8.27).

8.4 Antisymmetric Tensors

153

8.4.2 The Grassmann Algebra The algebraic subspace of antisymmetric tensors of degree n is denoted by (n) ⊗n − = (H) ∩ n− (H) ⊂ H⊗n . The tensor product A− n (H), i.e. An (H) = P− H (8.22) can be defined for arbitrary tensors of the space 7 − A− (H)  P− (H) = A− (8.65) n (H) = (H) ∩  (H), n≥0

and the bilinear mapping F, G ∈ A− (H) → P− (F ⊗ G) ∈ A− (H)

(8.66)

is a product within A− (H). The antisymmetrization operator satisfies the identity P− (P− F ⊗ P− G) = P− (F ⊗ G)

(8.67)

for tensors F ∈ H⊗m and G ∈ H⊗n and m, n ∈ N. For the proof of this statement, see e.g. Dieudonn´e (1972). As a consequence of (8.67), the product (8.66) is associative. Similar to the symmetric case, one can define many associative products on A− (H), but there is exactly one – denoted as exterior product or Grassmann product F ∧ G – that has the desired factorization property. This exterior product is defined by + (m + n)! − − P− (F ⊗ G) ∈ A− (8.68) F ∈ Am , G ∈ An → F ∧ G = m+n m! n! for tensors of fixed degree m, n = 0, 1, . . ., and it is extended to a product on A− (H) by linearity. The proof of the associativity follows as in Proposition 8.3.1 for the symmetric tensor algebra. Moreover, the product (8.68) satisfies the following decomposition given in equation (8.69). Lemma 8.4.1 If H1 and H2 are two orthogonal subspaces of H, then the inner product of the tensors F1 ∧ F2 and G1 ∧ G2 with Fk , Gk ∈ A− (Hk ), k = 1, 2, factorizes into

F1 ∧ F2 | G1 ∧ G2 = F1 | G1 F2 | G2 .

(8.69)

This identity can be derived with essentially the same arguments as given in the proof of Lemma 8.3.2. But one can also use identities for determinants; see the appendix in Dieudonn´e (1972). The space A− (H) with the product (8.68) is called the Grassmann algebra. Its unit is the vector 1 ∈ C = A− 0 (H). As a consequence of (8.58), the exterior product satisfies the rule F ∧ G = (−1)mn G ∧ F

(8.70)

154

Introduction to Fock Spaces

− if F ∈ A− m and G ∈ An . For vectors in H, this product is antisymmetric f ∧ g = −g ∧ f and all powers f ∧ f , f ∧ f ∧ f , and so on of a vector f ∈ H vanish. From the definition (8.68) and the norm identity (8.25), we obtain + (m + n) ! *Fm * *Gn * *Fm ∧ Gn * ≤ (8.71) m! n! − − for F ∈ A− m (H) and G ∈ An (H). Hence the tensor product of Am (H) and − − (H) and An (H) is continuous, and it can be extended to the Hilbert spaces m − (H) and G ∈  − (H), we n− (H) by continuity. For arbitrary tensors Fm ∈ m n n − then obtain Fm ∧Gn ∈ m+n (H) with the norm estimate (8.71). But the exterior product is not continuous on A− (H), and it has no continuous extension onto the whole Fock space  − (H). As in the case of symmetric tensors, we derive some useful√results for decomposable antisymmetric tensors. The tensors f1 ∧ · · · ∧ fm = m!PA f1 ⊗ √ · · · ⊗ fm and g1 ∧ · · · ∧ gn = n!P− g1 ⊗ · · · ⊗ gn with fj , gk ∈ H have the inner product f1 ∧ · · · ∧ fm | g1 ∧ · · · ∧ gn = 0 if m = n and

f1 ∧ · · · ∧ fn | g1 ∧ · · · ∧ gn = n! f1 ⊗ · · · ⊗ fn | PA g1 ⊗ · · · ⊗ gn  (−1)[σ ] f1 ⊗ · · · ⊗ fn | gσ (1) ⊗ · · · ⊗ gσ (n) = σ ∈Sn

=



(−1)[σ ]

σ ∈Sn

Hence the general rule is

fj | gσ (j) .

j=1



f1 ∧ · · · ∧ fm | g1 ∧ · · · ∧ gn =

n 

0 if m = n det (fj | gk ) if m = n.

(8.72)

As in the case of the symmetric tensor algebra, there are other normalizations of the inner product and of the exterior product. But these definitions should finally lead to (8.72).

8.4.3 Basis Systems If ek , k ∈ K ⊂ N is an ON basis of H, the nonvanishing exterior products of n basis vectors are the tensors ±ek1 ∧ ek2 ∧ · · · ∧ ekn , where the indices k1 , k2 , . . . , kn are n different numbers. If J ⊂ K is a subset of |J| = n elements {k1 , k2 , . . . , kn }, one can uniquely order these numbers k1 < · · · < kn . Then we use the notation eJ  ek19∧ e9k2 ∧ · · · ∧ .ekn . As/ a consequence of (8.72), these tensors are normalized, 9eJ 9 = 1, and eJ | eK = 0 if J = K. Hence the tensors eJ with J ⊂ K, |J| = n form an ON basis of n− (H). With the notation

8.4 Antisymmetric Tensors

155

e∅ = 1 ∈ C for the basis vector of 0− (H), the Fock space of antisymmetric   8∞ −  (H) has the basis e | J ⊂ K, |J| < ∞ . The tensors  − (H) = J n=0 n index set of this basis is the powerset Power(K), i.e. the set of all finite subsets of K. If dim H = d < ∞, the set K = {1, 2, . . . , K} has K = d elements, and the set of all subsets J ⊂ M with |J| = n has dn elements. The space of antisymmetric tensors of degree p has therefore the dimension dim n− (H) = d − n if 0 ≤ n ≤ d, and dim n (H) = 0 if n > d. Hence the Fock space has the − finite dimension dim  (H) = 2d . For many calculations, it is convenient to use a notation based on occupation numbers, as in the case of symmetric tensors. With the definitions e∧0 k = 1, ∧1 ek = ek , we can write the basis eJ in the form 1 2 e− (n1 , n2 . . .)  e∧n ∧ e∧n ∧ ··· , 1 2

(8.73)

where the antisymmetry restricts the occupation numbers as nk ∈ N−  {0, 1} . In this notation, the basis of n− (H) is e− (n1 , n2 . . .)

with (n1 , n2 . . .) ∈ N∞ −,

(8.74) 

nk = n.

(8.75)

k∈M

The basis of the Fock space  − (H) is given by (8.73) with the set of all terminating sequences (n1 , n2 . . .) ∈ (N− )K as the index set. The norm estimate (8.71) implies that the exterior product is well defined on − (H) = the linear span of the Hilbert spaces n− (H), n ≥ 0. Hence the space fin −  (H) ∩ fin (H) is an algebra with the exterior product. If dim H is infinite, the − (H) is strictly larger than A− (H). algebra fin The ON basis for Fermion Fock space may be alternatively labeled by the finite subsets of K, as each state ek may occur at most once. The Fermion Fock space is finite dimensional whenever its one-particle space is finite dimensional. However, when the one-particle space H is infinite dimensional, both the Boson and Fermion Fock space over H are infinite dimensional too, and so are isomorphic as Hilbert spaces. Indeed, one may construct unitary maps between the Boson and Fermion Fock spaces in this case. As an example, we take the Fock spaces  + (H) and  − (H) over H = L2 (R, dμ), where dμ is a nonatomic measure on R. The tensors in n± (H), n ∈ N, are functions fn± (x1 , . . . , xn ) ∈ L2 (Rn , dn μ), which are symmetric/antisymmetric in the variables (x1 , . . . , xn ) ∈ Rn ; see the corresponding examples for symmetric and antisymmetric Fock spaces. Then 0+ (H) ' 0− (H) ' C and the the tensor spaces n± (H), n ∈ N, are isomorphic

156

Introduction to Fock Spaces

with the simple identification fn+ ' fn− if fn+ (x1 , . . . , xn ) = fn− (x1 , . . . , xn ) for x1 < x2 < · · · < xn . The diagonals xj = xk do not count, since the measure is nonatomic. These isometric isomorphisms imply an isometric isomorphism between the Fock spaces  + (H) and  − (H). In both cases of symmetric and antisymmetric tensors, the norm (8.27) can be written as ∞ '  2 2 |fn (x1 , . . . , xn )|2 dn μ. *f * = |f0 | + n! (8.76) n=1 x1 α>0 (α) (H) ⊂  (H). The exponential vectors are obviously + (H). The elements of the spaces (α) (H), 0 ≤ α < 1, and of the algebra > inner product of two exponential vectors follows from (9.64) by polarization:

exp( f ) | exp(g) = e f |g .

(9.66)

The normalized exponential vectors are usually denoted as coherent states: coh( f ) = e− 2 *f * exp( f ). 1

2

(9.67)

The series (9.65) satisfies the functional relation of an exponential: exp( f ) ∨ exp(g) = exp( f + g).

(9.68)

Hence the linear span of the exponential vectors A+ coh (H)  Lin {exp( f ) | f ∈ H}

(9.69)

is an algebra with the product ∨. The following inclusions are obvious: + + + A+ coh (H) ⊂ (α) (H) ⊂ > (H) ⊂  (H),

0 < α < 1,

+ (H) are Fock scales as introduced in Section 8.3.2. where the spaces (α) Another direct consequence of the series expansion (9.65) is a simple relation for the operators (9.13) and (9.19). For operators S, M defined on the set D ⊂ H, we have the identities

(S) exp( f ) = exp(Sf ),

and

(9.70)

d(M) exp( f ) = (Mf ) ∨ exp( f )

(9.71)

are true for f ∈ D. The function h ∈ H → exp(h) has an absolutely convergent power series expansion at h = 0. It is therefore infinitely differentiable at h = 0. As a consequence of (9.68), the mapping h ∈ H → exp( f + h) has an absolutely convergent power series expansion at any point f ∈ H. The function (9.65) is therefore an analytic function on the Hilbert space H; see Hille and Phillips (1957, Secs. 3.16–3.18). The estimate exp( f + h) − exp( f ) = exp( f ) ∨ (exp(h) − 1) = h ∨ exp( f ) + O(*h*2 ) implies that the derivative of exp( f ) at f in direction h ∈ H is exp( f ) ∨ h = h ∨ exp( f ). Lemma 9.2.1 Let D be a dense linear subset of H, then the linear span of the exponential vectors {exp( f ) : f ∈ D} is a dense subset of  + (H).

174

Operators and Fields on the Boson Fock Space

Proof We first assume that D = H and prove the theorem for this case. Let T be the completion of the linear span of {exp( f ) : f ∈ H}. The function Cn , (α1 , . . . , αn ) → n (α1 , . . . , αn )  exp (α1 f1 + · · · + αn fn ) ∈ T ⊂  + (H) with vectors fj ∈ H, j ∈ {1, . . . , n} , n ∈ N, is analytic, and all its derivatives ∂n are elements of T. The derivative ∂α n (α1 , . . . , αn ) at α1 = · · · = αn = 0 is ∂n n (α1 , . . . , αn ) = f1 ∨ · · · ∨ fn ∈ A+ n (H). ∂α n n

(9.72)

n

∂ ∂ Thereby the symbol ∂α n means ∂α ∂α ...∂α . The tensor exp (0) = 1 and the 1 2 k derivatives (9.72) with n = 1, 2, . . . , span the algebra A+ (H). Hence T coincides with  + (H). Now assume that D is a dense subset of H. The function exp( f ) is continuous in the norm topologies of the spaces H and  + (H). Therefore, the completion of the linear span of {exp( f ) : f ∈ D} coincides with the completion of the linear span of {exp( f ) : f ∈ H}. Hence the Lemma is true.

Assume the Hilbert space H has an involution (8.5), then HR = { f = f % : f ∈ H} is the Hilbert space of real elements. In the proof of Lemma 9.2.1, we can choose the vectors fj as elements of HR . The C-linear span of all tensors f1 ∨· · ·∨fn with fj ∈ HR spans A+ n (H). Hence we obtain the following corollary. Corollary 9.2.2 The linear span of the exponential vectors with real test functions, {exp( f ) : f ∈ HR }, is a dense subset of  + (H). As a consequence of Lemma 9.2.1, a tensor F ∈  + (H) is uniquely determined by the function H , h → exp(h) | F ∈ C. Corollary 9.2.3 A linear operator K on  + (H), which is defined on a linear domain including the exponential vectors, is uniquely determined by the images on the set of exponential vectors K exp( f ), f ∈ H, or by the matrix elements

exp(g) | K exp( f ) ∈ C for f , g ∈ H. The domain of definition of the creation and annihilation operators (9.27) and (9.28) includes the exponential vectors. We obtain ∂ exp( f + λh) |λ=0 , (9.73) ∂λ where the last identity is valid for real and for complex λ. The relation A∗ (h) exp( f ) = h ∨ exp( f ) =

exp(g) | A(h) exp( f ) =



exp(g + λh) | exp( f ) |λ=0 = e g|f h | f ∂λ

implies the identity A(h) exp( f ) = h | f exp( f ).

(9.74)

9.2 Exponential Vectors and Weyl Operators

175

By little algebra, one derives that the canonical commutation relations (9.34) are valid on the linear span of the exponential vectors. If S is a unitary operator on H, then the relations (9.70) and (9.73) imply the identities (S−1 )A∗ (h)(S) = A∗ (S−1 h), (S−1 )A(h)(S) = A(S−1 h).

(9.75)

9.2.2 Weyl Displacement Operators We now extend the notion of Weyl unitaries to Fock space. The Weyl displacement operators D( f ), f ∈ H, are first defined on the linear span of the exponential vectors by D( f ) exp(h)  e− f |h − 2 *f * exp( f + h). 1

2

(9.76)

For f = 0, we obtain D(0) = I. The calculation of the product D( f )D(g) D( f )D(g) exp(h)

1 1 1 2 = exp − f + g | h − * f + g* + g | f − f | g exp( f + g + h) 2 2 2 1 = exp ( g | f − f | g ) D( f + g) exp(h) 2 yields first that D( f )D(−f ) = D(−f )D( f ) = I. Hence the operators (9.76) are invertible with D−1 ( f ) = D(−f ). Other consequences are the Weyl relations D( f )D(g) = eω( f ,g) D( f + g),

(9.77)

with the R-bilinear antisymmetric form ω( f , g) = − 12 f | g + 12 g | f = −i Im f | g . The matrix element between exponential vectors follows from (9.66) as

1

exp(g) | D( f ) exp(h) = exp g | f + h − f | h − * f *2 . (9.78) 2 A special case of this identity is the vacuum expectation

1

| D( f ) = exp − * f *2 . 2 Since the calculation of (D(−f ) exp(g) | exp(h)) yields again the result (9.78), we obtain (D(h))∗ = D(−h) = D−1 (h) on the linear span of the exponential vectors, which is dense in  + (H). The Weyl operators can therefore be extended to unitary operators on the Fock space  + (H).

176

Operators and Fields on the Boson Fock Space

The Weyl relations imply that for fixed h ∈ H the operators R , λ → D(λ h) form a one-parameter unitary group. The generator of this group fold D(λ f ) exp(h) |λ=0 = − f | h exp(h) + f ∨ exp(h) = lows from (9.76) as dλ ∗ (−A( f ) + A ( f )) exp(h). Hence the Weyl operator D(h) is given by D( f ) = exp A∗ ( f ) − A( f ) (9.79) on the entire Fock space  + (H). The operator (9.79) is the extension of the displacement operator defined in Section 3.2.1 to arbitrary finite and infinite dimensions. The exponential series exp A∗ (h) and exp A(h), h ∈ H, are contin+ + (H) into (β) (H) if 1 > α > β ≥ 0. The uous mappings from the spaces (α) identity 1 2 D(h) = exp A∗ (h) exp (−A(h)) e− 2 *h* can be easily verified on the linear span of the exponential vectors. It is there+ (H). fore also true on > If U is a unitary operator on H, then (9.75) and (9.79) imply the identity (U ∗ )D(h)(U) = D(U ∗ h).

(9.80)

The relations (9.73) and (9.76) imply d D(h) exp(g − h + λf ) (9.81) dλ 1 2 d − h|g − 1 *h*2 2 e = e h|g − 2 *h* exp(g + λf ) |λ=0 dλ (9.82)

D(h)A∗ ( f )D(−h) exp(g) = e h|g − 2 *h* 1

2

= − h | f exp(g) + f ∨ exp(g).

(9.83)

Hence D(h)A∗ ( f )D−1 (h) = A∗ ( f ) − h | f I follows for all h ∈ H and f ∈ H, and we have derived the identities D(h)A∗ ( f )D−1 (h) = A∗ ( f ) − h | f I, D(h)A( f )D−1 (h) = A( f ) − f | h I.

(9.84)

There is an alternative version of the Weyl operator that uses the self-adjoint ˆ Segal field operator (9.36) (h) = A∗ (h) + A(h):   ˆ W(h)  exp i (h) = D (ih) . (9.85) For subsequent use, we note the Weyl relations (9.77) for the operators W(h): W( f )W(g) = e−iIm f |g W( f + g).

(9.86)

The relations (9.84) are equivalent to ˆ f )W −1 (h) = ( ˆ f ) − 2 Im h | f I. W(h)(

(9.87)

9.3 Distributions of Boson Fields

177

If W(h), h ∈ H, are unitary operators on  + (H), which satisfy the Weyl relations (9.86),   then R , λ → W(λh) is a one-parameter group W(λh) = ˆ ˆ exp iλ(h) . The self-adjoint generators (h) are R-linear in h and fulfill the canonical commutation relations (9.38). This statement follows from (9.86)

W(λh)W(γ f )W(−λh) = exp (−λγ ( h | f − f | h )) W(γ f ) by differentiation with respect to the real parameters γ and λ. A canonical structure is therefore uniquely defined by the Weyl operators.

9.2.3 Free Dynamics of Weyl Operators Let H1 be the positive one-particle Hamiltonian on H. The Hamiltonian on the Fock space  + (H) is H = d(H1 ) and the dynamics on  + (H) is induced by the unitary group U(t) = exp(−iHt) = (U1 (t)) with U1 (t) = exp(−iH1 t). As a consequence of (9.70), the time evolution of a coherent state is U(t) exp( f ) = exp(U1 (t)f ).

(9.88)

The time evolution of a Weyl operator follows from (9.80): (9.89) U ∗ (t)D(h)U(t) = D(U1∗ (t)h). / . ∗ The skew-symmetric form, Im f | g = Im U1 (t)f | U1∗ (t)g , of the Weyl relations (9.77) does not depend on the time, and the Weyl relations are invariant against the free dynamics. This result agrees with the invariance of the canonical commutation relations derived in Section 9.1.4.

9.3 Distributions of Boson Fields 9.3.1 Gaussian Fields ˆ f ) = A∗ ( f ) + A ( f ) from We recall the definition of the Segal operator field ( (9.36). In the vacuum state, we have the matrix elements  ˆ f1 ) = ˆ fn ) . . . (

| Aε(n) ( fn ) . . . Aε(1) ( f1 ) ,

| ( ε

where fn , . . . , f1 are given one-particle vectors. Here we have that ε (k) denotes the presence or absence of * on A ( fk ), and we have a sum over the 2n possibilities ε = (ε (n) , . . . , ε (1)). Note that not all sequences ε contribute. If we set s(k) = −1 when the kth operator is an annihilator, and s(k) = +1 when it is a creator, then the partial n  sums m k=1 s(k) must never go negative (1 ≤ m ≤ n), and k=1 s(k) must be

178

Operators and Fields on the Boson Fock Space

Figure 9.1 Creation and annihilation vertices. zero. Such sequences are called Catalan sequences, and these are the only ones that are not zero. We give a simple argument to compute | Aε(k) ( fn ) . . . Aε(1) ( f1 ) : ∗ every an  expression . . . A( fi )A ( fj ) . . ., we replace it with  time we ∗encounter . . . fi | fj + A fj A ( fi ) . . .; the term fi | fk is a scalar and can be brought outside the expectation, leaving a product with two fewer fields to average. Ultimately, we must pair up every creator with an annihilator; otherwise, we get an expression that averages zero due to (9.31). Therefore, only the even moments are nonzero and we obtain the identity 

| Aε(2n) ( f2n ) . . . Aε(1) ( f1 ) =

ε

n  

fpk |fqk .

(9.90)

Pair(2n) k=1

Here (pk , qk )nj=1 ∈ Pair(2n) is a pair partition: the pk correspond to annihilators and the qk to creators, so we must have pk > qk for each j; the ordering of the pairs is unimportant, so for definiteness we take qn > · · · > q2 > q1 . We may picture this as follows: for each i ∈ {1, 2, . . . , 2n}, we have a vertex; with A∗ ( fi ) we associate a creator vertex with weight fi and with A( fi ) we associate an annihilator vertex with weight fi . A matched creation/annihilation pair (pk , qk ) is called a contraction over the creator vertex qk and annihilator vertex pk , it corresponds to a multiplicative factor fpk |fqk , and is shown pictorially as a single line (Figure 9.1). We then consider a sum over all possible diagrams. As an illustrative example, the fourth-order expression is ˆ ( f2 ) ( ˆ f1 ) ˆ ( f3 )  ˆ f 4 )

| ( = f4 |f3 f2 |f1 + f4 |f2 f3 |f1 + f4 |f1 f3 |f2 .

(9.91)

Here we have |Pair(4)| = 24! 2 2! = 3, and the three pair partitions, which are {{4, 3}, {2, 1}}, {{4, 2}, {3, 1}}, and {{4, 1}, {3, 2}}, may be sketched as in Figure 9.2.

9.3 Distributions of Boson Fields

179

Figure 9.2 The diagrams corresponding to fourth-order moments. Setting all the fk equal to a fixed test function f , we obtain ˆ f )k = | [A∗ ( f ) + A( f )]k = *f *k |Pair(k)|.

| ( We then see that the observable Q( f ) has a mean-zero Gaussian distribution of variance *f *2 for the Fock vacuum state. We could have obtained this more expressly by noting that ˆ

1

(9.85)

| eit( f ) = | D (tf ) = e− 2 *f * t . 2 2

However, we have also obtained a more general moment formula by our current calculation. The result of the calculation can be summarized in the formulas ˆ fn ) . . . ( ˆ f1 ) = 0, n odd,

| ( ˆ f2m ) . . . ( ˆ f1 ) =

| (

n  

ˆ fpk )( ˆ fqk ) , n even. (9.92)

|(

Pair(2m) j=1

The summation extends over pair partitions (pk , qk )m j=1 ∈ Pair(2m) with pk > qk and qm > · · · > q2 > q1 . This structure of the vacuum expectation values characterizes a Gaussian field. It will appear again in Section 11.1, where the free relativistic Boson field is investigated.

9.3.2 Poissonian Fields Now let us introduce the field observables N ( f , g) = (A ( f ) + 1)∗ (A (g) + 1) = A∗ ( f ) A (g) + A (g) + A∗ ( f ) + 1   β α  A∗ ( f ) A (g) ≡ α,β∈{0,1}

where we employ a convention that [x]0 = 1 and [x]1 = 1 for any algebraic variable x. We now wish to study matrix elements of the form

| N( fn , gn ) . . . N ( f1 , g1 ) .

180

Operators and Fields on the Boson Fock Space

Figure 9.3 Typical Poissonian diagram; see the text. By expanding each term, we may write this as a sum over all expressions of the type

| A∗ ( fn )α(n) A(gn )β(n) . . . A∗ ( f1 )α(1) A(g1 )β(1) taken over all α, β ∈ {0, 1}n . This time, in the diagrammatic description, we have n vertices, with each vertex being one of four possible types A∗ A, A∗ , A, 1:

A typical situation is depicted in Figure 9.3. Evidently we must again join up all creation and annihilation operators into pairs. However, get creation, multiple scattering, and annihilation as the rule; otherwise, we have a stand-alone constant term of unity at a vertex. In Figure 9.3, we can think of a particle being created at vertex i(1), then scattered at i(2), i(3), i(4) successively before being annihilated at i(5). (This component has been highlighted using thick lines.) Each such component corresponds to a unique part, here {i(5), i(4), i(3), i(2), i(1)}, having two or more elements; singletons may also occur, and these are just the constant term vertices. Therefore, every such diagram corresponds uniquely to a partition of {1, . . . , n}. Once this observation is made, it is easy to see that 

| A∗ ( fn )α(n) A(gn )β(n) . . . A∗ ( f1 )α(1) A(g1 )β(1) α,β∈{0,1}n

=





gi(k) |fi(k−1) · · · gi(3) |fi(2) gi(2) |fi(1) . (9.93)

π ∈Part(n) {i(k)>···>i(2)>i(1)}∈π

If we now take all the fk and gk equal to a fixed f , then we arrive at4  n   n n

| N ( f , f ) = *f *2(n−m) . m m=0

4 Note that a part of size k contributes *f *2(k−1) , so a partition of n vertices into m blocks of sizes k1 , . . . km contributes *f *2(k1 +···+km −m) = *f *2(n−m) .

9.3 Distributions of Boson Fields

181

It therefore follows that, for each normalized vector φ ∈ h and λ > 0, the observable

√ √ 1 1 M(φ, λ) = λN √ φ, √ φ = (A∗ (φ) + λ)(A(φ) + λ) (9.94) λ λ has a Poisson distribution of intensity λ in the Fock vacuum state. Again, we can arrive at this from an alternative point of view. We see that D( f )∗ A∗ (φ)A(ψ)D( f ) = [D( f )∗ A(φ)D( f )]∗ D( f )∗ A(ψ)D( f ) = A∗ (φ)A(ψ) + ψ|f A∗ (φ) + f |φ A(ψ) + f |φ ψ|f . √ So we set ψ = φ with *φ* = 1 and f = λφ to get √ √ D( f )∗ A∗ (φ)A(φ)D( f ) = A∗ (φ)A(φ) + λA∗ (φ) + λA(φ) + λ ≡ M (φ, λ) , and so

| eitN(φ,λ) = D( f ) | eitA

∗ (φ)A(φ)

D( f )

−*f *2

exp( f )|(eit|φ φ| ) exp( f )

−*f *2

exp( f )| exp( f + φ|f (eit − 1)φ)

=e

=e

= exp{λ(eit − 1)}.

9.3.3 Exponentially Distributed Fields Is there a similar interpretation for Stirling numbers of the first kind as well? Here we should be dealing with cycles within permutations rather than parts in a partition. Consider the following representation of a cycle (i(1), i(2), . . . , i(6)).

To make the sense of the cycle clear, we are forced to use arrows and therefore we have two types of lines. In any such diagrammatic representation of a permutation, we may encounter any of the following five possible of vertices:

182

Operators and Fields on the Boson Fock Space

An uncontracted (constant) vertex indicates a fixed point for the permutation. Let us consider the one-dimensional case first. The preceding suggests that we should use two independent (that is, commuting) Bose variables, say b1 , b2 . Let us set b  b1 ⊗ 12 + 11 ⊗ b∗2 . Then we see that b∗ will actually commute with b, and Wick ordering b∗ b gives b∗ b = b∗1 b1 ⊗ 12 + 12 ⊗ b∗2 b2 + b1 ⊗ b2 + b∗1 ⊗ b∗2 + 11 ⊗ 12 , and here we see the five vertex terms we need. Let 1 and 2 be the vacuum state for b1 and b2 respectively, then let = 1 ⊗ 2 be the joint vacuum state. We wish to show that N = b∗ b has a Gamma distribution of unit power (an exponential distribution!) in this state. First of all, let q = b∗ + b so that q = q1 ⊗ 12 + 11 ⊗ q2 , where qk = bk + b∗k . Now each qk has a standard Gaussian distribution in the corresponding vacuum 2 state ( k | etQk k = et /2 ), and so 2

| etQq = 1 | etq1 1 2 | etq2 2 = et . Therefore, | q2n = 2n |Pair(2n)| = (2n)! .  2n ∗ m 2n−m n! 2n (remember that b∗ and b commute!), However, q = m m (b ) (b) 2n 2n ∗ n n ∗ n and so | q ≡ n | (b ) (b) = 2n n | (b b) . Therefore,

| (b∗ b)n = n! and so, for t < 1,

| exp{tb∗ b} =

1 . 1−t

Therefore, N = b∗ b has an exponential distribution in the joint vacuum state. The generalization of this result to Bosonic fields over a Hilbert space h is straightforward enough. Theorem 9.2.4 Let h be a separable Hilbert space and let j be an conjugation on h. Define operator fields B(.) on the double Fock space (h) ⊗ (h) by B( f ) = B1 ( f ) ⊗ 12 + 11 ⊗ B∗2 (j f ),

(9.95)

9.4 Thermal Fields

183

where the Bi (.) are usual Bosonic fields on the factors (h), i(= 1, 2). Then [B( f ), B∗ (g)] = 0 for all f , g ∈ h and if N( f , g)  B∗ ( f )B(g), then we have the following expectations in the joint Fock vacuum state = 1 ⊗ 2:

| N( fn , gn ) . . . N( f1 , g1 ) = B( fn ) . . . B( f1 ) |B(gn ) . . . B(g1 ) = perm jfj | jgk = perm gj | fk ,

(9.96)

where perm is the permanent (8.52). The proof should be obvious at this point, so we omit it. The sum is over all permutations σ ∈ Sn , and each permutation is decomposed into its cycles; the product is then over all cycles (i(1), i(2), . . . , i(k)) making up a particular permutation. We note that the representation corresponds to a type of infinite dimensional limit to the double-Fock representation for thermal states.

9.4 Thermal Fields We now come to a well-known trick for representing thermal states of Bose systems. For definiteness, let h = L2 (R3 ) be the space of wave functions in the momentum representation, then the thermal state · β,μ is characterized as the Gaussian (sometimes referred to as quasifree) mean zero state with '

A∗ ( f )A( f ) β,μ = |f (k)|2 nβ,μ (k)d3 k, where nβ,μ (k) = (eβ[E(k)−μ] − 1)−1 . (We have the physical interpretation of β as inverse temperature, μ as chemical potential, and E(k) the energy spectrum function.) Let us denote by nβ,μ the operation of pointwise multiplication by the function nβ,μ (·) on h, that is, (nβ,μ f )(k) = nβ,μ (k)f (k). We check that [A∗ ( f ) + A( f )]2 β,μ = f |Cβ,μ f β[E − μ] . with Cβ,μ = 2nβ,μ + 1 ≡ coth 2 Theorem 9.4.1 (Araki–Woods) Let h be a separable Hilbert space and j an conjugation on h. Define operator fields B(.) on (h) ⊗ (h) by √ √ B( f ) = B1 ( n + 1f ) ⊗ 12 + 11 ⊗ B∗2 (j nf ), (9.97) where the Bi (.) are usual Bosonic fields on the factors (h), (i = 1, 2), and n is a positive operator on h. Then the fields satisfy the canonical commutation

184

Operators and Fields on the Boson Fock Space

relations (9.34) and their moments in the joint Fock vacuum state = 1 ⊗ 2 are precisely the same as for the thermal state when we take n = nβ,μ . To see this, note that

| ei[A

∗ ( f )+A( f )]





= 1 | ei[A1 ( f )+A1 ( f )] 1 2 | ei[A2 ( f )+A2 ( f )] 2     1 1 = exp − f |(n + 1)f exp − f |(n)f 2 2   1 = exp − f |Cf , 2

where C = 2n+1 is the covariance of the state. The first factor describes the socalled spontaneous emissions and absorptions while the second factor describes the so-called stimulated emissions and absorptions. In the zero temperature limit, β → ∞, we find that n → 0, and so the second-factor fields (stimulated emission and absorption) become negligible. We remark that for this state A∗ ( f )A( f ) = f |nf ≥ 0, and so n must be a positive operator. In particular, we must therefore have the constraint C ≥ 1.

(9.98)

The representation (9.97) has been introduced by Araki and Woods (1963). It has become an important ingredient of quantum field theory at finite temperature, see e.g. Chap. 2 of the review article Landsman and Weert (1987).

9.5 q-deformed Commutation Relations So far, we have presented the two staple examples from quantum field theory – Bose and Fermi fields. However, there has been much interest in other possibilities. Let us postulate canonical q-deformed commutation relations of the form A( f )A(g)∗ = q A(g)∗ A( f ) + f |g .

(9.99)

The cases q = ±1 are the Bose and Fermi cases. The case q = 0 is of particular importance and leads to what is now known as Free statistics. While there are no fundamental particles obeying free statistics, it turns out that these models capture noncommutative central limit results that are of enormous importance to random matrix theory. In particular,  0 (H) will be the Fock space introduced in Section 8.2.3. We refer the reader to the lecture notes of Voiculescu et al. (1975) and of Nica and Speicher (2006). Roughly speaking, for −1 < q < 1 it is possible to construct a Hilbert space  q (H) of decomposable tensors on which we may define concrete operators

9.5 q-deformed Commutation Relations

185

satisfying (9.99) for −1 ≤ q ≤ 1; see Bozejko et al. (1997) and Saitoh and Yoshida (2000b). We define a sesquilinear form 

f1 ◦ · · · ◦ fn |g1 ◦ · · · gm q  δn,m qi(σ ) f1 |gσ (1) · · · fn |gσ (n) , σ ∈Perm(n)

  where i(σ )  # (j, k) : 1 ≤ j < k ≤ n, σ (j) > σ (k) is the number of inversions of the permutation σ , and take  q (H) to be the completion of the finite number vectors with respect to the corresponding norm. The creator is defined by A(g)∗ f1 ◦ · · · ◦ fn = g ◦ f1 ◦ · · · ◦ fn ,

(9.100)

while the annihilator A(g) is the following generalization of (9.30): A(g) f1 ◦ · · · ◦ fn =

n  . / (q)j−1 g | fj f1 ◦ · · · ◦ fj−1 ◦ fj+1 ◦ · · · ◦ fn .

(9.101)

j=1

In addition, we have the vacuum vector , and here we have A(g)∗ = g;

A(g) = 0.

(9.102)

The operators A(g) and A(g)∗ are then bounded for −1 ≤ q < 1 with norm √ *g*/ max{1, 1 − q}.

9.5.1 q-deformed Gaussian Distribution One may then use (9.99) repeatedly to get a q-deformed Wick ordering rule. We will consider | Aε(2n) ( f2n ) . . . Aε(1) ( f1 ) for a Catalan sequence ε. To this end, let us express π ∈ Pair(2n) as a collection of pairs (pk , qk )nj=1 with the pk correspond to annihilators and the qk to creators, so we must have pk > qk for each j and fix the ordering of the pairs as qn > · · · > q2 > q1 . The number of crossings associated with the pair partition is defined to be   (9.103) Nc (π )  (j, k) : pj > pk > qk > qj . The reason for referring to these as crossings is immediate from the situation in Figure 9.4. Note that we have focused on just the jth and kth pair for clarity, and in general there will be other pairs (omitted). One finds a remarkable generalization of equation (9.90).  ε

| Aε(2n) ( f2n ) . . . Aε(1) ( f1 ) =

 Pair(2n)

qNc (π )

n 

fpk |fqk .

k=1

(9.104)

186

Operators and Fields on the Boson Fock Space

Figure 9.4 A crossing pair of partitions. From this, we obtain the generating function for the distribution of the ˆ f ) = A∗ ( f ) + A ( f ): q-deformed Segal operator field ( ˆ

| eit( f ) =

∞  *f *2n n=0

(2n)!



qNc (π ) .

(9.105)

π ∈Pair(2n)

∞ ˆ Let the distribution be P, so that | eit( f ) = −∞ eitx P[dx], then one may obtain a continued fraction expression; see Bozejko and Yoshida (2006). ' ∞ [0]q 1 P[dx] = (9.106) [1]q z − x −∞ z− [2]q z−

z−

where [n]q =

1−qn 1−q

=

1 + q + · · · + qn−1 ,

[3]q

..

,

.

[0]q = 1.

∞ 1 In the case q = 0, we have the simple identity for H(z) = −∞ z−x P[dx] that √ 1 1 2 H(z) = z−H(z) , which leads to H(z) = 2 z ± z − 4 , and one may calculate the Hilbert transform at the boundary to get P[dx] ≡ *(x) dx, where 1 Im H(x − i0+ ) dx π  1√ 2 |x| ≤ 2; 2π 4 − x , = 0, |x| > 2.

*(x) ≡

(9.107)

This is the well-known Wigner semicircle law. It turns out that for −1 < q < 1, the distributions are always absolutely continuous and compactly supported. For q = 1, we have, of course, the standard Gaussian. For the Fermi case, q = −1, we have that [n]−1 is one, for n odd, a zero, for 1 which n > 0 even, leading to the terminating continued fraction H(z) = z−1/z is easily seen to correspond to the discrete distribution 12 δ1 [dx] + 12 δ−1 [dx]. (This is, of course, a fair coin – but also corresponds to what we termed the standard Fermionic Gaussian distribution, for reasons that ought to be by now apparent.) In general, for −1 < q < 1, the density has been shown by Bozejko et al. (1997) to take the form

9.5 q-deformed Commutation Relations

*(x) =

⎧ ⎨ ⎩

1√ π 1−q

sin θ

∞



n=1

 (1 − qn )|1 − e2iθ qn |2 ,

187

|x| ≤

0,

where θ ∈ [0, π ] is defined by x =

√2 ; 1−q

otherwise,

√2 1−q

cos θ .

9.5.2 q-deformed Poisson Distribution We may define a number operator by  √  √  N = A ( f )∗ + λ A ( f ) + λ and we similarly find that

| eitN =

∞  * f *n n=0

n!



qNc (π ) (it)N(π ) ,

π ∈Part(n)

where again we have a modification of the Bose formula involving the deformation parameter q raised to the power of the crossing number for partitions. Remarkably, there is a closed formula for the probability distribution due to Saitoh and Yoshida (2000a). Taking *f * = 1, this will be P [dx] = * (x) dx +

N max 

pn δνn [dx] ,

n=0

where the density is (1 − q) * (x) = 2π x ×

for

√ λ−



√1 1−q

:



2 1 4λ − x−λ− 1−q 1−q

 2 n )2 − (1 − q) qn x − λ − 1 λ + q (1 1−q ∞ n   , n=1 1 − q 2n q 1 qn x − λ − 1−q + λ + 1−q

2

product formula is true:

ϕF∨G (z) = ϕF (z) · ϕG (z).

(10.2)

+ (H), which Proof The product F ∨ G is defined for tensors in the algebra > includes the exponential vectors. We choose F = exp( f ) and G = exp(g). Then

exp(h) | exp( f ) ∨ exp(g) = exp(h) | exp( f + g) = e(h|f +g) = e h|f e h|g = exp(h) | exp( f ) exp(h) | exp(g) 189

190

L2 -Representations of the Boson Fock Space

follows, and (10.2) is true. By linearity, this result can be extended to tensors F and G in the linear span of the exponential vectors. Since this set is dense in  + (H) and the inner product is continuous, the result is true for all tensors, for which the product F ∨ G is defined. Hence the mapping F → ϕF (z) is an isomorphism between the tensor algebra A+ (H) and the multiplicative algebra of antiholomorphic polynomials ϕF (z), F ∈ A+ (H), with unit ϕ(z) ≡ 1. We now define an L2 -inner product for these antiholomorphic functions. Let dυ be the canonical Gaussian premeasure on (the underlying real space of) H. This measure has the Fourier–Laplace transform '

e v|z + u|z dυ = e u|v with u, v ∈ H.

(10.3)

H

Proposition 10.1.2 If F and G are tensors in A+ coh (H), then the following identity is true: '

ϕF (z)∗ ϕG (z)dυ = F | G .

(10.4)

Proof The functions ϕF (z) with F ∈ A+ coh (H) are cylinder functions and they are measurable with respect to dυ. If F and G are exponential vectors, the identity (10.4) is a consequence of (9.66) and (10.3). The extension to F, G ∈ A+ coh (H) follows by (anti)linear continuation.  A special case of (10.4) is |ϕF (z)|2 dυ = *F*2 . Hence (10.1) is an isometric mapping from A+ coh (H) into the pre-Hilbert space of the linear span of antiholomorphic exponential functions, which are square integrable with respect to dν. The completion of this pre-Hilbert space is denoted by La2 (H, dυ). By continuity, the mapping (10.1) can be extended to an isometric isomorphism between  + (H) and La2 (H, dυ). This representation of the Fock space by antiholomorphic functions is called the Bargmann–Fock or complex wave representation (Bargmann, 1961; Segal, 1962). In the literature, one often assumes that H has an conjugation z → z% . Then one can use the holomorphic functions ϕF (z% ) for the representation. If H is finite dimensional, H ∼ = Cn , the promeasure is the Gaussian measure 1 −*z*2 n n n d xd y on C , where x ∈ Rn and y ∈ Rn are the real and the dυ = π n e imaginary part of z = x + iy ∈ Cn . If H is infinite dimensional, one can extend the promeasure dυ to a σ additive measure on a space H> , which is strictly larger than H. The functions ϕF (z), z ∈ H> , are then square integrable with respect to this measure for

10.2 Wiener Product and Wiener–Segal Representation

191

all F ∈  + (H), but only a dense subset of these functions is continuous and antiholomorphic in the variable z ∈ H> .

10.2 Wiener Product and Wiener–Segal Representation Another important L2 -representation of the Boson Fock space is the Wiener– Segal representation or real wave representation that has been introduced by Wiener (1930) and Segal (1956). A stochastic version of this isomorphism – first derived by Itˆo (1951) – is presented in the next section. Let H be a Hilbert space with conjugation H , f → f % ∈ H. Then . / B( f , g)  f % , g ∈ C (10.5) is a bilinear nondegenerate symmetric form with modulus |β( f , g)| ≤ *f * *g*. On the linear span of the exponential vectors A+ coh (H), we define a linear invertible mapping ϒ by 1

ϒ exp( f )  exp( f − 2−1 B( f , f )) = e− 2 B( f , f ) exp( f ).

(10.6)

Since exp( f ) and exp( f −2−1 B( f , f )) are analytic in f , we can derive the action of ϒ on the algebra A+ (H) by power counting. This calculation begins with ϒ 1 = 1,

ϒ f = f,

ϒ ( f ∨ f ) = f ∨ f − B( f , f ) 1.

A closer investigation of this mapping is given in Kupsch and Smolyanov (1998), where it is called Wick ordering or normal ordering (of tensors). The mapping (10.6) can be extended to a continuous linear mapping from the + + (H) into the space (β) (H) if α > β ≥ 0, see Appendix A of Hilbert space (α) Kupsch and Smolyanov (1998). Starting from (10.6), it is possible to define a product with symbol . on A+ coh (H) by (exp f ) . (exp g)  e−B( f ,g) exp( f + g).

(10.7)

This product is symmetric and associative on A+ coh (H). Both sides of (10.7) are analytic functions of f and g. Calculating the linear parts in f and g, we obtain the bilinear mapping H × H , ( f , g) → f . g  f ∨ g − B( f , g) ∈ A+ (H),

(10.8)

which can be extended to an associative and commutative product F . G on A+ (H) called the Wiener product. With . F = F . = F, this algebra has the unit element = 1∈A+ 0 (H). Using the mapping (10.6), the formula (10.7) can be written as

192

L2 -Representations of the Boson Fock Space F . G  ϒ −1 (ϒF ∨ ϒG) ,

(10.9)

with F = exp( f ) and G = exp(g). By linear extension to F, G ∈ A+ coh (H), (H). With contithis formula defines a bilinear symmetric product on A+ coh nuity arguments, this product can be extended to a product on the spaces + (H), α > 0. (α) From Corollary 9.2.2, we know that a tensor F ∈  + (H) is uniquely determined by the function HR , x → exp(x) | F ∈ C. Given F ∈ A+ coh (H) ⊂  + (H), we define the function ψF (x)  exp(x) | ϒF .

(10.10)

Thereby the tensor F = 1 is represented by the function ψF (x) ≡ 1. Proposition 10.2.1 For tensors F and G in A+ coh (H), the following product formula is true: ψF.G (x) = ψF (x) · ψG (x). Proof

(10.11)

The formula (10.11) follows from (10.2) and (10.9).

The functions (10.10) with F ∈ A+ coh (H) are cylinder functions on HR , and they are measurable with respect to the canonical Gaussian pro-measure dμ(x) on HR . This pro-measure has the Fourier–Laplace transform ' e x| f dμ(x) = eB( f , f )/2 , f ∈ H. (10.12) HR

Lemma 10.2.2 If F and G are tensors in A+ coh (H), then the following identity is true ' (10.13) ψF (x)∗ ψG (x)dμ(x) = F | G . Proof It is sufficient to prove the identity (10.13) for F = exp( f ) and G = exp(g) with ψF (z) = exp(x) | ϒ exp( f ) = e−B( f , f )/2 e x| f and ψG (z) = e−B(g,g)/2 e x|g . Then the integral in (10.13) is calculated as   ∗ exp − 12 B( f % , f % ) − 12 B(g, g) e x| f +g dμ(x) (10.13)

=

(10.5)

exp B( f ∗ , g) = exp f | g ,

and the identity (10.13) is true for exponential vectors. The extension to tensors in A+ coh (H) follows by (anti)linearity.  A special case of (10.13) is |ψF (x)|2 dμ(x) = *F*2 . Hence F → ψF (x) = exp(x) | ϒF

(10.14)

10.3 It¯o–Fock Isomorphism

193

+ is an isometric mapping from A+ coh (H) ⊂  (H) into a pre-Hilbert space of functions ψF on HR , which are square integrable with respect to dμ. The completion of this pre-Hilbert space is denoted by L2 (HR , dμ). By continuity, the mapping (10.14) can be extended to an isometric isomorphism between  + (H) and L2 (HR , dμ). The representation of the Fock space is called the Wiener–Segal representation or real wave representation. If H = Cn is finite dimensional, thepro-measure dμ(x) is the Gaussian measure dμ(x) =  1 − n2 (2π ) exp − 2 *x*2 dn x on HR = Rn . In this case, one can absorb the   n weight function (2π )− 2 exp − 12 *x*2 into the wave functions ψ, and one

obtains the standard Schr¨odinger representation on L2 (Rn , dn x).

10.3 It¯o–Fock Isomorphism If the one-particle Hilbert space is an L2 -space with measure given by the covariance of a martingale, the Wiener–Segal isomorphism can be constructed by stochastic integrals, a method first given by It¯o in the case of the canonical Wiener process (Itˆo, 1951). Let Xt be a martingale with canonical probability space ( X , FX , P) and let hX = L2 ( X , FX , P). We consider the function F (t) = E Xt2 which then defines a monotone increasing function. We shall understand dF (t) to be the Stieltjes integrator in the following. It turns out that our considerations so far allow us to construct a natural isomorphism between hX and the Fock space + L2 R+ , dF (see, e.g., Parthasarathy, 1992). For f ∈ L2 R+ , dF , we define the random variable ' X (f) =

f (s) dXs [0,∞)

and a process Xt ( f ) = X 1[0,t] f .  (n) Lemma 10.3.1 Let Xt ( f ) = n (t) dXtn ( f ) . . . dXt1 ( f ), then n ' t∧s ( ) 1 (n) f (u) g (u) dF (u) δn,m . E Xt ( f ) Xs(m) (g) = n! 0 Proof

For simplicity, we ignore the intensities. Let (n) Xt

' =

n (t)

dXtn . . . dXt1 .

194

L2 -Representations of the Boson Fock Space

) ( ) ( (n) (0) (n) = E Xt = 0 whenever n > 0. Next suppose Then we have E Xt Xs that n and m are positive integers, then  ' t ' s ) ( (n) (n−1) (m−1) dXu Xu− dXv Xv− E Xt Xs(m) = E 0 0 ' t∧s ) ( = dF (u) E Xu(n−1) Xu(m−1) 0

and we may reiterate until we reduce at least one of the orders to zero. We then have ' t∧s ' un ' u2 ) ( (n) dF (un ) dF (un−1 ) · · · dF (u1 ) E Xt Xs(m) = δn,m 0

0

0

1 = F (t ∧ s)n δn,m . n! The proof with the intensities from L2 R+ , dF included is a straightforward generalization. Theorem 10.3.2 The Hilbert space hX = L2 ( X , FX , P) and the Fock space 2 + + L R , dF are naturally isomorphic. Proof

Consider the map into the exponential vectors given by  1 (n) X ( f ) → exp ( f ) e/Xt ( f ) ≡ n! t n≥0

for each f ∈ L2 R+ , dF , and denote the t → ∞ limit as e/X( f ) . We know that the exponential vectors are dense in Fock space and in a similar way the exponential processes e/Xt ( f ) are dense in hX . The map may then be extended to one between the two Hilbert spaces. Unitarity follows from the observation that ) (  ∗ ∗ E e/X( f )e/X(g) = e [0,∞) f g dF , which is an immediate consequence of the previous lemma. Now for F ∈ hX = L2 ( X , FX , P), we will have the so-called chaotic expansion ' 6 F= F (tn , . . . , t1 ) dXtn . . . dXt1 . (10.15) n

n

Following our ideas from Chapter 4, we introduce the shorthand notation ' 6 F (T) dXT . F=− (10.16) Power(R+ )

10.3 It¯o–Fock Isomorphism

195

For instance, the diagonal-free exponentials are '  f (t)dXt e/ =− f T dXT . Power(R+ )

10.3.1 Stochastic Convolutions The stochastic Wick convolution X is defined by ' FG ≡ − [6 F X 6 G](T) dXT ,

(10.17)

Power(R+ )

  whenever F = − 6 F (T) dXT and G = − 6 G(T) dXT . Under favorable circumstances, we may deduce a closed form for the convolution. The Wiener Wick Convolution 

Starting from the exponential e/ 

e/

f (t)dWt



e/

g(t)dWt



f (t)dWt 



=e

f (t)dWt − 12



f (t)2 dt

, we have

=e e/ ' ' =− f S gS dS − [f + g]T dWT Power(R+ ) Power(R+ ) ' '  =− dWT − dS f T1 +S gT2 +S . f (t)g(t)dt

Power(R+ )

[f (t)+g(t)]dWt

Power(R+ )

T1 +T2 =T

As the exponential vectors are dense, we generalize to '  6 [6 F W 6 F (T1 + S)6 dS G(T2 + S). G](T) = − Power(R+ )

T1 +T2 =T

The Poisson Wick Convolution  = exp ln(1 + f )dN, and so This time we have e/ '   e/ f (t)dNt e/ g(t)dNt = exp ln(1 + f + g + fg)dN 

f (t)dNt



= e/ [f (t)+g(t)+f (t)g(t)]dNt ' =− ( f + g + fg)T dNT , Power(R+ )

but we may write ( f + g + fg)T =



f T1 gT3 ( fg)T2

T1 +T2 +T3 =T





T1 +T2 +T3 =T

f T1 +T2 gT3 +T2 .

(10.18)

196

L2 -Representations of the Boson Fock Space

Again we generalize to get [6 F N 6 G](T) =



6 F (T1 + T2 )6 G(T3 + T2 ).

(10.19)

T1 +T2 +T3 =T

10.3.2 Wiener–It¯o–Segal Isomorphism Let us specify the case where ( W , FW , PW ) is the probability space of the canonical Wiener process with W = C0 [0, ∞). Here we obtain the Wiener– It¯o–Segal isomorphism   L2 ( W , FW , PW ) ∼ =  L2 R+ , dt .

(10.20)

  Let f ∈ L 2 R+ , dt with real and imaginary parts f and f , then we have   W ( f ) = W f + iW f , that is, W (·) is complex linear in its argument. We recall that the Wiener exponential is given by 1

e/W( f ) = eW( f )− 2

∞ 0

f2

.

∞ Note that this is analytic in f , and in particular the integral term is 0 f 2 and ∞ not 0 |f |2 . We may introduce some notation at this stage. For x ∈ C0 [0, ∞), a Wiener trajectory, that is, x = x (t), is a continuous path in t ≥ 0 with x (0) = 0. We set

x| exp ( f )  e/W( f ) (x) ,

(10.21)

which of course corresponds to the identification e/W( f ) with the exponential vector exp ( f ) behind the isomorphism. The set of operators {Q ( f ) : f ∈ S} on the Fock space will be commutative 2 R+ , dt , the if and only if Im f |g = 0, for all f , g ∈ S. We fix S = LR real-valued L2 -functions. Lemma 10.3.3 In the preceding of the Wiener–It¯o–Segal representation + setting 2 for the Fock space  L R , dt by L2 ( W , FW , PW ), the operators Q (g), 2 R+ , dt , correspond to multiplication by the random with test function g ∈ LR ∞ variable W (g) = 0 g (t) dWt . Proof with the identity eiQ(g) = D (ig), we note that for g ∈ + Starting 2 LR R , dt we have

10.3 It¯o–Fock Isomorphism ;

197

<  2  1 x|eiQ(g) exp ( f ) = e− 2 g −i gf x| exp ( f + ig) 

 g2 −i gf W( f +ig) e/ (x)    − 12 g2 −i gf W( f +ig) − 12 ( f +ig)2 1

= e− 2 =e

e

=e

(x) x| exp ( f ) .

iW(g)

(x) e

The identification (10.21) can be rephrased by the following resolution of identity: ' |x x| PW [dx] . (10.22) I= C0 [0,∞)

Technically, we should give the rigged Hilbert space underlying this resolution. The particular Gelfand triple is similar in construction to the Schwartz spaces and is due to Hida (2008) and forms the basis of the mathematical theory known as white noise analysis. Theorem 10.3.4 (Cameron–Martin–Girsanov) Let  [x] denote a functional 2 R+ , dt , we define the function of a trajectories x ∈ C0 [0, ∞). For h ∈ LR t H (t) = 0 h (s) ds, then ) ( E [[x + H]] = E [x]/ eW(h) (x) , ∞

where e/W(h) (x) = e

0

hdx− 21

∞ 0

h2

.

Proof This is actually easy to prove on the Fock space, where the shift by the function H is then implemented by the displacement operator. We note that E [ [x + H]] = | [Q + H] , where now we consider  ∗a functional   of the operator-valued process Qt + Ht = / . Qt + h|1[0,t] ≡ D h2 Qt D h2 . Therefore, = > h h

| [Q + H] = |D( )∗  [Q] D( ) 2 2 = > 1 *h*2 h h exp( )| [Q] exp( ) = e− 4 2 2  2' 1 | x| exp ( f ) |2  [Q (x)] P [dx] = e− 4 h W '  − 12 h2 =e eW(h) (x)  [Q (x)] P [dx] . W

11 Local Fields on the Boson Fock Space: Free Fields

The theory of general quantum fields was first presented by Heisenberg and Pauli (1929, 1930) on the basis of the canonical formalism of fields. Free quantum fields are obtained in this way, and a successful perturbation theory has been developed for interacting quantum fields using the canonical formalism. But as shown by Haag – see Haag’s Theorem in Streater and Wightman (1964) – the interaction picture does not exist for quantum field theories, which are invariant against space-time translations and rotations. Therefore, relativistic quantum field theories with interaction need a foundation beyond canonical formalism. It is the aim of this chapter to emphasize an alternative approach to relativistic quantum field theory that is based on the representation theory of the inhomogeneous Lorentz group (Poincar´e group) and on locality without reference to canonical equal time commutators. In Section 11.1, the Hilbert space of a scalar (spin zero) particle is given by an irreducible unitary (ray) representation of the Poincar´e group. The Fock space of this Hilbert space carries the n-particle states. The free field (x) with x being an element of the Minkowski space is constructed as linear combinations of the corresponding creation and annihilation operators in such a way that it transforms covariantly under Poincar´e transformations. The additional essential demand is locality, which means that two fields (x) and (y) commute if x and y are spacelike separated. The free field is then determined up to unessential phase factors. Such an approach is presented for any spin in Weinberg (1964, 1995). As additional literature, we refer to Itzykson and Zuber (1980), to Kastler (1961, chap. V), and to Reed and Simon (1975, sec.X.7). Since the canonical formalism is often used in the literature, we present canonical fields and their equal time commutators in Section 11.2. This approach is applied in Chapter 12.

198

11.1 The Free Scalar Field

199

11.1 The Free Scalar Field 11.1.1 Lorentz and Poincar´e Group The four-dimensional Minkowski spacetime is denoted by M. The elements of M are the vectors x = (x0 , x) = (x0 , x1 , x2 , x3 ) ∈ M. Here we have x0 = t ∈ R, x ∈ R3 , and use units with h¯ = c = 1. The structure of M is given by the indefinite bilinear form (11.1) M , x, y −→ x · y = x0 y0 − x.y ∈ R.   The forward light cone is V0+ = x | x2 ≡ x · x = 0, x0 ≥ 0 , and a light ray starting at x = 0 and propagating forwardin time can reach points on V0+ . The  + 2 convex cone V = x | x > 0, x0 > 0 is called the (open) forward cone, and V − = −V + is the backward cone; see Figure 11.1. Their closures are V + and V − .   The spacelike hypersurfaces Viλ = x |x2 = −λ2 ,λ > 0, are connected, 2 2 whereas the timelike hypersurfaces V m = x | x = m ,m > 0, split into the  disconnected hyperboloids Vm+ = x | x2 = m2 , x0 > 0 and Vm− = −Vm+ ; see Figure 11.2. Lorentz transformations ! are linear transformation of the Minkowski space that satisfy (!x) · (!y) = x · y for all x, y ∈ M.

(11.2)

The set of all these transformations form a group – the Lorentz group – which is denoted by L. The determinant of a Lorentz transformation is either det ! = +1 or det ! = −1. A transformation ! ∈ L with det ! = 1 is called a proper Lorentz transformation, and the corresponding subset of L is L+ . A Lorentz transformation is called orthochronous, if

Figure 11.1 The open future and past cones V ± in Minkowski spacetime M.

200

Local Fields on the Boson Fock Space: Free Fields

Figure 11.2 The spacelike hyperboloids Vm± (cross-section in the (x0 , x1 ) plane). !V + = V + ,

(11.3)

the corresponding subset of L is L↑ . The set L, and its subsets L↑ , L+ and ↑ ↑ L+ = L↑ ∩ L+ are groups. The proper orthochronous Lorentz group L+ is the connectivity component of the identity. It acts transitively on the hyperboloids Vm+ , Vm− and Viλ , m > 0, λ > 0. The Minkowski space can be extended to the complex Minkowski space MC ' C4 . The complex linear transformations !, which preserve the C bilinear form z · w = z0 w0 − 3j=1 zj wj and which have the determinant det ! = 1, form the complex Lorentz group LC . The symmetry group of relativistic theories is the inhomogeneous Lorentz ↑ ↑ group or Poincar´e group P+ = M×L+ of all translations y ∈ M and all proper ↑ orthochronous Lorentz transformations ! ∈ L+ . This group is the connectivity component of the identity. The full Poincar´e group P = M × L acts on the Minkowski space as M , x → !x + y ∈ M. This definition implies the product rule (M × L) × (M × L) → M × L : (y, !), (y , ! ) → (y, !) ◦ (y , ! ) = (y + !y , !! ).

(11.4)

In the following, we will at the slight risk of overkill introduce the notation 6 for the set of four momemta (k0 , k), where k0 is the energy variable and k is M the three-momentum. The set of all three momenta will be denoted as 6 R3 . We do this to distinguish spacetime and energy-momenta coordinates; however, 6 along with the associated Lorentz the same Minkowski metric applies to M and Poincar´e group actions. We will use the same notation for hyperboloids in both cases.

11.1 The Free Scalar Field

201

11.1.2 The Klein–Gordon Equation The Klein–Gordon (KG) equation is the differential equation    + m2 φ(x) = 0

(11.5)

with the parameter m > 0. The symbol  stands for the Lorentz invariant second-order differential operator

3 1 ∂2 ∂ 2  ∂2 − = 2 2 − /. = 0 2 ∂x c ∂t ∂xj j=1 Going over to the Fourier representation ' 3 −ik·x 4 ˜ φ(x) = (2π )− 2 φ(k)e d k, the differential equation (11.5) is replaced by the algebraic equation   ˜ = 0. −k2 + m2 φ(k) ˜ The solutions φ(k) have support in the double hyperboloid Vm = Vm+ ∪ Vm− in 6 Specifically, we have M.

  6 : k2 = m2 , k0 > 0 = k ∈ M 6 : k0 = ω(k) , (11.6) Vm+ = k ∈ M where ω(k) =

* k2 + m2 ,

˜ = f (k)δ(k2 − m2 ) with while Vm− = −Vm+ , and they can be factorized as φ(k) 2 functions Vm , k → f (k) ∈ C. The measure δ(k − m2 )d4 k is concentrated on Vm and is invariant under Lorentz transformations. The functions f (k) are taken as elements of the Hilbert space Hm with the inner product '

f1 | f2 m = f1 (k)∗ f2 (k) δ(k2 − m2 )d4 k. (11.7) With these functions, the Fourier representation of the solutions of the KGequation is given by ' 3 (11.8) φ(x) = (2π )− 2 f (k)e−ik·x δ(k2 − m2 )d4 k. The linear space  of functions (11.8) with f ∈ Hm is denoted as Km . The Hilbert ± = f ∈ H : supp f ⊂ V ± are isomorphic orthogonal subspaces spaces Hm m m of Hm . The restriction of the measure δ(k2 − m2 )d4 k to the hyperboloid Vm+ is   dμm (k) = δ+ k2 − m2 d4 k = (2ω(k))−1 d3 k, (11.9)

202

Local Fields on the Boson Fock Space: Free Fields

+ is therefore with δ+ (k2 − m2 )  (k0 )δ k2 − m2 . The Hilbert space Hm 3 d k ± under the transformation (11.8) R3 , 2ω(k) ). The image of Hm isomorphic to L2 (6 ± is called Km . We define on Km an inner product (φ1 | φ2 )m with the property (φ1 | φ2 )m = f1 | f2 m ,

(11.10)

if the functions φj , j = 1, 2, are the Fourier transforms of fj . Standard Fourier theorems imply the identities

' ∂ ∂φ1 (x)∗ φ1 (x)∗ 0 φ2 (x) − φ (x) d3 x (φ1 | φ2 )m = i 2 0 0 ∂x ∂x x =t + if f1 ∈ Hm , f2 ∈ H m , (11.11)

' ∗ ∂ (x) ∂φ 1 φ1 (x)∗ 0 φ2 (x) − φ2 (x) d3 x (φ1 | φ2 )m = −i ∂x ∂x0 x0 =t − if f1 ∈ Hm , f2 ∈ H m . (11.12) As a consequence of (11.10), the sesquilinear form (11.11) is the positive + . A real solution φ(x) of the KG-equation definite inner product of the space Km has the Fourier representation (11.8) with a function f , which is real with respect to the antilinear involution f (k) → f + (k) = f (−k)∗ .

(11.13)

This involution is an antiunitary operator within Hm . The inner product (11.8) is invariant under the transformations f (k) → eik·y f (!−1 k)

(11.14)

for all elements (y, !) of the full Poincar´e group P. The equivalent transformation on Km follows from the Fourier transform (11.8) as φ(x) → φy,! (x)  φ(!(x − y)).

(11.15)

If the group P is restricted to the orthochronous group P ↑ , the transformation + onto H+ and H− onto H− . Consequently, (11.15) maps K+ (11.14) maps Hm m m m m + − −. onto Km and Km onto Km If f (k) is a sufficiently decreasing function, for example, if |k|3 f (k) ∈ Hm , . / / 3 . then eik·x | f m is defined, and (2π )− 2 eik·x | f m = φ(x) is the Fourier integral ± is obtained by (11.8). The projection of φ onto Km ; < (11.10) 3 φ ± (x) = (2π )− 2 (±k0 )eik·x | f (k) = ± ± m (y − x) | φ(y) m

(11.16)

11.1 The Free Scalar Field

with the generalized functions ' ' −3 −ik·x −3 + (x) = (2π ) dμ (k) = (2π ) e m m

R3

203

ei

kx−ω(k)x0

d3 k 2ω(k) (11.17)

and + + ∗ − m (x) = −m (−x) = −m (x) .

(11.18)

± The ± m (x) are invariant under orthochronous Lorentz transformations, m (!x) ± ↑ = m (x), ! ∈ L . The integral (11.17) is calculated in Section 11.1.5 with the result  √  2 + i0x0 m K −x 1 m + (11.19) √ m (x) = (2π )2 −x2 + i0x0 if m > 0. The function K1 (s) is the modified Bessel function of the second kind. The sum − + m (x) = i + m (x) + m (x) = −2 Im m (x) ' (11.20) = i (2π )−3 e−ik·x ε(k0 )δ(k2 − m2 )d4 k

is a real L↑ invariant and antisymmetric distribution, (−x) = −(x). The generalized function (x) has therefore a support restricted to the closed backward and forward cones V + ∪ V − . The integral representations (11.11) and (11.12) for the inner products (11.16) lead to the following formula for φ(x) = φ + (x) + φ − (x):

' ∂ ∂(x − y) (x − y) 0 φ(y) − φ(y) d3 y. (11.21) φ(x) = ∂y ∂y0 y0 =t0 This formula solves the initial value problem. The KG-equation is a secondorder differential equation in the time variable. The initial condition at time t0 , say t0 = 0, is given by the function φ(0, x) = ϕ(x) and its derivative ∂ φ(x0 , x) |x0 =0 = ψ(x). The knowledge of these functions is equivalent to ∂x0 the knowledge of f (k) on both branches of Vm . The formula (11.21) determines φ(x) at any time x0 ∈ R. Moreover, it proves that solutions of the KG-equation propagate causally. Starting at t = 0 from an initial state localized in a region G ⊂ R3 , that is, supp ϕ ⊂ G and supp ψ ⊂ G, the solution φ(x) propagates for t > 0 into the causal shadow G + V + of G. But if the class of solutions is restricted to functions with positive frequencies, there is no causal localization:   + , then for arbitrary t < t the space region supp φ(t , x) ∪ if 0 =  φ(x) ∈ K 1 2 1 x   suppx φ(t2 , x) is unbounded. The proof of this statement follows from the Fourier representation

204

Local Fields on the Boson Fock Space: Free Fields

3

φ(t, x) = (2π )− 2

'

f (ω(k), k) e−iω(k)t+ikx

d3 k . 2ω(k)

(11.22)

If φ(t, x) has compact support for t = t1 and t = t2 , then both the functions g(k)  ω−1 f (ω, k) exp (−it1 ω) and g(k) exp (−i(t2 − t1 )ω) with ω = √ k2 + m2 would then be entire analytic functions in k ∈ C3 . But that is not possible.

11.1.3 The One-Particle Hilbert Space Momentum Representation There are straightforward methods to derive the theory of relativistic fields from unitary representations of the inhomogeneous Lorentz group – see, e.g., Bogoliubov et al. (1990) – or from the underlying canonical structure – see, e.g., Baez et al. (1992). Here we start in a less fundamental way from the classical theory of a relativistic free particle with mass m and spin zero. The L2 -space over the classical momentum space of this particle is then chosen as the one-particle Hilbert space. For a relativistic particle of mass, m > 0 the four-momentum p = (E, p) ∈ 6 is concentrated on the hyperboloid Vm+ . The associated Lorentz invariant M +, measure on the Minkowski space is (11.9). As one-particle Hilbert space Hm we choose the space L2 (Vm+ , dμm ) of functions Vm+ , k → f (k) ∈ C that are square integrable with respect to this measure. In Section 11.1.2, this space was + . The inner product of H+ is called Hm m '

f | g m = f (k)∗ g(k)dμm (k). (11.23) + Identifying the function f (k), k ∈ Vm+ , with f (ω(k), k), k ∈ 6 R3 , the space Hm 3 d k is isomorphic to L2 (6 R3 , 2ω(k) ). Many calculations of this section are true for all masses m ≥ 0, and we give some results including m = 0. But the quantum field theory for m = 0 d3 k R3 , 2ω(k) ) ⊂ particles needs some additional work related to the fact that L2 (6 2 3 3 6 L (R , d k) is true for m > 0 but not for m = 0. Therefore, we restrict to the case m > 0 in general. + carries a unitary representation of the proper The Hilbert space Hm ↑ orthochronous Poincar´e group P+ that can be read off from the invariance transformations (11.14) of the KG-equation ↑

(U1 (y, !)f ) (k) = eik·y f (!−1 k), (y, !) ∈ P+ .

(11.24)

11.1 The Free Scalar Field

205

It is straightforward to prove that these transformations are unitary and that ↑ they satisfy the identities for a representation of the group P+ : U(0, I) = id, U(y, !)U(y , ! ) = U(y + !y , !! ).

(11.25)

The 4-momentum operators P(1) = (H1 , P1 ) are the generators of the space and time translations  ∂  −iky e f (k) = kf (k), (11.26) (P1 f ) (k) = i y=0 ∂y ∂  ik0 t  e f = ω(k)f (k). (11.27) (H1 f ) (k) = −i t=0 ∂t The spectrum of the 4-momentum operator P(1) = (H1 , P1 ) is concentrated on the hyperboloid Vm+ . The time evolution of a vector f (k) is f (k, t) = (U1 (t)f ) (k) = f (k)e−iω(k)t . There are two possibilities to extend the representation (11.24) to a representation of the orthochronous group P ↑ , either by the rule (11.24) applied to transformations in P ↑ , or by (U1 (y, !)f ) (k) = (det !) eik·y f (!−1 k). In the first case, the particles are called scalar particles, and in the second case pseudoscalar particles. Spacetime Functions

3  then ψ(x) = (2π )− 2 f (k)e−ik·x dμm (k) is a If f (k) is a function in solution of the KG equation with positive frequencies. The unitary representation of the proper orthochronous Poincar´e group (11.24), cf. (11.14), leads to ψ(x) → ψy,! (x) = ψ(!(x − y)). This is the behavior of classical fields on the Minkowski space; and it is a justification to take the variable x as point in the configuration Minkowski space. But, as already stated at the end of Section 11.1.2, there is no localization of the wave function in a finite region of space (or spacetime) consistent with causality. Hence causal propagation has only a vague meaning for one-particle states. These problems have been investigated in great detail by Hegerfeldt (1974) and by Hegerfeldt and Ruijsenaars (1980). We come back to locality and causality in the context of quantum fields in the Sections 11.1.4 and 12.2.

+, Hm

206

Local Fields on the Boson Fock Space: Free Fields

There is another possibility to specify one-particle vectors by functions of the spacetime variables. Let ϕ(x) be a function in S(M). Then the Fourier transform1 ' √ − 32 ϕ(k) ˆ = 2π F4 [ϕ] (k) = (2π ) ϕ(x)eik·x d4 x (11.28) M

6 and the restriction to the mass hyperboloid is a function ϕ(k) ˆ ∈ S(M), + . The ϕ(k) ˆ |k∈Vm+ determines a unique vector in the one-particle space Hm hermitean form ' / . ϕˆ2 | ϕˆ1 M ϕˆ2 (k)∗ ϕˆ1 (k)δ+ (k2 − m2 ) d4 k (11.29) 6 = 6 The form is an extension of the inner product (11.23) to functions in S(M). (11.29) is positive hermitean with the highly nontrivial null-space   6 = ϕˆ ∈ S(M) 6 : ϕ(k) (11.30) ˆ = 0 if k ∈ Vm+ . S0 (M) 6 6 is a pre-Hilbert space with the inner product The quotient space S(M)/S 0 (M) 6 is isomorphic to the 6 induced by (11.29), and the completion of S(M)/S 0 (M) + . The mapping one-particle Hilbert space Hm √ + ˆ |k∈Vm+ ∈ Hm (11.31) J : ϕ → ϕˆ = 2π F4 [ϕ] → f (k) = ϕ(k) + . This mapping is continuous with is a linear operator from S(M) into Hm + , and the image space JS(M) respect to the topologies of S(M) and of Hm + . The Fourier relations imply the identity is a dense linear subspace of Hm

Jϕ | Jψ m = ϕ * ψ m , where

''

ϕ2 * ϕ1 m 

4 4 ϕ2 (x)∗ + m (x − y) ϕ1 (y) d xd y

(11.32)

(11.33)

M×M

is a degenerate positive hermitean form on S(M) with the null-space S0 (M) = 6 = kerJ. The integral kernel + F4−1 S0 (M) m (x − y) is the generalized function (11.17). The space S(M) is a semi-Hilbert space with the form (11.33). Since + , the functions ϕ ∈ S(M) label in a nonunique {Jϕ : ϕ ∈ S(M)} is dense in Hm way a dense set of one-particle vectors. The functions ϕ(x) ∈ S(M) carry the continuous representation ϕ(x) → ϕy,! (x) = ϕ(!(x − y)), (y, !) ∈ P ↑ ,

1 The additional factor

√ 2π is convenient for the subsequent calculations.

11.1 The Free Scalar Field

207

of the orthochronous Poincar´e group. The Fourier transform ϕ(k) ˆ behaves like (11.14), and the function f = Jϕ transforms according to the unitary representation (11.24) of the one-particle vectors. So far, the mappings (11.31) have been defined for functions ϕ ∈ S(M). But it is possible to extend the domain of this mapping to more general spaces. Lemma 11.1.1 The mappings (11.31) can be extended to the following spaces: (i) The Hilbert space L2 (M, 1 + (x0 )2 d4 x). (ii) The space of generalized functions ϕ(x) = δ(x0 − t)χ (x) localized at a fixed time t with χ (x) ∈ L2 (R3 , d3 x). The topology of this space is the L2 -topology of the factor χ (x). The mapping (11.31) is a continuous linear operator from either of these spaces + . The identity (11.32) remains valid for these extensions. into Hm Proof We first take functions ϕ(x) ∈ L2 (M, 1 + (x0 )2 d4 x), which have the finite norm '   2 |ϕ(x)|2 1 + (x0 )2 d4 x. *ϕ*L = (11.34) 6 M

√ an element of the Sobolev space with The Fourier transform ϕˆ = 2π F4 [ϕ] is

2 9 92  2 ˆ d4 k. Then the one-particle norm of norm 9ϕˆ 9S  M ϕ(k) + ∂k∂ 0 ϕ(k) ˆ Jϕ is well defined and can be estimated by ' 2 2 *Jϕ* = ϕˆ (k) δ+ (k2 − m2 )d4 k M

' ' ∞ d3 k 0 ∗ ∂ dk ϕˆ (k) ϕˆ (k) = 2Re ∂k0 R3 2ω(k) ω(k) 9 1 9 9ϕˆ 92 = π *ϕ*2 . ≤ L S 2m m For the second case, we choose singular functions ϕ(x) = δ(x0 − t)χ (x), which 2 3 3 are localized √ at a fixed time t with a function 0 χ (x) ∈ L (R , d 2x). 3The3 Fourier ˜ with χ˜ ∈ L (R , d k). The transform 2π F4 [ϕ] is ϕˆ (k) = exp i k t χ(k) one-particle norm of Jϕ is then estimated by the norm of χ ' ' d3 k 2 2 2 2 4 |ϕ˜ (k)|2 *Jϕ* = ϕˆ (k) δ+ (k − m )d k = 2ω(k) M R3 ' ' |χ˜ (k)|2 d3 k = (2m)−1 |χ (x)|2 d3 x. ≤ (2m)−1 R3

R3

The Fourier identity Jϕ | Jψ = ϕ * ψ remains valid for functions in the spaces 1 and 2.

208

Local Fields on the Boson Fock Space: Free Fields

In the case of functions ϕ(x) = δ(x0 − t)χ (x), the integral (11.33) becomes the positive definite hermitean form '' 3 3 χ1 (x)∗ + qm (χ1 | χ1 ) = m (0, x − y)χ2 (y)d xd y, R3 ×R3

defined on L2 (R3 , d3 x) with the locally integrable kernel function ' 3 m −3 ikx d k (11.19) + (0, x) = (2π ) e = (2π )−2 K1 (m |x|) . m |x| 2ω(k) R3 √ 2 3 3 The closure of L (R , d x) with respect to the norm q (χ |χ ) is the space   1 d3 k = − + m2 4 L2 (R3 , d3 x), which is the Fourier transF3−1 L2 (R3 , 2ω(k)) d3 k +. )∼ form of L2 (R3 , 2ω(k) = Hm

Generalized Vectors A short introduction to generalized vectors was given in Section 9.1.3. The d3 k + = L2 (V + , dμ ) ' L2 (6 R3 , 2ω(k) ) can be extended to genHilbert space Hm m m 

eralized functions in S (R3 ). The simplest examples are vectors with fixed 6 momentum p ∈ M k ∈ Vm+ . (11.35) p ∈ Vm+ , ψp (k) = 2ω(p) δ 3 (k − p) ,  / . The inner product ψp | f m = ψp (k) f (k)dμm (k) = f (p) is well defined for + . The mapping all continuous functions f ∈ Hm ' + S(Vm+ ) , f → f (p)ψp (k) dμm (k) = f (k) ∈ Hm + . The vectors (11.35) are (formally) norcan be extended to functions in Hm malized to ' (11.36) ψp (k) ψp (k) dμm (k) = 2ω(p) δ 3 p − p .

The distribution (11.35) is the kernel of the L↑ -invariant identity operator, and −1 ψp has the simple covariance property ψ!p (k) = ψp ! k for ! ∈ L↑ . A candidate for a state at position x ∈ R3 at time x0 = t is 3

φx (k) = (2π )− 2 eik·x ,

k ∈ Vm+ ,

x ∈ M.

(11.37)

The inner product (11.23) of φx with sufficiently decreasing function f ∈ L2 (Vm+ , dμm ) coincides with the Fourier transform ' 3 (11.29)

φx | f m = (2π )− 2 e−ik·x f (k)dμm (k) = φ(x).

11.1 The Free Scalar Field

209

+ . Given a function ϕ in S(M), the This result can be extended to all f ∈ Hm  3  integral ϕ(x)φx (k) d4 x = (2π )− 2 eik·x ϕ(x)d4 x = f (k) yields the image Jϕ of the mapping (11.31). The functions (11.37) are formally normalized to ' ' 1 (11.17) e−ik·(x−y) dμm (k) = + φx (k)∗ φy (k)dμm (k) = m (x − y). (2π )3

Since + m (0, x) is not a localized distribution, this result indicates again that there is no reasonable localization in position for a relativistic one-particle state.

11.1.4 Fock Space and Field Operators Creation and Annihilation Operators + ) and the corresponding creation and annihilation The Fock space  + (Hm + − + , can be constructed as in Chapter 9. The operators A (f ) and A (f ) , f ∈ Hm + ), n = vacuum vector is denoted as . A concrete realization of n+ (Hm

1, 2, . . ., is given by functions F(k1 , . . . , kn ) which are symmetric in the variables kj ∈ Vm+ , j = 1, . . . , n, and which are square integrable with respect to the product measure (dμm )n . The creation/annihilation operators satisfy the commutation relations (9.34)     + A (f ), A+ (g) = A− (f ), A− (g) = 0, '   − (11.23) f (k)∗ g(k)dμm (k) I. (11.38) A (f ), A+ (g) = f | g I = The representation (11.14) of the Poincar´e group can be extended to the representation U(y, !) =  (U1 (y, !)) on the Fock space. The generator of the translations is the total 4-momentum P (11.39) P = d P(1) . The point spectrum of P consists of the eigenvalue zero with the vacuum as eigenvector, and the continuous spectrum includes the hyperboloid Vm+ (the + generated by the hyperspectrum of P(1) ) and the closed convex set Conv V2m + boloid V2m . For all values of the mass m ≥ 0, the spectrum of P is therefore a subset of the closed forward light cone Spec P ⊂ V + .

(11.40)

The energy component of (11.39) is the many-particle Hamilton operator H = d(H1 ).

(11.41)

210

Local Fields on the Boson Fock Space: Free Fields

The time evolution is given by the unitary group t → U(t) = exp (−iHt) parametrized by t ∈ R, and the dynamics of the creation and annihilation operators is, as shown in (9.61),   U ∗ (t)A± (f (k)) U(t) = A± f (k)eiω(k)t) . (11.42) Following Section 9.1.4, the behavior of the creation/annihilation operators under Poincar´e transformations (y, !) ∈ M × L↑ is seen to be   U(y, !)A± (f )U ∗ (y, !) = A± (U1 (y, !)f ) = A± eik·y f (!−1 k) . (11.43) The creation/annihilation operators of the singular states (11.35)   ± 2ω(p) δ 3 (k − p) A± p A

(11.44)

+ ). We obtain the wellare operator-valued generalized functions on  + (Hm ± + defined operators A (f ), f ∈ Hm , by the “integration” ' ' − A+ (f ) = f (k)A+ dμ (k), A (f ) = f (k)∗ A− m k k dμm (k). (11.45)

The operators (11.44) satisfy the formal commutation relations ) ( ) ( + + − A− k , Ap = Ak , Ap = 0, ( ) + A− = 2ω(p) δ 3 p − p I, , A p k

(11.46)

which after the integration (11.45) lead to (11.38). More explanations are given in Section 9.1.3. Covariant Field Operators Given a function ϕ(x) ∈ S(M), then the mapping (11.31) determines the oneparticle vectors f = Jϕ and f + = Jϕ ∗ . We define field operators (∓) (ϕ) as the creation/annihilation operators (−) (ϕ) = A+ (f ) and (+) (ϕ) = A− (f + ).

(11.47)

The reason for the interchange of the superscripts ± will be explained later. + The relation (9.28) leads to the identity on Hm  ∗ (+) (ϕ) = (−) (ϕ ∗ ) . (11.48) If ϕ(x) is a real function, its Fourier transform satisfies the identity ϕ(k) ˆ = ∗ , and the functions f and f + coincide, f + (k) = f (k) ∈ H+ . ϕ(−k) ˆ m

11.1 The Free Scalar Field

211

The rules (11.24) and (11.43) imply that (∓) (ϕ) are covariant field operators under Poincar´e transformations U(y, !)(∓) (ϕ)U ∗ (y, !) = (∓) (ϕy,! ).

(11.49)

Both mappings ϕ → (−) (ϕ) and ϕ → (+) (ϕ) are linear (and continuous in suitable topologies). We can therefore write ' ' (−) (ϕ) = (−) (x)ϕ(x) d4 x = A+ k f (k)dμm (k), ' ' (+) (ϕ) = (+) (x)ϕ(x) d4 x = A− (11.50) k f (−k)dμm (k), with an operator-valued generalized functions (∓) (x). The Fourier integral (11.29) indicates that the operators (∓) (x) are the creation/annihilation operators of the singular vector (11.37) '   3 3 ∓ik·x dμm (k). (11.51) (∓) (x) = A± (2π )− 2 eik·x = (2π )− 2 A± k e The creation operator (−) (x) has only negative frequencies, and the annihilation operator (+) (x) has only positive frequencies. The superscripts ∓ refer to these properties. The relations (11.49) imply the transformation rules for these singular field operators U(y, !)(∓) (x)U ∗ (y, !) = (∓) (!x + y).

(11.52)

Both the operators (±) (x) have the same covariant transformation property under Poincar´e transformations. As the 4-dimensional Fourier transforms of the fields (∓) (x) are concentrated on Vm+ and Vm− = −Vm+ , these fields are solutions of the Klein–Gordon equation2    + m2 (∓) (x) = 0. (11.53) The vacuum expectations of the product of two of these fields are ; < ; < | (±) (ϕ), (±) (ψ) = | (−) (ϕ), (+) (ψ) = 0, ; < . / . / | (+) (ϕ), (−) (ψ) = | A− (f + )A+ (g)) = f + | g m , with the latter expression equal to = ϕ ∗ * ψ m , and we have the one-particle vectors f + = Jϕ ∗ and g = Jψ. The commutation relations of the fields follow as 2 The precise meaning of this equation is that

functions ϕ(x) ∈ S(M).



  (∓) (x)  + m2 ϕ(x)d4 x = 0 is true for all test

212

Local Fields on the Boson Fock Space: Free Fields ( ) ( ) (−) (ϕ), (−) (ψ) = (+) (ϕ), (+) (ψ) = 0, ( ) . / (+) (ϕ), (−) (ψ) = ϕ ∗ * ψ m , ( ) . / (−) (ϕ), (+) (ψ) = − ψ ∗ * ϕ m

(11.54)

with the bilinear form, as shown in (11.33), '' . ∗ / 4 4 ϕ *ψm= ϕ(x) + m (x − y) ψ(y)d xd y. M×M

These identities are consequences ofthe relations (11.33), (11.38), and (11.47). The commutators (±) (x), (±) (y) follow from (11.54) by extracting the test functions ϕ and ψ ) ( ) ( (−) (x), (−) (y) = (+) (x), (+) (y) = 0, ( ) (+) (x), (−) (y) = + m (x − y), ( ) (−) (x), (+) (y) = − (11.55) m (x − y) with the distributions (11.17) and (11.18). Local Fields Let ϕ(x) be a test function in S(M). We introduce the field operator (ϕ)  (−) (ϕ) + (+) (ϕ) = A+ (Jϕ) + A− (Jϕ ∗ ),

(11.56)

where the operator J has been defined in (11.31). The mapping ϕ → (ϕ) is linear, and we can write ' (11.57) (ϕ) = (x)ϕ(x)d4 x with the operator-valued generalized function     3 3 (x) = (−) (x) + (+) (x) = A+ (2π )− 2 eik·x + A− (2π )− 2 eik·x . (11.58) As a consequence of (11.48) and (11.31), the adjoint operator of (ϕ) is ((ϕ))∗ = (ϕ ∗ ). Choosing a real test function ϕ, we have f = f + , and the operator (ϕ) is self-adjoint and agrees with the Segal field operator (9.36) ˆ ) = A+ (f ) + A− (f ). The operator-valued generalized function (x) is (f usually also denoted as self-adjoint field operator. With a positive weight function ϕ, the operator (ϕ) can be interpreted as a field strength measured over a spacetime region with efficiency ϕ(x).

11.1 The Free Scalar Field

213

+ . ThereThe operator (11.31) J maps S(M) onto a dense linear subsets of Hm (−) fore, the linear span of the vectors (ϕ) =  (ϕ) with ϕ ∈ S(M) is + . Since any ϕ ∈ S(M) can be written as ϕ = ϕ + iϕ with dense in Hm 1 2 real functions in S(M), also the C-linear span of vectors (ϕ) with real +. ϕ ∈ S(M) is dense in Hm The commutation rules (11.54) imply ' [(ϕ), (ψ)] = −i ϕ(x)m (x − y)ψ(y)d4 xd4 y I (11.59)

with the generalized function (11.20) m (x). The distribution m (x) is invariant under orthochronous Lorentz transformations m (!x) = m (x), ! ∈ L↑ , and it is antisymmetric, m (−x) = −m (x), x ∈ M. Such a distribution always vanishes at spacelike distances m (x) = 0

if x2 = x · x < 0.

(11.60)

Hence (ϕ) and (ψ) commute if the supports of ϕ and ψ are spacelike separated. The most important properties of the singular operator (11.58) are the following: (i) It is covariant under inhomogeneous Lorentz transformations (y, !) ∈ M × L↑ , as shown in (11.52): U(y, !)(ϕ)U ∗ (y, !) = (ϕy,! ), ∗

U(y, !)(x)U (y, !) = (!x + y). (ii) It is a local field operator:   (x), (y) = 0

if (x − y)2 < 0.

(11.61) (11.62)

(11.63)

(iii) It is a self-adjoint field operator (in the sense defined previously). (iv) It is a solution of the Klein–Gordon equation:   (11.64)  + m2 (x) = 0. The locality is an essential property of a relativistic field operator. It guarantees that the field strength and other local observables – such as the energy density – at spacelike distances are causally independent. The time evolution of the field operators (ϕ) is given by U ∗ (t)(ϕ)U(t) = (eiH1 t )(ϕ)(e−iH1 t ) = (ϕt ) with ϕt (x) = ϕ(x0 − t, x), or U ∗ (t)(x)U(t) = (x0 + t, x).

214

Local Fields on the Boson Fock Space: Free Fields

For real ϕ ∈ S(M), the operator (11.56) (ϕ) is self-adjoint, and it agrees ˆ ), that is, with the Segal field operator (9.36) (f ˆ ). (ϕ) ≡ (f Then the exponential exp (i (ϕ)) is well defined, and the Weyl relations introduced in Section 9.2.2 can be transferred to the local fields exp (i (ϕ)) exp (i (ψ)) = eiδ(ϕ,ψ) exp (i (ϕ + ψ)) with the antisymmetric form δ(ϕ, ψ) =

1 2

(11.65)

'' ϕ(x)m (x − y)ψ(y) d4 xd4 y.

(11.66)

M×M

To derive the identity (11.65) from (9.86), we use the formula (11.33) for the inner product and the relation (11.20). The displacement relation (9.87) can be written in terms of the local fields as '' i (ψ) −i (ψ) (ϕ) e = (ϕ) − ϕ(x)m (x − y)ψ(y) d4 xd4 y I. e M×M

(11.67) The real functions ϕ and ψ can be taken from the space S(M), but see the following subsection for singular arguments. Extracting the test function ϕ, we obtain ' i (ψ) −i (ψ) (x) e = (x) − m (x − y)ψ(y) d4 y I. (11.68) e M

Singular Distributions So far, we have assumed that the test functions ϕ are elements of the space S(M). But following the arguments of Section 11.1.3, the field operators (11.56) (ϕ) are still defined, if ϕ has the form ϕ(x) = δ(x0 − t)χ (x) with χ ∈ L2 (R3 , d3 x). Likewise (ϕ) is also meaningful for ϕ an element of the space L2 (M, 1 + (x0 )2 d4 x). The formulas (11.65) and (11.67) have a well-defined meaning for real test functions from these spaces. The factorization ϕ(x) = δ(x0 − t)χ (x) leads to field operators (t, χ ) at fixed times. If χ ∈ L2 (R3 , d3 x) the Fourier transform χ˜ is an element of L2 (R3 , d3 k), and χ˜ (k) and ω(k)χ˜ (k) are elements of the one-particle space d3 k ). The operator (t, χ ) has the representation L2 (R3 , 2ω(k) ' (t, χ ) = (t, x)χ (x) d3 x     ∗ . (11.69) = A+ eiω(k)t χ˜ (k) + A− eiω(k)t χ(−k) ˜

11.1 The Free Scalar Field

215

˙ χ ) = ∂ (t, χ ) is also well defined. If χ (x) Moreover, the derivative (t, ∂t is taken as a test function in S(R3 ), the operator (t, χ ) is infinitely often differentiable with respect to t. The time evolution of the operators (t, χ ) is given by U ∗ (t)(s, χ )U(t) = (eiH1 t )(s, χ )(e−iH1 t ) = (s + t, χ ).

11.1.5 Two-Point Functions and Propagators Wightman Functions The vacuum expectation of the local field operator (x) vanishes, so that we have ( | (x) ) = 0. The vacuum expectation of the product of two fields is not trivial, however, and is known as the two-point Wightman function, w2 (x, y)  | (x)(y) .

(11.70)

Proposition 11.1.2 The two-point Wightman function is related to the commutator (11.55) by w2 (x, y) = + m (x − y)

(11.71)

and is a distribution depending on the difference variable x − y. Moreover, the two-point Wightman function has the invariance property w2 (!x + a, !y + a) = w2 (x, y), Proof

a ∈ M,

! ∈ L↑ .

(11.72)

The identity (11.71) is readily established as follows: ;
0, such that 1 (11.77) w(ξ ) = (2π )−3 g(s) 2 with ' √ d3 k 2 2 g(s) = e−s k +m √ k2 + m2 R3 ' ∞ ' ∞ √ * κ 2 dκ 2 2 = 4π e−s κ +m = 4π m2 e−tsm t2 − 1dt √ κ 2 + m2 0 1  −1 if m > 0 4π ms K1 (ms) (11.78) = if m = 0. 4π s−2

11.1 The Free Scalar Field

Thereby

' K1 (s) = s



−st

e

*

t2

−s

'

− 1dt = se

1



217

* e−su u(u + 2)du (11.79)

0

is the modified Bessel function of the second kind. The integrals in (11.79) are holomorphic for s ∈ C with Re s > 0. By rotating the variable u in the range − 12 π < arg u < 12 π , we can enlarge the domain of holomorphy to the complex plane with a cut along the negative real axis. Moreover, by rotating the variable u within the range through the cut 

−π < arg u < π , an analytic continuation to the domain s = |s| eiα | s = 0, − 32 π < α < 32 π is possible. The function K1 (s) has the representation, as shown in Magnus et al. (1966, sec. 3.2), ( s) 1 I1 (s) − s + r3 (s). (11.80) K1 (s) = s−1 + γ + log 2 4 In this formula we have Euler’s constant γ , r3 (s) is an entire analytic function with a zero of third order at s = 0, and I1 (s) is the modified Bessel function of the first kind: ∞  s 2n+1  1 1 1 I1 (s) = −iJ1 (is) = = s + s3 + · · · .(11.81) n! (n + 1)! 2 2 16 n=0

In the neighborhood of s = 0, we obtain K1 (s) = s−1 + sR(s)

(11.82)

with R(s) = 12 log 2s + locally bounded function. For large values of |s| and − 32 π < arg s < 32 π , the asymptotic behavior of K1 (s) is, as shown in Magnus et al. (1966, sec. 3.14.1), +  

3 −1 π −s 1 + s + O s−2 . e (11.83) K1 (s) = 2s 8 The function (11.78) has therefore the representation K1 (ms) 4π = 2 + 4π m2 R(ms), s s where the last term vanishes for m = 0. * With ξ = (−is, 0) ∈ C4 such that s = −ξ 2 , the function w(ξ ) is  *  2 m K1 m −ξ * , w(ξ ) = (2π )2 −ξ 2 g(s) = 4π

(11.84)

(11.85)

if m > 0, and w(ξ ) = (2π )−2 ξ −2 if m = 0. Due to the Lorentz invariance (11.75) and analyticity, this result is true for all ξ ∈ M + iV − ⊂ MC . In this

218

Local Fields on the Boson Fock Space: Free Fields

* domain, the sign of the square root satisfies Re −ξ 2 > 0. In the limit ξ = x − i(ε, 0), ε → +0, we obtain the value of w on the Minkowski space3  √  K1 m −x2 + i0x0 m 0 + (11.86) √ m (x) = w(x + i0, x) = (2π )2 −x2 + i0x0   * 1 1 m2 2 + i0x0 . (11.87) = −x + R m (2π )2 −x2 + i0x0 (2π )2 * √ The function −x2 + i0x0 is the limit of −ξ 2 for Im ξ 0 → −0, that is, ⎧ 0 ⎪ x2 if x2 < 0, ⎪ ⎪ ⎨ 0 * −x2 + i0x0 = if x2 > 0, x0 > 0, −i x2 ⎪ 0 ⎪ ⎪ ⎩ +i x2 if x2 > 0, x0 < 0. The explicit result (11.85) shows that the domain of analyticity of w(ξ ) is (11.85) much larger than the tubular domain T = M + iV − . The function 2 < π (with is holomorphic for ξ ∈ MC if ξ = 0 and −π < arg −ξ * Re −ξ 2 > 0). This domain includes the domain of spacelike vectors of the Minkowski space and the euclidean points ξE = (ix4 , x1 , x2 , x3 ) = 0 with x = (x1 , x2 , x3 , x4 ) ∈ E = R4 . At the euclidean points, the function w has the values m w(ξE ) = (2π )−2 K1 (m |x|) , (11.88) |x| 0 4 μ 2 with the euclidean length |x| = μ=1 (x ) . As a consequence of this result, the two-point Wightman function w2 (x1 , x2 ) at spacetime points x1,2 ∈ M is the boundary value of a function that is analytic in both variables ξ1 and ξ2 . This function has an analytic continuation to the euclidean points ξjE = (i xj4 , xj1 , xj2 , xj3 ), j = 1, 2, with xj = (xj1 , xj2 , xj3 , xj4 ) ∈ E and x1 = x2 , m w2 (ξ1E , ξ2E ) = (2π )−2 K1 (m |x1 − x2 |) . (11.89) |x1 − x2 | 0 μ μ 2 4 is the euclidean distance. Here |x1 − x2 | = μ=1 x1 − x2 The distribution w(x) has no defined value at x = 0, and the vacuum expectation ( | (x)(y) ) = + m (x − y) is not defined at coinciding points x = y. 3 A general theory of Lorentz invariant distributions has been developed by Meth´ee (1954);

see section 3.3 of Bogoliubov et al. (1990). For the connection between boundary values of holomorphic functions and distributions, see Bremermann and Durand (1961) and chapter 2. of Streater and Wightman (1964).

11.1 The Free Scalar Field

219

Hence the product of fields (x)(y) cannot be defined at coinciding points x = y. But for the free field, it turns out that the only ill-defined term comes from the vacuum expectation. The normal product or Wick product of the field operators (x) and (y) is defined as :(x)(y):  (x)(y) − w2 (x, y) I.

(11.90)

At the algebraic level, this is really just the usual process of expanding terms and then summarily moving all the creation operators to the left of all annihilation operators, ignoring the commutation relations. Indeed, the identity  (x)(y) − w2 (x, y) I = (x)(y) − (+) (x), (−) (y) implies this representation of the normal product by “normal ordering”: :(x)(y): = (+) (x)(+) (y) + (−) (y)(+) (x) +(−) (x)(+) (y) + (−) (x)(−) (y).

(11.91)

The normal product transforms like U(a, !) :(x)(y): U ∗ (a, !) = :(!x + a)(!y + a):

(11.92)

under orthochronous Poincar´e transformations (a, !) ∈ M×L↑ . This behavior follows from the transformation rules (11.62) and (11.72). Another important property of the normal product is its commutativity :(x)(y): = :(y)(x): .

(11.93)

This relation is an immediate consequence of (11.91) since creation operators commute with creation operators and annihilation operators commute with annihilation operators. This normal product is defined as operator-valued distribution also at coincidence points :2 (x):  :(x)(x): .

(11.94)

This singular operator is called the (second) Wick power of (x). It is straightforward to prove that the inner product F |:2 (x): G is a continuous function of x if F and G are tensors in a suitable linear subset D of the Fock space as defined in Section 9.1.3. But it takes more effort to derive that ϕ(x) :2 (x): d4 x is a well-defined operator on D for all test functions ϕ ∈ S(M), cf. Wightman and Garding (1964). Commutator Function The commutator of the field operators (x) and (y) is a c-number, which can be calculated from its vacuum expectation

220

Local Fields on the Boson Fock Space: Free Fields .   / + − i m (x − y) = | (x), (y) = + m (x − y) − m (y − x). (11.95)

Taking into account the identity between Bessel functions is (R(is) − R(−is)) = K1 (is) + K1 (−is) = −π J1 (s), s > 0, + the distribution m (x) = −i + m (x) − m (−x) follows from (11.87) and (11.80) as  √  J m x2 1 1 m 0 2 0 2 sign(x )δ(x ) − sign(x )(x ) . (11.96) m (x) = √ 2π 4π x2 The Fourier representation −3

'

m (x) = i (2π )

e−ik·x sign(k0 )δ(k2 − m2 )d4 k

(11.97)

is an immediate consequence of (11.73). The generalized function m (x) is Lorentz invariant, m (!x) = m (x), ! ∈ L↑ , and it is also antisymmetric, m (−x) = −m (x). It vanishes therefore for spacelike x, as can also be seen from the representation (11.96). We now investigate m (x) = m (t, x) as distribution in x with parameter t ∈ R. Choosing a test function ϕ(x) ∈ S(R3 ), the most singular term 1 0 2 2π sign(x )δ(x ) gives the contribution ' ' t2 2 2 3 ϕ(x)d sign(t)δ(t − x )ϕ(x)d x = |x|=|t| 2t = 2π t ϕ(0) + O(t3 ) (11.98)  if t = 0. The remaining contributions are of the order |x|≤|t| d3 x ∼ |t|3 for small |t|. Hence m (t, x) is a solution of the Klein–Gordon equation    + m2 m (x) = 0 (11.99) with the initial conditions m (t = 0, x) = 0, 2

∂ m (t = 0, x) = δ 3 (x). ∂t

(11.100)

∂ The second derivative ∂t 2 m (t = 0, x) = 0 vanishes at t = 0 as a consequence of (11.98). Performing the k0 integration in (11.97), we obtain the useful Fourier representation ' sin ω(k)t ikx 3 e d x. (11.101) m (x) = (2π )−3 ω(k) R3

11.1 The Free Scalar Field

221

Retarded Propagator The retarded propagator or retarded Green’s function ret m (x) is the fundamental solution of the Klein–Gordon equation   4  + m2 ret (11.102) m (x) = δ (x) that vanishes for t < 0. Its Fourier representation is '  −1 −4 −ik·x 2 2 0 m ret (x) = (2π ) − k − i0k d4 k. e m

(11.103)

The distribution m2 −k21−i0k0 is the boundary value of the Lorentz invariant ana −1 lytic function m2 − w2 with vectors w = k + iu in C4 , k ∈ R4 , u ∈ V + , for u → 0. This function is holomorphic for Im k0 > 0. Hence the distribution (11.103) vanishes for t < 0. The identity  −1  =1 m2 − k2 m2 − k2 − i0k0 implies the relation (11.102). The Green’s function (11.103) is a well-defined ret ↑ Lorentz invariant distribution, ret m (!x) = m (x), ! ∈ L , with a support in + the closed forward cone V . The advanced propagator ret ret adv m (x) = m (−t, x) = m (−x)

(11.104)

is another solution of the differential equation (11.102). It vanishes for t > 0. ret The relation ret m (x) − m (−x) = m (x) follows from the identity −1  −1  m2 − k2 − i0k0 − m2 − k2 + i0k0 = 2π i sign(k0 ) δ(k2 − m2 ). adv Then the support restrictions of ret m (x) and m (x) lead to the identification 0 ret m (x) = (x )m (x) 0 + = i(x0 )+ m (x) − i(x )m (−x).

(11.105)

The last identity follows from (11.95). Feynman Propagator and the Two-Point τ -Function The causal Green’s function or Feynman propagator cm (x) is yet another a solution of the differential equation (11.102). It is defined as  if x0 > 0, i+ c ret + m (x) (11.106) m (x)  m (x) + im (−x) = + if x0 < 0. im (−x)

222

Local Fields on the Boson Fock Space: Free Fields

The second identification can be written as 0 + cm (x) = i(x0 )+ m (x) + i(−x )m (−x);

it follows from (11.105). Its Fourier transform is ' ˜ cm (k) = cm (x)eik·x d4 x   −1 = m2 − k2 − i0k0 + 2π i (−k0 )δ(k2 − m2 ) −1  . (11.107) = m2 − k2 − i0 ˜ cm (k) and cm (x) are invariant under Lorentz transThe generalized functions  formations of the full group L. The propagator i cm (x) and the distribution + m (x) are boundary values of the analytic function w(ξ ); they coincide for spacelike x, a region that lies inside the analyticity domain, and for positive timelike x; but they are different boundary values for negative timelike x: −i cm (x) = + m (x) = w(x) = w(−x) 0 −i cm (x) = + m (x) = w(x − i0, x) −i cm (x) = w(x0 + i0, x) 0 + m (x) = w(x − i0, x)

if if if if

x2 x2 x2 x2

< 0, > 0, x0 > 0, > 0, x0 < 0, > 0, x0 < 0. (11.108)

In analogy to (11.71), we define the two-point τ -function as τ (x1 , x2 )  −icm (x1 − x2 ) = (x10 − x20 )w2 (x1 , x2 ) + (x20 − x10 )w2 (x2 , x1 ). (11.109) The second identity follows from (11.106). The two-point τ function is a welldefined generalized function in S  (M × M). The τ -function is symmetric in its arguments x1 and x2 , and – as a consequence of the Lorentz invariance of cm (x) – it has the invariance property τ (!x1 + a, !x2 + a) = τ (x1 , x2 ),

a ∈ M,

! ∈ L, (11.110)

under Poincar´e transformations. This τ -function is related to the time-ordered product of field operators. The time-ordered product or T -product of the field operators (x1 ) and (x2 ) is formally introduced as T (x1 )(x2 ) = (x10 − x20 )(x1 )(x2 ) + (x20 − x10 )(x2 )(x1 ). (11.111)

11.1 The Free Scalar Field

223

The right side of (11.111) does not depend on the Lorentz system, since the operators (x1 ) and (x2 ) commute for spacelike separation of the arguments. But it is only a formal definition since the multiplication of the generalized operator (x1 )(x2 ) with the step function (x10 − x20 ) is so far only a formal device. As a consequence of (11.109), the vacuum expectation of (11.111) can be identified with the τ -function τ (x1 , x2 ) = | T (x1 )(x2 ) ,

(11.112)

which is well defined. To proceed, we use the identity (11.90). The normal product :(x1 )(x2 ): is symmetric in the variables x1 and x2 . The timeordered prescription therefore does not affect this product, T (:(x1 )(x2 ):) = :(x1 )(x2 ):. The identities (11.90) and (11.109) then imply that (11.111) can be identified with the following well-defined product. The time-ordered product of the field operators (x1 ) and (x2 ) is defined as T (x1 )(x2 )  τ (x1 , x2 ) I+ :(x1 )(x2 ): .

(11.113)

The product (11.113) is an operator-valued generalized function of the variables x1 and x2 . It is symmetric in x1 and x2 , and it transforms under orthochronous Poincar´e transformations as U(a, !) (T (x)(y)) U ∗ (a, !) = T (!x + a)(!y + a).

(11.114)

This relation follows from (11.92) and (11.110). Euclidean Propagator and Schwinger Function The function (11.74) w(ξ ) can be analytically continued to the euclidean points ξE = (ix4 , x1 , x2 , x3 ) with x = (x1 , x2 , x3 , x4 ) ∈ E = R4 , x = 0. The euclidean propagator m Em (x)  w(i x4 , x) = (2π )−2 K1 (m |x|) (11.115) |x| 0 4 μ 2 is a function of the euclidean length |x| = μ=1 (x ) of the vector x ∈ E = R4 ; see (11.88). From Section 11.1.5, we know that w(i x4 , x) is an analytic continuation of the Feynman propagator w(x) = −icm (x), x spacelike. Starting from the Feynman propagator (11.107) in the momentum representation, the energy variable k0 can be continued to the imaginary axis k0 → ik4 with k4 ∈ R. This continuation is the Wick , k0 → eiα k0 , 0 ≤ 2 rotation R −1 π 2 α ≤ 2 , within the analyticity domain of m − k − i0 . Then the indefinite 2 form k2 = (k0 )2 − k2 becomes the negative definite form − k4 − k2 = − |k|2 , k = (k1 , k1 , k3 , k4 ) ∈ R4 . The euclidean propagator is therefore defined in the momentum variables as

224

Local Fields on the Boson Fock Space: Free Fields −1  ˜ Em (k) = |k|2 + m2  .

(11.116)

The functions (11.115) and (11.116) are locally integrable and decreasing for large arguments (including the case m = 0). They are well-defined generalized functions in S  (R4 ), which are related by the euclidean Fourier transform '  −1 |k|2 + m2 exp (−i kx) d4 x (11.117) Em (x) = (2π )−4 M

(see, e.g., Gelfand and Shilov, 1964). Thereby kx is the euclidean inner product  kx = 4μ=1 kμ xμ . As already stated in Section 11.1.5, the Wightman function w2 (x, y) = w(x − y) has an analytic continuation to w2 (i x4 , x, i y4 , y). The resulting function S2 (x, y) = w2 (i x4 , x, i y4 , y) = Em (x − y)

(11.118)

is called the two-point Schwinger function. The causal propagator cm (x) has been introduced in quantum field theory by Stueckelberg and Rivier (1949) and by Feynman (1949). It is a basic ingredient of the standard perturbation theory. The transition from the Minkowski time variable to a variable on the imaginary axis and the corresponding transition of the energy variable by analytic continuation (“rotation”) has been introduced by Wick (1954) as a method to simplify calculations. A systematic investigation of euclidean Green’s functions started with the work of Schwinger (1958), Nakano (1959), and Symanzik (1966).

11.1.6 N-Point Functions and Normal Product Wightman Functions The vacuum expectation wn (x1 , . . . , xn ) = | (x1 ) . . . (xn )

(11.119)

of the product of n field operators is called the n-point Wightman function. The field operators (xj ) = (−) (xj ) + (+) (xj ) are sums of creation and annihilation operators, and the expectation (11.119) can be calculated as done in Section 9.3.1. The result yields again the Gaussian combinatorics of moments  w2 (xσ (1) , xσ (2) ) . . . w2 (xσ (2k−1) , xσ (2k) ), w2k (x1 , . . . , x2k ) = σ ∈S< (2k)

(11.120) with the odd terms vanishing. Recall that the two-point Wightman function is w2 (x, y) = + m (x − y). Here S< is the set of all permutations σ of the index

11.1 The Free Scalar Field

225

set {1, . . . , 2k} that respect the order relations σ (1) < σ (3) < · · · < σ (2m − 1) and σ (2j − 1) < σ (2j), j = 1, . . . , k. The Wightman functions are in general not invariant under an interchange of their arguments. Let X = (x1 , . . . , xn ) be a sequence of spacetime points, then we introduce the following notation for an ordered product of field operators,  (X) =  (x1 ) . . .  (xn ) , and the Wightman function is then defined as w (X) = |  (X) . This is entirely consistent with (11.119). The representation (11.120) and the analyticity properties of the two-point Wightman function together imply that the generalized function w(x1 , . . . , xn ) is the boundary value of an analytic function w(ξ1 , . . . , ξn ). The domain of analyticity includes the region (ξ1 , . . . , ξn ) ∈ C4n with the condition that 2 Re −(ξj − ξk ) > 0, 1 ≤ j < k ≤ n. Normal (Wick) Product of Field Operators Given a sequence X = (x1 , . . . , xn ), we would like to obtain a simple way to describe the normal product or Wick product : (X):. Evidently this should be the generalization of the identity (11.91), and therefore the same as taking  (x1 ) . . .  (xn ), writing each term as  (xk ) = (−) (xk ) + (+) (xk ), expanding and then moving all annihilators to the left of creators, and ignoring the commutation relations. This procedure is called normal ordering or Wick ordering. For the given sequence X = (x1 , . . . , xn ), we shall understand the equation X1 + X2 = X to mean a way of splitting X into a pair of disjoint subsequences, respecting the same order of terms; see Section 4.1.7. The null subsequence is allowed. Given a sequence X = (x1 , . . . , xn ), we define the Wick product, : (x1 ) . . .  (xn ):, to be  (−) (X1 )(+) (X2 ). (11.121) : (X):  X1 +X2 =X

The Wick product (11.121) is an associative product, generated from the product (11.90) of two fields. The Wick product of field operators is commutative. The covariance properties (11.49) of (±) imply the covariance of the normal product U(y, !) :(x1 ) . . . (xn ): U ∗ (y, !) = :(!x1 + y) . . . (!xn + y): . (11.122)

226

Local Fields on the Boson Fock Space: Free Fields

For n = 1, we obtain :(x):  (x), and for n = 2 the identity (11.121) yields the normal product (11.91) of Section 11.1.5. More generally, the definition (11.121) leads to the following. Proposition 11.1.4 The sequentially ordered and Wick-ordered products of field operators are related by  w(X1 ) :(X2 ):, (11.123) (X) = X1 +X2 =X

:(X): =



1

(−1) 2 #X1 w(X1 ) (X2 ).

(11.124)

X1 +X2 =X

Proof This is easily established by induction. These relations are true for n = 1 and n = 2. For larger n, they can be derived with the help of recurrence relations. The product :(X): (xn+1 ) may be calculated using (11.121), and it is straightforward to derive the recurrence relation  w2 (xm , xn+1 ) :(X/xm ): :(X): (xn+1 ) = m∈N

+ :(X + xn+1 ): . The notation here is that X/xm is the sequence (x1 , . . . , xn ) with xm removed, and X + xn+1 means (x1 , . . . , xn , xn+1 ). The sum (11.121) then leads to the recurrence relation The identities (11.123) and (11.124), which is M¨obius inversion formula for sequences, are consequences of this relation. A crucially important remark is that the Wick product is symmetric under interchange of its arguments. Given X = (x1 , . . . , xn ), we could take the corresponding set of elements X = {x1 , . . . , xn }, and since :(X): does not depend on the order of the elements of the sequence, we could think of it as :(X): instead. That is, we may reconsider the Wick product as  :(X): ≡ :  (x): . x∈X

In light of this, we may interpret equation (11.121) as a standard Wick product 

:: ≡ (−)  (+) ,

(11.125)

where (±) (X)  x∈X (±) (x). Note that the objects appearing in (11.125) are operators and that their order - the creators φ (−) to the left of the annihilators (+) is essential. The Wick product was introduced by Wick (1950). In this publication, Wick also derived the identities (11.123) and (11.124) (first Wick Theorem) and the subsequent relations (11.132) with the T-product (second Wick Theorem).

11.1 The Free Scalar Field

227

Time-Ordered Product and τ −Functions The time-ordered product or T-product of the fields (x1 ), . . . , (xn ) is formally defined as  (xσ0 (1) > · · · > xσ0 (n) ) (xσ (1) ) . . . (xσ (n) ). T (x1 ) . . . (xn ) = σ ∈S(n)

(11.126) The step function (s1 > · · · > sn ) with (s1 , . . . , sn ) ∈ Rn is defined as  1, if s1 > · · · > sn , (s1 > · · · > sn ) = . 0, otherwise. The sum in (11.126) extends over all permutations σ ∈ S (n) of the index set {1, . . . , n}. By formal arguments, one obtains the following properties, which should be satisfied by the final definition: T (x1 ) . . . (xn ) = (x1 ) . . . (xn ) if x10 > · · · > xn0 ,

(11.127)

T (x1 ) . . . (xn ) = T (xσ (1) ) . . . (xσ (n) ),

(11.128)

σ ∈ S (n) ,



U(y, !) T (x1 ) . . . (xn ) U (y, !) = T (!x1 + y) . . . (!xn + y). (11.129) In the last identity, the Poincar´e transformations used were assumed to be orthochronous, that is, (y, !) ∈ M × L↑ . We note that the time-ordered product may be alternatively defined as T (x1 ) . . . (xn ) = (xπ(1) ) . . . (xπ(n) ), 0 0 . where π is the permutation that leads to the time ordering xπ(1) > · · · > xπ(n) The vacuum expectation

τn (x1 , . . . , xn ) = | T (x1 ) . . . (xn )

(11.130)

is called the n-point τ -function . Following (11.128), the τ -function is totally symmetric in the variables x1 , . . . , xn . It is therefore determined by its values modulo the simultaneous points. We note that we have τn (x1 , . . . , xn ) ≡ wn (xπ(1) , . . . , xπ(n) ) where π is the permutation putting the sequence (x1 , . . . , xn ) to chronological order – unfortunately, π depends on the sequence! For definiteness, let us assume that x10 > · · · > xn0 , then (11.130) coincides with the Wightman function (11.119), which in turn has the explicit representation (11.120) as sum over two-point Wightman functions. The corresponding sum over two-point τ -functions

228

Local Fields on the Boson Fock Space: Free Fields

τ2k (x1 , . . . , x2m ) =



τ2 (xσ (1) , xσ (2) ) . . . τ2 (xσ (2m−1) , xσ (2m) ),

σ ∈S
· · · > xn0 . Hence the identities (11.131) give a well-defined meaning to the vacuum expectation of (11.126). For given X = (x1 , . . . , xn ), the time-ordered product T  (X) will only depend on the set X = {x1 , . . . , xn } since any permutation of the original sequence will lead to the same chronological reordering. It therefore makes sense to define   (x) . T  (X) = T x∈X

Likewise, the τ -functions are symmetric and we may write τ (X) ≡ | T (X) . We immediately obtain the Gaussian decomposition as  τP , τ (X) ≡ P ∈Pair(X)

where we use the same notation as in (6.15). We can now give a precise meaning to the formal sum (11.126). If x10 > · · · > xn0 the T-product (11.126) coincides with the product (11.127), which can be expressed by Wightman functions and normal products  (11.124) T (x1 ) . . . (xn ) = w(X1 ) :(X2 ): . X1 +X2 =X

Now as the subsequence X1 respects the correct chronological ordering, we may replace the Wightman w(X1 ) by the τ (X1 ) leading to the identification  τ (X1 ) :(X2 ): . (11.132) T (x1 ) . . . (xn )  X1 +X2 =X

The main result is that (11.132) now yields a well-defined operator-valued generalized function that obviously satisfies (11.127) and (11.128). The identity (11.129) follows from the invariance τ2 (!x1 + y, !x2 + y) = τ2 (x1 , x2 ) of the two-point τ -function and the covariance (11.122) of the normal product. For an arbitrary finite set X = {x1 , . . . , xn }, we may use the two previous  lemmas to write (11.132) as T (X) = X1 +X2 =X τ (X1 ) : (X2 ): where the

11.1 The Free Scalar Field

229

decomposition is into disjoint subsets. This may be written in the more compact notation of (4.5): T  ≡ τ  :: . (11.133)  The time-ordered exponential of  [J] = J (x)  (x) dx is defined to be  1 T e[J] = T  [J]n n! n  1 ' = J (x1 ) . . . J (xn ) T  (x1 ) . . .  (xn ) dx1 . . . dxn n! n or in Guichardet notation,

'

Te

[J]



J X T  (X) dX.

Setting n = {(x1 , . . . , xn ) : t (x1 ) > · · · > t (xn )}, we see that the time-ordered exponential is ' T e[J] = J (x1 ) . . . J (xn ) T  (x1 ) . . .  (xn ) dx1 . . . dxn . n

n

Now, we saw in (11.133) that T  = τ  ::, and by virtue of the Wick product we have that T e[J] = ZT  = Zτ [J] Z:: [J] . For the  free particle case, the τ -functions are Gaussian, so that Zτ [J] ≡ exp 12 J (x) τ (x, y) J (y) dxdy, while we introduce the normal ordered exponential ' Z:: [J] =

J X : (X): dX.

We also have (11.125), :   (−)  (+) , so that Z:: [J] = Z(−) [J] Z(+) [J] . Note that these generating functions do not commute! Therefore, we have in the free particle case '

1 (−) (+) T e[J] = exp J (x) τ (x, y) J (y) dxdy e [J] e [J] , 2 which converts the time-ordered exponential into Wick-ordered form. In perturbation theory, one needs a generalization of the T-product to time ordering of Wick products. Since the Wick product is commutative, we have T :(X): ≡ :(X): .

230

Local Fields on the Boson Fock Space: Free Fields

More generally, let X1 , . . . , Xm be disjoint finite subsets of M. Then the timeordered product T (:(X1 ): · · · :(Xm ):) is again calculated with formula (11.132), but the τ -function may only contain contractions τ2 (x, y), where x and y belong to different sets Xk . Schwinger Functions The Wightman functions have an analytic continuation to the euclidean points 0 0 (ξ1 , . . . , ξn ) = ix1 , x1 , . . . , ixn , xn , and the n-point Schwinger function with arguments (x1 , . . . , xn ) ∈ En = R4n is defined as   Sn (x1 , . . . , xn ) = wn ix10 , x1 , . . . , ixn0 , xn . (11.134) The two-point Schwinger function S2 (x, y) = Em (x − y) is calculated in Section 11.1.5. Identities (11.120) imply the following explicit formula for the Schwinger functions, with odd terms vanishing:  S2 (xσ (1) , xσ (2) ) . . . S2 (xσ (2m−1) , xσ (2m) ), S2m (x1 , . . . , x2m ) = σ ∈S
· · · > xσ0 (n) )(xσ (1) ) . . . (xσ (n) ), T (x1 ) . . . (xn ) = σ ∈Sn

(12.15) where the sum extends over all permutations σ of the index set {1, . . . , n}. Mathematically well-defined objects, which correspond to these formal products, might not exist in a Wightman quantum field theory. But it is possible to postulate the existence of such products as operator-valued generalized functions without contradiction to the Wightman axioms. By formal manipulation on expressions of the type (12.15), one obtains a list of properties these objects should have. These properties are then taken as defining assumptions (“axioms”); see, for example, Eckmann and Epstein (1979) or the extensive presentation in chapter 13 of Bogoliubov et al. (1990). We only mention two properties, which we already know from the free field: The T-product

12.1 Interacting Neutral Scalar Fields

241

of arbitrary Wightman fields should satisfy the symmetry (11.128) and the covariance (11.129) under proper orthochronous Poincar´e transformations. The vacuum expectation of the T-product τn (x1 , . . . , xn ) = | T (x1 ) . . . (xn )

(12.16)

is a generalized function in S  (Mn ), and it is called n-point τ -function. It is invariant under exchange of its spacetime arguments, and it is invariant under proper orthochronous Poincar´e transformations τn (!x1 + y, . . . , !xn + y) = ↑ τn (x1 , . . . , xn ), y ∈ M, ! ∈ L+ . The τ -functions have a particular role in quantum field theory. All scattering matrix elements can be calculated from the τ -functions; and the standard perturbation theory has been developed for these functions; see, for example, Bogoliubov and Shirkov (1959, 1983), Itzykson and Zuber (1980), and Weinberg (1995). Euclidean Green’s Functions (Schwinger Functions) The spectral condition and Lorentz invariance allows to derive analyticity domains for the Wightman functions wn (x1 , . . . , xn ) that include the euclidean points (ξ1 , . . . , ξn ) ∈ MnC with ξj = (i xj4 , xj ), where xj = (xj1 , xj2 , xj3 , xj4 ) ∈ E = R4 , j = 1, . . . , n, are vectors in the euclidean R4 . Thereby, we have to exclude the “exceptional points” with xj = xk for j = k. The n-point euclidean Green’s functions or Schwinger functions are defined on the space   for j = k (12.17) En= = (x1 , . . . , xn ) ∈ En , xj = xk as s(x1 , . . . , xn ) = w(ξ1 , . . . , ξn ).

(12.18)

The Schwinger functions are generalized functions in S  (En= ), but it might not be possible to extend them to generalized functions in S  (En ). They are symmetric functions of their arguments, and they are invariant under euclidean transformations s(Rx1 + y, . . . , Rxn + y) = s(x1 , . . . , xn ),

R ∈ O+ (E),

y ∈ E.

Osterwalder and Schrader (1973, 1975) have derived a list of axioms for the Schwinger functions that follow from the axioms for the Wightman functions, and – assuming these axioms – the Wightman theory can be reconstructed. A comprehensive reference is chapter 9 in Bogoliubov et al. (1990); see also Zinoviev (1995). One can therefore obtain a Wightman field theory from Schwinger functions. But the Osterwalder–Schrader axioms are difficult to

242

Local Interacting Boson Fields

verify, and one usually starts from a more restrictive set of basic assumptions. The n-point Schwinger functions are taken as generalized functions in S  (En ). Eckmann and Epstein (1979) have shown that this assumption is equivalent to the existence of well-defined T-products for the Wightman field theory. All successful constructions start from an additional very effective assumption: the Schwinger functions are taken as moments of a probability measure. The probabilistic approach is successful so far for field theories on a two- or a threedimensional euclidean space. For higher spacetime dimensions, perturbation expansions and (formal) identities can be derived. The calculations are to some extend easier than those of the perturbation theory for τ -functions, since the singularity structure of euclidean Green’s functions is simpler. Connected Green’s Functions Let Gn (X) = Gn (x1 , . . . , xn ), n ∈ N, and be a family of Green’s functions of quantum field theory (Wightman functions, τ -functions, or Schwinger functions). The symbol X = (x1 , . . . , xn ) is a sequence of vectors xj in M or E, resp. Then connected Green’s functions are defined by K1 (x) = G1 (x) and the recursion  Kn (X) = Gn (X) − K(X1 ) . . . K(Xp ), (12.19) X1 +···+Xp =X

if n ≥ 2. Here the summation is over all partitions of the sequence X = (x1 , . . . , xn ) as detailed in Section 4.1.7. These Green’s functions have been introduced by Haag (1958) for the Wightman functions and are called truncated Wightman functions. This notation persists in most books of mathematical physics. But standard books of quantum field theory, such as Itzykson and Zuber (1980), use the term connected Green’s functions. At any rate, they are exactly the cumulant Green’s functions already introduced in Section 4.1.7. In the case of free fields, the only nonvanishing connected Green’s function is K2 (x1 , x2 ) = G2 (x1 , x2 ). Any interaction has to show up in the connected Green’s functions! Knowing the connected Green’s functions, one obtains the  ordinary ones from (4.16), namely G(X) = K(X1 ) . . . K(Xm ). If the Green’s functions are symmetric in their arguments – as in the case of τ -functions or Schwinger functions – the connected Green’s functions are exactly the cumulants Green’s functions of Section 4.1.4, and the techniques introduced there, in particular the use of generating functionals, apply.

12.1.3 Perturbation Expansions and Constructive Approaches So far, the only example of a quantum field on four-dimensional Minkowski space, which satisfies all requirements, is the free field. But there is an involved

12.1 Interacting Neutral Scalar Fields

243

theory of perturbation expansions for the Green’s functions of interacting fields. Here we give only a rough outline with restriction to the formal expansion without regularization. A modern approach to renormalization of a self-interacting scalar field theory in four dimensions can be found in Gallavotti (1985). Constructions beyond perturbation theory have been successful for field theories on spacetime dimensions less than 4; see, for example, the extensive monograph Glimm and Jaffe (1987) and the recent review article Summers (2012). Perturbation Expansions for τ -Functions The field equation for a self-interacting scalar field (x) of mass m is   (12.20)  + m2 (x) = j(x). Thereby, the current density j(x) will be determined by the interaction Lagrangian Lint (x). For a polynomial interaction Lint (x) = −λ :P ((x)): with the real polynomial P(s) of the variable s ∈ R, the relation for the d P(s). The powers of the field source term is j(x) = :P ((x)): with P (s) = ds operator (x) are usually understood as Wick powers :n (x):. In standard quantum field theory, one assumes that the field (x) approaches for t → ∓∞ free incoming/outgoing fields in/out (x) (LSZ asymptotic condition, Lehmann et al., 1955, 1957). This assumption can actually be derived from the Wightman axioms and the assumption that retarded products of field operators exist (Hepp, 1965; Steinmann, 1968). With the incoming asymptotic condition, the differential equation (12.20) can be written as the integral equation (Yang and Feldman, 1950) ' 4 (12.21) (x) = in (x) + ret m (x − y)j(y)d y, with the retarded propagator (11.103). In this case, a closed formula for the τ -functions of the interacting field can be derived on a formal level: the GellMann and Low series (see Gell-Mann and Low, 1951, and, e.g., sec. 6-1-1 of Itzykson and Zuber, 1980), ∞ p '  i d4 y1 . . . d4 yp τ (x1 , . . . , xn ) = N −1 p! p=0 . / × | T in (x1 ) . . . in (xn )Lint (y1 ) . . . Lint (yp ) , (12.22) where N is the normalization ∞ p '  . / i N= d4 y1 . . . d4 yp | T Lint (y1 ) . . . Lint (yp ) . p! p=0

244

Local Interacting Boson Fields

In these formulas, the Lagrangian density Lint (y) is understood as functional in of the incoming free field in (12.22) can be formally summed  4 . The series to the exponential exp i d y Lint (y) . The vacuum expectations in (12.22) are polynomials of integrals over products of two-point τ -functions of the field in . The euclidean version of this perturbation expansion and its formulation with Feynman diagrams is given in Section 6.2.3. The Gell-Mann and Low series (12.22) corresponds to equation (6.22) in that section. Studying the combinatorics of all terms in (12.22), one sees that the denominator N cancels those factors in the nominator, which do not include an external xj argument. The ratio (12.22) can therefore be written as the linked cluster expansion: ∞ p '  i τ (x1 , . . . , xn ) = d4 y1 . . . d4 yp p! p=0 . / × | T in (x1 ) . . . in (xp )Lint (y1 ) . . . Lint (yp ) L . (12.23) The subscript L means that in the evaluation of | . . . , only products of two-point functions are admitted, which are “linked” with one or more external xj -arguments. The linked cluster expansion is derived in Section 6.2.3 for the euclidean (probabilistic) perturbation theory; see equation (6.25). The Gell-Mann and Low series can be used to define a generating functional for the n-point functions. With a test function J(x) in S(M), which can be interpreted as external classical current, we define the functional  / . | T exp i d4 y Lint (y) + J(y)in (y)  . / . (12.24) Z(J) = | T exp i d4 yLint (y) Thereby the exponential in (12.24) is understood as formal power series. Then (12.22) is equivalent to τ (x1 , . . . , xn ) =

δn Z(J) |J=0 . δJ(x1 ) . . . δJ(xn )

(12.25)

The τ -functions (12.23) are generalized functions, and they have to be smeared with test functions in the external xj variables. But also after these integrations with smooth and rapidly decreasing functions, severe problems with the d4 y integrals remain. Most of these integrals do not exist, for two reasons: (i) Products of two point functions with arguments at the same point yk occur. These local singularities are not integrable (ultraviolet divergence). (ii) The integration over the Minkowski space leads to divergences at large distances (infrared divergence).

12.1 Interacting Neutral Scalar Fields

245

The main work of perturbation theory is therefore the renormalization of these singularities. Euclidean Approaches We give only a few details for a euclidean quantum field theory in d ≥ 2 dimensions. The euclidean field is denoted with φ(x), x ∈ E = Rd . The free euclidean field is a Gaussian field with covariance '  −1 |k|2 + m2 exp (−i k(x − y)) dd x. Em (x − y) = (2π )−d Ed

Let dμfree (φ) be the corresponding Gaussian measure on S  (E). Assume a  j polynomial interaction λP(φ) where P(t) = 2n j=0 aj t , a2n > 0 is a polynomial with lower bound, and λ ∈ C is a coupling constant. To define an interaction Lagrangian, the polynomial P(φ) is substituted by :P(φ): where the double dots indicate normal ordering with respect to the Gaussian measure dμm . Moreover, the integration is restricted to a finite volume of E by introducing a cutoff function g(x) that has compact support and satisfies 0 ≤ g ≤ 1. Then a formal advice to calculate the n-point Schwinger functions is ' (12.26) s(x1 , . . . , xn ) = φ(x1 ) . . . φ(xn )dμλ (φ) with the measure −1

dμλ (φ) = lim N(λ; g) g→1

' exp −λ

2

Rn

:P (φ(x)): g(x)d x dμfree (φ),

(12.27)  where N(λ; g) = exp −λ Rn :P (φ(x)): g(x)d2 x dμfree (φ) gives the probability normalization. But for d ≥ 4, it is not yet clear whether such a probability measure exists. Nevertheless, these relations can be used to derive perturbation expansions and identities within perturbation theory. The perturbation expansion derived from (12.27) yields Schwinger functions, which are the analytic continuations of the τ -functions obtained from (12.23). An important advantage of the euclidean approach is that the methods introduced in Chapter 6 can be used. In the case of d = 2 dimensions, the identities (12.26) and (12.27) have a well-defined mathematical meaning. For sufficiently small coupling λ ≥ 0, the limit in (12.27) is a probability measure on S  (E), and the integration (12.26) leads to Schwinger functions, which satisfy the OS axioms. A complete presentation of this approach with the reconstruction of an interacting Wightman 

246

Local Interacting Boson Fields

quantum field theory in two spacetime dimensions is given in Glimm and Jaffe (1987). In the case of d = 3 dimensions, Feldman and Osterwalder (1976) have succeeded to construct the probability measure dμλ (φ) for a φ 4 interaction. Also in this case, the moments (12.26) satisfy the OS axioms.

12.2 Interaction with a Classical Current The simplest interaction of a scalar particle is the interaction with a classical current. Though this model is quite simple, it allows an interesting study of causality within local quantum field theory. The field equation is, as shown in the classical equation (11.136),   (12.28)  + m2 (x) = j(x) I with a real function j(x). If j(x) = j(x) does not depend on time, the model is called the van Hove model, which was first investigated by van Hove (1952). Further references are Cook (1961), Schweber (1961: sect. 12a), and Emch (1972: sect. 1.1.e). For an investigation of the equation (12.28) with a time-dependent current j(x) = j(t, x) we take functions that are square integrable ' d4 x |j(x)|2 < ∞, (12.29) M

and that have a compact support G ⊂ M j(x) = 0

if x ∈ / G ⊂ [0, T] × BR

(12.30)

with a finite time T > 0 and a ball BR ⊂ R3 of finite radius R.1 The restrictions (12.29) and (12.30) allow the interpretation of j(t, x) as a time-dependent oneparticle vector. More precisely, the partial Fourier transform ' 3 ˜j(t) = ˜j(t, k) = (2π )− 2 j(t, x) e−ikx d3 k = ˜j(t, −k)∗ (12.31) 9 9 + and the norm 9˜j(t)9 is an integrable function is for almost all t a vector in Hm of time : ' ' 9 9 T 9˜j(t)9 dt < d4 x |j(x)|2 < ∞. (12.32) 2m R M 1 The calculations of this section can be performed with currents that have a finite norm (11.34).

But the restriction to currents with a compact support is convenient for the discussion of causality.

12.2 Interaction with a Classical Current

247

The external current breaks the translation invariance. Haag’s Theorem is therefore not applicable, and the interaction picture exists.

12.2.1 Solution of the Field Equation Classical Fields In the case of classical field equations, the inhomogeneous Klein–Gordon (KG) equation  + m2 φ(x) = −j(x) has the solution φ(x) = φin (x) + ϑ(x) with ' 4 ϑ(x)  /ret m (x − y)j(y)d y. M

(12.33)

Thereby φin (x) is a solution of the homogeneous KG equation, and the field ϑ(x) is generated by the current. The convolution kernel /ret m (x) is the retarded Green’s function (11.101) of the Klein–Gordon equation. The support of  ret 4 (x) is restricted to the forward cone. The field ϑ(x) = / /ret m m (x − y)j(y)d y therefore propagates into the causal shadow of the support of the current. If j has the compact support (12.30), the field ϑ(x) coincides for x0 > T with ' /m (x − y)j(y)d4 y, (12.34) ϑ∞ (x)  M

which is a solution of the homogeneous Klein–Gordon equation. Quantum Fields The quantum field (x) is a solution of the operator differential equation   (12.35)  + m2 (x) = j(x) I. For t → −∞, the field (x) has to converge to the incoming free field in (x) according to the LSZ condition. The solution of (12.35), which satisfies this asymptotic condition, is ' in d4 y/ret (x) = in (x) + m (x − y)j(y) I =  (x) + ϑ(x) I. M

(12.36) The field (x) obviously satisfies the commutation relation (11.63), which vanishes for spacelike separation of the arguments; hence (12.36) is a local field operator. Moreover, the field (x) and the momentum field -(x) = ∂0 (x) = -in (x) + ∂0 ϑ(x) I

(12.37)

248

Local Interacting Boson Fields

are a canonical pair, which satisfies the canonical equal time commutation relations (11.141) and (11.142). The operators (12.36) and (12.37) are field operators in the Heisenberg picture. Thereby all operators and vectors are defined in the Fock space of the free field in . The one-particle Hilbert space of this +,in ' L2 (R3 , d3 k ), and the Hilbert space of the full theory is the field is Hm 2ω(k) + =  + (H+,in ). To obtain the Schr¨odinger picture, we have to Fock space in calculate the unitary evolution operators U(t2 , t1 ), t1,2 ∈ R, which have the properties U(t3 , t2 )U(t2 , t1 ) = U(t3 , t1 ), ∗

U(t1 , t1 ) = I, U (t2 , t1 ) = U −1 (t2 , t1 ) = U(t1 , t2 ), U −1 (t2 , t1 )(t1 , x)U(t2 , t1 ) = (t2 , x), U −1 (t2 , t1 )-(t1 , x)U(t2 , t1 ) = -(t2 , x),

(12.38)

for arbitrary values of the time parameters tj ∈ R, j = 1, 2, 3. For this purpose, ˜ t (f ) = f (−k) @(k, t)d3 k from (11.153) we use the smeared field operator  + 3 with smooth test functions f ∈ S(R ) ∩ Hm,R . The field equation (12.35) is then given by   ∂2 ˜ t (f ) = − ˜t M 6 2 f + 2 ˜j(t) | M 6 f .  ∂t2 √ 6 f (k) = m2 + k2 f (k) is the Hamilton operator of a Thereby, the operator M free particle. The inhomogeneous term appearing in (12.37) is ' ' . / 3 ˜ ˜ 6 2 j(t) | M f = j(t, k)f (k)d k = j(t, x)χ (x)d3 x with the test function χ = F3−1 f . The retarded Green’s function (11.101) is the integral kernel of the bounded operator  −1 6 sin M 6t M if t > 0, G(t) = . 0 if t < 0. +,in The derivative dtd G(t) is a bounded operator, and R , t → dtd G(t)f ∈ Hm +,in is a norm continuous vector function for all f ∈ Hm . Following (12.36), the solution of (12.37) is ' . / in ˜ ˜ 6 f dt G(t − t )˜j(t ) | M t ( f ) =  t ( f ) + 2 ' t . / in ˜ 6 (t − t ) ˜j(t ) | f dt . sin M = t ( f ) + 2 (12.39) −∞

t 6 (t − s) ˜j(s)ds is a The estimate (12.32) implies that the integral −∞ sin M +,in , which is a differentiable function of t. The canonwell-defined vector in Hm ical momentum field follows by differentiation as

12.2 Interaction with a Classical Current

˜ t( f ) = ˜ in t (f) − 2

'

t −∞

6 (t − t ) ˜j(t ) | M 6 f dt .

cos M

249

(12.40)

Comparing (12.39) and (12.40) with (11.163), we observe that the operator ˜ t (f ) can be obtained from  ˜ in  t (f ) by a Weyl displacement. The vector ' t +,in 6 t )˜j(t )dt ∈ Hm exp(iM (12.41) g(t)  −∞

is a continuous function of t ∈ R, and the following identities are valid: ' t 6t 6t % −iM iM 6 (t − t ) ˜j(t ), g − e g = −2i dt sin M e −∞ ' t 6 6 6 (t − t ) ˜j(t ). e−iM t g + eiM t g% = 2 dt cos M −∞

Hence we have derived the relation −1 ˜ t( f ) ˜ in Wt  = t ( f )Wt

(12.42)

with the Weyl operators Wt = W(g(t)),

t ∈ R.

(12.43)

−1 ˜ in ˜ in The incoming  fieldhas the time evolution t ( f ) = Uin (t)0 (f )Uin (t) with 6

Uin (t) = Γ e−iM t . The evolution operators (12.38) are therefore given by the unitary operators . U(t2 , t1 ) = Wt1 Uin (t2 − t1 )Wt−1 2

(12.44)

12.2.2 S-Matrix The assumptions (12.29) and (12.30) imply the identities  in if x0 ≤ 0,  (x) (x) = if x0 ≥ T out (x)

(12.45)

with the incoming free field in (x) and the outgoing free field out (x) = in (x) + ϑ∞ (x) I.

(12.46)

The vector (12.41) has the properties g(t) = 0 if t ≤ 0, and g(t) = g∞ if t ≥ T with ' 6 t )˜j(t )dt . g∞ = exp(iM (12.47) R

250

Local Interacting Boson Fields

The vector (12.47) coincides with the one-particle vector derived from j(x) by the mapping J of (11.31). The Weyl operator W(g∞ ) is therefore the operator W(g∞ ) = exp i in (j) . (12.48) in out can be achieved by the transform out (x) = The transition in →  in exp i  ( j)  (x) exp −i in ( j) of the in-field. In scattering theory, the unitary operator, which connects the infield with the outfield is called the S-matrix. In this notation, we write with S = exp −i in ( j) . (12.49) out (x) = S−1 in (x) S,

This result also easily follows from the displacement relation (11.68) of the local field and the identities (12.32) and (12.46).

12.2.3 Causality in the Schr¨odinger Picture We now discuss the Schr¨odinger picture under the restrictions (12.29) and (12.30) on the source. We assume that the initial state at time t ≤ 0 is the + . The time evolution is given by the unitary operators vacuum state ∈ in (12.44), and the observables are built up from the field (t = 0, x) and its derivatives at time t = 0. This state remains unchanged until the current is switched on, U(t, 0) = Uin (t) = if t ≤ 0. For t ≥ 0, the classical current generates the coherent state   6 (t) = U(t, 0) = Uin (t)W (−g(t)) = W −e−iM t g(t) . If t ≥ T, this vector becomes = Uin (t)S with the S-matrix (12.49). (t) in (j) is the normalized exponential state The coherent state S = exp −i   

(9.67) N(h) = exp h −

1 2

*h*2 of the one-particle vector h = −i in (j) =

+,in . This state propagates with the free evolution −ig∞ ∈ Hm 6

(t) = Uin (t)N(−ig∞ ) = N(−ie−iM t g∞ ),

t ≥ T.

(12.50)

The particle content of a coherent state is given in Section 9.2.1.. Here we discuss the causality of the propagation. In the Heisenberg picture, the contribution of the current to the field (12.36) obviously propagates into the causal shadow of the current. But for the coherent state (12.50), a causal propagation is 6 +,in has no reasonable not obvious, since the one-particle vector e−iM t g∞ ∈ Hm localization in the spacetime Minkowski space as discussed in Section 11.1.3. To resolve this apparent contradiction, we have to specify the observables, which are related to a measurement within a finite domain O ∈ R3 . In quantum field theory, such local observables are integrals of the type O d3 x σ (x)A(x),

12.2 Interaction with a Classical Current

251

where σ (x) ≥ 0 is a weight function (efficiency) and A(x) is a Wick polynomial constructed from the local field (t = 0, x) and its derivatives at t = 0. Take, for example, the field strength density with operator (0, x) = in (0, x). The expectation of the field strength density in the state (t) is / .

(t) | (0, x)(t) = | U ∗ (t, 0)in (0, x)U(t, 0) (12.51) (12.36) = | (t, x) = ϑ(t, x). The expectation of more general local observables – such as the energy density – can be calculated by the same method, and yields again the result obtained from the classical field ϑ(x). Hence effects of the created particles can only be observed within the causal shadow of the current. The noncausal parts of the one-particle wave function are not observable in the local quantum field theory.

13 Quantum Stochastic Calculus

In this chapter, we describe the quantum stochastic calculus developed by Hudson and Parthasarathy (1984). This is a branch of mathematics involving integration with respect to Fock space processes of creation, annihilation, and number, which generalizes the usual It¯o theory. The motivations came from the desire to have a noncommutative theory of probability, and unitary dilations of quantum dynamical semigroups. However, the theory effectively starts with observations that Gaussian and Poissonian processes can naturally be constructed in a suitable Fock space, and that the It¯o correction ought to be a manifestation of Wick ordering. As the title of the early paper, Hudson and Streater (1981), suggests, the It¯o calculus should be just the ordinary calculus with Wick ordering. (See the account by Hudson, 2012, on the development of the theory.) We will initiate this with a description of the Maassen kernel calculus, Maassen (1987), and its extension by Meyer (1993).

13.1 Operators on Guichardet Fock Space We begin by displaying the creation, annihilation, and number operators in a more explicit form when the Fock space is a Guichardet Fock space of the form L2 (Power(X), dX), which we have previously encountered. For ease of notation, we shall use the shorthand for this chapter P  Power(X). Lemma 13.1.1 The creation and annihilation operators on Guichardet Fock space may be defined by their actions on suitable  ∈ L2 (P, dX):

252

13.1 Operators on Guichardet Fock Space

253

'

dx g∗ (x)  [X + x]  ∗ g (x)  [X − x] . A (g)  [X] ≡ (A (g) ) [X] ≡

X

(13.1)

x∈X

The number operator is (N) [X] ≡ #X  [X]. Proof Since exp ( f ) [X + x] = f (x) exp (f ) [X], we immediately obtain that exponential vectors are the eigenvectors for annihilation: '

A (g) exp( f ) = g∗ (x)f (x) dx exp ( f ) . We see that A∗ (g) and A (g) are indeed adjoint, since by (13.4), ' '

|A (g)  = dX dx ∗ [X]g∗ (x) [X + x] '  = dZ ∗ [Z − z] g∗ (z)  [Z] z∈Z

/ = A (g) | . .



Likewise, the commutation relations are apparent from looking at '    A (g) A∗ (h)  [X] = dx g∗ (x) h (y)  X + x − y . y∈X+x



The y = x term in the sum yields dx g∗ (x) h (x)  [X] while the remainder is readily seen to give (A∗ (h) A (g) ) [X]. Therefore, we obtain the CCR '   A (g) , A∗ (h) = g∗ (x) h (x) dx. (13.2)

13.1.1 Operator Densities We may tentatively introduce an annihilator density ax so that formally (ax ) [X] =  [X + x] , and so that the annihilator may be written as ' A (g) = g∗ (x) ax dx. X

Likewise, we should define the creator density as  ∗   ax  [X] = δ (x − y)  X − y y∈X

254

Quantum Stochastic Calculus

so that A∗ (g) =

'

g (x) a∗x dx. X

The annihilator density is, at least, defined almost everywhere; however, the creator density  is clearly singular. We also have formally that the number operator is N = X a∗x ax dx. More generally, for a finite set X, we set   ax and a∗X = a∗x . aX = x∈X

x∈X

We obtain the elegant rule (aX ) [Y] =  [X + Y] . However, as the reader will quickly surmise, the creation analogue is a more ponderous expression involving multiple delta-functions. Technically speaking, these are not operators on the Fock space; however, they are convenient objects to consider, and their formal manipulation reproduces the correct formulas for bona fide operators defined in the next subsection, which would otherwise be pure drudgery. The densities formally obey the singular canonical commutation relations ) ( ax , a∗y = δ (x − y) . Wick Ordering It is convenient to give a rule for putting expressions like aX a∗Y to Wick order. We first of all introduce the notion of a δ-function between sets: if   X = x1 , . . . , xp and Y = y1 , . . . , yq , then we set  δ (X, Y) = δp,q δ x1 − yσ (1) . . . δ xp − yσ (p) . σ ∈Sp

Lemma 13.1.2 (Wick Ordering)  aX a∗Y = X1 +X2 =X Y1 +Y2 =Y

δ (X2 , Y2 ) a∗Y1 aX1 .

Proof This is the result of repeated use of the commutation relations. Let   us write aX a∗Y = axp . . . ax1 a∗y1 . . . a∗yq , where X = x1 , . . . , xp and Y =   y1 , . . . , yq , and proceed to move all the creator symbols to the left, starting with a∗y1 , then a∗y2 , and so on, using the relations ax a∗y = a∗y ax + δ (x − y). The final result will be

13.1 Operators on Guichardet Fock Space

aX a∗Y =

max{p,q}  n=0



255



1≤j(1) t(y), the integral will be '  α ( )μ  ν dxdy a∗x Kx a∗y Ly ay [ax ]β = t(x)>t(y) '  α μν dx a∗x Kx !t(x) (L) [ax ]β X

since a∗x will now commute with the earlier Ly . Likewise, if t(x) < t(y), then a∗y will commute with earlier Kx and so ' ( )μ    ν α a∗x Kx [ax ]β Ly ay = dxdy a∗y t(x) 0, we have the decomposition with H[0,t] = h0 ⊗ F[0,t] and H[t,∞) = F[t,∞) .



270

Quantum Stochastic Calculus

Let D0 be a fixed linear manifold in F. A family X = (Xt )t≥0 of operators on H, with the domain contained in D0 ⊗exp (A) is said to be a quantum stochastic process based on (D0 , A). If Xt has trivial action on the factor H[t,∞) for the continuous tensor decomposition at time t, for each t > 0, then we say that the process is adapted. If there exists an increasing sequence {tn }∞ n=0 with t0 = 0 with tn ↑ ∞ such that Xt takes a constant (operator) value yn on the interval tn ≤ t < tn+1 , then the process is simple: in addition, it will be adapted if the yn act trivially on the future decomposition factor H[tn ,∞) , for each n. If the map t → Xt u ⊗ exp ( f ) is strongly continuous for all u ∈ D0 , f ∈ A and t > 0, the process is said to be continuous, while if the map is strongly measurable T with T12 *Xs u ⊗ exp ( f )*2 ds < ∞, for all u ∈ D0 , f ∈ A, then X is said to be locally square-integrable over [T1 , T2 ].  t  α We now show that the Wick integral Xt = 0 a∗s Xαβ (s) [as ]β , or equivt αβ alently as Xt = 0 Xαβ (s) dAs , where the Xαβ (·) are four locally squareintegrable, adapted processes based on (D0 , A), and can be obtained as a limit of finite sum approximations similar to the usual It¯o construction. We begin by fixing an appropriate notion of convergence: Let X (·) and Xn (·), for each n = 1, 2, . . ., be locally square integrable processes based on (D0 , A); then we say that the Xn (·) converge to X (·) over [T1 , T2 ] if ' T2 *(X (t) − Xn (t)) u ⊗ exp ( f )*2 dt = 0 lim n→∞ T 1

for all u ∈ D0 and f ∈ A. We also say that the Xn (·) converge to X (·) locally if we have this convergence for every finite subinterval [T1 , T2 ]. Here the strategy is similar to how ordinary stochastic integrals of It¯o type are constructed. We may smooth a locally square-integrable process to get a continuous process in a manner that preserves adaptedness, and conversely always construct an approximating sequence of continuous processes to any locally square-integrable process. In turn, we may discretize any continuous process to obtain a simple process, again maintaining adaptedness, and conversely approximate any adapted continuous process by a sequence of simple ones. The proof of this is given in Hudson and Parthasarathy (1984) as Proposition 3.2, but is almost identical to the standard construction – see for instance Øksendal (1992) – and we omit it. Given for simple adapted processes Xαβ (·), we define their quantum stochastic integral to be '

T2

T1

Xαβ (s) dAαβ s =

n−1  j=0

  αβ αβ Xαβ tj Atj+1 − Atj

(13.23)

13.5 Quantum Stochastic Calculus

271

where T1 = t0 < t1 < · · · < tn = T2 is a partition of [T1 , T2 ] that includes all discontinuity points of each of the four processes. Let us introduce the notation ' t2 1 t h¯ t2 = h (t) dt 1 t 2 − t 1 t1 for the time average of a function over an interval [t1 , t2 ]. Lemma 13.5.1 Let Xαβ (·) be Yαβ (·) adapted simple processes based on t t αβ μν (D0 , A) and let X (t) = 0 Xαβ (s) dAs , Y (t) = 0 Yμν (s) dAs ; then, for all u, v ∈ D0 and f , g ∈ A, > = ' T2 αβ u ⊗ exp ( f ) | Xαβ (s) dAs v ⊗ exp (g) T1

'

T2

=

 α . / β ds f ∗ (s) u ⊗ exp ( f ) |Xαβ (s) v ⊗ exp (g) g (s) (13.24)

T1

and

X (t) u ⊗ exp ( f ) |Y (t) v ⊗ exp (g) ' t  α

Xαβ (s) u ⊗ exp ( f ) |Y (s) v ⊗ exp (g) + ds f ∗ (s) = 0 . / + X (s) u ⊗ exp ( f ) |Yαβ (s) v ⊗ exp (g) . / β + X1α (s) u ⊗ exp ( f ) |Y1β (s) v ⊗ exp (g) (13.25) g (s) . Proof Substituting the expression (13.23) into the left-hand side of (13.24), we see that it equals 

 α  β g δtj f ∗

tj+1

.

/ u ⊗ exp ( f ) |Xαβ tj v ⊗ exp (g) ,

tj

j

where we set δtj = tj+1 − tj . This can be rewritten as the right-hand side of (13.24) by virtue of the piecewise continuous nature of the process. Next we take a partition of [T1 , T2 ] = [0, t] that includes all discontinuity points of each of Xαβ and Yαβ . The right-hand side of (13.25) can then be written as   α  β tj+1 . / Xαβ tj u ⊗ exp ( f ) |Y tj v ⊗ exp (g) g δtj f ∗ j

tj

/ . + X tj u ⊗ exp (f ) |Yαβ tj v ⊗ exp (g) . / + X1α tj u ⊗ exp ( f ) |Y1β tj v ⊗ exp (g) ,

and similarly we recover the required integral form by inspection.

272

Quantum Stochastic Calculus

These are of course expressions we have already seen before when formally manipulating Wick integrals. The following lemma helps us to show that the formulas (13.24) and (13.25) hold true also for adapted locally squareintegrable processes. Lemma 13.5.2 Suppose that A consists of locally bounded functions; then for T1 ≤ t ≤ T2 , u ∈ D0 and f ∈ A, e have ' T2 9 9 9Xαβ (r) u ⊗ exp ( f )92 , *X (t) u ⊗ exp ( f )*2 ≤ 6 C2 dr er−T2 T1

α,β

(13.26) where C = C ( f , T1 , T2 ) = maxp=0,2 supT1 ≤t≤T2 |f (t)|p/2 . Proof

Formula (13.25) allows us to deduce that ' t β  α  *X (t) u ⊗ exp ( f )*2 = ds f ∗ (s) f (s) T1 . / ×2Re X (s) u ⊗ exp ( f ) |Xαβ (s) u ⊗ exp ( f ) ' t ds *(X11 (s) f (s) + X10 ) u ⊗ exp ( f )*2 . + T1

92 9 9 9 The Hilbert space inequalities 2 | ξ |η | ≤ *ξ *2 + *η*2 and 9 nj=1 ξj 9 ≤  9 92 n nj=1 9ξj 9 then imply ' t *X (t) u ⊗ exp ( f )*2 ≤ ds *X (s) u ⊗ exp ( f )*2 T1 9 9  9Xαβ (s) u ⊗ exp ( f )92 . +6C (f , T1 , T2 )2 α,β

By monotonicity, we may differentiate wrt. t to get d *X (t) u ⊗ exp ( f )*2 ≤ *X (t) u ⊗ exp (f )*2 dt 9 9 9Xαβ (t) u ⊗ exp ( f )92 . +6C ( f , T1 , T2 )2 α,β

The result follows from applying the integrating factor e−t to both sides and integrating up. Whenever Xαβ (·) are adapted square integrable processes based on (D0 , A), with A consisting of locally bounded functions, then there exists an approx(n) imating family Xαβ (·) of adapted simple processes based on (D0 , A), and

13.5 Quantum Stochastic Calculus

273

as corollary to the previous lemma we have that X (n) (t) u ⊗ exp ( f ) =  t a (n) αβ 0 Xαβ (s) dAs u⊗exp ( f ) converges in F, for every u ∈ D0 and f ∈ A. (Apply the lemma to X (n) − X (m) !) The limit vector of the approxi t will be independent αβ mating sequence and can be denoted as 0 Xαβ (s) dAs u⊗exp ( f ). This serves t αβ as the definition of the quantum stochastic integral X (t) = 0 Xαβ (s) dAs on the domain D0 ⊗ A. By construction, X (·) will be adapted and the formulas (13.24), (13.25), and (13.26) apply to such integrals.

13.5.1 QSDEs Let Gαβ be four operators on the initial space h0 with common invariant domain D. We wish to show the existence and uniqueness of solutions to the quantum stochastic differential equation (QSDE): αβ

dUt = Gαβ Ut dAt ,

U0 = 1.

This can be equivalently be considered as the integro-differential equation '

t

Ut = 1 + 0

Gαβ Us dAαβ s ,

and the standard technique for dealing with such problems is Picard iteration.  We define a sequence U (n) : n = 0, 1, 2, . . . of processes by setting U (0) = 1 and then ' t (n) Gαβ Us(n−1) dAαβ Ut = 1 + s . 0

(n)

We may write Ut

(n)

Kt

=

n

(n) m=0 Kt ,

where ' ≡ Gαn βn . . . Gα1 β1 ⊗

α βn

n (t)

dAtnn

α β1

. . . dAt11

(implied summation over 4n terms!) and we recall that n (t) is the simplex t > tn > · · · > t1 > 0. In the following, we will again take A to consist locally bounded functions. Lemma 13.5.3 (Hudson–Parthasarathy) For fixed u ∈ D, let us set G (u, n) = 9 9 maxαn ,βn ,...,α1 ,β1 9Gαn βn . . . Gα1 β1 u9. Let f ∈ A, and T > 0; we have for each t ∈ [0, T] 9 92 n 1  2 9 (n) 9 24T C ( f , 0, T)2 . 9Kt u ⊗ exp ( f )9 ≤ e T+*f * G (u, n)2 n!

274

Quantum Stochastic Calculus   α β α β (n) Let us set Vt = αn ,βn ,...,α1 ,β1 n (t) dAtnn n . . . dAt11 1 , then 9 9 9 9 9 (n) 9 9 9 (n) 9Kt u ⊗ exp ( f )9 ≤ G (u, n) 9Vt exp ( f )9 .

Proof

 t (n−1) (n) 10 01 00 Now Vt = 0 Vt dNt , where Nt = A11 t + At + At + At , and from the previous lemma we find that ' t 9 92 9 92 9 (n) 9 9 9 es−t 4 9Vs(n−1) exp ( f )9 ds 9Vt exp ( f )9 ≤ 6C ( f , 0, T)2 0

and so, by induction, we get the required estimate 9 92  n 9 (n) 9 9Kt u ⊗ exp ( f )9 ≤ 24C ( f , 0, T)2 G (u, n) ×eT *exp ( f )*2 vol (n (T)) . (n) (Note that Vt ≡ Nnt , although we didn’t make us of this fact.) From the lemma, we see that ' T 9 n ' T 92 9 92  9 (n) 9 9 (m) 9 dt 9Ut u ⊗ exp (f )9 ≤ n dt 9Kt u ⊗ exp ( f )9 < ∞ 0

m=0 0

and so U (n)t is locally square-integrable. By construction, it also is an adapted process. At this stage, we should give a domain for the U (n) . Let us set    1 n D0 = u ∈ D : √ G (u, n) τ < ∞, for all τ > 0 . n! n 9 9 √ n 2 9 9 (n) 24TC ( f , 0, T) and so Now 9Kt u ⊗ exp ( f )9 ≤ e T+*f * /2 G (u, n) √1 n! 9 9  9 (n) 9 sup 9Kt u ⊗ exp ( f )9 < ∞ n 0≤t≤T

n (n) (n) and so we see that Ut u ⊗ exp (f ) = m=0 Kt u ⊗ exp (f ) converges as n → ∞ in a manner that is uniform in t. The limit can be denoted Ut u ⊗ exp ( f ) and clearly yields a quantum stochastic process U based on (D0 , A). The uniformity of the convergence coupled with the estimates in the lemma allow us to take strong limits on D0 ⊗ A to obtain ' t (n) Ut = s− lim Ut = 1 + Gαβ s− lim Us(n−1) dAαβ s n→∞ n→∞ 0 ' t =1+ Gαβ Us dAαβ s , 0

13.5 Quantum Stochastic Calculus

275

and so U satisfies the QSDE and is consequently continuous. Of course, it is also locally square-integrable. To establish uniqueness of solution, wenote that if U and U  are both soluαβ t tions, then W = U − U  satisfies Wt = 0 Gαβ Ws dAs . Comparing with the  αβ t (n) (n−1) relation Kt = 0 Gαβ Ks dAs in the previous lemma, we likewise arrive at the estimate n 1  2 *Wt u ⊗ exp ( f )*2 ≤ e T+*f * G (u, n)2 24Tα ( f , 0, T)2 n! for t ∈ [0, T] and so, taking n → ∞, we find that U and U  agree on D0 ⊗ A.

13.5.2 Unitarity We now seek conditions on the coefficients Gαβ such that the process U will be unitary. For simplicity, we take them to be unitary and so we can set D = h. The isometry condition Ut∗ Ut = 1 implies that 0 = d Ut∗ Ut = dUt∗ Ut + Ut∗ (dUt ) + dUt∗ (dUt )   αβ = Ut∗ G∗βα + Gαβ + G∗1α G1β Ut dA1 , while the co-isometry condition Ut Ut∗ = 1 implies that 0 = d Ut Ut∗ = (dUt ) Ut∗ + Ut dUt∗ + (dUt ) dUt∗   αβ = Gαβ + G∗βα + G1α G∗1β dA1 , leading us to the conditions G∗βα + Gαβ + G∗1α G1β = 0 = Gαβ + G∗βα + G1α G∗1β . The general solution to these equations is G11 = S − 1, G10 = L, G01 = −SL∗ , 1 G00 = − LL∗ − iH, 2 where S is unitary, L bounded but otherwise arbitrary, and H is self-adjoint. For X, a bounded operator on h, we set Jt (X) = Ut∗ (X ⊗ 1) Ut , and from the quantum It¯o formula, we find αβ dJt (X) = Jt Lαβ (X) dAt ,

276

Quantum Stochastic Calculus

where we introduce the bounded linear maps Lαβ (X) = XG∗βα + Gαβ X + G∗1α XG1β . These maps each have the algebraic property that Lαβ (1) = 0, which just follows from the unitary requirement on the Gαβ . We see that

 √ 9 9 n *u* −*f *2 /2 9 9 *Ut u ⊗ exp ( f )* ≤ e 24Tα (f , 0, T) max Gαβ √ , αβ n! n for t ∈ [0, T]. Therefore, for fixed f , g ∈ A, we have a well-defined bounded operator Rt acting on the initial space such that

u|Rt v = Ut u ⊗ exp ( f ) |Ut v ⊗ exp (g) for all u, v ∈ h. R will be locally bounded, and we see that α  β  dRt = f ∗ (t) Lαβ (1) g (t) = 0. dt We see that for Rt is the constant R0 = exp ( f ) | exp (g) . Therefore,

Ut u ⊗ exp ( f ) |Ut v ⊗ exp (g) = u ⊗ exp ( f ) |v ⊗ exp (g) . It follows that Ut is an isometry.

13.6 Quantum Stratonovich Integrals Given integral processes determined by X˙ = α  stochastic  ∗ α a pair βof quantum Xαβ [a] and Y˙ = a∗ Yαβ [a]β , their quadratic covariation is given by a ' t ' t  ∗ α [[X, Y]]t = a dXdY ≡ Xα1 Y1β [a]β . 0

0

As we are dealing with noncommutative process, the covariation will generally be nonsymmetric. The quantum It¯o rule may then be written in the form d (XY) = (dX) Y− + X− (dY) + d[[X, Y]].  The analogue of the Stratonovich integral is to define XδY as the limit of finite sum approximations using a midpoint rule, first shown by Chebotarev  (1997) (see also Gough, 1997): k Xtt∗ Ytk+1 − Ytk , with tk∗ = 12 (tk+1 + tk ). This turns out to be the same as ' ' 1 XδY ≡ X (dY) + [[X, Y]] 2

13.7 The Quantum White Noise Formulation

277

or in differential language 1 X (δY) = X (dY) + dXdY. 2   1 Similarly, we have (δX) Y ≡ (dX)Y + 2 [[X, Y]] and 1 (δX) Y = (dX) Y + dXdY. 2 The quantum It¯o rule then implies the Stratonovich rule d (XY) ≡ (δX) Y− + X− (δY) ,

(13.27)

which restores the Leibniz form. Remarkably, the Stratonovich differentials take the form



1 1 X (δY) ≡ X + dX dY, and (δX) Y ≡ dX Y + dY . 2 2 As such, the It¯o correction (covariation) is shared equally between the two differentials. When we originally derived the Wick product formula, we used the singular commutation relations first and separated off the correction, then split the remainder into two time-ordered integrals.

13.7 The Quantum White Noise Formulation Let us repeat, for clarity, the arguments  t int the setting of quantum stochastic integral processes. We have Xt Yt = 0 ds 0 dr X˙ s Y˙ r with  μ  α X˙ s Y˙ r = a∗s Xαβ (s) [as ]β a∗r Yμν (r) [ar ]ν . (13.28) t

t

We know that Xt Yt ≡ 0 (δX) Y + 0 X (δY), while at the same time we would expect, from a reasonable definition of fluxions, to have ' t ' t ˙ + ˙ XY Xt Yt = X Y. (13.29) 0

0

Let us therefore do the obvious thing and define ' t ' ' t ' t ˙Xs Ys ds  (δXs ) Ys , ˙ Xs Ys  Xs (δYs ) , 0

0

(13.30)

0

so that we have  α 1  ∗ α a X˙ t Yt ≡ a∗t Xαβ (t) Y (t) [at ]β + Xα1 (t) Y1β (t) [at ]β . (13.31) 2 t

278

Quantum Stochastic Calculus

However, in a reasonable definition of fluxions, we might expect that  α X˙ t Yt ≡ a∗t Xαβ (t) [at ]β Y (t) (13.32) Evidently the equations (13.31) and (13.32) will be consistent if 1 [at ]β Y (t) = Y (t) [at ]β + δˆβμ Yμν (t) [at ]ν , 2 that is, [at , Y (t)] =

1 Y1ν (t) [at ]ν . 2

(13.33)

We arrived at this equation by asking for some reasonable algebraic rules for the fluxions. So far, nothing is rigorous; however, we have been able at least to express otherwise undefined objects in terms of well-defined Wick-ordered ones. We may justify (13.33) on the following formal manipulations , ' − t   − Y˙ s ds at , Y(t ) = at , ' =

0 t−

  ds at , Y˙ s

0

' =

t−

ds δ (t − s) Y1ν (s) [as ]ν .

0

  Here we have dropped the term at , Yμν (s) since this should vanish for t > s as the integrands are adapted by assumption. We get the answer we want  t− − 1 if we adopt the rule that the 0 ds  s) f1 (s)  ∗≡ α 2 f t . A similar set of  δ (t ∗− manipulations would suggest that Xt , at = 2 at Xα1 (t), and so   Xt Y˙ t = Xt aαt Yαβ (t) [at ]β  α 1  ∗ α a = a∗t Xt Yαβ (t) [at ]β + Xα1 (t) Y1β (t) [at ]β 2 t and therefore  α  α X˙ t Yt + Xt Y˙ t = a∗t Xαβ (t) Y (t) [at ]β + a∗t Xt Yαβ (t) [at ]β  α + a∗t Xα1 (t) Y1β (t) [at ]β , which is precisely the quantum It¯o product rule we desire. So far, we have used the symbols at and a∗t purely as a notation for terms under the integral sign in quantum stochastic integrals. However, this is motivated by the following: whenever we multiply these integrals together, we encounter these operator densities out of normal (Wick) order, and the upshot

13.8 Quantum Stochastic Exponentials

279

of the preceding is that in the process of putting these to normal order, we end up with the It¯o correction! One ends up with a technique that extends to multiple integrals where one can essentially ignore the diagonal terms provided you give a rule for what to do when you encounter a delta-function supported at the boundary of a simplex. A straightforward way of tackling this is to take the commutation relations to be (see Accardi et al., 2002; Gough, 1997)   (13.34) at , a∗s = d∗ (t − s) , where we have d∗ (t − s) = with the rules that

'

1 1 d+ (t − s) + d− (t − s) 2 2

f (s) d± (t − s) = f t± .

(13.35)

(13.36)

This brings us back to our introductory comments. A fully rigorous theory has been developed known as quantum white noise analysis, Accardi et al. (2000), and is a natural generalization of the Hida theory of (classical) white noise, as in Obata (1994). We should mention that an extension to higherorder processes has been given by Accardi et al. (1996); see also Accardi et al. (2002).

13.8 Quantum Stochastic Exponentials As in the classical case, we are faced with several possible mechanisms to define a stochastic exponential of a quantum stochastic integral process. Filling the role of the diagonal free exponential is the It¯o time-ordered exponential. The quantum It¯o time-ordered exponential is denoted as Ut = T I eXt and defined as the solution to the QSDE dUt = (dXt ) Ut , This admits the series expansion ' T I eXt = n

n (t)

U0 = I.

dXtn . . . dXt1 .

Note that the increments do not necessarily commute – even for different times!

280

Quantum Stochastic Calculus

An alternative is the Holevo time-ordered exponential introduced in Holevo (1992). The Holevo time-ordered exponential is denoted as Vt = T H eYt and defined by

  dVt = edYt − 1 Vt ,

V0 = I.

If the increments commuted, we could reconstitute this as T H eYt = eYt , but this would be an exceptional situation. The time-ordered exponential is to be understood as the “Trotterization”     T H eYt = lim eY(tn )−Y (tn−1 ) − 1 · · · eY(t1 )−Y(t0 ) − 1 , n→∞

where t = tn > tn−1 > · · · > t0 = 0 is a grid with maxk (tk − tk−1 ) → 0 in the limit. Holevo (1992, 1996) establishes convergence on the domain t of exponential vectors for stochastic integrals Yt = 0 Yαβ dAαβ under the condition that the  coefficients are ultrastrongly admissible, that is, there is a (n) sequence Yαβ of adapted simple processes such that |||Y (n) |||t < ∞ and |||Y − Y (n) |||t → 0, where

A B' 9 9 B t 9 9 *Yi0 (s)*2 ds |||Y||||  ess − sup sup Yij (s) + C i,j

t≤s≤0

0

i

A' ' t B t 9 9 B 9Y0j (s)92 ds + *Y00 (s)* ds. +C 0

j

0

In particular, the number of noise channels may be countable. The Holevo timeordered exponential arises naturally as the continuous limit of discrete time open quantum dynamics; see, for instance, Attal and Pautrat (2006) and Gough (2004). To determine the relationship between the two, we need to work out the t integrals Xt[n] ≡ 0 (dX)n . Lemma 13.8.1 Let X be a quantum stochastic integral determined by X˙ = [at ]α Xαβ (t) [at ]β ; then (for n ≥ 2)  α X˙ t[n] ≡ a∗t Xα1 (t) X11 (t)(n−2) X1β (t) [at ]β . (13.37)  1 Let f be an analytic function, say f (z) = n≥0 n! fn zn , then the kth decapitated version of f is the analytic function:

13.8 Quantum Stochastic Exponentials

fk (z) 

 f (z) − k−1 1 n=0 n fn+k z = (n + k) ! zk

 n≥0

1 n n! fn z

281

.

Proposition 13.8.2 We have the identification T I eXt = T H eYt with the quantum stochastic integrals related by Xαβ = Yαβ + Yα1 exp2 (Y11 ) Y1β , Yαβ = Xαβ + Xα1 ϕ2 (X11 ) X1β , where ϕ (z) = ln (1 + z). Note that exp2 (z) = e −1−z and ϕ2 (z) = ln(1+z)−z . The result is immediate z2 z2 from the It¯o rule, and the proposition is a consequence of the relations Xt =   (−1)n+1 [n] 1 [n] Xt , which we encountered in Theorem n≥1 n! Yt and Yt = n≥1 n 2.5.1, and which also holds in the noncommutative case. In particular, note that z

X11 = eY11 − I, X10 = exp1 (Y11 ) Y10 , X01 = Y01 exp1 (Y11 ) , X00 = Y00 + Y01 exp2 (Y11 ) Y10 , Y11 = ln (I + Y11 ) , Y10 = ϕ1 (X11 ) X10 , Y01 = X01 ϕ1 (X11 ) , Y00 = X00 + X01 ϕ2 (X11 ) X10 . Yet another possibility is afforded by the Stratonovich time-ordered exponential. We define Wt = T S eZt as the solution to the QSDE dWt = (δZt ) Wt ,

W0 = I.

This is the same formal definition as the time-ordered Stratonovich exponential (2.27), though the specification that the Stratonovich increment appears on the left is now a crucial aspect. This admits the series expansion ' δZtn . . . δZt1 . T S eZt = n

n (t)

Proposition 13.8.3 We have T S eZt = T I eXt , where Xαβ = Zαβ + Zα1 Zαβ = Xαβ + Xα1

1 I − 12 Z11 1 I + 12 X11

Z1β , X1β .

282

Quantum Stochastic Calculus

Proof As in Proposition 2.6.1, the consistency condition is dX = dZ+ 12 dZdX; however, this time the order of the product is important. Comparing coefficients of the fundamental noise processes shows that 1 Xαβ = Zαβ + Zα1 X1β , 2 and in particular X11 =

Z11 I− 12 Z11

and Z11 =

worked out through simple algebra.

X11 . I+ 12 X11

The other terms are readily

13.9 The Belavkin–Holevo Representation The Hudson–Parthasarathy theory leads to a quantum It¯o calculus where we have the product rule d (Xt .Yt ) = dXt .Yt + Xt .dYt + dXt .dYt .

(13.38)

Here the It¯o correction may be thought of as a being due to Wick ordering. It is also possible to give a representation, akin to the matrix representation of the the Heisenberg group, where correction is accounted for by having the ordinary product but with higher dimensional matrices. This was given by Belavkin (1988) and Holevo (1989).

13.9.1 Matrix Algebra Notation Let A be a fixed algebra and let An×m denote the set of n×m arrays with entries in A. Given X ∈ An×m and Y ∈ Am×r , we say that the pair (X, Y) is composable and define their product XY ∈ An×r to be the usual matrix product. Taking n ≥ 1 to be a fixed dimension, we consider arrays of the form X00 ∈ A1×1 ≡ A, X0 = [X01 , . . . , X0n ] ∈ A1×n , ⎤ ⎡ X10 ⎥ ⎢ X0 = ⎣ ... ⎦ ∈ An×1 , ⎡

X

Xn0

X11 ⎢ .. =⎣ .

···

Xn1

···

⎤ X1n .. ⎥ ∈ An×n . . ⎦ Xnn

We may assemble these components into the following square matrices   X00 X0 X= ∈ A(1+n)×(1+n) (13.39) X0 X

13.9 The Belavkin–Holevo Representation

and



0 ⎣ X= 0 0

X0 X 0

⎤ X00 X0 ⎦ ∈ A(1+n+1)×(1+n+1) . 0

283

(13.40)

We shall write X ←→ X whenever X and X take  theforms (13.39) and (13.40) respectively with the same component entries Xαβ . Important special cases are the following: ⎡ ⎤   0 0 0 0 0 , P  ⎣ 0 In 0 ⎦ , P 0 In 0 0 0 ⎡

1 ⎣ I 0 0

0 In 0

⎤ 0 0 ⎦, 1



0 ⎣ J 0 1

0 In 0

⎤ 1 0 ⎦, 0

where In is the n × n identity matrix. Lemma 13.9.1 We have the following identifications: X† ←→ X∗  JX† J, XPY ←→ XY, XY ←→ XJY. It is instructive to check these relations. To begin with, we easily see that ⎤ ⎡ 0 (X0 )† (X00 )† X% = ⎣ 0 (X )† (X0 )† ⎦ , 0 0 0 which is the required expression. We shall refer to X% as the twisted involution ¯ %+ of X, and we indeed have the properties (X% )% = X,(αX + βY)% = αX  ¯ % . Next we compare XPY = X0 Y0 X0 Y with βY X Y0 X Y ⎡ ⎤⎡ ⎤ 0 X0 X00 0 Y0 Y00 XY = ⎣ 0 X X0 ⎦ ⎣ 0 Y Y0 ⎦ 0 0 0 0 0 0 ⎡ ⎤ 0 X0 Y X0 Y0 = ⎣ 0 X Y X Y ⎦ . (13.41) 0 0 0 The last relation we leave as an exercise.

284

Quantum Stochastic Calculus

It is convenient to the following vector notations on A1+n+1: ⎤ ⎡ φ−1 ⎢ φ1 ⎥ ⎥ ⎢ ) ( ⎥ ⎢ † φ % = φ † J = φ0† |φ1† , . . . , φn† |φ−1 . φ = ⎢ ... ⎥, ⎥ ⎢ ⎣ φn ⎦ φ0



13.9.2 Quantum Stochastic Integrals

 Take Xαβ (t) : t ≥ 0 to be a family of adapted quantum stochastic processes, t αβ with quantum stochastic integral Xt = 0 Xαβ (s) dAs , where the differentials are understood in the It¯o sense.  The coefficients Xαβ (t) may be assembled into a matrix Xt , as in the preceding, which we call the It¯o matrix for the process, and also into a matrix Xt , as in the preceding, and we shall refer to this as the Belavkin–Holevo matrix for the process. The It¯o matrix for a product Xt Yt of quantum It¯o integrals will then have entries Xαβ Y + XYαβ + Xαk Ykβ and is therefore given by XY + XY + XPY. We now see the importance of Lemma 13.9.1: the It¯o correction is described by the It¯o matrix XPY, but this is equivalent to just the ordinary product XY of the Belavkin–Holevo matrices. In particular, the Belavkin–Holevo matrix for the product Xt Yt can be written as XY + XY + XY = (XI + X) (YI + Y) − XYI.

(13.42)

We recover the fluxion differentials from either the It¯o or Belavkin–Holevo matrices via the formulas dX = a† Xa = a% Xa dt with ⎡ ⎢ ⎢ a=⎢ ⎣

a0 a1 .. . an

⎤ ⎥ ⎥ ⎥, ⎦



0 a1 .. .

⎢ ⎢ ⎢ a=⎢ ⎢ ⎣ an a0

⎤ ⎥ ⎥ ⎥ ⎥. ⎥ ⎦

If we wish to use the It¯o differential notation, then we must use the slightly more cumbersome matrix formula

13.9 The Belavkin–Holevo Representation

285

  dX = tr Xd@ A , where (with  denoting the usual transpose for arrays) ⎡ ⎤ 0 0 0    ⎢ ⎥ d@ A  ⎣ dA0 dA 0 ⎦.  dA0 0 dA00 ⎡ ⎤ # # 0 Note here that the matrix Xd@ A then takes the form ⎣ # # 0 ⎦ , where # 0 0 0   indicates the nonzero entries, and it is easy to see that tr Xd@ A takes the form           X0 dA0 + X00 dA00 + trd X dA + X0 dA0 = X0j dA0j + X00 dA00 + Xij dAij + Xi0 dAi0 . Putting these observations together, we get the following lemma. Lemma 13.9.2 Let Xt and Yt be quantum stochastic integrals, then the quantum It¯o product rule may be written as2 d (Xt Yt ) = a% (t) [(Xt I + Xt ) (Yt I + Yt ) − (Xt Yt ) I] a (t) . dt The multiple version of the quantum It¯o product rule is

(13.43)

d (Xt Yt . . . Zt ) = (Xt + dXt ) (Yt + dYt ) · · · (Zt + dZt ) − (Xt Yt . . . Zt ) , which follows from the product rule by basic induction. Putting the multiple product rule into It¯o matrix notation involves some rather unwieldy expressions. In contrast, this is handled very efficiently in terms of Belavkin–Holevo matrices as the lemma generalizes immediately. Corollary 13.9.3 Let Xt , Yt , . . . , Zt be quantum stochastic integrals, then the multiple quantum It¯o product rule may be written as3 d (Xt Yt . . . Zt ) = a% (t) (Xt I + Xt ) (Yt I + Yt ) · · · (Zt I + Zt ) dt

− (Xt Yt . . . Zt ) I a (t) . Another straightforward corollary of the lemma is the differential rule for functions of quantum integral processes. 



2 Alternatively, d (X Y ) = tr [(X I + X ) (Y I + Y ) − (X Y ) I] d@ At . t t t t t t t t   3 Or as d (X Y . . . Z ) = tr [(X I + X ) (Y I + Y ) · · · (Z I + Z ) − (X Y . . . Z ) I] d@ At . t t t t t t t t t t t t

286

Quantum Stochastic Calculus

Corollary 13.9.4 Let Xt be a quantum stochastic integral process and let f be analytic, then the process f (Xt ) has differential   d f (Xt ) = a% (t) f (Xt I + Xt ) − f (Xt ) I a (t) . (13.44) dt To appreciate just how compact the expression (13.44) is, we give the explicit  form of the differential when f (x) = n fn xn :           0 00  0 df (Xt ) = f0 dA + f00 dA + trd f dA + f0 dA , where f0 =

f00 =



fn

p+q=n−1 

n

p,q≥0



p+q=n−1 

fn

n

X p X0 (XId + X )q ,

X p X00 X q

p,q≥0

+



fn

p+q+r=n−2 

n



X p X0 (XId + X )q X0 X r ,

p,q,r≥0

 f = f (XId + X ) − f (XId ) , f0 =

 n

fn

p+q=n−1 

(XId + X )p X0 X q .

p,q≥0

This leads us to the second key observation about the Belavkin–Holevo matrix notation: The differential rule for quantum processes follows immediately from the Poisson–It¯o differential  t rule. αβ = The expected value of X t 0 Xαβ (s) dAs in the Fock vacuum state is t given by E0 [Xt ] = 0 X00 (s) ds, as each of the Aαβ are martingales for this state, with the exception of time. Let us denote the 00-coordinate map for Belavkin–Holevo matrices by ρ0 : that is, in terms of our previous notation, ρ0 (X) = X00 . We need to be able to pull out the top-right component of the matrix, and the mapping ρ0 that achieves this is ρ0 (X) ≡ %0 X 0 , where we introduce the vector



⎤ 0 0  ⎣ 0 ⎦ . 1

13.9 The Belavkin–Holevo Representation

287

13.9.3 Quantum It¯o Algebras It is of interest to consider abstract spaces of It¯o algebras. Let us think of X as an “infinitesimal generator” of a quantum stochastic process Xt = !t (X). We then wish to consider the algebraic properties that would be desired for the set a of these generators, as well as the possible representations for such sets. We begin by detailing some well-known examples. Weiner–It¯o Algebra We consider the Wiener–It¯o SDE dXt = v (Xt ) dt + σ (Xt ) dWt , where Wt is a Wiener process. In this case, the d = 1 representation can be used. The Wiener 01 process may be described as A10 t +At and we have the Belavkin–Holevo matrix X = σ W + vT, where we introduce the matrices ⎡ ⎤ ⎡ ⎤ 0 1 0 0 0 1 W = ⎣ 0 0 1 ⎦, T = ⎣ 0 0 0 ⎦. 0 0 0 0 0 0 The vector space w spanned by {W, T} is closed under matrix multiplication and has the product table ×

|

W

T

W T

| |

T 0

0 0

making w a matrix algebra. We readily see that for f analytic 1 f (XI + σ W + vT) = f (X) I + σ f  (X)W + vf  (X) + σ 2 f  (X) T. 2 From this we deduce the Wiener–It¯o differential formula

1 2    df (Xt ) = σ f (X) dWt + vf (X) + σ f (X) dt. 2 Poisson–It¯o Algebra Likewise, we consider the Poisson–It¯o SDE dXt = v (Xt ) dt+σ (Xt ) dNt , where 10 01 00 Nt is a Poisson process, here described as A11 t + At + At + At . This time, we work with the Belavkin–Holevo matrix X = σ N + vT, where we introduce the new matrices ⎡ ⎤ 0 1 1 N = ⎣ 0 1 1 ⎦. 0 0 0

288

Quantum Stochastic Calculus

The matrix algebra n generated by {N, T} again closes and we have the product table ×

|

N

T

N T

| |

N 0

0 0

and, for f analytic, we find

  f (XI + σ N + vT) = f (X) I + f (X + σ ) − f (X) N + vf  (X) T.

This implies the Poisson–It¯o differential formula   df (Xt ) = f (X + σ ) − f (X) dNt + vf  (X) dt. Hudson–Parthasarathy Algebra The matrices W and N do not commute, however, and by inspection one finds that the algebra generated by {T, W, N} is actually a four-dimensional algebra that is spanned by the elements T, A, A% and S, where ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 1 0 0 0 0 0 0 0 S  ⎣ 0 1 0 ⎦. A  ⎣ 0 0 1 ⎦, A%  ⎣ 0 0 0 ⎦, 0 0 0 0 0 0 0 0 0 In particular, W = A + A% and N = S + A + A% +T. This algebra is called the Hudson–Parthasarathy algebra and is denoted as hp. General Theory ⎡ ⎤ 0 X0 X00 Let us denote the space of matrices ⎣ 0 X X0 ⎦, considered as a sub0 0 0 set of A(1+n+1)×(1+n+1) by hp (n). This is becomes a nonunital %-algebra when we specify the twisted involution %. We may extend to a unital algebra by adding the identity I. An example of some elements are the following matrices determined from their It¯o coefficients Xαβ : for fixed 1 ≤ i ≤ d, T:

Xαβ = δα0 δβ0

Ai :

Xαβ = δα0 δβi

Sij :

Xαβ = δαi δβj ,

along with Wi = Ai + A%i , and Ni = Sii + Ai + A%i + T. These elements are all self-adjoint with respect to the twisted involution. We consider the following matrix *-subalgebras of hp (n):

13.9 The Belavkin–Holevo Representation

289

t = span {T} wi = span {T, Wi } ni = span {T, Ni }   qdi = span Ai , A%i , T   hpi = span Sii , Ai , A%i , T We shall refer to this matrix representation as the canonical representation of the It¯o algebra hp (n) and its subalgebras. Here t is the Newtonian algebra for a deterministic time variable. wi is the Wiener–It¯o algebra for the ith channel and ni is the Poisson–It¯o algebra for the channel. Next we have the quantum diffusion algebra for the channel qdi , and we write qd (n) for the full algebra of quantum diffusions for all n channels. The latter consists of all matrices of the ⎤ ⎡ 0 X0 X00 form X = ⎣ 0 0 X0 ⎦ and we have the property that qd (n) .qd (n) = 0 0 0 t, that is, XY is a multiple of T whenever X, Y ∈ qd (n). Finally hpi is the Hudson–Parthasarathy algebra for the ith channel, and it is interesting to note that it is the algebra generated by the elements {Ni , Wi , T}. αβ In general, we have an identification dXt = Xαβ (t) dAt ≡ d!t (X) between a quantum stochastic integral and its Belavkin–Holevo matrix process {Xs : s ≥ 0}. We can write out the basic axiomatic requirements for an abstract It¯o algebra a for stochastic integrals with expectation E: αd!t (X) + βd!t (Y) = d!t (αX + βY) d!t (X) .d!t (Y) = d!t (XY) d!t (X)† = d!t (X% ) d {!t (X) !t (Y)} = (!t (X) I + X) (!t (Y) I + Y) − !t (X) !t (Y) I There exists a unique self-adjoint T ∈ a such that TX = XT = 0 for all X ∈ a. • There exists a linear functional ρ on a such that ' t ρ (Xs ) ds. E [!t (X)] =

• • • • •

0 % In particular, a must 2 be a %-algebra. The element T is self-adjoint (T = T ) but also niplotent T = 0 , and this means that we cannot have a representation of a in terms of matrices with the ordinary involution of Hermitean conjugation: hence the necessity of a twisted involution. The functional ρ is said to be faithful on a if the only element, Z, such that ρ (Z) = ρ (XZ) = ρ (ZY) = ρ (XZY) = 0 for all X, Y ∈ a is Z = 0. Under conditions of

290

Quantum Stochastic Calculus

faithfulness, it can be shown that any finitely generated It¯o algebra can be canonically represented.

13.9.4 Evolutions and Dynamical Flows Hudson and Parthasarathy show that the QSDE dU = a† GUa U0 = 1 dt has a unique solution for a given constant Belavkin–Holevo matrix G on coefficients on B (h0 ), the bounded operators on h0 . Necessary and sufficient conditions for unitarity are then given by (I + G) (I + G)% = I = (I + G)% (I + G) , that is, I + G is (twisted) unitary on B (h0 )(1+n+1)×(1+n+1) . The next result shows how we may arrive at the Holevo time-ordered exponential coefficients and the Stratonivich coefficients. Proposition 13.9.5 Let H and E be (twisted) Hermitean matrices, that is, H = H% and E = E% , then either prescription I + G = e−iH , or I + G =

I − 2i E I + 2i E

determines a (twisted) unitary. This is relatively easy to see, as the first expression is evidently unitary while the second is a Cayley transform. Conversely, given a twisted unitary I + G, we refer to H = i ln (I + G) as the generator matrix for the Holevo time-ordered −1  G as the exponential (logarithm may not be unique!) and E = −i I − 2i G Stratonovich generator matrix. We remark that for quantum diffusions, these notions coincide since, if X ∈ qd (n), then X2 ≡ ξX T for some operator ξX , and we have that e−iX =

I − 2i X I+

i 2X

1 = I − iX− ξX T. 2

The expressions generally differ, however, when scattering is involved. Let us explain briefly the origin of the Stratonovich generator. We may define the scalar product of a process Xt , having Belavkin–Holevo matrix Xt , with a second Belavkin–Holevo matrix Yt to be

1 Xt ◦ Yt  Xt I + Xt Yt 2

13.9 The Belavkin–Holevo Representation

291

  and similarly Xt ◦ Yt  Xt Yt I + 12 Yt , then d dYt dXt ◦ Yt + Xt ◦ , (Xt Yt ) = dt dt dt % t where dX dt ◦ Yt = a (t) (Xt ◦ Yt ) a (t), and so on. Formally, we have the Wick ordering rules

a% (t) (Xt ◦ Yt ) a (t) = a% (t) Xt a (t) ◦ Yt , a% (t) (Xt ◦ Yt ) a (t) = Xt ◦ a% (t) Yt a (t) . For the QSDEs dUt dUt = −ia% (t) Ea (t) ◦ Ut and = a% (t) GUt a (t) dt dt to be equivalent, we need the consistency condition G = −iE − 2i EG, and this is precisely the relation introduced in the preceding.

14 Quantum Stochastic Limits

In this chapter, we look at the problem of quantum stochastic processes as approximations to physical models. Here, in contrast to the usual approximations for open systems that restrict their attention to just the reduced model, we are interested in a limit that captures both the system and its environment, with the latter taking a limit form of the Fock space of the Hudson– Parthasarathy quantum stochastic processes. The earliest formulation of this is by Accardi et al. (1989), which was formulated as a weak coupling limit resulting in a quantum diffusive evolution (Accardi et al., 1990). Subsequently, this was extended to include low-density limits that involved the Poissonian processes in the limit. For a detailed account, we refer to the book on the Quantum Stochastic Limit (Accardi et al., 2002). In this chapter, we formulate a general problem leading to a mixed Gaussian–Poissonian limit (Gough, 2005).

14.1 A Quantum Wong Zakai Theorem We will consider limit problems of the following type. First fix an initial space hS describing the Hilbert space of a quantum mechanical systems, and Bose Fock space FR to model the reservoir (the environment of the system). On the joint space hS ⊗ FR , we consider the time-dependent dynamics due to a unitary (λ) family Ut , where t ≥ 0 is time and λ > 0 is a parameter, with the associated Schr¨odinger equation ∂ (λ) (λ) U = −i ϒt (λ) Ut , ∂t t

(14.1)

where the time-dependent Hamiltonian ϒt (λ) is assumed to take the form (λ)

ϒt

− + − = E11 ⊗ a+ t (λ)at (λ) + E10 ⊗ at (λ) + E01 ⊗ at (λ) + E00 ⊗ 1 α − β = Eαβ ⊗ [a+ t (λ)] [at (λ)] .

292

(14.2)

14.1 A Quantum Wong Zakai Theorem

293

Here the operators a± t (λ) are creation (+) and annihilation (−) operators on the Bose Fock space FR . Our main assumption is that the reservoir is in its vacuum state, which we denote by , and that the creation/annihilation operators in the Hamiltonian (14.2) satisfy commutation relations of the form   − (14.3) at (λ) , a+ s (λ) = Cλ (t − s) . We shall then assume that the two-point function will converge to a singular delta-function as λ → 0. For definiteness, we assume that there exists a continuous integrable function C with the properties that ' ∞ ' 0 1 C (t) dt = = C (t) dt 2 −∞ 0 and so that Cλ (t) ≡



t 1 C . λ2 λ2

(14.4)



 t (λ) (λ) as λ → 0, Our interest will be in the limit of Ut = T exp −i 0 ds ϒs and we would like to realize the limit as a quantum stochastic process. For each λ > 0 fixed, we define the collective creator ' ∞ f (t) at (λ) dt A+ ( f , λ)  0

for square-integrable functions f , and similarly A− ( f , λ) = A+ ( f , λ)∗ . Likewise, we define the collective Weyl displacement operator as   D ( f , λ)  exp A+ ( f , λ) − A− ( f , λ) . (14.5) (λ)

We aim to study the limit of matrix elements of Ut between collective exponential vectors of the type D (f , λ) and show that they agree with those of a unitary quantum stochastic process. Let us note the following facts that show this is reasonable. Lemma 14.1.1 Let f1 , f2 ∈ L2 (R+ , dt), then   lim A− ( f1 , λ) , A+ ( f2 , λ) =

λ→0

'



f1 (t)∗ f2 (t) dt.

0

Proof

For finite λ > 0, we have ' ∞ '   − dt1 A ( f1 , λ) , A+ ( f2 , λ) = 0



dt2 f1 (t1 )∗ Cλ (t1 − t2 ) f2 (t2 ) ,

0

and the result follows from the limiting delta-function assumption for the twopoint function.

294

Quantum Stochastic Limits

Corollary 14.1.2 Let f1 , . . . , fn ∈ L2 (R+ , dt) for some integer n > 0, then lim |D ( f1 , λ) . . . D ( fn , λ) = |D (f1 ) . . . D ( fn ) ,

λ→0

where the right-hand side is taken on the Fock space + L2 (R+ , dt) – we denote the vacuum again by . The corollary readily follows from the lemma by virtue of the Gaussian moment generating function formulas. We therefore have a quantum probabilistic limit theorem that says that the ± A ( f , λ) are converging in law to limit fields A∗ ( f ) and A ( f ). The hope then is that limits of the form (λ)

lim φ1 ⊗ D( f1 , λ) | Ut

λ→0

φ2 ⊗ D( f2 , λ)

(14.6)

will exist and take the form φ1 ⊗D( f1 ) | Ut φ2 ⊗D( f2 ) for arbitrary vectors φj ∈ h and fj ∈ L2 (R + , dt) and where Ut is a well-defined quantum stochastic process on h ⊗ + L2 (R+ , dt) . We will also try to establish a similar result for the Heisenberg evolution Jt(λ) (X) = Ut(λ)† (X ⊗ 1R )Ut(λ) , for fixed bounded observables X ∈ B(hS ). (λ)

The formal Dyson series development of Ut involves the multiple time integrals1 ' ∞  (λ) n Ut = (−i) dsn . . . ds1 ϒsn (λ) . . . ϒs1 (λ). (14.7) n (t)

n=0

14.1.1 The Dyson Series Expansion of Ut(λ) Substituting the Dyson series (14.7) into (14.6), we obtain a series expansion: (λ)

φ1 ⊗ D( f1 , λ) | Ut ' ∞  = (−i)n n=0

φ2 ⊗ D( f2 , λ) dsn . . . ds1

n (t)

× φ1 ⊗ D( f1 , λ) | ϒsn (λ) . . . ϒs1 (λ) φ2 ⊗ D( f2 , λ) .

(14.8)

Lemma 14.1.3 We have the identity

φ1 ⊗ D( f1 , λ) | ϒsn (λ) . . . ϒs1 (λ) φ2 ⊗ D( f2 , λ) = φ1 ⊗ | ϒ˜ sn (λ) . . . ϒ˜ s1 (λ) φ2 ⊗ D( f1 , λ) |D( f2 , λ) , (14.9) 1 We recall our notation: for σ ∈ S , we have the simplex σ (t)  {(s , . . . , s ) : t > s n n 1 σ (n) n > · · · > sσ (1) > 0}, and n (t) in (5.1) is the simplex corresponding to the identity permutation.

14.1 A Quantum Wong Zakai Theorem

295

α − β ϒ˜ s (λ) = E˜ αβ (t, λ) ⊗ [a+ t (λ)] [at (λ)] ,

(14.10)

where ϒ˜ s (λ) is

where E˜ 00 (t, λ) = E00 + E10 f1 (t, λ) + E01 f2∗ (t, λ) + E11 f1 (t, λ)f2∗ (t, λ); E˜ 10 (t, λ) = E10 + f2∗ (t, λ)E11 ; E˜ 01 (t, λ) = E01 + f1 (t, λ)E11 ; E˜ 11 (t, λ) = E11 ; with

'



fj (t, λ) =

(14.11)

fj (u)Cλ (t − u) du.

(14.12)

0

Proof

On the left-hand side, we have the exponential vector 1

D( f2 , λ) = e− 2



f2∗ (s)Cλ (s−u)f2 (u)dsdu A+ (f2 ,λ)

e

.

A+ ( f

2 ,λ) to the far left, commuting past the operIf we try to move the operator e ators ϒsn (λ) . . . ϒs1 (λ) we find that we effectively implement the replacement of the annihilators as

− ∗ a− t (λ) → at (λ) + f2 (t, λ),

(14.13)

with the creators being unchanged. Similarly, doing the same procedure with + D ( f1 , λ) we implement the change a+ t (λ) → at (λ) + f1 (t, λ), with the annihlators being unaffected. The overall result is to replace each Hamiltonian term (λ) ∗ α − β ϒs with ϒ˜ s (λ) = Eαβ ⊗ [a+ t (λ) + f1 (t, λ)] [at (λ) + f2 (t, λ)] , which is then rearranged to give (14.10). Up to the factor (−i)n D( f1 , λ) |D( f2 , λ) , we see that the n-th term in the Dyson series expansion (14.8) of the matrix element is ' dsn . . . ds1 φ1 | E˜ αn βn (sn , λ) . . . E˜ α1 β1 (s1 , λ) φ2 n (t)

αn − βn + α1 − β1 × | [a+ sn (λ)] [asn (λ)] . . . [as1 (λ)] [as1 (λ)] .

(14.14)

The vacuum expectation in (14.14) can be computed using (9.93). Let us recall the identity 

| [A+ ( fn )]α(n) [A− (gn )]β(n) . . . [A+ ( f1 )]α(1) [A− (g1 )]β(1) α,β∈{0,1}n

=





π ∈Part(n) {i(1),...,i(k)}∈π

gi(k) |fi(k−1) · · · gi(3) |fi(2) gi(2) |fi(1)

(14.15)

296

Quantum Stochastic Limits

for f1 , g1 , . . . , fn , gn ∈ f ., where we take the various sets (parts of the partition) {i(1), . . . , i(k)} ∈ π to be ordered so that i(1) < i(2) < · · · < i(k) and if the set is a singleton it is given the factor of unity. We again resort to a diagrammatic convention in order to describe the Dyson series expansion into sums of integrals of products of two-point functions. There is a one-to-one correspondence between the diagrams appearing in the n-th term of the Dyson series and set of partitions of the n vertices. The diagram pictured in Figure 9.3 would contribute a weight of ' 17 E˜ 01 (t17 , λ)E˜ 00 (t16 , λ) . . . E˜ 10 (t1 , λ) Cλ (t17 − t11 ) . . . Cλ (t2 − t1 ) (−i) 17 (t)

to the series.

14.1.2 Principal Terms (in the Dyson Series) A standard technique in perturbative quantum field theory and quantum statistical mechanics is to develop a series expansion and argue on physical grounds that certain “principal terms” will exceed the other terms in order of magnitude (Abrikosov et al., 1963). Often it is possible to resum the principal terms to obtain a useful representation of the dominant behavior. Mathematically, the problem comes down to showing that the remaining terms are negligible in the limiting physical regime being considered. Let us consider a typical diagram. We shall assume that within the r . . .], n2 contraction pairs diagram there are n1 singleton vertices [. . .     [. . . r . . . r . . .], n3 contraction triples [. . . r . . . r . . . r . . .], and so on. This yields a set of occupation numbers n = nj and a diagram has a total of   E (n) = j jnj vertices, which are partitioned into N (n) = j nj connected subdiagrams. We see that the total number of diagrams contributing to the n-th level of the Dyson series will be given by the Bell number Bn . The resulting terms can be split into two types: type I will survive the λ → 0 limit; type II will not. They are distinguished as follows: Type I: Terms involving contractions of time consecutive annihilator/creator pairs only. (That is, under the time-ordered integral in (14.14), an + annihilator a− sj+1 (λ) must be contracted with the creator asj (λ).) Type II: All others cases. The terminology used here is due to Accardi et al. (1990). Let n be a positive integer and m ∈ {0, . . . , n − 1}. Let {(pj , qj )}m j=1 be contractions pairs over indices {1, . . . , n} such that if P = {p1 , . . . , pm } and

14.1 A Quantum Wong Zakai Theorem

297

Q = {q1 , . . . , qm }, then P and Q are both nondegenerate subsets of size m and we require that pj < qj for each j and that Q be ordered so that q1 < · · · < qm . We understand that (pj , qj )m j=1 is type I if qj = pj + 1 for each j and type II otherwise. The following result is an extension of lemma 4.2 in Accardi et al. (1990). Lemma 14.1.4 Let (pj , qj )m j=1 be a set of m pairs of contractions over the set of indices {1, . . . , n}, then ' m  1 tn−m . (14.16) ds1 . . . dsn Cλ s(pj ) − s(qj ) ≤ m 2 (n − m)! n (t) j=1

Moreover, as λ → 0,  ' m  ds1 . . . dsn Cλ s(pj ) − s(qj ) → n (t)

1 tn−m 2 (n−m)! ,

j=1

0,

type I; type II.

(14.17)

Proof Let q = q1 and set t(q) = [s(p) − s(q)]/λ2 , then ' m  + ds1 . . . dsn

a− s(pj ) (λ)as(qj ) (λ) = n (t)

'

t

' ds(1)· · ·

0

'

j=1 s(q−2)

' ds(q − 1)

0 s(p)−λ2 t(q)

dt(q)

[s(p)−s(q−1)]/λ2 s(n−1)

'

ds(q + 1)· · ·

0

s(p)/λ2

ds(n) C (t (q)) 0

m 

Cλ s(pj ) − s(qj ) .

j=2

However, we have that s(p) − λ2 t(p) < s(q − 1), and so we obtain the bound ' t ' s(q−2) ' ∞ ' s(p)−λ2 t(q) ds(1)· · · ds(q − 1) dt(q) ds(q + 1) 0

−∞

0 s(n−1)

' ···

ds(n) C (t (q)) 0

m 

0

Cλ s(pj ) − s(qj ) .

j=2

And so, working inductively we obtain (14.16). Suppose now that the pairs are of type I, then p = q − 1 and so the lower limit of the t(q)-integral is zero. Consequently, we encounter the sequence of integrals ' s(q−2) ' s(q−1)/λ2 ' s(q−1)−λ2 t(q) ds(q − 1) dt(q) ds(q + 1)C (t (q)) . . . . 0

0

0

This occurs for each q-variable, and so we recognize the limit as stated in (14.17) for type I terms.

298

Quantum Stochastic Limits

For type II pairs, on the other hand, let j = min{k : pk < qk − 1}; setting q = qk , we encounter the sequence of integrals ' s(q−2) ' s(p)/λ2 ' s(q−1) ds(q − 1) dt(q) ds(q + 1) C (t (q)) . . . . [s(p)−s(q−1)]/λ2

0

0

But now, with respect to the variables s(1), . . . , s(p), . . . , s(q−1), we have that, since s(p) = s(q − 1), the lower limit [s(p) − s(q − 1)]/λ2 of the t(q)-integral is almost always negative and so, as t → C (t) is continuous, we have the dominated convergence of the whole term to zero. Clearly type II terms do not contribute to the n-th term in the series expansion in the limit. However, we must establish a uniform bound for all these terms when the sum over all terms is considered. We do this in the next section. Before proceeding, let us remark that expression (14.14) is bounded by Cαn βn . . . Cα1 β1 *φ1 **φ2 * ' αn − βn + α1 − β1 × dsn . . . ds1 |[a+ sn (λ)] [asn (λ)] . . . [as1 (λ)] [as1 (λ)] , n (t)

(14.18) where C11 = *E11 *; C10 = *E10 * + *E11 *c2 ; C01 = *E01 * + *E11 *c1 ; ∞

C00 = *E00 * + *E10 *c1 + *E01 *c2 + *E11 *c1 c2

(14.19)

and cj = −∞ du |C (u) fj (u) ||. We will make the assumption that 1 C11 < 1 and that C = max{C11 , C10 , C01 , C00 } < ∞. 2 We need to do some preliminary estimation. We employ the occupation numbers introduced in Section 1.6.1. The number of times that we will have  (α, β) = (1, 1) in a particular term will be j>2 ( j − 2)nj (that is, singletons and pairs have none, triples have one, quadruples have two, etc.), and this equals E(n) − 2N(n) + n1 . Therefore, we shall have E(n)−2N(n)+n1

Cαn βn . . . Cα1 β1 ≤ C11

C2N(n)−n1 .

(14.20)

14.1.3 Generalized Pul´e Inequalities We shall denote by Part(n) the set of all partitions having the same occupation number sequence n. Given a partition π ∈Part(n) we use the convention

14.1 A Quantum Wong Zakai Theorem

299

q(j, k, r) to label the r-th element of the k-th j-tuple. A simple example of a partition in Part(n) is given by selecting in order from {1, 2, . . . , E(n)}, first of all n1 singletons, then n2 pairs, then n3 triples, and so on. The labeling for this particular partition will be denoted as q¯ (·, ·, ·), and explicitly we have  l nl + (k − 1)nj + r. (14.21) q¯ (j, k, r) = l