100 Years of Fundamental Theoretical Physics in the Palm of Your Hand: Integrated Technical Treatment [1 ed.] 3030510808, 9783030510800

https://www.springer.com/gp/book/9783030510800 This book aims to integrate, in a pedagogical and technical manner, wit

421 67 13MB

English Pages 574 [520] Year 2020

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

100 Years of Fundamental Theoretical Physics in the Palm of Your Hand: Integrated Technical Treatment [1 ed.]
 3030510808, 9783030510800

Table of contents :
Preface
Contents
About the Author
Abbreviations, Notations and Data
Units and Data
Masses
References
1 Introduction—An Overview and a Road Map
References
2 Sorting Out Selective Measurements: QM Re-invents Complex Numbers, Matrices, Operators and Inner-Product Spaces
References
3 Interference and Measurements
References
4 Transformation Theory and Emergence of hbar in the Formalism
References
5 Quantum Dynamics, Construction of Hamiltonians and Decay of Quantum Systems
References
6 Stability of the H-Atom in Configuration Space
Reference
7 Do Atomic Electrons Fall to the Center of Multi-electron Atoms?
References
8 How Quantum Mechanics Forces Matter in Bulk to Occupy Such a Large Volume and Prevents It from Collapsing Around Us
References
9 Schrödinger's Cat and Quantum Decoherence
References
10 Quantum Teleportation
References
11 Lorentz Frames and Minkowski Spacetime
References
12 The Celebrated Lorentz Group: Underlying Transformations Derived
Reference
13 Physics in Minkowski Spacetime: Applications
References
14 First Unified Field Theory: Maxwell's Equations and Lorentz Covariance
15 QM Meets Relativity and Birth of QFT: Fields and Particles
References
16 The Five Types of Fields You Meet in Quantum Field Theory
References
17 Gauge Fields
References
18 Particles & Symmetries
References
19 Strange, Charm, Bottom and the Top Quarks: How They Came About?
References
20 C, P, CP and T Violations in Weak Interactions
References
21 Lagrangians: Varying Action Integrals in QFT
Reference
22 Quantum Dynamics: The Functional Differential Formalism of QFT
References
23 Quantization of Gauge Theories and Constraints: Functional Differential Formalism
References
24 Functional (Path) Integral Formalism as a Functional Fourier Transform of the Functional Differential One
References
25 Transition Amplitudes and the Meaning of Virtual Particles
References
26 Radiative Corrections
References
27 Anomalous Magnetic Moment of the Electron and the Lamb Shift
References
28 How the Fine-Structure Changes from simeq 1/137 to simeq1/128 at High Energies
References
29 Bell's Test, Entanglement and the Strict Predictions of QFT
References
30 What Is Renormalization Theory and Why Is It a Criterion in Developing and Describing the Fundamental Interactions in Nature?
References
31 Renormalization Group of an Abelian Gauge Theory: QED
References
32 Renormalization Group of a Non-Abelian Gauge Theory: QCD and Asymptotic Freedom
References
33 Birth of the Electroweak Theory
References
34 The Higgs Field and the God Particle
References
35 The Elecroweak Model and the Significant Role of the Higgs Field
References
36 Spin and Statistics and the CPT Theorem
References
37 Grand Unified Field Theories and the Role of Inflation of the Universe
References
38 Supersymmetry and Superspace
References
39 The Very Nature of a SUSY Theory: Super-Poincaré Algebra, Energy Spectrum and Supermultiplets
Reference
40 Superfields
References
41 SUSY Field Theories: Maxwellian, Yang–Mills, Spin0-Spin1/2 Interactions
References
42 SUSY and the Standard Model: Couplings Unification
References
43 String Theory
References
44 Bosonic Strings
References
45 Compactification, mathcalD Branes and Mass Generation
References
46 Superstrings
References
47 Vertices, Interactions and Scattering
References
48 String Theory Re-Invents the Yang-Mills Field Theory
Reference
49 Gravity and Spacetime
References
50 Spacetime Curvature
51 Field Equations of GR
References
52 Transfer of Gravitational Energy, Polarization Aspects and Gravitational Waves
References
53 Review of Planetary Motion in Newtonian Theory of Gravitation in Brief
54 Schwarzschild Metric: An Elementary Treatment
References
55 Conservation Laws and GR Treatment of the Schwarzschild Metric
56 Perihelion Precession of Planets Orbits and the Schwinger-Binet Equation
References
57 Light Deflection in GR and Gravitational Lensing
References
58 Retardation of Light Signals in GR
References
59 Local Measurement of the Speed of Light in the Schwarzschild Geometry and Its Constancy
60 Black Holes Physics
References
61 Fall into a Black Hole
62 Black Holes, White Holes and Wormholes
References
63 Geometrodynamics of the Wormhole
References
64 Spinning Black Holes: Basic Properties
References
65 Spinning Black Holes: Rotational Aspects and the Ergoregion
References
66 Spinning BHs: Applications and Energy and Angular Momentum Aspects
References
67 Frame Dragging and Geodetic Effect Derived: Two More Tests of GR
References
68 Entropy, Thermodynamics of a BH and Hawking Radiation
References
69 String Theory Re-invents Einstein's General Theory of Relativity
Reference
70 Accommodating Spinors in Spacetime and Emergence of Supergravity
Reference
71 Quantization of Geometry and Loop Quantum Gravity
References
72 Perturbative Quantum Gravity
References
73 Our Homogeneous and Isotropic Universe on Very Large Scales
References
74 Three Dimensional Surfaces of Constant Curvatures
75 Robertson–Walker Metric and Friedmann Universes
References
76 Expansion of the Universe and the Hubble Law
References
77 Elementary Derivation of the Friedmann Equations and Their Modifications
78 Evolution and Fate of the Universe
Reference
79 Age of the Universe and How Big Is It?
References
80 Distances in Cosmology
References
81 Statistical Thermodynamics in Brief
82 Black Body Radiation: The Planck Law
83 Statistical Distributions
84 Neutrinos and Ultra-Relativistic Particles
Reference
85 Big Bang Nucleosynthesis
Reference
86 Cosmic Microwave Background Radiation and Photon Decoupling
References
87 The Big Bang Description Needs Some Refinement
88 Inflation
References
89 Inflation Solves Three Key Problems of the Big Bang Description and GUTs
90 Field Theory Formulation of Inflation: A Time Dependent Vacuum and a Scalar Field
Reference
91 Power Spectrum, Geometry of the Universe and Parameters of Cosmology
References
92 Dark Matter and Dark Energy
References
93 Another Look at Our Universe
References
Index
Index

Citation preview

E. B. Manoukian

100 Years of Fundamental Theoretical Physics in the Palm of Your Hand Integrated Technical Treatment

100 Years of Fundamental Theoretical Physics in the Palm of Your Hand

E. B. Manoukian

100 Years of Fundamental Theoretical Physics in the Palm of Your Hand Integrated Technical Treatment

123

E. B. Manoukian The Institute for Fundamental Study Naresuan University Phitsanulok, Thailand

ISBN 978-3-030-51080-0 ISBN 978-3-030-51081-7 https://doi.org/10.1007/978-3-030-51081-7

(eBook)

© Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

My aim in this book is to integrate, in a pedagogical and technical manner, with detailed derivations, most, if not all, of the modern fundamental theoretical physics developments in the past 100 years or so consisting of foundational aspects of: – – – – – – – – – – – –

Quantum physics and Stability Problems in the Quantum World. Minkowski Spacetime Physics. Particles Classifications and Underlying Symmetries and Symmetry Violations. Quantum Field Theory of Particles Interactions. Higgs Field Physics. Supersymmetry: A Theory with Mathematical Beauty. Bosonic and Superstrings. Spacetime, Gravity and Supergravity. General Relativity and its Predictions, Including Frame Dragging. Intricacies of Black Holes Physics. Perturbative and Non-Perturbative Quantum Gravity. Intricacies of Modern Cosmology, Including Inflation and Power Spectrum,

thus covering fundamental physics from minute regions of spacetime to the edge of the Universe. If you are in the process of learning or lecturing on any of the above, irrespective of your specialty, then this textbook is for you. It is a graduate level book.1 Graduate students, particularly, will benefit much from this detailed textbook as it essentially covers an entire typical good graduate program in theoretical physics in an integrated inviting manner. As such, it may be used in all the courses based on one or more of the above descriptions. And by its very nature, I expect that it would become a perfect constant companion of the student. The book emphasizes approaches and developments which are reasonably well understood and for which satisfactory theoretical descriptions have been given. Having realized that some physicists whose work does not involve relativity seem to know very little about special relativity, I had to introduce aspects of Minkowski spacetime physics. I have as well covered enough material in statistical thermodynamics and statistical distributions for pedagogical reasons and for completeness as they are much needed in cosmology. When I was a graduate student, as well as when I was new in this business, I often wished there was a rather technical textbook, with derivations not just in words and hand-waving arguments as are often presented, of modern developments in fundamental theoretical physics, covering its wide spectrum, and thus being able to master the technicalities of the corresponding exciting developments and, in turn, being able myself to derive explicitly the exciting results in all the different areas of modern fundamental theoretical physics from the very small to cosmological distances. With over-specializations, in one specific area, students,

Some of the chapters which are not too technical may be suitable for advanced undergraduates in their final year.

1

v

vi

and perhaps many physicists, would find it difficult to keep up with all the exciting developments going on in physics and are even less familiar with their underlying technicalities. For example, they might have heard that the Universe is 13.8 billion years old, but have no idea on how this number is actually computed. They might know, or have heard, that the Higgs field gives mass2 to particles, without really realizing that the underlying theory had to develop that way and as such a field was needed to do just that, in order to have a consistent description of the electroweak interaction which unifies the electromagnetic and the weak interactions. In Quantum Electrodynamics (QED), one starts with a massive electron in the theory, no Higgs field is needed to generate it. Why? They might have heard or read, or possibly not being aware at all, that only about 7 years ago a general relativity prediction was experimentally confirmed that a rotating massive sphere (the Earth), for example, acts on the spin of a gyroscope, in a similar way as that of a rotating charged sphere acts on the spin of a particle, without having a clue on how to compute the underlying physical quantity in question. They might have heard, or not at all, that unlike photons, that there is a relative angle of 45o of graviton plane polarization states rather than 90o, having no clue on how to derive this theoretically. They have undoubtedly heard of the large empty space within atoms such as, for example, by reading the amusing article by Paul Sen3 at the BBC, in which he emphasizes this large empty space that matter occupies. He states: if you were to take out all of the empty space in atoms, and then compress all of the atoms so they were physically touching, all of the human beings on the planet would be about the size of an apple, without being aware that one can describe this large aspect of matter quantitatively. They might have heard of the inflation of the Universe, in which the latter was expected to have expanded gigantically exponentially, without knowing why this was expected to have happened and how old the Universe was expected to be at that time. These are just few examples. It is precisely because of these reasons that I was involved with this project which is certainly in need for providing a good training for our students. As graduate students, as well as instructors and researches, with different specializations, have no time to master all the fields given above, this unique textbook will be of value not only to graduate students, as mentioned above, but to instructors and professionals as well, who are interested in presentations, intricacies and derivations of the many aspects of modern fundamental theoretical physics irrespective of their specialties. To help particularly the student, I have provided sufficient details in the derivations in the book and as such it would be useful for readers with different backgrounds. To make it easier for the reader to go through a chapter, some of the more difficult results or longer derivations are conveniently displaced in boxes. I have tried to make each chapter as self-contained as possible, with the necessary prerequisite chapters numbers displaced below the title of a given chapter if the need arises. The book is divided most conveniently, in general, into short chapters so that the reader may master each topic easily and progress through the book at his or her convenient rate. The chapters are not randomly introduced and there is a smooth transition as we move from one part of the book to the next one as spelled out in the Table of Contents. I start with low energy quantum physics, study its development and consider some of its important consequences. In order to describe the dynamics of the overwhelming number of quantum particles observed in nature at high energies, satisfying relativistic constraints, the need arises to introduce Minkowski spacetime concepts and interesting applications in it are subsequently carried out. Such particles are then properly described in Minkowski spacetime. Their classifications and their underlying symmetries, together with their violations, are then studied at length. Quantum Field Theories (QFT) are afterwards developed to consistently describe their interactions with the exception, at that stage, because of physical reasons and associated technical reasons, of the gravitational one. Moreover, the concept of the Higgs field is simultaneously introduced

2

This was even announced on CNN and BBC. See P. Sen. You can’t see the atom. Published by BBC News: 23 July, 2007, News Front Page. UK (2007).

3

Preface

Preface

vii

and further generalizations dealing with supersymmetry and the string theory4 approach to high energy physics are considered. Special attention is given to describe mathematically, not just in words, how the Higgs field permeates spacetime, and study its particles excitations as well as of the decay process of the Higgs boson excitation in a way directly related to experimental studies. The need then arises to consider gravitation, in particular, in the realm of experiments. To deal with the gravitational interaction, a curved spacetime description, generalizing the Minkowski one and based on a principle, referred to as the principle of equivalence, leads to the General Theory of Relativity (GR) as a theory of gravitation, and several pertinent modern applications are given such as Frame Dragging and Geodesic Effect as well as in Black Hole physics. Quantum aspects of gravitation as well as of supergravity are then considered by accommodating half-odd integer spins in spacetime. GR and the overwhelming number of particles introduced in earlier chapters allow one finally to deal with the intricacies of modern cosmology and study theoretically, i.e., technically, the evolution, the age, the geometry and the fate of the Universe. The prerequisites for the book are introductory courses in quantum mechanics, electrodynamics as well as basics in mathematical techniques to which most students in undergraduate physics programs in the world are introduced. The legendary Richard Feynman in his famous lectures5 on quantum field theory of fundamental processes, the first statement he makes, the very first one, is that the lectures cover all of physics. One quickly understands what Feynman meant by covering all of physics. The role of fundamental theoretical physics is to describe the basic interactions of Nature and QFT, par excellence, is supposed to do just that and provide a unified description of all these interactions. With this in mind, and with what that has been covered here by incorporating, in the process, modern developments as well, I can certainly be justified to say that the “present textbook covers all of fundamental theoretical physics”. I believe every serious graduate student in theoretical physics should be familiar with the content in this book. The present textbook is geared to this end. Without further ado, let us find out what the Universe has been telling us for the past 100 years or so. Phitsanulok, Thailand

4

E. B. Manoukian

For those who make arbitrary comments on string theory, it is worth mentioning that in this theory: 1. all particles, including the evasive graviton, emerge from the theory and they are not, a priori, put in the theory by hand, 2. the theory re-invents the Yang-Mils field theory on which our present modern gauge theories are based, 3. the theory re-invents GR on which gravitational tests and the description of modern cosmology rely. These three facts alone are reason enough to study the theory of strings. 5 See R. P. Feynman. The theory of fundamental processes. The Benjamin/Cummings Publishing Co., Menlo Park, California. 6th Printing (1982), p. 1.

Acknowledgements

I am indebted to many of my former graduate students, who are now established physicists in their own rights, particularly to Chaiyapoj Muthaporn, Siri Sirininlakul, Tukkamon Vijaktanawudhi (aka Kanchana Limboonsong), Prasopchai Viriyasrisuwattana, Seckson Sukhasena, Ponrad Srisawad, Songvudhi Chimchinda and Moragote Buddhakala who through their many questions, several discussions and/or collaborations, have been very helpful for me in writing this book. Although I have typed the entire manuscript myself, and drew the figures as well, Chaiyapoj Muthaporn prepared the LATEX input files. Without his constant help this work would never have been completed. I applaud him, thank him and will always remember how helpful he was. I would like also to thank Burin Gumjudpai, Seckson Sukhasena, Suchittra Sa-Nguansin, and Jiraphorn Chomdaeng of the Institute for Fundamental Study of Naresuan university for encouragement and support, as well as Ahpisit Ungkitchanukit and Chai-Hok Eab from Chulalongkorn University. Several chapters of the book were written at this Institute of Fundamental Study, and I could have found no more congenial atmosphere for working on this pleasant project than the one existing at the Institute. Over the years, I have benefited much from lectures and writings of Julian Schwinger who had a unique way of doing physics with clarity and much elegance and developed most powerful formalisms for carrying out computations. He certainly had one of the greatest minds in theoretical physics of our time. As always, my thanks go to Steven Weinberg, the late Abdus Salam, Raymond Streater and Eberhard Zeidler for the keen interest they have shown in my work on renormalization theory over the years. It was a pleasure to work with the wonderful editorial team of Angela Lahee (Berlin), Mieke van der Fluit (Dordrecht), Maria Bellantone (London), Hisako Niko (Dordrecht) and, at the production stage, Annelies Kersbergen (Dordrecht) all of Springer who made the publication of this book possible. My special thanks go to Mieke for her infinite patience, dedication and for her constant encouragement. During about fifteen years or so I have exchanged, on and off, more emails with Mieke than with anybody else on the globe in my entire life. She is a remarkably unique person, and I wish her a very happy retirement. This project would not have been completed without the patience, encouragement and understanding of my wife Tuenjai. To my parents, who are both gone, this work is affectionately dedicated.

ix

Contents

1

Introduction—An Overview and a Road Map . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 11

2

Sorting Out Selective Measurements: QM Re-invents Complex Numbers, Matrices, Operators and Inner-Product Spaces . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15 26

3

Interference and Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27 32

4

Transformation Theory and Emergence of  h in the Formalism . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33 38

5

Quantum Dynamics, Construction of Hamiltonians and Decay of Quantum Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39 43

6

Stability of the H-Atom in Configuration Space . . . . . . . . . . . . . . . . . . . . . . . Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45 47

7

Do Atomic Electrons Fall to the Center of Multi-electron Atoms? . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49 52

8

How Quantum Mechanics Forces Matter in Bulk to Occupy Such a Large Volume and Prevents It from Collapsing Around Us . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55 59

Schrödinger’s Cat and Quantum Decoherence . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61 66

10 Quantum Teleportation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69 70

11 Lorentz Frames and Minkowski Spacetime . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71 78

12 The Celebrated Lorentz Group: Underlying Transformations Derived . . . . . Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79 83

13 Physics in Minkowski Spacetime: Applications . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85 91

14 First Unified Field Theory: Maxwell’s Equations and Lorentz Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

9

15 QM Meets Relativity and Birth of QFT: Fields and Particles . . . . . . . . . . . . 99 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

xi

xii

16 The Five Types of Fields You Meet in Quantum Field Theory . . . . . . . . . . . 105 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 17 Gauge Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 18 Particles & Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 19 Strange, Charm, Bottom and the Top Quarks: How They Came About? . . . 133 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 20 C, P, CP and T Violations in Weak Interactions . . . . . . . . . . . . . . . . . . . . . . 139 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 21 Lagrangians: Varying Action Integrals in QFT . . . . . . . . . . . . . . . . . . . . . . . 145 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 22 Quantum Dynamics: The Functional Differential Formalism of QFT . . . . . . . 149 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 23 Quantization of Gauge Theories and Constraints: Functional Differential Formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 24 Functional (Path) Integral Formalism as a Functional Fourier Transform of the Functional Differential One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 25 Transition Amplitudes and the Meaning of Virtual Particles . . . . . . . . . . . . . 169 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 26 Radiative Corrections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 27 Anomalous Magnetic Moment of the Electron and the Lamb Shift . . . . . . . . 185 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 28 How the Fine-Structure Changes from ’1=137 to ’1=128 at High Energies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 29 Bell’s Test, Entanglement and the Strict Predictions of QFT . . . . . . . . . . . . . 195 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 30 What Is Renormalization Theory and Why Is It a Criterion in Developing and Describing the Fundamental Interactions in Nature? . . . . . . . . . . . . . . . 201 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 31 Renormalization Group of an Abelian Gauge Theory: QED . . . . . . . . . . . . . 207 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 32 Renormalization Group of a Non-Abelian Gauge Theory: QCD and Asymptotic Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 33 Birth of the Electroweak Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 34 The Higgs Field and the God Particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228

Contents

Contents

xiii

35 The Elecroweak Model and the Significant Role of the Higgs Field . . . . . . . . 229 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 36 Spin and Statistics and the CPT Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 37 Grand Unified Field Theories and the Role of Inflation of the Universe . . . . . 247 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 38 Supersymmetry and Superspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 39 The Very Nature of a SUSY Theory: Super-Poincaré Algebra, Energy Spectrum and Supermultiplets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 40 Superfields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 41 SUSY Field Theories: Maxwellian, Yang–Mills, Spin0-Spin1/2 Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 42 SUSY and the Standard Model: Couplings Unification . . . . . . . . . . . . . . . . . 279 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 43 String Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 44 Bosonic Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 45 Compactification, D Branes and Mass Generation . . . . . . . . . . . . . . . . . . . . . 303 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306 46 Superstrings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 47 Vertices, Interactions and Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 48 String Theory Re-Invents the Yang-Mills Field Theory . . . . . . . . . . . . . . . . . 323 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 49 Gravity and Spacetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 50 Spacetime Curvature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 51 Field Equations of GR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 52 Transfer of Gravitational Energy, Polarization Aspects and Gravitational Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 53 Review of Planetary Motion in Newtonian Theory of Gravitation in Brief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 54 Schwarzschild Metric: An Elementary Treatment . . . . . . . . . . . . . . . . . . . . . 353 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

xiv

55 Conservation Laws and GR Treatment of the Schwarzschild Metric . . . . . . . 359 56 Perihelion Precession of Planets Orbits and the Schwinger-Binet Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 57 Light Deflection in GR and Gravitational Lensing . . . . . . . . . . . . . . . . . . . . . 367 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 58 Retardation of Light Signals in GR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 59 Local Measurement of the Speed of Light in the Schwarzschild Geometry and Its Constancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 60 Black Holes Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384 61 Fall into a Black Hole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 62 Black Holes, White Holes and Wormholes . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393 63 Geometrodynamics of the Wormhole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400 64 Spinning Black Holes: Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405 65 Spinning Black Holes: Rotational Aspects and the Ergoregion . . . . . . . . . . . . 407 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410 66 Spinning BHs: Applications and Energy and Angular Momentum Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416 67 Frame Dragging and Geodetic Effect Derived: Two More Tests of GR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 68 Entropy, Thermodynamics of a BH and Hawking Radiation . . . . . . . . . . . . . 425 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428 69 String Theory Re-invents Einstein’s General Theory of Relativity . . . . . . . . . 429 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 70 Accommodating Spinors in Spacetime and Emergence of Supergravity . . . . . 433 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 71 Quantization of Geometry and Loop Quantum Gravity . . . . . . . . . . . . . . . . . 439 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444 72 Perturbative Quantum Gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449 73 Our Homogeneous and Isotropic Universe on Very Large Scales . . . . . . . . . . 451 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453 74 Three Dimensional Surfaces of Constant Curvatures . . . . . . . . . . . . . . . . . . . 455

Contents

Contents

xv

75 Robertson–Walker Metric and Friedmann Universes . . . . . . . . . . . . . . . . . . . 459 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 76 Expansion of the Universe and the Hubble Law . . . . . . . . . . . . . . . . . . . . . . 465 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468 77 Elementary Derivation of the Friedmann Equations and Their Modifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 78 Evolution and Fate of the Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476 79 Age of the Universe and How Big Is It? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479 80 Distances in Cosmology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 81 Statistical Thermodynamics in Brief . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487 82 Black Body Radiation: The Planck Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 83 Statistical Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 84 Neutrinos and Ultra-Relativistic Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502 85 Big Bang Nucleosynthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506 86 Cosmic Microwave Background Radiation and Photon Decoupling . . . . . . . . 507 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511 87 The Big Bang Description Needs Some Refinement . . . . . . . . . . . . . . . . . . . . 513 88 Inflation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521 89 Inflation Solves Three Key Problems of the Big Bang Description and GUTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523 90 Field Theory Formulation of Inflation: A Time Dependent Vacuum and a Scalar Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530 91 Power Spectrum, Geometry of the Universe and Parameters of Cosmology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534 92 Dark Matter and Dark Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539 93 Another Look at Our Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547

About the Author

Prof. Dr. E. B. Manoukian is Emeritus Professor of Physics at the “Institute for Fundamental Study” of Naresuan University in Thailand. He carried out graduate studies at McGill University and University of Toronto obtaining the M.Sc. and Ph.D. degrees, respectively, in 1968 and 1971. He held research positions at the “Theoretical Physics Institute” of the University of Alberta, the “Dublin Institute for Advanced Studies”, and the “Centre de Recherches Mathématiques Appliquées” of the University of Montreal. He held professorships at the “Royal Military College of Canada” of the Department of National Defence, as well as at the “School of Physics” of Suranaree University of Technology and Naresuan University, in Thailand. He is the author of over 190 research papers in most aspects of Theoretical Physics, as well as of 8 books, 5 of which, including this one, are by Springer. The following books by the author: “Renormalization” (Academic Press), “Quantum Theory: A Wide Spectrum” (Springer), “Quantum Field Theory I: Foundations and Abelian and Non-Abelian Gauge Theories” (Springer), “Quantum Field Theory II: Introductions to Quantum Gravity, Supersymmetry and String Theory” (Springer), should be of interest to readers of the present one.

xvii

Abbreviations, Notations and Data

AU BB BH c CMB CPT EW EP G, GF GR GUT IF J K, kB ‘ P ; t P ; EP LHC ly LQG pc PDG Pdg PE, EEP QED, QCD QFT, HEP SR WH yr

Astronomical unit Black Body Black Hole Speed of light Cosmic Microwave Background Combined Charge, Parity, Time reversal transformations Electroweak Equivalence Principle of renormalization theory Newton gravitational constant, Fermi weak interaction constant General Relativity Grand Unified Theory Inertial Frame Joule Kelvin degree, Boltzmann constant Planck length, Planck time, Planck energy Large Hadron Collider light-year Loop Quantum Gravity parsec Particle Data Group Particle data group Principle of Equivalence, Einstein’s Equivalence Principle Quantum Electrodynamics, Quantum Chromodynamics Quantum Field Theory, High Energy Physics Special Relativity White Hole year

Units and Data • For units and data see the compilations of the “Particle Data Group”: Tanabashi et al. [1], and, e.g., Olive et al. [2]. The following (some obviously approximate) numerical values, however, should be noted: 1 MeV ¼ 106 eV; 1 GeV ¼ 103 MeV; 103 GeV ¼ 1 TeV; 1 erg ¼ 107 J; 1 J ¼ 6:242  109 GeV

h ¼ 1:0546  1034 m2 kg=s ¼ 1:0546  1034 J-s ¼ 6:58212  1022 MeV s hc ¼ 197:33 MeV-fm ¼ 1:973  1016 GeV-m; 1 GeV=c2 ¼ 1:782662  1027 kg  c ¼ 2:99792458  1010 cm/s(exact); kB ¼ 8:617  1014 GeV/K ¼ 1:381  1023 J/K xix

xx

Abbreviations, Notations and Data

‘P ¼ 1:616  1033 cm; fm ¼ 1013 cm; AU ¼ 1:496  1011 m; ly ¼ 9:460730  1015 m pc ¼ 3:086  1013 km; tP ¼ 5:390  1044 s; Julian year ¼ 365:25 days; Gyr ¼ 109 years EP ¼ 1:221  1019 GeV; G ¼ 6:674  1011 m3 =kg sec2 ¼ 6:709  1039  hc5 =GeV2 : H0 ðPresent-day Hubble constantÞ ¼ h=ð9:777752 GyrÞ; hðHubble expansion rate scaling factorÞ

1=H0 ðHubble timeÞ ¼ 9:777752 Gyr=h; c=H0 ðHubble lengthÞ ¼ 1:372  1026 m; for h ’ 0:674 3H02 =8pG ðcritical density of the UniverseÞ ¼ 1:878  1026 h2 kg=m3 G ¼ 6:674  1011 m3 =kg sec2 ¼ 6:709  1039  hc5 =GeV2 ; GF ¼ 1:666  105 h 3 c3 = GeV2

rThom ðThomson cross sectionÞ ¼ ð2=3Þð4prc2 Þ ¼ 6:6525  1025 cm2 rc ðclassical radius of the electronÞ ¼ 2:817940  1015 m Energy-temperature conversion : 1 GeV ¼ 1:1605  1013 K Fine structure constant a ¼ 1=137:04 at Q2 ¼ 0; and  1=128 at Q2  MZ2 Weak-mixing angle hW ; sin2 hW  0:232; at Q2  MZ2 Strong coupling constant as  0:119; at Q2  MZ2 • The notation h for the Hubble scaling factor in the Hubble constant above is a standard one and should not be confused with the Planck one. It is also dimensionless. pffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi • ‘P ¼ hG=c3 ; tP ¼ hG=c5 ; EP ¼ c2 hc=G. • Latin indices i; j; k; . . . are generally taken to run over 1,2,3, while the Greek indices l; m; . . . over 0; 1; 2; 3 in 4D. Variations do occur when there are many different types of indices to be used and the meanings should be evident from the presentations. • The Minkowski metric glm is defined by ½glm  ¼ diag½1; 1; 1; 1 ¼ ½glm  in 4D. • The gamma matrices satisfy the anti-commutation relations fcl ; cm g ¼ 2glm . Their properties are tabulated in Box 16.8, and various representations of the gamma matrices are given in Box 16.5 in Chap. 16. • The charge conjugation matrix is defined by C ¼ ic2 c0 . • w ¼ wy c0 , u ¼ uy c0 , v ¼ vy c 0 . A Hermitian conjugate of a matrix M is denoted by M y , while its complex conjugate is denoted by M  . • The step function is denoted by HðxÞ which is equal to 1 for x [ 0, and 0 for x\0. • In Chaps. 15–48, we set h and the speed of light c equal to one as it is customary in particles & fields studies. • The symbol e is used in dimensional regularization, while  is used in defining the boundary condition in the denominator of a propagator ðQ2 þ m2  iÞ and should not be confused with e used in dimensional regularization. We may also use either one when dealing with an infinitesimal quantity, in general, with  more frequently, and this should be self evident from the underlying context.

Masses Mp ¼ 938:3 MeV=c2 ; Mn ¼ 939:6 MeV=c2 ; MW ¼ 80:4 GeV=c2 ; MZ ¼ 91:2 GeV=c2 ; me ¼ 0:511 MeV=c2 ; ml ¼ 105:66 MeV=c2 ; ms ¼ 1777 MeV/c2 Mass of the neutral Higgs H 0  125:10 GeV=c2 . For approximate mass values of some of the quarks, see Table 28.1. For more precise range of values see the PDG compilations.

Abbreviations, Notations and Data

References 1. Tanabashi, M., et al. (2018). Particle data group. Physical Review D, 98, 010001. 2. Olive, K. A., et al. (2014). Particle data group. Chinese Physics C, 38, 090001.

xxi

1

Introduction—An Overview and a Road Map

It is rather unfortunate that with over-specializations, students, and perhaps many physicists, even established ones, would find it difficult to keep up with all the exciting developments going on in fundamental modern theoretical physics, and are even less familiar with their underlying technicalities. It is quite common that a student who is in the process of specializing in one area of physics, or even a physicist working in a particular area, would know very little or almost nothing about other areas of physics. They might know, or have heard, for example, that the Higgs field gives mass1 to particles, in the electroweak theory, which unifies electromagnetic and weak interactions, without really realizing that in describing quantum electrodynamics no Higgs field was ever introduced to give mass to the electron. They might know or have just heard of the inflation of the Universe as a stage in its evolution in which it was expected to have undergone a gigantic exponential expansion without knowing why inflation was needed in describing the evolution of the Universe in the first place and knowing even less at which stage this enormous expansion was expected to have occurred. They might have heard that particles decaying through so-called the weak interaction, violating some symmetry, would give rise to detectable tracks before they decay, while a particle in a strongly interacting decay process would hardly leave any detectable tracks. They might know or have heard of the large space occupied by matter in bulk but have no idea how to account for this in a quantitative manner. I was even surprised and realized that some physicists whose work does not require the use of relativistic physics, knew very little about special relativity. When I was a graduate student, as well as when I was new in this business, I often wished there was a rather technical book, with derivations not just in words and hand-waving arguments as are often presented, of modern developments in fundamental theoretical physics, covering its wide spectrum and thus being able to master the technicalities of the corresponding exciting developments and, in turn, being able myself to derive explicitly the exciting results in all the different areas of modern fundamental theoretical physics from the very small to cosmological distances. With this in mind, the main reason for me in writing this book was to integrate, in a pedagogical and technical manner, with detailed derivations, not just in words and hand-waving arguments, most, if not all, of the modern fundamental theoretical physics developments in the past 100 years or so consisting of foundational aspects of: – – – – – – – – – – – –

Quantum physics and Stability Problems in the Quantum World. Minkowski Spacetime Physics. Particles Classifications and Underlying Symmetries and Symmetry Violations. Quantum Field Theory of Particles Interactions. Higgs Field Physics. Supersymmetry: A Theory with Mathematical Beauty. Bosonic and Superstrings. Spacetime, Gravity and Supergravity. General Relativity and its Predictions, Including Frame Dragging. Intricacies of Black Holes Physics. Perturbative and Non-Perturbative Quantum Gravity. Intricacies of Modern Cosmology, Including Inflation and Power Spectrum, thus covering its wide spectrum in all the different areas of modern fundamental theoretical physics from minute regions of spacetime to the edge of the Universe.

1 This

was even announced on CNN and BBC.

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_1

1

2

1

Introduction—An Overview and a Road Map

As mentioned in our Preface, the legendary Richard Feynman in his famous lectures on quantum field theory of fundamental processes, the first statement he makes, the very first one, is that the lectures cover all of physics.2 One quickly understands what Feynman meant by covering all of physics. The role of fundamental physics is to describe the basic interactions of Nature and QFT, par excellence, is supposed to do just that and provide a unified description of all these interactions. With this in mind, and with what that is being covered here by incorporating, in the process, modern developments as well, I can certainly be justified to say that the present book covers all of fundamental theoretical physics. The above topics essentially cover an entire good graduate program in theoretical physics presented in an integrated manner. This unique book will thus be of value not only to graduate students, but to instructors and professionals as well, who are interested in presentations, intricacies and derivations of the many aspects of modern fundamental theoretical physics, irrespective of their specialties. We start with low energy quantum physics, study its development and consider some of its important consequences. In order to describe the myriad of high energy quantum particles observed in Nature, the need arises to introduce Minkowski space concepts of special relativity. Once such an arena of relativistic particles is introduced, one may then classify these particles, study their underlying symmetries, and describe their dynamics by introducing, in the process and most appropriately to do so, quantum fields. Quantum field theories are afterwards developed to consistently describe their interactions with the successful theories are those involving the electromagnetic, weak and strong interactions. Unfortunately similar methods available, and introduced here, for these interactions at this stage in quantum field theory, do not apply directly to the gravitational interaction in the quantum world. To understand this the need arises to study the gravitational interaction from which the very successful Einstein’s general relativity (GR) theory emerges and, in the process, its applications, including to black hole physics, are considered. Quantum aspects of gravitation are then studied non-perturbatively, as well as a perturbative treatment of GR, in the light of the other interactions introduced earlier, are carried out. In the final stage, after particles have been introduced and classified, and gravitational theory has been thoroughly studied, we deal with the intricacies of modern cosmology and study technically, the evolution, the age, the geometry and the fate of the Universe. In the remaining part of this chapter, I give an overall view of the entire subject matter covered in the book. The present chapter involves no derivations. All the derivations, and additional references, are given in the subsequent chapters. At this stage the impatient reader or the advanced one may proceed directly to Chap. 2, and go on, as each chapter contains introductory material before the underlying technicalities are worked out. More generally, I would advise the reader to go through the remaining part of this introductory chapter as many times as necessary and this simultaneously while studying and going through the various chapters. The remaining part of this chapter will give you a pretty good idea about the entire subject matter treated in the book. In developing the formalism of QM, and the underlying rules for computations, I have made use of the elegant, clear and incisive approach of direct analysis of various selective measurements, pioneered by the legendary Julian Schwinger,3 which has its roots in Dirac’s abstract presentation4 in terms of projection operators, which provides tremendous insight into the physics behind the formalism. In particular, we will learn, through the formalism, how QM re-invents complex numbers, matrices, operators and inner-product spaces as well as the important, unescapable, concept of probability. We will see how so-called interference terms, in general, appear in computing probabilities coming from cross terms between the alternative intermediate states, for a system to go from an initial state to a final one, and the inherited principle of linear superposition involving, necessarily, of adding amplitudes, for the system to go through such alternative states, rather than of adding the corresponding probabilities, for each alternative, emerges in computing of the probability of transition in the process. We will see how such interference terms may, for example, be destroyed by a specific measurement which yields one of such particular alternative states and how the concept of conditional probability, familiar to probabilists, naturally emerges. The appearance of interference terms in probabilities associated with measurements becomes a serious problem in the Schrödinger’s cat “experiment”5 in which an alive cat is in a vessel which is coupled by a lethal device to a radioactive substance. If the latter decays, this triggers the device to release a deadly gas and the cat dies. On the other hand, if no decay occurs, the cat lives. With the radioactive decay law obeying the probabilistic rules of quantum physics, a decay may or may not occur in the vessel within a given time specified by the “experimentalist” which depends on the half-life of the radioactive substance. The underlying theory of this leads to interference between alive and dead cat states, and unless one looks into the vessel the cat is in a superposition of alive and dead states which is not a meaningful classical concept. On the other hand, one may argue

2 Feynman

[42], page 1. 129]. An extensive analysis of this formalism in terms of selective measurements including various applications see Chaps. 1 & 8 of Manoukian [82]. 4 Dirac [22]. 5 See Schrödinger [117]. 3 See Schwinger [128,

1

Introduction—An Overview and a Road Map

3

that a quantum system is not in isolation but, in particular, it interacts with the environment6 which may, in turn, cause the destruction of such interference terms in conformity with classical concepts. Transformation theory, such as describing displacements in space and time in the quantum world, is developed and, quantum generators, in the process, are introduced. We will see how the fundamental constant  appears in the formalism by requiring that the dimensionalities of these generators coincide with the corresponding ones to their classical counterparts by uniformly scaling these generators, emerging in the formalism, by this fundamental constant. A key result is that symmetry transformations in the quantum world are carried out by a special class of operators referred to as unitary or anti-unitary. Quantum dynamics is developed and Hamiltonians are then constructed for various physical systems from first principles. This is followed by studying the decay of quantum systems, and the uncertainty of a system to remain in the same state during a given time, defined through the concept of survival probability, from which the so-called energy-time uncertainty arises. This is exemplified by the decay of a quantum system from one energy level, of finite width, to a lower one. The collapse of the hydrogen atom based on classical arguments in configuration space, is fully resolved in QM by showing rigorously that the probability that the electron falls to the center of the atom is zero. This is also shown for multi-electron atoms, where the probability of some or all electrons falling to the center of the atom is zero—a result which has been established rigorously only recently.7 One of the greatest problems that quantum mechanics has addressed over the years is the problem as to how matter in bulk does not collapse around us?, realizing that classical theory fails to satisfactorily describe, and why it occupies such a large volume involving a large empty space within it. The problem of matter in bulk, undoubtedly, turned out to be one of the most interesting problems that QM has resolved, and if one prepares a list of the most important problems that quantum theory addressed over the years, this underlying subject of matter will undoubtedly be on it. To this end, Paul Ehrenfest,8 as an address concerning the award of the Lorentz medal to Wolfgang Pauli, emphasizes the large volume occupied by matter in bulk in 1931, and he states: “We take a piece of metal, or a stone. When we think about it, we are astonished that this quantity of matter should occupy so large a volume”. He went on by stating that the Pauli exclusion principle is the reason: “Only the Pauli Principle, no two electrons in the same state.” This statement always stayed in my mind and an actual demonstration of a quantification of a large volume occupied by matter remained a challenge for me for years. The role that QM plays together with the exclusion principle is not only of interest in physics, but also in chemistry, as it is exemplified in describing the Periodic Table of Elements, as well as in the sciences, in general. It is amusing to read an article by Paul Sen [130] at the BBC, in which he emphasizes once more the large volume that matter occupies. He states: if you were to take out all of the empty space in atoms, and then compress all of the atoms so they were physically touching, all of the human beings on the planet would be about the size of an apple. Invoking the Pauli exclusion principle in the theoretical analysis which proves that matter does not collapse around us turned out to be not only sufficient but necessary for the problem at hand,9 as will be discussed further below. On the other hand, if the Pauli exclusion principle is not invoked, it is interesting to quote Freeman Dyson10 who states: [such] matter in bulk would collapse into a condensed high- density phase. The assembly of any two macroscopic objects would release energy comparable to that of an atomic bomb.... Matter without the exclusion principle is unstable.” In the translated version of the book by Shin’ichir¯o Tomonaga on spin,11 one reads in the Preface: “The existence of spin, and the statistics associated with it, is the most subtle and ingenious design of Nature - without it the whole universe would collapse.” I have carefully considered the large volume of matter it occupies with one of my former students,12 the analysis of which leads to the inescapable fact that, necessarily, in order to have a non-vanishing probability of having the electrons of matter in bulk within a sphere of radius R, the corresponding volume v R grows not any slower than the first power of the number N of electrons for N → ∞. That is, necessarily, the radius R grows not any slower than N 1/3 for N → ∞. No wonder why matter occupies so large a volume! The drastic difference between matter, with the exclusion principle, and “bosonic matter”, i.e., for which the Pauli exclusion principle is not invoked, with Coulomb interactions, is that the ground-state energy for the latter, E N ∼ −N α , with α > 1,13 where (N + N ) denotes the number of the negatively charged particles plus an equal number of positively charged particles. 6 Zurek

[159]. Manoukian [89]. 8 Klein [65], Dyson and Lenard [24, 68]. 9 Lieb and Thirring [69]. 10 Dyson [23]. 11 Tomonaga [143]. 12 See Manoukian and Sirininlakul [81]. 13 See: Dyson [23]; Dyson and Lenard [24]; Lenard and Dyson [68]; Lieb [70]; Manoukian and Muthaporn [78]; Manoukian and Sirinilakul [79]. 7 See

4

1

Introduction—An Overview and a Road Map

This behavior for “bosonic matter” is unlike that of matter, with the exclusion principle, for which14 α = 1 which establishes the stability of matter by a quantum mechanical description, i.e., it shows as to why matter does not collapse around us. A power law behavior with α > 1, implies instability, as the formation of a single system consisting of (2N + 2N ) particles is favored over two separate systems brought together each consisting of (N + N ) particles, and the energy released upon collapse of the two systems into one, being proportional to [(2N )α − 2(N )α ], will be overwhelmingly large15 for realistic large N , e.g., N ∼ 1023 . Finally, the underlying theory of quantum teleportation is developed which involves the transfer of a quantum state of a particle to another distant particle without traversing the distance between them. In particular, a recent intriguing experiment by J.-G. Ri et al. in [103] involved in the transportation of polarization properties of a photon to another photon at large distances of the order of 1400 km. The underlying theory of the so-called Bell’s test, which involves studying properties acquired by particles, widely separated, produced in scattering process, is developed and applied after developing the intricacies of quantum field theory in which transition amplitudes are unambiguously evaluated and are in conformity with relativity. It eventually becomes necessary to increase the energy under consideration and, in turn, the need arises to introduce Minkowski spacetime concepts, and the underlying theory of special relativity,16 to describe high-energy fundamental processes involving the myriad of particles observed in Nature in the microscopic world, moving with high speeds comparable to that of the speed of light, which may involve the creation of other particles and an exchange between energy and matter may take place. To this end, the concept of an inertial frame IF is defined, as a frame in which a particle which is initially at rest in it remains at rest, or a particle in it which is initially in uniform motion continues in uniform motion, unless they are acted upon by an external force. The principle of relativity is then introduced asserting the equivalence of any two inertial frames, moving with uniform velocities relative to each other, in describing physical laws including that the speed of light in vacuum is the same in the inertial frames.17 Spacetime consists of all possible events. An event in an IF of reference in which a coordinate system is set up, is labeled by coordinates (t, r) stating when and where the event has occurred in the given coordinate system. Different inertial frames give different labeling for the same event in spacetime. They are related by socalled Lorentz transformations (LT), and by some generalizations to be studied later, which in turn give the transformation rule between physical laws, described in coordinate systems set up by any two such IF, which are simply relabeled. Vectors and tensors are introduced, in the process, and their transformation rules are worked out under LT. Detailed consideration is given to physics in Minkowski spacetime and various applications are worked out. Most importantly the consistency of the set of Maxwell’s equations, describing the very first unified theory, of electricity, magnetism and optics, is established and the explicit solution of this set is derived, for a given current distribution, in terms of the so-called vector potential in terms of which electric and magnetic fields may be defined. With the Minkowski space, as the arena of particles and their collisions, introduced, we may consistently develop their interactions consistent with relativity which allows the creation of particles in a collision process, and an exchange between matter and energy may arise. An appropriate description of such physical processes for which a variable number of particles may be created or destroyed, in the quantum world, is provided by the very rich concept of a quantum field. The theory which emerges from extending quantum physics to the relativistic regime is called Quantum field theory (QFT).18 A field, with given specified properties, as a function of space and time variables, may give rise to the creation of a single or multiparticles with various probabilities which must add up to one according to the conservation of probability. Accordingly, a probability is assigned to the creation of a single particle which in turn gives rise to a wavefunction renormalization constant associated with a single particle. That is, a wavefunction renormalization constant, associated with a single particle, emerges in a theory as a consequence of QM. This point is certainly not sufficiently emphasized in the literature. A given field as a function of space, at a given fixed time, plays the role of degrees of freedom, as the space variable varies, and in terms of which Lagrangians may be set up and field equations may be derived as one does in classical mechanics. These equations eventually lead to the description of the fundamental processes observed in nature based on specific interacting theories of fields. The

14 This

key result is established rigorously in the monumental paper by Lieb and Thirring [69]. the collapsing stage of “bosonic matter” see: Manoukian, Muthaporn and Sirininlakul [83]. 16 Einstein [26]. See also Einstein et al. [32]. 17 For relevant experiments, see Waddoups, Edwards and Merrill [147]; Babcock and Bergman [8]. See also Beckmann and Mandics [10]. For a test of the independence of the speed of the light source, in the microscopic domain of elementary particles, see, Alväger and et al. [6]. For a macroscopic test at cosmological distances in binary star systems, see, Brecher [15]. See also, e.g., Ragulsky [102], for the constancy of the speed of light in all directions. For the independence of the speed of light of its frequency, see, e.g., Schaefer [113]. 18 Quite a fairly detailed account of many aspects of QFT, including the historical development of the subject, since its birth in 1926 up to present days, is given in the introductory chapter of my book: Manoukian [87], and would certainly be most valuable for readers at all levels. 15 For

1

Introduction—An Overview and a Road Map

5

successful such theories we have for particles interactions at present are: quantum electrodynamics (QED),19 describing the interactions of charged particles and photons, quantum chromodynamics (QCD),20 describing the strong interaction of quarks and so-called gluons, and the electroweak theory (EW),21 which unifies the electromagnetic and the so-called weak interaction, with the latter involving decay processes. The theory involving the EW one and the QCD one is referred to as the standard model. One of the remarkable properties of QCD, as a theory of strong interactions, is that its coupling parameter becomes small at high energies and as a result reliable perturbative computations may be carried out at high energies.22 This welcome property of the strength of the coupling at high energies becoming small is referred to as asymptotic freedom.23 A grand unified field theory (GUT), on the other hand, unifies electromagnetic, weak and strong interactions, and emerges relatively at very high energies, such as ∼1015 − 1016 GeV, in which the underlying couplings become equal, that is the theory involves a single coupling parameter instead of the three. Below such energies the different couplings become separately identified breaking the GUT symmetry of equal couplings. The photon and the gluons are referred to as gauge fields and mediate these basic interactions. A basic problem constrained by relativity is that the fields in a Lagrangian density and their derivatives, in spacetime, are to be multiplied at the same, i.e., coincident, spacetime points to avoid problems associated with action at a distance with the inherited infinite propagation speeds of signals. This equivalently corresponds to measurements made at absolute zero distances and are physically unattainable. The upshot of this is that the coupling parameters of the fields in the Lagrangian densities, describing interactions, are necessarily expressed at infinite energy and are referred to as unrenormalized parameters. It is certainly not justifiable to think that our theories may be extended all the way to absolute zero distances or to infinite energies. When computing physical processes and physical quantities, we have to eliminate these parameters in favor of physically measurable parameters corresponding to the energies at which experiments are being carried out. This process is what one calls renormalization. It is important to realize that the elimination of unrenormalized parameters in favor of physically measurable parameters is necessary in order to confront theory with experiments as experiments are carried out at “finite energies”. The equations which relate the various couplings in a theory at different energies are referred to as renormalization group equations. The process of renormalization may be consistently carried out for the above mentioned interacting theories and as such are referred to as renormalizable theories. It is important to realize that in developing and describing the fundamental interactions in Nature, such as the ones just discussed, it is a criterion that a theory is renormalizable, that is, its underlying initial couplings and mass parameters in the Lagrangian densities may be eliminated in favor of corresponding parameters at finite energies, and one may unambiguously compute transition amplitudes to any order in a perturbative setting. This shows the important and key role the theory of renormalization plays in fundamental physics. Historically, Abdus Salam,24 in 1951, was the first to develop and sketch a very general theory of renormalization. Salam’s ingenious formalism was brought to a mathematically rigorous form and the underlying theory was proved by the author25 much later. Shortly after Salam’s work, Nikolai Bogoliubov and Ostap Parasiuk (BP)26 also developed a very general renormalization scheme and the underlying theory was proved later by Klaus Hepp and Wolfhart Zimmermann, which is popularly known as the BPHZ scheme.27 The equivalence of both schemes originating from Salam’s and the BP schemes was also proved by the author,28 and this unifying theorem of renormalization is now also referred to as the equivalence principle (EP)29 of renormalization. The intricate details showing how unrenormalized parameters are eliminated in favor of renormalized, i.e., physical, parameters was also established by the author.30 The renormalization scheme originating from Salam’s scheme turned up to be much simpler than the BP one, and after I have established the 19 Schwinger

[123, 126, 127], Feynman [41], Tomonaga [142]. [138], Gross [51], Wilczek [153], Politzer [100]. 21 Glashow [49], Weinberg [149, 150], Salam [112]. 22 This was established by Gross, Wilczek and Politzer, see their Nobel lectures [51, 100, 153], respectively. See also the early work of Vanyashin and Terentyev [145]. 23 An interesting applications of this property is applied to explain the observation of the so-called “deep inelastic experiment” (see, e.g. Friedman and Kendall [47]) which gives support that a nucleon, such as the proton, contains point-like constituents which are identified with quarks and gluons which are collectively called partons, a term coined by Feynman [40]. For a systematic theoretical analysis of deep inelastic scattering, see Manoukian [87], pp. 429–456. 24 See Salam [110, 111]. 25 Manoukian [76]. See also references therein. 26 See Bogoliubov and Parasiuk [13]. 27 See Hepp [57], Zimmermann [158]. 28 Manoukian [76]. 29 Zeidler [157]. See also Figuora and Gracia-Bondia [43] regarding EP. 30 Manoukian [75, 76]. 20 Tanabashi

6

1

Introduction—An Overview and a Road Map

equivalence of both schemes, and regarding this Raymond Streater stated: “It is the end of a long chapter in the history of physics” and that “physicists found Salam’s [method] easier than the BPH[Z] one”.31 Unlike QED, the EW theory requires that all the masses involved in the theory are initially massless and acquire masses by the interaction of their corresponding fields with a scalar field referred to as the Higgs field through a process referred to as the Higgs mechanism. The corresponding particle associated with the Higgs field, referred to as the Higgs boson, has been recently observed. You will learn, through a theoretical description, how the signal of identifying the Higgs boson is made experimentally.32 The technical details of the Higgs mechanism will be studied at length in this book. In particular, you will learn why no Higgs field is introduced in QED as in the EW theory. You will learn why the Higgs boson is massive. You will learn why the photon does not acquire a mass which would be otherwise in contradiction with relativity and the constancy of the speed of light. You will learn why one does not worry if quarks are not observed but the observation of the Higgs boson was critical. Regarding the Higgs mechanism, at its earlier developments, the legendary Victor Weisskopf, who had been the Director of CERN in 1960s, did not seem to be impressed by this particular way of generating masses. In his CERN publications,33 he once remarked that “this is an awkward way to explain masses and that he believes that Nature should be more inventive, but experiments may prove him wrong”. But the Higgs mechanism as we know it now works rather well and the electroweak theory is in good agreement with experiments but open, of course, to modifications like almost any other theory. At earlier stages, Julian Schwinger was the first to come with the idea that such a method may generate masses for fermions34 and advanced as well the idea of the unification of electromagnetic and weak interactions35 involving vector bosons.36 The above theories will be discussed in detail and their renormalization group equations will be studied and used to describe the unification of the underlying interactions at high energies in a GUT description. We will learn that a GUT theory predicts the decay of a proton and the relevance of this to the so-called inflation of the Universe, in which the latter was expected to have expanded enormously, will be considered. All the fields encountered in present QFT will be studied. Particle properties and their classifications will be considered in detail and symmetry violations, associated with space reflections (parity) of a coordinate system through the origin, time reversal of a given process, and charge conjugation, in which a particle is replaced by its anti-particle, in weak interactions processes will be worked out. Special attention will be given to see how the different quarks came about in the description of the composite nature of some particles which may participate in the strong interactions. Neutrino masses and the theoretical description of related experiments implying their non-zero masses and their mass differences will be spelled out. Computational methods of transition amplitudes of fundamental processes will be described by both the functional-integral formalism, also known as Feynman path integral formalism or simply as the Path Integral Formalism, pioneered by Richard Feynman,37 and by the functional differential formalism, also referred to as the Schwinger action principle or as the Quantum Dynamical Principle, as well simply as the action principle, developed by Julian Schwinger.38 The so-called Bell’s inequality and the Bell test39 will be analyzed appropriately in a quantum field theory setting,40 in which transition probabilities computations of basic processes may be unambiguously carried out and are in conformity with relativity. Supersymmetry41 will be considered in much detail,42 in which to every known particle one associates a particle of opposite statistics (fermion ↔ boson) of the same mass, referred to as a sparticle and an underlying theory is symmetric 31 See Streater [137]. For

a historical developments of renormalizaton see the latter as well as Zeidler [157]. For other rather very specialized work, but of importance, on renormalization theory see Becchi, Rouet and Stora [9]; Veltman [146],  t Hooft [140], see also references therein. 32 For the experiments concerning the observation of the Higgs boson signal, see, Aad et al. [1]; and Chatrchyan et al. [17]. See also the Nobel Lectures of Englert [33] and Higgs [58], and references therein. 33 Weisskopf [151] on page 7, 11th line from below. In “Growing up with field theory, and recent trends in particle physics”. The 1979 Bernard Gregory Lectures at CERN, 29 pages. CERN: Geneva. 34 See Martin and Glashow [90], p. 16. See also Johnson [63], p. 96. 35 Schwinger [121]. 36 The legendary Victor Weisskopf in: [152], p. 17, states that Schwinger was the first who suggested that the weak interactions should be interpreted as transmitted by boson fields and that his original idea initiated an impressive development that culminated in the unification of electromagnetic interactions. 37 Feynman [37–39, 41]. See also Dirac [21], Dewitt [18, 19]. 38 Schwinger [120, 122, 124, 125, 127]. See also Johnson [62]; Lam [67]; Manoukian [77, 87]. 39 See, e.g., Bell [11]. 40 See, e.g., Manoukian and Yongram [80, 156]. 41 Supersymmetry was introduced by Gol’fand and Likhtman [50]. 42 A fairly detailed account of supersymmetry may be found in my book Manoukian [88].

1

Introduction—An Overview and a Road Map

7

under corresponding transformation referred to as supersymmetry transformations. Since fermions and bosons have different masses this symmetry must be broken. One of the advantages that a supersymmetric theory based on the standard model has over its non-supersymmetric counterpart is that the underlying couplings unify at a higher energy (∼1016 GeV) and gives one hope that one may be able to unify these three  interactions with the gravitational one which is expected to be relevant in particles interactions at the Planck energy  c5 / G  1.221 × 1019 GeV, or even less, where G is the Newton gravitational constant, c is the speed of light, and  is the Planck constant (divided by 2π ). String theory43 is then developed where the theory is described in terms of strings (and other extended objects). As a string moves in space time it generates a two dimensional surface referred to as a worldsheet, and string theory is a QFT which operates on this two-dimensional worldsheet with remarkable consequences. In particular particles that are needed to describe the dynamics of elementary particles arise naturally in the mass spectra of oscillating strings such as the graviton, and are not, a priori, assumed to exist or put in by hand in the underlying theories. The emergence of the graviton from string theory gives hope for the development of a consistent quantum theory of gravitation. In particular we learn, that string theory re-invents the so-called Yang–Mills field theory,44 on which the EW, QCD and GUT are based, as well as re-invents Einstein’s theory of general relativity (GR). Unlike the successful field theories discussed above, we do not have at present an acceptable quantum theory of gravitation. Before going into this aspect of a quantum nature of gravitation, the need then necessarily arises to study gravitation in order to understand this problem, and see, in particular, how Einstein’s theory of gravitation, also referred to as general relativity (GR),45 emerges and consider this theory in all seriousness as a standard one due to the fact that, as a classical one, it has been remarkably successful and quite impressing when meeting the challenge of experiments over the years for which Newton’s theory is not adequate. Although the gravitational coupling constant is much weaker than the other couplings such as of the electromagnetic, the weak and the strong ones, at present available energies, one would hope that the strengths of other interactions,  just mentioned, will be comparable to the strength of gravitation at very high energies of the order of the Planck energy  c5 / G  1.221 × 1019 GeV, or perhaps even at a lower energy, at which quantum gravity will be significant and a unified quantum field theory description of all the interactions in Nature including gravitation may emerge. Unlike all the other interactions we have described above, the gravitational one is a universal interaction experienced by all particles massive or massless due to their energy content. It plays a prominent role in describing nature at large distances in our solar system, in describing the fate of the Universe, and not to mention of its unique role it plays right here on earth in our everyday lives.46 A central role that played in the development of GR by Einstein, the creator of the theory itself, was the so-called Einstein’s Equivalence Principle (EEP) . The principle states that: “At any point in spacetime one may set up locally a Lorentz frame such all the (non-gravitational) laws of physics take on their special relativistic forms at the point in question”. For example, in generalizing Maxwell equations in the presence of gravitation, at any given point in spacetime one may set up a Lorentz frame such that at the point in question, Maxwell’s equations are given by their special relativistic forms which we will encounter in Chap. 14.47 In order to develop the GR theory, we also need the Principle of General Covariance. This means that physical laws must take the same mathematical form in all coordinate systems. This also means that they are to be expressed in terms of vectors and tensors. Accordingly, physical laws expressed in different coordinate systems must take the same form, with just its variables, components, ... simply relabeled. This, in turn, requires that when one introduces the derivative of a vector, one must make sure that it defines a tensor, i.e., it transforms as a tensor under coordinate transformations. The EEP and the principle of general covariance allow us to define covariant derivatives of vectors and of tensors, in general. A metric gμν may be introduced, in analogy to the Minkowski metric ημν , which lowers vectors and tensor indices, and similarly, introduce the object g λμ as the inverse of gμν : g λμ gμν = δ λ ν with the latter equal to one if λ = ν and zero, otherwise. One may then introduce a curvature of spacetime in analogy to the curvature of the surface of a sphere with the latter surface representing a two-dimensional space. By making contact with Newton’s theory of gravitation, which is also reviewed in this book, Einstein’s field equations are readily derived. In particular, we will learn that with gravity one associates a curvature to describe the underlying geometry of spacetime, and in this sense gravity and curvature become simply interchangeable words for the same thing. We will see in Chap. 51, that the energy-momentum tensor, encompassing the energy and momentum properties of matter, becomes the source of Einstein’s field equations in the same way that the 43 A

fairly detailed account of string theory may be found in my book Manoukian [88]. All the massless fields excitations in so-called bosonic and superstrings types have been investigated in Manoukian [84–86]. 44 See Yang and Mills [155], Shaw [134]. 45 Einstein [27, 28] [A translation of which may be found in the books: Lorentz et al. [73, 74]]. 46 The reader need not have any background in GR as everything will be derived in coming chapters from scratch. 47 For some tests of the equivalence principle see these useful references: Misner et al. [94], Sect. 38.6; Turyshev [144]; Will [154]; Di Casola et al. [20]; Arai et al. [7].

8

1

Introduction—An Overview and a Road Map

electromagnetic current, which encompasses the charges and their motions, is the source of Maxwell’s equations as we will encounter in Chap. 14. We then carry various applications. In particular, we consider gravitational waves, which have been finally observed recently.48 The gravitational waves, as ripples in space, were detected due to the merger of two black holes as they were 1.3 billion years ago, or where at a distance that light would take 3.1 billion years to reach the Earth, via the Laser Interferometer Gravitational Wave Observatory (LIGO). And this detection happened 100 years after the prediction of gravitational waves by Einstein himself.49 We have paid special attention to polarization aspects of the gravitational field which, unfortunately, have not been sufficiently emphasized in the literature, and are expected to become important in future studies and experiments. We then carry out applications to the perihelion precession of planets around the sun, as well as of the deflection of light by gravitation and study gravitational lensing for which several images may be observed of a given star.50 Moreover we consider the application to the slowing down of a signal, as a time delay, in the round-trip travel time for radar signals from the Earth and back, upon reflecting off other planets, in the gravitational field of the Sun.51 We consider as well, in much details, an application to a beautiful experiment in which, in analogy to the electromagnetic effect of a rotating electrically charged body which generates magnetism and acts on the spin of a particle, one considers the effect of the rotation of the Earth on a gyroscope in orbit around the Earth and study the precession of its spin originally set in a given direction. This effect may not be accounted for by Newton’s theory of gravitation. Because of its analogy with the electromagnetic case, it is referred to as a gravitomagnetic effect as well as a frame dragging effect. It was already predicted as early as 1918 by two Austrian physicists Hans Thirring and Josef Lense.52 The suggestion of putting a gyroscope in orbit around the Earth and the precession of its spin be used to investigate the nature of the Earth’s gravitational field came from George Pugh53 and Leonard Schiff54 about 1960. This effect, and another one which involved with the warping of space, referred to as geodesic effect, which will be also discussed in detail in Chap. 67, have been finally measured accurately only recently.55 That is, it took 51 years until these two GR predictions were finally successfully tested experimentally between 1960 and 2011. A systematic detailed treatment is given to Black Hole (BH) physics. A Black Hole (BH) is a region of space into which matter has collapsed and out of which not even light may escape and hence the term “black”. It partitions space into an inner region which is bounded by a surface, referred to as the event horizon, which acts as a one way surface for light going in but not coming out due to the powerful gravitational attraction within the horizon. The general relativistic result is that a BH is formed when a massive body of mass M contracts to a size of radius about or less than 2 GM/c2 , where G is Newton’s gravitational constant, and c is the speed of light. Hence particles and signals, i.e., not even light, can escape from the region inside a BH. The “devouring” of matter by a BH occurs only when matter reaches the horizon. Matter, however, may escape at distances large in comparison to this critical radius. The term “Black Hole” appears explicitly in print in an article by Ann Ewing [36] in 1964. In spite of this, this term was attributed to John Wheeler in 1967.56 Earlier terms used for BH were, e.g., “Dark Stars”, and as early as the 1780s, Reverend John Mitchell,57 a British Natural philosopher, predicted the existence of “dark stars”. As mentioned above, gravitational waves have now been detected, as a consequence of the dynamics of the deformation of space around the merging of two black holes about 1.3 × 1022 km from Earth,58 and thus also represented the first observation of a black hole merger. Direct images of black hole and its vicinity has been also reported.59 The mathematical description of BH really began when the German physicist and astronomer Karl Schwarzschild60 found his celebrated solution of the Einstein field equations almost immediately after Einstein formulated his GR theory in

48 Abbott

et al. [2, 3]; Castelvecchi and Witze [16]. [29, 30]. 50 For a survey of solar system tests, see Will [154]. For recent experiments on light deflection, see also Fomalont and Kopeikin [44]; Kopeikin and Fomalont [66]; Titov and Girdiuk [141]. 51 See Shapiro [131, 132], Bertotti et al. [12]. 52 Thirring and Lense [139]. It is reprinted in: Ruffini and Sigismondi [109], and a translated version is given in: Mashoon, Hehl and Theiss [91]. 53 Pugh [101]. 54 Schiff [114, 115]. 55 Everitt et al. [34, 35]. 56 For a historical account of the term “Black Hole” see, e.g., Siegfried [133]. 57 Mitchell [95]. 58 See, e.g., Abbott et al. [2, 3]. 59 See Akiyama et al. [4], Roelofs et al. [108]. See also Bouman et al. [14]. 60 Schwarzschild [118, 119]. 49 Einstein

1

Introduction—An Overview and a Road Map

9

1915/1916.61 This solution arose from a singularity in (a zero in the denominator of) his metric gμν at the critical radius mentioned above referred to as the Schwarzschild radius. The clearest impression I personally got on how Einstein himself felt about a BH came from seminars given on GR by the legendary Cornelius Lanczos in the School of Theoretical Physics at the Dublin Institute for Advanced Studies (DIAS)62 in the early 70s while I was a research scholar over there, and Lanczos was about 80 years old.63 In his thirties, Lanczos worked as an assistant to Einstein. Concerning BHs, he used to refer to Einstein stating that: “Even God does mistakes, he divided by zero”. We study the fall into a BH, and we consider in much detail stationary, the so-called Schwarzschild Black Holes, as well as rotating64 BHs, the so-called Kerr Black Holes. We also study the concept of a White Hole (WH), which is a consistent solution following from the mathematics of GR. In contrast to a BH, a WH “spews out” matter which, in turn, cannot get in as well. We consider and study, in turn, the geometrodynamics of an Einstein–Rosen bridge,65 also known as a wormhole, which connects a black hole and a white hole. The term “wormhole” seems to have been coined by Wheeler in the mid 1950s. The importance and the possibility of the existence of WHs was particularly emphasized by the Russian cosmologist Igor Novikov in the early 1960s.66 We will study as well as how energy may be extracted from a rotating BH by a process referred to as a Penrose Process.67 We consider the emergence of two BHs to estimate the energy released by their mergence by using, in the process, a theorem by Hawking which will be also established in the text. We also examine the thermodynamics of BHs and their quantum nature related to their evaporation by radiation emission studied in detail by Stephen Hawking.68 We will see how string theory re-invents GR.69 A supersymmetric extention of GR is also carried out giving rise to supergravity. Finally quantum aspects of gravitations are considered. First, we develop the theory of Loop Quantum gravity (LQG) which is a non-perturbative background independent formulation of quantum gravity in which space itself emerges from the theory, and leads to a quantization of geometry with a smallest possible attainable value of an area being of the order of the Planck length squared ∼10−66 cm2 , giving rise to an interesting granular structure to space. This beautiful formulation of quantum gravity offers an intriguing description of space generated by socalled loops, which in the process are introduced, with nothing else in between. We finally consider GR in the light of the successful quantum field theories interactions, mentioned above, consisting of QED, QCD, EW and GUT, in a perturbative setting, to demonstrate the difficulty one encounters in introducing a consistent quantum gravity to any order of perturbation theory. After having been introduced to a myriad of quantum particles and described the dynamics of quantum particles at high energies, and having developed Einstein’s very successful theory of gravitation (GR), we apply their underlying physics to study cosmology in the remaining part of the book. Cosmology is involved in studying the structure and evolution of the Universe. The Universe contains structures of a large-range of scales such as planets, stars, which make up parts of galaxies, and clusters of bound galaxies, and even super-clusters. In the late 1990s, sufficient data was accumulated to infer that the Universe is not only expanding but is also accelerating70 instead of slowing down as one would expect from the attractive nature of gravity. In cosmology, one assumes that we do not occupy a privileged place in the Universe,71 that is, our place is in no way special and the Universe looks the same to every observer anywhere in it if local matter fluctuations are averaged out, an assumption referred to as the cosmological principle. As a leading approximation to the mathematical description of the Universe, it is assumed, on very large scales, that the Universe is homogeneous and isotropic, i.e., there is no preferred direction in space. The assumption that the Universe is smooth on large scales supports the cosmological principle, that there is no special point in the Universe. The assumption of isotropy and homogeneity of the Universe leads to the expression of a simple metric which was derived by the American physicist and mathematician Howard Robertson and 61 Einstein

[27, 28] [A translation of which may be found in the books: Lorentz et al. [73, 74]]. it all began, in the early 40s, the very first two members of the School of Theoretical Physics were Erwin Schrödinger and Walter Heitler. 63 It is unfortunate that he had just submitted a paper for publication and the proofs of the accepted paper arrived just a few days after his death, while I was still at DIAS. This, in some sense, should encourage all readers not to give up physics at an early age. 64 See Kerr [64]. 65 Einstein and Rosen [31]. 66 See Frolov and Novikov [48]. 67 Penrose [97]. 68 See Hawking [56]. 69 If you are a general relativist, it is certainly worth studying string theory. 70 See, e.g., Perlmutter [99], Schmidt [116], Riess [104], who were awarded the Nobel Prize. 71 A helpful analogy for visualizing that there is no center of the Universe and no place in it is privileged, is to compare space with the surface of an expanding balloon as suggested by Arthur Eddington [25]. See also Hoyle [60]. Consider galaxies as points marked on the surface of a balloon. As the balloon inflates, the distances between the dots increase in the same way as the distances between the galaxies. All galaxies on the surface of the balloon are equivalent and none is special. One may then visualize the surface of the balloon as representing our three dimensional space. 62 When

10

1

Introduction—An Overview and a Road Map

the British mathematician Arthur Walker in the mid 1930s,72 and the Russian physicist Alexander Friedmann73 in the early 1920s concerning the curvature of space and corresponding associated types of universes. This metric as well as the resulting Friedmann equations which describe the possible evolutionary processes of the Universe will be derived in the text. Recent data74 supports the idea that space is essentially flat. This, however, does not mean that spacetime is flat as this will become clear from consideration of isotropy, which is a statement that there is no preferred direction in space, and from the observation that the Universe is expanding. We will see analytically that the present data support the results that the present age of the Universe is 13.8 Billion years old, and that the visible Universe, beyond which we cannot see at present, is roughly 94 Billion lyr across, where 1 lyr = 9.460730 × 1012 km. A key result in the entire subject of cosmology is the Hubble Law75 which leads to the following observation. First we note that the separation distance between any pair of galaxies divided by their relative velocity gives the time it took for the pair to reach their present separation as the Universe expands. According to Hubble’s law this time is the same for any pair of galaxies. That is, they must all have been together some time in the remote past. This is the so-called Big Bang description of our Universe according to which space itself was created and further expanded with objects being much closer in the past, and then more space was created pushing galaxies and other super-structures apart, as the Universe expanded. A piece of evidence of the Big Bang, is the so-called (Big Bang-) nucleosynthesis, in which the lightest elements were produced, out of subatomic particles, during the first few minutes after the Big Bang, as the Universe cooled down, consisting essentially of Hydrogen and Helium 4, in accord with current abundances of 76% of p and 24% of 4He. In the mid 1960s, Arno Penzias and Robert Wilson76 of Bell Laboratories discovered that a remnant of radiation exists today in the Universe having an absolute temperature of the order of 3 Kelvin degree, and even earlier in 1941, Andrew McKellar, a Canadian astronomer, found evidence of the existence of such radiation.77 We will see that a Big Bang description shows that such radiation was produced when the Universe was about 380,000 years old and was at a temperature of 3000 Kelvin. Moreover, as the Universe expanded, the wavelength of light was stretched, giving rise to the present background radiation observed, referred to as the Cosmic Microwave Background (CMB) radiation of about 3 Kelvin degree. When the temperature of the Universe was much hotter than 3000 K, it was too hot for electrons to be bound into atoms, the mean free path of photons, that is, the average distance that a photon travels in between scattering by two charged particles, was small compared to the distance a photon would travel during the characteristic expansion time of the Universe for that period, and the photons were coupled to the electrons and protons, remembering that photons do not discriminate between negative and positive charges, that is they interact with both. As it got cooler, electrons and protons slowed down, merged together, and participated in forming neutral atoms. When neutral atoms were predominantly formed, the photons became mere spectators, as their chances of meeting charged particles diminished drastically, decoupled, moved out of equilibrium with matter, and moved freely streaming through space ever since. These decoupled photons form a background of radiation filling space, referred as the cosmic microwave background radiation, mentioned above, which is observed today. The visible Universe seems today to be the same in all directions around us on very large scales. The temperature variation δT /T we see across the sky is quite small δT /T ≈ 10−5 , and the temperature is nearly uniform for an observer at rest relative to the CMB radiation. Special attention will be given to this important property of the Universe in the book. The present, largely accepted, description of the Universe is provided by the so-called “Inflationary Big Bang” one. Inflation was proposed in the early 1980s,78 with subsequent later developments, as a transient process which is assumed to have occurred in the very early stages of the Universe in which it experienced a gigantic exponential expansion. That is, one expects that there was a period in the history of the evolution of the Universe, at which time it has expanded enormously and this is expected to have happened, for reasons that will follow, when the breakdown of GUT symmetry took place. One such reason for this is, as it will be discussed in Chap. 37, that in the GUT era processes are generated which violate baryon number conservation.79 Once matter-antimatter asymmetry took place in the GUT era, it is expected that such a gigantic rapid expansion would have separated widely the produced particles, and this would have prevented reversing such reactions thus maintaining any baryon asymmetry as soon as it emerged, and this asymmetry was then kept fixed from then on. This, 72 Robertson

[105–107]; Walker [148]. [45, 46]. 74 Tanabashi [138], Olive [96]. 75 Hubble [61], see also Mayall [92] for extensive references to Hubble’s work on our expanding Universe. 76 Penzias and Wilson [98]. 77 McKellar [93]. 78 See Starobinsky, sometimes referred to as the father of inflation, [135, 136]; Guth [52–54]; Guth and Kaiser [55]; Linde [71, 72]; Albrecht and Steinhardt [5]. See also Hoyle and Narlikar [59]. 79 Baryon numbers ±1 are assigned to baryons/antibaryons, and 0 to mesons and leptons. 73 Friedmann

1

Introduction—An Overview and a Road Map

11

in turn, may have generated the present-day excess of matter over antimatter in the Universe. This process is referred to as baryogenesis. A simple analogous relation that occurs in the real world which clarifies further the role of inflation in preventing the occurrence of the above mentioned reversed reactions is the following. Suppose a couple of workers in an office who were romantically involved split up. If they continue working in the same office or even if they continue living in the same city then there is a chance that they might get back together again in the future. To reduce the chances for them to get back together again (reversing the initial process), you may send one of them to Siberia in Russia and the other one to Fairbanks in Alaska. That is, you create a large space between them. Another reason for inflation is expected to have occurred is that it would have diluted so-called the concentration of monopoles,80 as possible entities produced in the GUT era, by such a gigantic rapid expansion, supporting the experimental fact that no monopoles have been observed. Inflation, in general, solves also the problem, referred to as the Horizon Problem, involved with the uniformity of the present temperature observed in regions in opposite directions of the sky, as it turns out that according to the standard Big Bang description, i.e., without invoking inflation, the emitting regions were too far apart to be able to exchange information in conformity with relativity, and seen to be in identical states?.81 Another problem that inflation solves is the so-called Flatness Problem for which according the Big Bang description, without inflation, even an almost spatially flat Universe at early stages of the evolution of the Universe would have produced a drastic deviation from flatness in the present stage of the Universe, in contradiction with present observations. Finally, we discuss the role of dark matter and dark energy in the above formalism and we re-consider general aspects of our Universe.

References 1. Aad, G., et al. (2012). Observation of a new particle in the search for the standard model Higgs boson with the ATLAS detector at LHC. Physics Letters B, 716, 1–29. 2. Abbott, B. P., et al. (2016). Observational waves from a binary black hole merger. Physical Review Letters, 116, 061102. 3. Abbott, B. P. et al. (2016). Astrophysical implications of the binary black hole merger GW150914, Astrophysical Journal Letters, 818, L22 (2016). 4. Akiyama, K., et al. (2019). First M87 event horizon telescope results. I. The shadow of the supermassive black hole. The Astrophysical Journal Letters, 875, L1, 1–17. 5. Albrecht, A., & Steinhardt, P. J. (1982). Cosmology for grand unified theories with radiatively induced symmetry breaking. Physical Review Letters, 48, 1220–1223. 6. Alväger, T., et al. (1964). Measuring the velocity of light emitted by fast sources, using accelerated particles. Physics Letters, 12, 260–262. 7. Arai, S., Nitta, D., & Tashiro, H. (2016). Test of the Einstein equivalence principle with spectral distortions in the cosmic microwave background. Physical Review, D, 94, 124048 (pp. 6). 8. Babcock, G. C., & Bergman, T. G. (1964). Determination of the constancy of the speed of light. Journal of the Optical Society of America, 54, 147–151. 9. Becchi, C., Rouet, A., & Stora, R. (1976). Renormalization of gauge theories. Annals of Physics, 98, 287–321. 10. Beckmann, P., & Mandics, P. (1965). Test of the constancy of the velocity of electromagnetic radiation in high vacuum. Radio Science Journal of Research, 69D, 623–628. 11. Bell, J. S. (2004). Speakable and unspeakable in quantum mechanics (2nd ed.). Cambridge: Cambridge University Press. 12. Bertotti, B., Iess, L., & Tortora, P. (2003). A test of general relativity using radio links with the Cassini spacecraft. Nature, 425, 374–376. 13. Bogoliubov, N. N., & Parasiuk, O. S. (1958). On the multiplication of propagators in quantum field theory. Acta Physica Mathematica, 97, 227–266. (Original German Title: Über die multiplikation der kausalfunctionen in der quantentheorie der felder.) 14. Bouman, K., et al. (2016). Computational imaging for VLBI image reconstruction. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 913–922). 15. Brecher, K. (1977). Is the speed of light independent of the source? Physical Review Letters, 39, 1051–1054. [Erratum: ibid., 1236.] 16. Castelvecchi, D., & Witze, A. (2016). Einstein’s gravitational waves found at last. Nature News, 2016, 19361. 17. Chatrchyan, S., et al. (2012). Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC. Physics Letters B, 716, 30–61. 18. DeWitt, B. (1964). Theory for radiative corrections for non-abelian gauge fields. Physical Review Letters, 12, 742–746. 19. DeWitt, B. (1967). Quantum theory of gravity. II. The manifestly covariant theory. Physical Review, 162, 1195–1239. 20. Di Casola, E., Liberati, S., & Sonego, S. (2015). Nonequivalence of equivalence principles. American Journal of Physics, 83, 39–49. 21. Dirac, P. A. M. (1933). The Lagrangian in quantum mechanics. Physikalische Zeitschrift der Sowjetunion, 3, 64–72. 22. Dirac, P. A. M. (2012). The principles of quantum mechanics (2012 ed.). La Vergne, Tennessee: www.bnpublishing.net. 23. Dyson, F. J. (1967). Ground-state energy of a finite system of charged particles. Journal of Mathematical Physics (NY), 8, 1538–1545. 24. Dyson, F. J., & Lenard, A. (1967). Stability of matter. I. Journal of Mathematical Physics (NY), 8, 423–434. 25. Eddington, A. (1988). The expanding universe. Cambridge: Cambridge University Press. [First published in 1933.] 80 A

monopole may carry an isolated magnetic pole, a north pole or a south pole without, respectively, of a compensating pole. a colleague (Eduard Prugove˘cki from University of Toronto) made the amusing remark: “How could they speak the same language if they have never met?”.

81 Once

12 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46.

47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75.

1

Introduction—An Overview and a Road Map

Einstein, A. (1905). Zur elektrodynamik bewegter Körper. Annalen der Physik, 17, 891–921. Einstein, A. (1915). Die feldgleichungen der gravitation. Preussischen Akademie der Wissenschaften zu Berlin, Sitzungsberichte, 844–847. Einstein, A. (1916). Die grundlage der allgemeinen relativitätstheorie. Annalen der Physik, 49, 769–822. Einstein, A. (1916). Näherungsweise Integration der feldgleichungen der gravitation. Preussischen Akademie der Wissenschaften Berlin, Sitzungsberichte, 688–696. Einstein, A. (1918). Über gravitationswellen. Preussischen Akademie der Wissenschaften Berlin, Sitzungsberichte, 154–167 Einstein, A., & Rosen, N. (1935). The particle problem in the general theory of relativity. Physical Review, 48, 73–77. Einstein, A., Lorentz, H. A., Weyl, H., & Minkowski, H. (1952). The principle of relativity. New York: Dover Publications. Englert, F. (2014). The BEH mechanism and its scalar boson. Reviews of Modern Physics, 86, 843–850. Everitt, C. W. F., et al. (2011). Gravity probe B: Final results of a space experiment to test general relativity. Physical Review Letters, 106, 22110. Everitt, C. W. F., et al. (2015). The gravity probe B test of general relativity. Classical and Quantum Gravity, 32, 224001 (29 pp.). Ewing, A. E. (1964). “Black holes” in space. Science News Letter for January 18, 1964, issue. Feynman, R. P. (1948). Space-time approach to non-relativistic quantum mechanics. Reviews of Modern Physics, 20, 367–387. Feynman, R. P. (1963). Quantum theory of gravitation. Acta Physica Polonica, 24, 697–722. Feynman, R. P., & Hibbs, A. R. (1965). Quantum mechanics and path integrals. New York: McGraw-Hill. Feynman, R. P. (1969). The behavior of hadron collisions at extreme energies. In Proceedings of the 3rd Topical Conference on High Energy Collisions. Stony Brook, New York: Gordon & Breach. Feynman, R. P. (1972). The development of the space-time view of quantum electrodynamics. In Nobel Lectures, Physics 1963–1970. 11 Dec 1965. Amsterdam: Elsevier. Feynman, R. P. (1982). The theory of fundamental processes (p. 1). Menlo Park, California: The Benjamin/Cummings Publishing Co., 6th Printing. Figuora, H., & Gracia-Bondia, J. M. (2004). The uses of Connes and Kreimer’s algebraic formulation of renormalization. International Journal of Modern Physics A, 19, 2739–2754. arXiv:hep-th/0301015v2. Fomalont, E. B., & Kopeikin, S. M. (2003). The measurement of the light deflection from Jupiter: Experimental results. Astrophysical Journal, 598, 704–711. Friedmann, A. (1922). Über die krümmung des raumes. Zeitschrift für Physik, 10, 377–386. (English translation in: Friedman, A. (1999). On the curvature of space. General Relativity and Gravitation, 31, 1991–2000.) Friedmann, A. (1924). Über die möglichkeit einer welt mit konstanter negativer krümmung des raumes. Zeitschrift für Physik, 21, 326–332. (English translation in: Friedmann, A. (1999). On the possibility of a world with constant negative curvature of space. General Relativity and Gravitation, 31, 2001–2008.) Friedman, J. I., & Kendall, W. H. (1972). Deep inelastic electron scattering. Annual Review of Nuclear and Particle Science, 22, 203–254. Frolov, V. P., & Novikov, I. D. (1998). Black hole physics: Basic concepts and new developments. AA Dordrecht: Kluwer. Glashow, S. L. (1980). Towards a unified theory: Threads in a tapestry. Reviews of Modern Physics, 52, 539–543. Gol’fand, A., & Likhtman, E. P. (1971). Extension of the Poincaré group generators and violation of P invariance. JETP Letters, 13, 323–326. Gross, D. J. (2005). The discovery of asymptotic freedom and the emergence of QCD. Reviews of Modern Physics, 77, 837–849. Guth, A. (1981). Inflationary universe: A possible solution to the horizon and flatness problems. Physical Review, D, 23, 347–356. Guth, A. (1984). The inflationary universe. Scientific American, 1, 34–60. Guth, A. H. (1997). The inflationary universe: The quest for a new theory of cosmic origins. New York: Basic Books. Guth, A., & Kaiser, D. I. (2005). Inflationary cosmology: Exploring the Universe from the smallest to the largest scales. Science, 307, 884–890. Hawking, S. W. (1975). Particle creation by black holes. Communications in Mathematical Physics, 43, 199–220. Hepp, K. (1966). Proof of the Bogoliubov-Parasiuk theorem of renormalization. Communications in Mathematical Physics, 2, 301–326. Higgs, P. W. (2014). Evading the Goldstone theorem. Reviews of Modern Physics, 86, 851–853. Hoyle, F., & Narlikar, J. V. (1964). On the avoidance of singularities in C-field cosmology. Proceedings of the Royal Society, A, 278, 465–478. Hoyle, F. (1968). The nature of the universe. Gretna, Louisiana: Pelican. (Reprint ed.). Hubble, R. (1929). A relation between distance and radial velocity among extra-galactic nebulae. Proceedings of the National Academy of Sciences of the United States of America, 15, 168–173. Johnson, K. (1968). 9th Latin American School of Physics, Santiago de Chile. In K. Johnson & I. Saavedra (Eds.), Solid state physics, and particle physics. New York: W. A. Benjamin. Johnson, K. (1996). Julian Schwinger - Personal recollections. In Y. Jack Ng (Ed.). Julian Scwhinger - The physicist, the teacher, and the man (p. 96). Singapore: World Scientific. Kerr, R. P. (1963). Gravitational field of a spinning mass as an example of algebraically special metrics. Physical Review Letters, 11, 237–238. Klein, M. J. (Ed.). (1959). Paul Ehrenfest: Collected scientific papers. Amsterdam: North-Holland. Kopeikin, S. M., & Fomalont, E. B. (2007). Gravimagnetism, causality, and aberration of gravity in the gravitational light-ray deflection experiments. General Relativity and Gravitation, 39, 1583–1624. Lam, C. S. (1965). Feynman rules and Feynman integrals for systems with higher-spin fields. Nuovo Cimento, 38, 1755–1765. Lenard, A., & Dyson, F. J. (1968). Stability of matter. II, Journal of Mathematical Physics (NY) 9, 698–709. Lieb, E. H., & Thirring, W. E. (1975). Bound for the kinetic energy of fermions which proves the stability of matter. Physical Review Letters, 35, 687–689, [Errata (1975), 35, 1116(E).] Lieb, E. H. (1979). The N 5/3 law for bosons. Physics Letters, 70A, 71–73. Linde, A. D. (1982). A new inflationary universe scenario: A possible solution of the horizon, flatness, homogeneity, isotropy and primordial monopole problems. Physics Letters, B, 108, 389–393. Linde, A. D. (1985). Initial conditions for inflation. Physics Letters, B, 162, 281–286. Lorentz, H. A., Einstein, A., Minkowski, H., Weyl, H., & Sommerfeld, A. (1952). The principle of relativity. New York: Dover Publication. Lorentz, H. A., Einstein, A., Minkowski, H., Weyl, H., & Sommerfeld, A. (1953). The meaning of relativity (5th ed.). Princeton: Princeton University Press. Manoukian, E. B. (1979). Subtractions vs counterterms. Nuovo Cimento, 53A, 345–358.

1 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95.

96. 97. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125.

Introduction—An Overview and a Road Map

13

Manoukian, E. B. (1983). Renormalization. New York: Academic. Manoukian, E. B. (1986). Action principle and quantization of gauge fieds. Physical Review D, 34, 3739–3749. Manoukian, E. B., & Muthaporn, C. (2002). The collapse of “bosonic matter”. Progress of Theoretical Physics, 107, 927–939. Manoukian, E. B., & Sirininlakul, S. (2004). Rigorous lower bounds for the ground state energy of matter, Physics Letters, 332, 54–59, [Errata (2004). 337A, 496(E).] Manoukian, E. B., & Yongram, N. (2004). Speed dependent polarization correlations in QED and entanglement. European Physical Journal D, 31, 137–143. Manoukian, E. B., & Sirininlakul,. (2005). High density limit and inflation of matter. Physical Review Letters, 95(190402), 1–3. Manoukian, E. B. (2006). Quantum theory: A wide spectrum. Dordrecht: Springer. Manoukian, E. B., Muthaporn, C., & Sirininlakul, S. (2006). Collapsing stage of “bosonic matter”. Physics Letters, 352A, 488–490. Manoukian, E. B. (2012). All the fundamental massless bosonic fields in bosonic string theory. Fortschritte der Physik, 60, 329–336. Manoukian, E. B. (2012). All the fundamental bosonic massless fields in superstring theory. Fortschritte der Physik, 60, 337–344. Manoukian, E. B. (2012). All the fundamental massless fermion fields in supersring theory: A rigorous analysis. Journal of Modern Physics, 3, 1027–1030. Manoukian, E. B. (2016). Quantum field theory I: Foundations and abelian and non-abelian gauge theories. Switzerland: Springer. Manoukian, E. B. (2016). Quantum field theory II: Introductions to quantum gravity, supersymmetry and string theory. Switzerland: Springer. Manoukian, E. B. (2016). Do atomic electrons fall to the center of multi-electron atoms? Modern Physics Letters, B, 30, 1650082 [8 p.]. Martin, P. C., & Glashow, S. L. (2008). Julian Schwinger 1918–1994: A biographical memoir. National Academy of Sciences . Washington, DC, Copyright 2008, p. 16. Mashoon, B., Hehl, F. W., & Theiss, D. S. (1984). On the gravitational effects of rotating masses: The Thirring-Lense papers. General Relativity and Gravitation, 16, 711–750. Mayall, N. U. (1970). Edwin Powell Hubble: A biographical memoir. National Academy of Siences. Washington, D.C. McKellar, A. (1941). Molecular lines from the lowest states of diatomic molecules composed of atoms probably present in interstellar space. Publications of the Dominion Astrophysical Observatory, Vancouver, B.C., Canada, 7, 251–272. Misner, C. W., Thorne, K. S., & Wheeler, J. A. (1971). Gravitation. New York: W. H. Freeman and Company. Mitchell, J. (1784). On the means of discovering the distance, magnitude, & c of the fixed stars, in consequence of the diminution of the velocity of their light, in case such a diminution should be found to take place in any of them, and such other data should be procured from observations, as would be farther necessary for that purpose. Philosophical Transactions of the Royal Society, 74, 35–57. Olive, K. A., et al. (2014). Particle data group. Chinese Physics C, 38, 090001. Penrose, R. (1969). Gravitational collapse: The role of general relativity. Rivista del Nuovo Cimento, Numero Speziale, I, 252–276. Penzias, A. A., & Wilson, R. W. (1965). A measurement of excess antenna temperature at 4080 Mc/s. Astrophysical Journal Letters, 142, 419–421. Perlmutter, S. (2012). Measuring the acceleration of the cosmic expansion using supernovae. Reviews of Modern Physics, 84, 1127–1149. Politzer, H. D. (2005). The dilemma of attribution. Reviews of Modern Physics, 77, 851–856. Pugh, G. E. (1959). Proposal for a satellite test of the Coriolis prediction of general relativity. WSEG Research Memorandum No. 11, Weapons System Evaluation Group. The Pentagon, Washington, D.C. 25 (12 November 1959). Ragulsky, V. V. (1997). Determination of light velocity dependence on direction of propagation. Physics Letters A, 235, 125–128. Ri, J.-G., et al. (2017). Ground-to-satellite quantum teleportation. Nature, 549, 70–73. Riess, A. G. (2012). My path to the accelerating universe. Reviews of Modern Physics, 84, 1165–1175. Robertson, H. P. (1935). Kinematics and world structure. Astrophysical Journal, 82, 284–301. Robertson, H. P. (1936). Kinematics and world structure II. Astrophysical Journal, 83, 187–201. Robertson, H. P. (1936). Kinematics and world structure III. Astrophysical Journal, 83, 257–271. Roelofs, F., et al. (2019). Simulations of imaging the event horizon of Sagittarius A∗ from space. Astronomy & Astrophysics, 625, A124 (19 p.). Ruffini, R. J., & Sigismondi, B. C. (2003). Non-linear gravitodynamics: The Lense-Thirring effect (pp. 349–388). Singapore: World Scientific. Salam, A. (1951). Divergent integrals in renormalizable field theories. Physical Review, 84, 426–431. Salam, A. (1951). Overlapping divergences and the S-matrix. Physical Review, 82, 217–227. Salam, A. (1980). Grand unification and fundamental forces. Reviews of Modern Physics, 52, 353–355. Schaefer, B. E. (1999). Severe limits on variations of the speed of light with frequency. Physical Review Letters, 82, 4964–4966. Schiff, L. I. (1960). Possible new experimental test of general relativity theory. Physical Review Letters, 4, 215–218. Schiff, L. I. (1960). Motion of a gyroscope according to Einstein’s theory of gravitation. Proceedings of the National Academy of Sciences, 46, 871–882. Schmidt, B. P. (2012). Acceleration expansion of the Universe through observations of distant supernovae. Reviews of Modern Physics, 84, 1151–1163. Schrödinger, E. (1935). The present situation in quantum mechanics. Naturwissenschaften, 23, 807–812, (English translation: Trimmer, J. D. (1980) Proceedings of the American Philosophical Society, 124, 323–338.) Schwarzschild, K. (1916). Über das gravitationsfeld eines massenpunktes nach der Einsteinschen theorie. Sitzungsberichte der KöniglichPreussischen Akademie der Wissenschaften, Berlin Kl, 189–196. Schwarzschild, K. (1916). Über das gravitationsfeld einer kugel aus inkompressibler flüssigkeit nach der Einsteinschen theorie. Sitzungsberichte der Königlich-Preussischen Akademie der Wissenschaften, Berlin Kl, 424–434. Schwinger, J. (1951). On the Green’s functions of quantized fields. I. Proceedings of the National Academy of Sciences, USA, 37, 452–455; (1953). Schwinger, J. (1957). A theory of fundamental interactions. Annals of Physics (NY), 2, 407–434. Schwinger, J. (1953). The theory of quantized fields. II, III. Physical Review, 91, 713–728, 728–740. Schwinger, J. (Ed.). (1958). Selected papers on quantum electrodynamics. New York: Dover. Schwinger, J. (1960). Unitary transformations and the action principle. Proceedings of the National Academy of Sciences, USA, 46, 883–897. Schwinger, J. (1962). Exterior algebra and the action principle I. Proceedings of the National Academy of Sciences, USA, 58, 603–611.

14

1

Introduction—An Overview and a Road Map

126. Schwinger, J. (1972). Relativisic quantum electrodynamics. In Nobel Lectures, Physics 1963–1970, 11 Dec 1965. Amsterdam: Elsevier. 127. Schwinger, J. (1973). A report on quantum electrodynamics. In L. Mehra (Ed.), The physicit’s conception of nature. Dordrecht-Holland: D. Reidel Publishing Company. 128. Schwinger, J. (1991). Quantum kinematics and dynamics. Redwood City: Addison-Wesley. 129. Schwinger, J. (2001). Quantum mechanics: Symbolism of atomic measurements. Berlin: Springer. 130. Sen, P. (2007). You can’t see the atom. Published by BBC News: 23 July, 2007, News Front Page. UK. 131. Shapiro, I. I. (1964). Fourth test of general relativity. Physical Review Letters, 13, 789–791. 132. Shapiro, I. I., et al. (1971). Fourth test of general relativity: New radar result. Physical Review Letters, 26, 1132–1135. 133. Siegfried, T. (2013). 50 years later, it’s hard to say who named black holes. Science News for December 23, 2013, issue. 134. Shaw, R. (1955). The problem of particle types and other contributions to the theory of elementary particles. Ph.D. thesis, Cambridge University. 135. Starobinsky, A. A. (1980). A new type of isotropic cosmological models without singularity. Physics Letters B, 91, 99–102. 136. Starobinsky, A. A. (1982). Dynamics of phase transition in the new inflationary universe scenario and generation of perturbations. Physics Letters B, 117, 175–178. 137. Streater, R. F. (1985). Review of renormalization by E. B. Manoukian. Bulletin of London Mathematical Society, 17, 509–510. 138. Tanabashi, M., et al. (2018). Particle data group. Physical Review D, 98, 010001. 139. Thirring, H., & Lense, J. (1918). Über die wirking rotierender ferner massen in der Einsteinschen gravitationstheorie. Physikalische Zeitschrift, 19, 156–163. 140.  t Hooft, G., (2000). A confrontation with infinity. Reviews of Modern Physics, 72, 333–339. 141. Titov, O. A., & Girdiuk, A. A. (2015). The deflection of light induced by the Sun’s gravitational field and measured with geodesics VLBI. Proceedings of the Journées 2014 Meeting, Z. Malkin and N. Capitaine (Eds.), Pulkovo Observatory, St. Petersburg, Russia. 142. Tomonaga, S. (1972). Development of quantum electrodynamics: Personal recollection. In Nobel Lectures, Physics 1963–1970, 6 May 1966. Amsterdam: Elsevier. 143. Tomonaga, S., & (Translator T. Oka). (1997). The story of spin. Chicago: University of Chicago Press. 144. Turyshev, S. G. (2008). Experimental tests of general relativity: Recent progress and future directions. Annual Review of Nuclear Science, 58, 207–248. 145. Vanyashin, V. S., & Terentyev, M. V. (1965). The vacuum polarization of a charged vector field. Soviet Physics JETP, 21, 375–380. (Original Russian version appeared in Zhurnal Experimental’noi i Teoreticheskoi Fiziki, 48, 565–569 (1965).) 146. Veltman, M. J. G. (2000). From weak interactions to gravitation. Reviews of Modern Physics, 72, 341–349. 147. Waddoups, R. O., Edwards, W. F., & Merrill, J. J. (1965). Experimental investigation of the second postulate of special relativity. Journal of the Optical Society of America, 55, 142–143. 148. Walker, A. G. (1937). On Milne’s theory of world-structure. Proceedings of the London Mathematical Society, Series, 2(42), 90–127. 149. Weinberg, S. (1980). Conceptual foundations of the unified theory of weak and electromagnetic interactions. Reviews of Modern Physics, 52, 515–523. 150. Weinberg, S. (1996). The quantum theory of fields II: Modern Applications. Cambridge: Cambridge University Press. 151. Weisskopf, V. F., & (1979). Personal impressions of recent results in particle physics. CERN Ref. Th 2732, on page 7, 11th line from below. In “Growing up with field theory, and recent trends in particle physics”. The 1979 Bernard Gregory lectures at CERN (29 p.). Geneva: CERN. 152. Weisskopf, V. F. (1980). Growing up with field theory, and recent trends in particle physics. “The 1979 Bernard Gregory lectures at CERN” (29 p.). Geneva: CERN. 153. Wilczek, F. (2005). Asymptotic freedom: From paradox to paradigm. Reviews of Modern Physics, 77, 857–870. 154. Will, C. M. (2014). The confrontation between general relativity and experiment. Living Reviews in Relativity, 17, 4–117. 155. Yang, C. N., & Mills, R. L. (1954). Conservation of isotopic spin and isotopic gauge invariance. Physical Review, 96, 191–195. 156. Yongram, N., & Manoukian, E. B. (2013). Quantum field theory analysis of polarization correlations, entanglement and Bell’s inequality: Explicit processes. Fortschritte der Physik, 61, 668–684. 157. Zeidler, E. (2009). Quantum field theory II: Quantum electrodynamics (pp. 972–975). Berlin: Springer. 158. Zimmermann, W. (1969). Convergence of Bogoliubov’s method of renormalization in momentum space. Communications in Mathematical Physics, 15, 208–234. 159. Zurek, W. H. (1991). Decoherence and the transition from quantum to classical. Physics Today, 44, 36–44.

2

Sorting Out Selective Measurements: QM Re-invents Complex Numbers, Matrices, Operators and Inner-Product Spaces

In this chapter we learn how the formalism of quantum mechanics (QM) may be developed by considering filtering processes such as in the Stern–Gerlach experiment in which a beam of particles, say, of spin 1/2 are split into two beams with spins orientations directed in opposite directions. By setting up together a series of such apparatuses, we will see how QM re-invents complex numbers, matrices, operators, inner-product spaces and not to mention the concept of probabilities. From the possible values that a physical quantity, under consideration, may take on, one may, through a filtering process, select a special range of its values or select some of its particular values for further investigations. Such a process is referred to as a selective measurement. As mentioned above, for example, a beam of particles of spin 1/2 initially prepared with spin along the positive z-axis may be fed into a Stern–Gerlach apparatus which may split the beam into two components, with spins along the positive and negative directions, say, of a z¯ -axis making an angle θ with the z-axis, and one may select, say, only the components along the positive z¯ -axis by blocking the −¯z components: z¯ z z¯ +¯ z θ +z x −¯ z x¯ We consider first the measurement of a physical quantity A (also called an observable) which may take on a finite number of discrete set of real numbers {a, a  , a  , . . .}. Generalizations to physical quantities which may take on an infinite number of possible discrete set of values or may take on values from a continuous set of values, such as the position of a particle, will be dealt with later. In general, the measurement of another physical quantity B may destroy the assigned value in a previous measurement of the physical quantity A, and both quantities cannot be measured simultaneously. In such a case A and B are said to be incompatible. Otherwise they are said to be compatible observables. For example by measuring the position of a particle after having determined its kinetic energy, may cause uncontrollable changes in the particle’s kinetic energy. And a subsequent measurement of the kinetic energy will, in general, be different from the initial measured one. Accordingly, the position and kinetic energy of a particle are incompatible observables. To obtain the optimum information about a system one needs to introduce a complete set of compatible observables, say, {A1 , . . . , Ak }. By this it is meant that any observable not belonging to this set and which is not a function of these

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_2

15

16

2

Sorting Out Selective Measurements: QM Re-invents Complex Numbers, …

observables is incompatible with at least one of them. To simplify the notation, we will denote such a complete set of compatible observables {A1 , . . . , Ak } simply by A. Each of the values in {a, a  , a  , . . .} given above will then, in general, stand for k-tuplet of real numbers. Through a filtering process, as in the Stern–Gerlach experiment, an ensemble of identical systems may be prepared to take on the value a, said to be in a state a, in a selective measurement of A. Corresponding to this selective measurement, we may introduce the symbol Λ(a) = |aa|, (2.1) to denote the operation which selects and transmits only those systems in state a. We may in turn consider the successive operations of the selective measurement of the same observable A defined by Λ(a  )Λ(a) = Λ(a  )a  |a,

(2.2)

where a  |a = δ(a  , a) is the numerical factor δ(a  , a) =



1, for a  = a , 0, for a  = a

(2.3)

 with Λ(a) 0 = 0 standing for the operation which accepts no system   whatsoever.  For a = a, the second selective measurement,  symbolized by Λ(a ), simply accepts and transmits 100% δ(a , a) = 1 of the systems prepared by the first selective measurement, symbolized by Λ(a). One is naturally led to introduce the identity operation

1=



|aa|,

(2.4)

a

which simply accepts and transmits all systems with no discrimination in any of the states a corresponding to all the values taken by A (i.e., by the complete set of compatible observables {A1 , . . . , Ak }.) Through two filtering processes of two consecutive sets of incompatible observables A and B: Λ(a) followed by Λ(b), one may generate a |ba| type of an operation which initially prepares systems in state a and then, through another filtering process, transmits a sub-ensemble of systems in state b. Since only a fraction of the systems in state a are expected to be finally transmitted through the B-filter, the operation Λ(b)Λ(a), in analogy to (2.2), may be defined by   Λ(b)Λ(a) = |ba| b|a ,

(2.5)

where b|a is a numerical factor introduced as a measure of the fraction of systems prepared initially in state a that are finally transmitted through the B filter in state b. For example, for the following set of the two Stern–Gerlach apparatuses, z z¯ +¯ z

+z

−¯ z

−z

+z

Fraction emerges

represents an apparatus in which a particle is fed into it with spin along the +z-axis. As before, the z¯ -axis makes an angle θ with the z-axis. The apparatus may be represented by successive selective set of measurements symbols given by

2

Sorting Out Selective Measurements: QM Re-invents Complex Numbers, …

17

  Λ(−¯z )Λ(+z) = | − z¯ +z| −¯z | + z

(2.6)

where, in general, the numerical factor −¯z | + z is non-zero. In the next chapter (Chap. 3, Box 3.1), we will learn how to compute the numerical factor −¯z | + z and it is explicitly given by − sin(θ/2) which is, of course, non-zero, in general. More elaborate successive selective measurements may be considered, e.g., of three sets of observables A, B, C, and establish, in the process, the following associative law of the measurements symbols      Λ(c) Λ(b)Λ(a) = Λ(c) |b|a| b|a = |ca| c|bb|a  = Λ(c)Λ(b) Λ(a).

(2.7)

The physical significance associated with the measurements symbols arises in the following manner. First note that inserting the identity operation associated with the observables A in the second equality below, gives rise to the following chain of equalities:     |aa| Λ(b) |cb| c|b = Λ(c)Λ(b) = Λ(c)1Λ(b) = Λ(c) a



   = |cb| c|aa|b + c|a a |b + · · · = |cb| c|aa|b , 



(2.8)

a

from which we may infer the following key identities: c|b =



c|aa|b,

(2.9)

a

For c → b :

δ(b , b) =



b |aa|b,

using Eq. (2.3),

(2.10)

b|aa|b,

using Eq. (2.3).

(2.11)

a

For b → b :

1=

 a

relevant to the underlying physics of the analysis. We note that under arbitrary scale transformations associated with any two observables A and B:     |ab| → |ab| λ(b)/λ(a) , and a|b → a|b λ(a)/λ(b) , (2.12) all the equations involving the selective measurements and successive selective measurements in Eqs. (2.1)–(2.8) as well as all the identities in Eqs. (2.9)–(2.11), remain invariant. Because of the arbitrariness under the above scale transformation, the numerical factor a|b, although of physical interest as a measure of fraction of those systems prepared in state a to be transmitted, through a B-filter, in state a, cannot have a physical significance by itself. However, the combination a|b b|a = pa (b),

(2.13)

denoted by pa (b), is invariant under the above scale transformations. It is of great interest to note the following properties of the numerical factor pa (b):

18

2

Sorting Out Selective Measurements: QM Re-invents Complex Numbers, …

pa (a) = 1,

(i)

(see Eq. (2.3)),

(ii) pa (b) is invariant under the scale transformations in Eq. (2.12),

(2.14) (2.15)

(iii) pa (b) is a measure of fraction of systems all in state a to be found in state b after the corresponding selective measurement, (iv) pa (b) satisfies the normalization condition :  pa (b) = 1,

(see Eq. (2.11)).

(2.16) (2.17)

b

Accordingly pa (b) is qualified to represent the probability of observing a system in state b which was in state a prior to the B measurement. A probability must be real and non-negative: pa (b) ≥ 0. Both these conditions are satisfied if: a|b = b|a∗ , where ∗ denotes complex conjugation thus introducing complex numbers in the formalism. The probability pa (b) then becomes (2.18) pa (b) = |b|a|2 ≥ 0, and the scale factors λ(a) in Eq. (2.12) must then satisfy the relations  ∗ λ(a) = 1/λ(a) ⇒ λ(a) = ei φ(a) ,

i.e., it is a phase factor,

(2.19)

with φ(a) denoting a real number. The numerical factor b|a is referred to as the amplitude of obtaining the value b for a B-measurement on a system initially known to be in a state a of an A-measurement. It is also referred to the transformation function from the A-description to the B-description. Upon introducing the trace operation, that for any numerical factor α,

Tr α |ab| = αb|a,

(2.20)

the probability pa (b) in Eq. (2.18) may be simply rewritten as



pa (b) = Tr Λ(b)Λ(a) = Tr Λ(a)Λ(b) = b|aa|b ≡ b|Λ(a)|b,

(2.21)

  noting that Λ(a)Λ(b)= a|b |ab| with α = a|b, as well as a|b = b|a∗ . An immediate application of the above expression of probability is the one given to the experimental situation in which after a spin 1/2 particle initially prepared with spin along the positive z axis, a selective measurement is made for detecting its spin along the −¯z direction for the following system: z z¯ +¯ z

+z

−¯ z

−z 0

+z

Fraction emerges

This is clearly given by (see below Eq. (2.6))



  p+z (−¯z ) = Tr Λ(−¯z )Λ(+z) = Tr | − z¯ +z| −¯z | + z = |−¯z | + z|2 = sin2 (θ/2),

(2.22)

2

Sorting Out Selective Measurements: QM Re-invents Complex Numbers, …

19

where, as mentioned below Eq. (2.6), we learn how to compute the numerical −¯z |z to be given by − sin θ in the next chapter. From Eqs. (2.18), (2.21) we note that pa (b) = pb (a) as a property of complex conjugation. This suggests to consider the reversal of a sequence of selective measurements, called the adjoint transformation, denoted by †, and defined as follows:  †  ∗ † |ba| = |ab|, d|b = d|b , with the latter as follows from   †   †     ∗ |a d|b c| = |cd| |ba| = |ab| |dc| = |a d|b c|.



(2.23) (2.24)

Elementary probability theory then says that the expectation value of a physically measured quantity B, for systems prepared initially prepared in state a, is simply given by pa (b) multiplied by b and a sum is taken of all possible p, i.e., 

b pa (b) =



b

b Tr Λ(b)Λ(a) ≡ Ba ,

(2.25)

b

where we have introduced a notation Ba for the above expectation value. Equation (2.25) suggests to introduce the object B=

 b

b Λ(b) =



b |bb|,

and hence write

Ba = Tr BΛ(a) .

(2.26)

b

as a linear combination of B-selective measurement symbols. For simplicity of the notation, we have used the same symbol B in Eqs. (2.25), (2.26) as the physical quantity it represents. The first expression in Eq. (2.26) suggests, in turn, to introduce more general objects as the following linear combinations: M=



b|M|b |bb |.

(2.27)

b,b

This may represent an apparatus which selects systems in states b , such as a set of particles with given momenta, then, after the selected systems in state b go through some processes, such as the collisions of underlying particles, it transmits systems emerging in states b, such as particles finally emerging in the process with some given final momenta, all depending on the numerical expression b|M|b .1 Some properties satisfied by such general linear combinations are spelled out in Box 2.1, leading to generation of matrices and matrix multiplication rules. One may also write the symbols M in (2.27) in a mixed representation as follows M=

 b|M|a|ba|,

(2.28)

a,b

leading to similar matrix multiplication rules as given in Box 2.1. In this latter case, systems initially in a state specified by a, are fed into a machine (apparatus) which produces some final state represented by |ψ, for a given value a of a physical quantity A characteristic of the system, denoted by:

1 For

such an explicit apparatus, referred to as a Ramsey apparatus, with application to a spin 1/2 particle moving in a series of three magnetic field switched on consecutively in time, see, pp. 13, 14 and Sect. 8.8 of Manoukian [4]. For Ramsey’s experiments, see, e.g., Ramsey’s Nobel Lecture: Ramsey [6].

20

2

|ψ =



Sorting Out Selective Measurements: QM Re-invents Complex Numbers, …

|bb|M|a,

and we may simply write

M = |ψa|.

(2.29)

b

The above operation of the machine (apparatus) may be then conveniently expressed as Box 2.1 Matrices and matrix multiplication rules follow  In general M1 M2 = b|(M1 M2 )|b  |bb |, also from (2.27) for M1 , M2 taken separately      b,b M1 M2 = b|M1 |b b M2 |b  |bb |, from which we obtain the expected result of b,b b  matrix multiplication b|M1 M2 |b  = b|M1 |b b M2 |b . The following identities are b



 easily established: Tr M = b|M|b, M † = b|M|b ∗ |b b|, b|M † |b  = b |M|b∗ . 

b,b   b|M|a|ba|, we obviously have M = a|M|b|ab|, In a mixed description M = b

M=

 

a,b a,b 

b|M|b b |a |ba|, b|M|a = Tr M|ab| , a|M † |b = b|M|a∗ .

a,b,b

M |a = |ψ,

|a → ψ.

(2.30)

M The symbol |ψ in (2.29) is a linear combination of the symbols |b, with numerical coefficients b|M|a, and the symbols |b acquire a significance mathematically as representing vectors generating a vector space of dimensionality directly obtained from the associated observable B (see (2.26)) representing a complete set of compatible observables, say, B = {B1 , . . . , Bk } with b = {b1 , . . . , bk }. The dimensionality of the generated vector space coincides with the number of different vectors that one may define as b1 , . . . , bk take on consistently their allowed real physical values. It is often convenient, but not always so, to use the notation |b as well for the corresponding vector representation. The 0 vector, in this vector space, is associated with the measurement of producing no systems at all. To define a vector space, we also have to see how the addition of such vectors as |ψ arise. To the above end, consider the consecutive applications of two machines M0 M, i.e., M followed by M0 , where M is defined in (2.28) and  |cc|, (2.31) M0 = c Δ

corresponding to a C-filtering machine which selects c values in some range Δ of values. This leads to M0 M =

 c Δ

|cc|bb|M|aa|.

(2.32)

b

Upon introducing a state |φb  =



|cc|b,

it gives rise to the state,

(2.33)

c Δ

|χ  =

 b

|φb b|M|a,

for which

M0 M|a = |χ .

(2.34)

2

Sorting Out Selective Measurements: QM Re-invents Complex Numbers, …

21

The first equality in (2.34) provides a linear superposition of vectors | φb  in the underlying vector space generated by the vectors |b, thus consistently leading to the generation of a vector space with the zero vector in the underling vector space / Δ, on the state |φb : Λ(c  )|φb  = 0. may be defined by carrying out a selective measurement via the symbol Λ(c  ) with c  ∈ The probability that the physical quantity B, characteristic of the system, takes the value b if the system is in state |ψ is, as before, 

|b|ψ|2 = 1. pψ (b) = Tr Λ(ψ)Λ(b)] = |b|ψ|2 = b|Λ(ψ)|b, provided

(2.35)

b

Also Λ(b)|ψ ≡ |bb|ψ = |bb|M|a,

i.e., b|M|a = b|ψ ≡ ψ(b),  |bψ(b), Hence from (2.29) |ψ =

(2.36) (2.37)

b

where we have use the notation ψ(b) for b|ψ. On the other hand, upon application of a selective measurement Λ(a), of a physical quantity A characteristic to the system in state |ψ, one obtains Λ(a) | ψ =



| aa | bψ(b) =

b



| aψ(a), where ψ(a) =

b

 a|bψ(b),

(2.38)

b

and the latter equation provides gives rise to a transformation of the system from the B-description to the A-description, with a|b, as mentioned before, denoting the corresponding transformation function. An interesting application is the machine M1 which produces a state |ϕ from a system in state | a as follows:

B

A a

b b

a

a

M1

  M1 = Λ(a) Λ(b) + Λ(b ) , M1 |a = |ϕ,   |ϕ = |a| a|b|2 + a|b |2 ,

(2.39) (2.40)

and the probability that the system ends up in state a again, is given by 2

|a|ϕ|2 = a|b|2 + a|b |2 ,

(see Eq. (2.35)),

(2.41)

2  2

which is not equal to |a|b|2 + |a|b |2 , as one may naïvely expect. The difference between the correct expression in (2.41) and the latter 2 2  2



= 2|a|b|2 a|b |2 |a|b|2 + a|b |2 − |a|b|2 + |a|b |2

(2.42)

22

2

Sorting Out Selective Measurements: QM Re-invents Complex Numbers, …

is referred to as an interference term coming from the two alternative states |b and |b  that the system prepared in |a may go into before emerging in state |a again. Detailed studies of the nature of such interference terms, with specific applications, will be carried out in the following chapter. We are now led to consider the measurement symbol |ψ1 ψ2 |. The trace of which leads to the inner product ψ2 |ψ1 : 

ψ2∗ (b)ψ1 (b), Tr |ψ1 ψ2 | ] = ψ2 |ψ1  = |ψ1  =



b

|bψ1 (b), ψ2 | =

b



where from (2.37),

ψ2∗ (b)b|.

(2.43) (2.44)

b

Evidently, the probability that a system prepared in state |ψ1  is found in state |ψ2  is simply give by pψ1 (ψ2 ) = |ψ2 |ψ1 |2 .

(2.45)

Thus we have a vector space generated by the vectors |b, on which the inner product ψ2 |ψ1  of vectors are given in (2.43). In particular, we note some of the properties of inner products as given in Box 2.2. Equation (2.26) for the observable B gives rise to the eigenvalue equation B |b = b |b, B = {B1 , . . . , Bk } complete set of compatible observables, b = {b1 , . . . , bk },

Bi |b = bi |b, simultaneously for i = 1, . . . k, b|b  = δ(b, b ) = δ(bi , bi ), and B † = B,

(2.46) (2.47) (2.48)

i

where the condition B † = B is referred to as a Hermiticity condition with B referred to as a Hermitian operator in the vector space generated by the vectors |b. The fact that the eigenvalue equations Bi |b = bi |b are satisfied simultaneously means that there is only one vector |b1 , . . . , bk , up to a phase factor, corresponding to a given set {b1 , . . . , bk } and that the eigenvalue b for a complete set of compatible observables B is non-degenerate. The totality of all the values taken by the eigenvalue b = {b1 , . . . , bk } is referred to as the spectrum of the Hermitian operator B. Moreover the decomposition of the Hermitian operator in Eq. (2.26) as a linear combination of B-selective measurement symbols, multiplied by b, is referred to as a spectral decomposition of the operator B. The vector space on which the inner product ψ2 |ψ1  in (2.43) is defined is referred to as an inner-product space. A vector such as |b and b|, referred to as its dual, are popularly referred to as a ket and a bra, respectively. In quantum physics, one generally deals not only with finite dimensional inner-product spaces but also with infinite dimensional ones as well with cases in which eigenvalues may take continuous set of values. To this end the concept of a (so-called separable) Hilbert space is defined in Box 2.3 to take into consideration of such general cases as well. A particular simple example of a Hilbert space2 is the space of square-integrable functions, denoted by L 2 (R 3 ), associated with a particle of spin zero, such that for any functions f 1 (x), f 2 (x) in L 2 (R 3 ), x in R 3 ,

 f1 | f2  =

2 For

R3

d3 x f 1∗ (x) f 2 (x), f i 2 =

rigorous studies of Hilbert space concepts see, e.g., Prugove˘cki [5].

R3

d3 x | f i (x)|2 < ∞, i = 1, 2.

(2.49)

2

Sorting Out Selective Measurements: QM Re-invents Complex Numbers, …

23

The objects  f 1 |, as mentioned earlier, are referred to as dual vectors. In later chapters, we will introduce functions consistent with the Pauli exclusion principle,3 to describe multi-electrons in atoms and in matter, in general, for which no two electrons occupy the same state. f in (2.49) defines the norm of the vector | f . Box 2.2 Some properties of inner products ψ|ψ =



|ψ(b)|2 ≥ 0, φ |ψ∗ = ψ|φ, αφ|ψ = α ∗ φ |ψ, φ |αψ = αφ|ψ, for any

b

complex number α, φ |(ψ1 + ψ2 ) = φ|ψ1  + φ|ψ2 . For the meaning of |(ψ1 + ψ2 ) see (2.34).  The norm of a vector |ψ is defined by ψ = ψ|ψ; which should be normalized to one for a probabilistic interpretation. The following two inequalities, referred to, respectively, as the Cauchy–Schwarz inequality and the triangular inequality are easily established which are, respectively, given by :

(i) |φ|ψ| ≤ φ ψ ,

(ii) φ + ψ ≤ φ + ψ .

Box 2.3 The Hilbert space A set of vectors is called a Hilbert space, denoted by H , if : (i) | f 1 , | f 2  H , then so is α1 | f 1  + α2 | f 2  for all complex numbers α1 , α2 . For the zero vector 0 in H , 0 + | f  = | f  for all | f  H . (ii) It is equipped with an inner product  f 1 | f 2  for all | f 1 , | f 2  ε H such that : α f 1 | f 2  = α ∗  f 1 | f 2 ,  f 1 |α f 2  = α f 1 | f 2 ,  f 1 | f 2 + f 3  =  f 1 | f 2  +  f 1 | f 3 ,  f 1 | f 2 ∗ =  f 2 | f 1 ,   f | f  ≥ 0, and  f | f  = 0 if and only if | f  is the zero vector 0. The norm of | f  : f =  f | f |. (iii) In H there exists a sequence of orthonormal vectors : {| f 1 , , | f 2 , . . .}, called a basis, i.e. (N )

 f i | f j  = δi j , and for any vector | f  H , we may find constants Ck such that N    f − Ck(N ) f k  → 0 as N → ∞. This property is referred to as the separability of H . k=1

(iv) A sequence of vectors: {| f (1) , f (2) , . . .} in H is called a Cauchy sequence if given any   > 0, we may find a positive integer N such that  f (n) − f (m)  < , whenever n > N , m > N . Then every Cauchy sequence in H converges to some vector | f  H . That is f − f (n) → 0, for n → ∞. This latter property is referred to as the completeness of H . As in the finite dimensional space we have, respectively, the Cauchy–Schwarz and triangular inequalities : | f 1 | f 2 | ≤ f 1 f 2 ,

f 1 + f 2 ≤ f 1 | + f 2 .

Operators having a continuous spectrum such as the position operator X require special consideration. In analogy to Eq. (2.26), its spectral decomposition may be formally defined by

d3 x x |xx|, Λ(x) = |xx|, Λ(x )Λ(x) = Λ(x ) δ (3) (x − x),

x |x = δ (3) (x − x), 1 = d3 x |xx|, |ψ = |ψ 1 = d3 x |xx|ψ, X=

R3

3 The

(2.50)

R3

R3

Pauli exclusion principle and, in general, the Spin & Statistics Theorem will be studied in detail in Chap. 36.

(2.51)

24

2

Sorting Out Selective Measurements: QM Re-invents Complex Numbers, …

where δ (3) (x − x) is Dirac’s delta. Thus we learn that states such as |x do not belong to a Hilbert space with vector in the latter have a finite norm . < ∞. They may, however, be rigorously defined as anti-linear functionals on some vectors |ψ in L 2 (R 3 ), through   (2.52)  α1 ψ1 + α2 ψ2 |x = α1∗ ψ1 |x + α2∗ ψ2 |x, for all constants α1 , α2 . Objects such as |x obviously belong to a larger space than H , since they are not square-integrable, in a space which may be denoted by D + , while the collections of the anti-linear functionals belong to a space D ⊂ H , and the triplet: D ⊂ H ⊂ D + , in which physical computations are carried out, is referred to as a Rigged Hilbert Space. It is also known as a Gelfand triplet.4 In setting up an eigenvalue equation for an object such as the position operator X, one may be guided by the experimental situation that a measurement of the position of a particle with infinite precision is impossible. One may then interpret the equation:   X f x0 = x0 f x0 , ⇒ X − x0 f x0 = 0, ⇒ f x0 ∝ δ (3) (x − x0 ),

(2.53)

in the following manner. For any > 0, as small as one wishes, set up experimentally, one may find a square-integrable function δ (3) (x − x0 ), depending on such that (x − x0 )δ (3) (x − x0 ) =

 R3

2 1/2  d3 x |x − x0 |2 δ (3) (x − x0 ) < , where such

an explicit function is : δ (3) (x − x0 ) =

 2 3/4 1  (x − x )2 0 , exp − 3/2 π 2

(2.54) (2.55)

with the position determined with a precision corresponding to an uncertainty not exceeding . In quantum physics, in computations of physical quantities, one is really dealing with so-called self-adjoint operators, say, A, not just Hermitian ones, where the domain of definition of vectors |ψ in H is large enough, i.e., for which Aψ 2 < ∞, such that any vector |φ in H may be approximated by a vector in the domain of definition of the operator A. That is, for any vector |φ, and any > 0, we may find a vector |ψ , in the domain of definition of the operator A, such that φ − ψ < . For such an operator, the adjoint A† of A is defined by φ|A|ψ = A† φ|ψ, for all |ψ in the domain of definition of A.

(2.56)

If the domains of definitions of A† and A coincide and φ|A|ψ = Aφ|ψ, then A is said to be self-adjoint, and one writes5 A† = A. The spectral decomposition of a self-adjoint (formally Hermitian) operator, in general, in analogy to the special case in (2.26), with the latter applicable to a finite discrete spectrum, is much more involved.6 We may, however, formally, proceed by extending the expression in (2.26) to the general case, as we have already done for the position operator in Eq. (2.50), in the following manner. We define the spectral decomposition of a self-adjoint (formally Hermitian) operator in terms of projection operators PA (λ), defined, in turn, as follows7

4 Gelfand

and Vilenkin [3]. much details on self-adjoint operators and their spectra, see Sect. 1.8 and Chap. 4 of Manoukian [4]. 6 For the mathematically inclined reader, I recommend strongly the book by Prugove˘ cki [5] which involves a careful mathematical analysis of the spectral decomposition of self-adjoint operators in general, as well as Manoukian [4]. 7 In the states |λ we have suppressed other relevant quantum numbers that may be needed. These states may be considered in the sense of the Gelfand triplet description discussed above. 5 For

2

Sorting Out Selective Measurements: QM Re-invents Complex Numbers, …

λ

25

Λ A (λ ) = |λ λ |, λ |λ  = δ(λ − λ ), and the −∞

∞ dλ Λ A (λ ) = 1. normalization condition in (2.4) becomes replaced by PA (∞) = PA (λ) =

dλ Λ A (λ ),

PA (λ1 )PA (λ2 ) =

λ1

dλ −∞



λ2











dλ |λ λ |λ λ | =

−∞

=

λ

−∞

λ1

dλ −∞



(2.58)

−∞

λ2

−∞

dλ |λ λ |δ(λ − λ )

dλ |λ λ | = PA (λ),

λ = min(λ1 , λ2 ).

PA (λ)PA (λ) = PA (λ).

Hence, in particular,

(2.57)

(2.59)

Equation (2.59) provides the definition of a projection operator. Equation (2.58) is a statement of the fact that a sum over all values in the spectrum gives one, and thus defines a completeness relation. The spectral decomposition of a self-adjoint (formally Hermitian) operator is then formally defined as

A=

λ→∞

λ→−∞

λ dPA (λ) =

∞ −∞

dλ λ Λ A (λ).

(2.60)

We also learn that for a state |ψ, by inserting the identity in the inner product and using (2.58), that we may write

ψ 2 = ψ|ψ = ψ| 1 |ψ = where



−∞

dλψ|λλ|ψ =

ψ|λλ|ψ ≡ |ψ(λ)|2 ,

∞ −∞

dλ |ψ(λ)|2 ,

(2.61) (2.62)

and ψ defines the norm of the vector |ψ. Of particular interest is a self-adjoint operator (formally Hermitian) which is bounded below, i.e., it has a spectrum which is bounded from below:

∞ λ dPA (λ), i.e., λ ≥ −λo , and where − λo > − ∞, (2.63) A= −λo



∞ from which ψ|A|ψ = λ dψ|PA (λ)|ψ = λ dλ |ψ(λ)|2 ≥ −λo , for ψ = 1, (2.64) −λo

−λo

where the latter inequality follows from the fact that λ ≥ −λo , that |ψ(λ)|2 ≥ 0, and that the latter cannot be equal to zero for all λ since ψ 2 = 0 (see Eq. (2.61)). On physical grounds, referring to the Hamiltonian of a system represented by a self-adjoint (formally Hermitian) operator, a necessary condition for the stability of the corresponding quantum system is that its spectrum is bounded from below. Otherwise, if the system is in its lowest state, an infinite energy is required to excite the system which is meaningless. The development of the elegant formalism of quantum theory via selective measurements is due to Julian Schwinger [7, 8]. It has its roots in Dirac’s [1, 2] abstract presentation in terms of projection operators and as we have seen provides tremendous insight into the physics behind the formalism. An extensive analysis of this formalism in terms of selective measurements including various applications see Chaps. 1 & 8 of Manoukian [4].8

8 Other authors who have also shown interest in this approach to some extent include: F. A. Kaempfer (1965). Concepts in quantum mechanics. New

York: Academic Press; K. Gottfried (1989). Quantum mechanics. Reading: Addison-Wesley; J. J. Sakurai (1994). Modern quantum mechanics. Reading: Addison-Wesley.

26

2

Sorting Out Selective Measurements: QM Re-invents Complex Numbers, …

References 1. 2. 3. 4. 5. 6. 7. 8.

Dirac, P. A. M. (1988). The principles of quantum mechanics (4th ed.). Oxford: Oxford. Dirac, P. A. M. (2012). The principles of quantum mechanics (2012 ed.). La Vergne: www.bnpublishing.net. Gelfand, I. M., & Vilenkin, N. Y. (1964). Generalized functions, Volume 4: Applications of harmonic analysis. New York: Academic. Manoukian, E. B. (2006). Quantum theory: A wide spectrum. Dordrecht: Springer. Prugove˘cki, E. (1981). Quantum mechanics in Hilbert space (2nd ed.). New York: Academic. Ramsey, N. F. (1990). Experiments with separated oscillatory fields and hydrogen masers. Reviews of Modern Physics, 62, 541–552. Schwinger, J. (1991). Quantum kinematics and dynamics. Redwood City: Addison-Wesley. Schwinger, J. (2001). Quantum mechanics: Symbolism of atomic measurements. Berlin: Springer.

3

Interference and Measurements Prerequisite Chap. 2

In this chapter one learns the basic truth about quantum physics, referred to as the principle of linear superposition of quantum physics, which implies that the probability of a system to go from an initial state to a final one through several intermediate states does not, in general, coincide with the sum of probabilities evaluated for each alternative intermediate state. In this respect one has to sum instead the amplitudes over all the alternative intermediate states and then compute the probability of transition. This leads to additional terms in the probability in the latter expression, coming from cross terms between the alternative intermediate states, referred to as interference terms. This is demonstrated in Eqs. (3.4)–(3.6) and Fig. 3.1 below. We will also learn that the mere presence of a measuring apparatus of the attribute of a physical system for which the experimentalist does not make a reading, i.e, the meter of the apparatus is not read, destroys the interference pattern mentioned above (see Eqs. (3.23), (3.24)). Moreover, in a most simplistic description, a given system which is not in a definite state but  is represented by a sum over several eigen-states, say b cb |b with given expansion coefficients cb labeled by the possible eigenvalues b of an attribute of the system under consideration, then if a measurement carried on the system leads to a given value, say, b for the physical quantity being measured then in computing the probability that the system is found and remains in state |b, we learn that the measurement process effectively annihilates all the expansion coefficients in the latter with the exception of the coefficient cb corresponding to the value b obtained and replaces cb by one in the above sum, using the same notation for simplicity. This happens on account that one is dealing with a conditional probability based on the input that it is given that a measurement of the attribute has already given the value b for it. This is demonstrated in Eqs. (3.15), (3.17) below. Suppose a spin 1/ 2 of a particle is initially prepared in the state | + z, in the notation of, e.g., Eq. (2.6) in the previous chapter, i.e., it is directed in the direction of the positive z-axis. The particle then goes through a Stern–Gerlach set up, three of which (a), (b), and (c) are considered in Fig. 3.1. The spin states in consideration are denoted by | ± z, | ± z¯ , where for | + z¯ , for example, the spin of the particle is directed along the positive z¯ -axis. The inner products of all theses states are worked out in Box 3.1. We compute the various probabilities for the spin of the particle to go from the initial state | + z to the final states | ± z, after the particles gas gone through the two Stern–Gerlach apparatuses as indicated in Fig. 3.1, where the z¯ axis makes an angle θ with the z-axis as shown in Box 3.1. The apparatuses (machines) of the set-ups in Fig. 3.1, in the notation of the previous chapter, in terms of selective measurement symbols, are given, respectively:   M (a) | + z = | − z¯ −¯z | + z, M (a) = Λ(+z) + Λ(−z)Λ(−¯z ) = 1Λ(−¯z ), M (b) | + z = | + z¯ +¯z | + z, M (b) =  Λ(+z) + Λ(−z)Λ(+¯z ) = 1Λ(+¯z ),  (c) M = Λ(+z) + Λ(−z) Λ(+z) + Λ(−z) = 1, M (c) | + z = | + z.

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_3

27

28

3

(a)

(b) z¯

z +

+ +z

? −

(c) z¯

z

+



Interference and Measurements

z

+

+

+

+z

?

+z





− 1

cos4 (θ /2) 1 4

2

sin θ −z

sin4 (θ /2) +z

1 4



2

sin θ −z

+z

0 −z

+z

Fig. 3.1 A spin 1/2 particle initially prepared in the state | + z goes through the Stern–Gerlach set up is illustrated where the z¯ -axis makes an angle θ with the z-axis in configuration space. In part a only a spin along the negative z¯ direction is allowed to go through the system, while in part b only a spin along the positive z¯ direction is allowed to go through. In part c spin in both directions are allowed to go through. The probabilities are shown in the figure and the corresponding spikes are not drawn to scale for any specific value of θ

The amplitudes corresponding to each case shown in Fig. 3.1 for the spin, in the initial state | + z, to finally end up again in the same state | + z are given below:   (a): +z| |− z¯ −¯z | | + z = |+z| − z¯ |2 = sin2 (θ/2),   (b): +z| |+ z¯ +¯z | | + z = |+z| + z¯ |2 = cos2 (θ/2),   (c): +z |+ z¯ +¯z | + | − z¯ −¯z | | + z = +z|I | + z = 1,

(3.1) (3.2) (3.3)

where we have used the inner products evaluated in Box 3.1. These lead to the probability values shown in Fig. 3.1 at the value +z upon squaring them. The amplitudes to go from the state   | + z to the state | − z, in cases (a), (b) and (c) above, which for the case (a) the amplitude is give by −z| | − z¯ −¯z | | + z s for example, are similarly obtained by using, in the process, the inner products evaluated in Box 3.1, yielding to the corresponding probability values shown in Fig. 3.1 at the point −z. Upon comparing the sum of the probabilities in cases (a) and (b) with the probabilities in case (c) for the same final states, one learns the basic fact about quantum physics that the probability of a system to go from an initial state to a final state through several intermediate states does not, in general, coincide with the sum of probabilities evaluated for each alternative intermediate state. And one must sum over the alternative amplitudes first from which transition probabilities are then evaluated. For example, the probability for the spin to remain in the same state | + z, i.e., for a spin non-flip, we note from the explicit expressions of the probabilities, given at the value +z given in Fig. 3.1, that

3

Interference and Measurements

29

Box 3.1 Spin polarization states z

      01 0 −i 1 0 , σ2 = , σ3 = . The Pauli matrices are given by σ1 = 10 i 0 0 −1



θ N

x N0

Introduce the unit vectors : N 0 = (0, 0, 1), N = (sin θ, 0, cos θ) just depicted to infer the eigenvalue equations : σ ·N0 ξ N0 = ξ N0 , σ ·N ξ±N = ± ξ±N , where ξ N0 = (1 0),     ξ N = cos(θ/2) sin(θ/2)), ξ−N = − sin(θ/2) cos(θ/2) , and orthonormality conditions : ξ ε†N ξ εN = δ(ε, ε ), as well as the inner products ξ N† 0 ξ N = cos(θ/2), ξ N† 0 ξ−N = − sin(θ/2). The following notations will be used in the   text : +z| = ξ N0 = (1 0), −z| = ξ−N = (0 1), ± z¯ | = ξ±N . In these notations: 0

+z| + z¯  = cos(θ/2) = −z| − z¯ , −z| + z¯  = sin(θ/2),

+z| − z¯  = − sin(θ/2) = −¯z | + z.

θ  θ  + cos4 sin4 2 2 (probability confined to intermediate state | − z¯ )

=

1,

(3.4)

(probability confined to intermediate state | + z¯ )

for sin θ = 0, where the value “1” of the probability, on the right-hand side of the equation, is obtained by summing over the amplitudes for the two alternative intermediate states first. We note from the identity sin4

θ  2

+ cos4

θ  2

=1 −

1 2 sin θ, 2

(3.5)

and the additional cross term −(1/2) sin2 θ arising from the two alternative intermediate terms, upon computing the probabilities in (3.4) and (3.5), is referred to as an interference. Similarly, for the probability to go from the initial spin state | + z to a final state | − z, i.e., for a spin flip, we note from the explicit expressions of the probabilities, at the value −z, given in Fig. 3.1, that 1 2 1 2 sin θ + sin θ 4 4 (probability confined to intermediate state | − z¯ )

=

0,

(3.6)

(probability confined to intermediate state | + z¯ )

for sin θ = 0, where the “0” value of the probability, on the right-hand side of the equation, is obtained by summing over the amplitudes for the two alternative intermediate states first as before. We next consider the role of measurement of a quantum mechanical system. To this end, consider a physical quantity, associated with a physical system, represented by a Hermitian operator B which, for simplicity, is assumed to take on values from a discrete set of real numbers: {b, b , . . .} satisfying the eigenvalue equation B|b = b|b,

b|b  = δ(b, b ).

(3.7)

We consider an apparatus whose states will be denoted by: |a, ν, where a corresponds to a “needle” registering value obtained in measurement of the above physical quantity. The latter will be denoted by ab if the “needle” registers the value b for the physical system. Here ν denotes the collection of all other quantum numbers needed to specify the state of the apparatus. We consider the situation in which the apparatus, in general, may disturb the physical system, by the measuring process, causing the physical system to make a transition from a state specified by the value b, as registered by the apparatus, to some other set of values. After a value for the physical quantity is registered by the apparatus, ν may also change to some other set of values. The states |a, ν will be taken to satisfy the orthonormality condition

30

3

Interference and Measurements

a, ν|a , ν   = δ(a, a  )δ(ν, ν  ).

(3.8)

Let |a, ν denote the initial state of the apparatus. Prior of the measurement, the initial state of the combined system consisting of the apparatus and the physical system, may be then defined by ψ0 =



 cb |b |a, ν,

(3.9)

b

with arbitrary expansion coefficients cb . During the registration process the combined system will evolve to a final state which may be expressed in the following manner ψ=

cb |φ (b) ,

|φ (b)  =

|b  |a  , ν   C(b , ab , ν  ; b, a, ν),

(3.10)

b,ν 

b

showing a correlation has occurred between the apparatus and the physical system and also incorporating a general disturbing transition as a result of the interaction between these two sub-systems. The expansion coefficients C(b , ab , ν  ; b, a, ν) satisfy the normalization condition

|C(b , ab , ν  ; b, a, ν)|2 = 1, and hence |C(b , ab , ν  ; b, a, ν|2 ) ≤ 1, (3.11) b,ν 

ν 

on account of the normalization condition φ (b) |φ (b )  = δ(b, b ). To obtain physical insight on the expression in (3.10), one may ask the question: (1) What is the probability that the value b was obtained in the measurement process regardless of the transition made by the two sub-systems? This is given by |ψ|φ (b) |2 = |cb |2

|C(b , ab , ν  ; b, a, ν)|2 = |cb |2 ,

(3.12)

b,ν 

where we have used the normalization condition in (3.11). On the other hand, for the more restrictive question: (2) What is the probability that the value b was obtained in the measurement process and that the physical system remains in the state |b irrespective of the final state of the apparatus? This is given simply by |cb |2

|C(b, ab , ν  ; b, a, ν)|2 ≤ |cb |2 ,

(3.13)

ν

where, in the process of writing the above equation, we have used the inequality in (3.11), and thus the probability is, in general, reduced over the one in (3.12). The following key consideration then arises as a consequence of the above two equations, from which one may then ask the question: (3) Given that the apparatus yielded a given value b, then what is the probability that the physical system remains in the state |b irrespective of the state of the apparatus. Such a probability is referred to as a conditional probability written as   Prob [physical system remains in state |b apparatus yielded the value b ,

(3.14)

is given, by a conditional probability, expressed by the ratio of the following two probabilities (irrespective of the state of the apparatus1 ) Prob[apparatus yielded the value b and the physical system remains in state |b ] Prob[apparatus yielded value |b r egar dless of the state of the physical system] 

|cb |2 ν  |C(b, ab , ν  ; b, a, ν)|2 = = |C(b, ab , ν  ; b, a, ν)|2 , |cb |2 ν 1 This

is indicated by the sum over ν  in the respective probabilities.

(3.15)

3

Interference and Measurements

31

using Eqs. (3.12) and (3.13), and the factor |cb |2 (for |cb |2 = 0) cancels out in the final expression of the conditional probability inheriting the beginning statement: given that the physical value b was obtained. That is, “as if” the measurement carried out on the physical system, giving the value b and read ab on the apparatus, has forced all the expansion coefficients {cb , cb" , . . .} in Eq. (3.10) to be zero with the exception of the coefficient cb and replaced the latter by one. Of particular interest of pedagogical value is an ideal apparatus for which the state of the combined system may be taken in the simple form

cb |b  |ab . (3.16) |ψ = b

This defines a state with perfect correlation between the apparatus and the physical system with the “needle” registering value of the apparatus always coinciding with the given value taken by the physical system instead of having a more general probabilistic distribution as indicated by the presence of the coefficients C(b , ab , ν  ; b, a, ν) in (3.10). In this case the conditional probability in (3.15) reduces to |cb |2 = 1, (3.17) |cb |2 giving the confirmation that if the apparatus yields the value b, then the system is found in the definite, i.e., reduced to, the state |b with probability one, and as if the measurement process annihilated all the expansion coefficients cb in (3.10) with the exception of the coefficient cb , corresponding to the value b obtained, and replaced it by one!. This is referred to as quantum state reduction. The concept of a conditional probability is well known to probabilists but it seems, rather unfortunately, that it has not been sufficiently emphasized in the physics literature. As we have just seen, its importance in the quantum theory of measurement is obvious. In a further example we consider the role of an ideal apparatus set up to determine the direction of the spin in the intermediate stage of the Stern–Gerlach set up as shown in Fig. 3.2. The states of the apparatus corresponding to the direction in which its “needle”   points along the spin of the particle in the positive/negative z¯ -directions, in a measurement process, will be denoted by ±¯z . For such an ideal apparatus it is easy to find the state of the combined system of spin-apparatus in the intermediate stage where the spin direction is measured by the apparatus. To this end, first note that in the absence of the apparatus, we may write a completeness relation for the initial state | + z as follows: |+ z = |+ z¯ +¯z | + z + |− z¯ −¯z | + z.

z

(3.18)



+ +



+ + ? −



+z



4

+ Spin of the particle

− Apparatus

sin (θ /2)+cos4 (θ /2) 1 2

sin2 θ

−z

+z

Fig.3.2 An ideal apparatus is inserted in the intermediate stage to determine the component of the spin along the ±¯z directions but the experimentalist does not take a reading and hence does not know this result. The probabilities for the emerging spin, with spikes not drawn to scale, are also shown in the figure

32

3

Interference and Measurements

Since the “needle” of the ideal apparatus is directed along the spin of the particle in a measurement, the state of the combined system of spin-apparatus in the intermediate state is then simply given by     |ψ = |+ z¯  + z¯ +¯z | + z + |− z¯  − z¯ −¯z | + z.

(3.19)

If no reading of the apparatus is undertaken and we are only interested in determining the probabilities  of  the directions of  the final spin along the ±z directions, then we may take trace over the apparatus (denoted by (A)) states ± of the projection operator |ψψ| to obtain   TrA |ψψ| = |+ z¯ |+¯z | + z|2 +¯z | + |− z¯ |−¯z | + z|2 −¯z |,

(3.20)

   where we have used the orthonormality conditions ε z¯ ε z¯ = δ(ε, ε ), where ε, ε = ±. Accordingly, If no reading of the apparatus is undertaken, the probabilities of the final spin emerging along ±z directions are given by Probability of final spin along + z :     +z| |+ z¯ |+¯z | + z|2 +¯z | | + z + +z| |− z¯ |−¯z | + z|2 −¯z | | + z 4 4   = +z| + z¯  + +z| − z¯  , Probability of final spin along − z :     −z| |+ z¯ |+¯z | + z|2 +¯z | | − z + −z| |− z¯ |−¯z | + z|2 −¯z | | − z 2  2 2   2 = −z| + z¯  +¯z | + z + −z| − z¯  −¯z | + z ,

(3.21)

(3.22)

which from the inner products determined in Box 3.1, lead to the probabilities shown in Fig. 3.2, where in the last case we have used, in the process, the elementary identity 2 sin(θ/2) cos(θ/2) = sin θ . Upon comparing the probabilities in Fig. 3.2 with the sum of probabilities in set ups (a) and (b) together in Fig. 3.1 at +z and −z, we see the interference disappears by the mere presence of the apparatus. On the other hand, if the meter is read and the spin is found, say, in the direction −¯z direction, thus singling only the state −¯z , then the probabilities that the spin emerges finally in the | ± z states are respectively 4  Prob[ +z direction] : +z| − z¯  = sin4 (θ/2), 2 2  1 Prob[ −z direction] : −z| − z¯  −¯z | + z = sin2 (θ ), 4

(3.23) (3.24)

coinciding with the probabilities at +z and −z in the set up (a) in Fig. 3.1 in which only the component of spin along −¯z is allowed to go through the first Stern–Gerlach apparatus as expected. Finally we note, that since we are dealing with an ideal apparatus then if the measuring apparatus determines the spin, say, in the −¯z direction, then the spin will stay in the state | − z¯ , with probability one, before it goes through the second Stern–Gerlach apparatus. For the case when one is dealing with a non-ideal apparatus in the last example see Sect. 1.10.3 of my book: Manoukian [1]. For greater generality where the orthogonality condition for an apparatus states such as the one in (3.8) is relaxed, see Sects. 8.7 and 8.8 of the latter book. An interesting book on the quantum theory of measurement, in general, which also includes some critical papers on the subject see the edited book quoted below.2

References 1. Manoukian, E. B. (2006). Quantum theory: A wide spectrum. Dordrecht: Springer. 2. Wheeler, A., & Zurek, W. H., (Eds.). (1983). Quantum theory and measurement. Princeton: Princeton University Press.

2 Wheeler

and Zurek [2].

Transformation Theory and Emergence of  in the Formalism

4

Prerequisite Chap. 3

In the last chapter, we have learned that if a system is prepared in a state |ψ then the amplitude that the system is found in some state |φ is given by φ|ψ, and the probability of finding it in state |φ is given by |φ|ψ|2 . On the other hand, suppose that |ψ , |φ  denote the states |ψ, |φ, resulting from a symmetry transformation. Then invariance of the just mentioned probability, under such a transformation, means that |φ | ψ |2 = |φ|ψ|2 . One expects that one may encounter two types of transformation operators to ensure such an invariance: 1. |ψ  = U |ψ, |φ  = U |φ, U φ|U ψ = φ|ψ, U [a|φ + b|φ] = aU |φ + bU |ψ, and U is referred to as a (linear) unitary operator,

(4.1)







2. |ψ  = U |ψ, |φ  = U |φ, U φ|U ψ = φ|ψ , U [a |φ + b|φ] = a U |φ + b U ψ, and U is referred to as an (anti-linear) anti-unitary operator,

(4.2)

for any complex numbers a, b. This is indeed the case1 and is referred to as Wigner’s Symmetry Transformation Theorem.2 In the present chapter, we consider continuous transformations such as translations in space and time and transformations as providing a boost to a given frame of reference imparting it with a given velocity. In the last chapter, we have also learned how the concept of operators arises by considering a series of filtering processes. In this chapter we will see that symmetry transformations lead to basic commutation relations that are satisfied by the observables, i.e., by the physical operators involved in the transformations. We will also see how the (reduced) Planck constant  arises consistently in the formalism of QM, by appropriately choosing the units of the observables in QM to coincide with their classical counterparts. For infinitesimal continuous symmetry transformations discussed above, one approaches the identity transformations. The identity transformation being, trivially a unitary operator, implies that all the transformations just mentioned, including one involving a rotation of a coordinate system, are unitary operators, i.e., |ψ = U |ψ, U † U = I = UU † , and for a given operator A,

(4.3)

ψ|A|ψ = ψ|A|ψ, implies the transformation A = U A U,

(4.4)



for the operator A. Discrete transformations, such as space reflection and time reversal will be considered below where we will see, in particular, that time reversal may be consistently implemented by an anti-unitary operator. In particular we note that in the expression a + v t, a denotes the initial displacement and − v, denotes the velocity of frame F relative to F. One may also consider in addition a rotation of a coordinate system, with a space-time labeling taking now the form x i = R i j x j − a i − v i t, t = t − τ,  j  ij i i i ij i i δx = x − x = δ δ − R x + δa + δv t, δt = t − t = δτ,

1 For

(4.5) (4.6)

a complete hair splitting proof of this see: Manoukian [2], pp. 55–65. [6]; [7], p. 64.

2 Wigner

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_4

33

34

4 t

t P

x

Transformation Theory and Emergence of  in the Formalism x = x − a − vt

x

F

δx = x−x

δ a + δ vt

t = t −τ

F

otion rm M Unifo

δt = t −t

δτ

Fig. 4.1 A point P in space-time is labeled by x and x in the respective frames F and F. The combination a + vt represents a space translation as a function of time

where [R i j ] is a matrix which implements a rotation of coordinate system F by an angle, say, ϕ about some unit vector n. The unitary operator which implements such infinitesimal transformation on states |ψ and operators A should be a linear combination of the infinitesimal parameters: δa, δv, n δϕ, δτ . That is it should have the form: U = 1 + i G,

 G = P · δa − H δτ + N · v + J · n δϕ],

(4.7)

and is said to implement an infinitesimal Galilean transformation. The i factor in it is introduced so that G is a self-adjoint operator G † = G. The latter operator is dimensionless, and the operator coefficients of δa j , δτ , δv j , n j δϕ, respectively in G have, what is called in quantum physics, natural units. These operators have important physical meanings and are, in general, counterparts of classical quantities which, however, are defined in terms of different units. For example, we will learn that P in (4.7) is the observable representing the momentum of a system, and according to the latter equation it would have the units [1/length] instead of [(mass×length)/second]. In order to make the comparison with the classical standards, an overall conversion factor is introduced, uniformly, by dividing G, in (4.7), by this conversion factor which necessarily has the dimensions of action. The operator coefficients of the above mentioned parameters will be then all defined in the same units as their classical standards. The unit of action in quantum physics found empirically is that provided by  (which is Planck’s constant divided by 2π ). By introducing this conversion factor we may rewrite (4.7) as3 U =1+

i G + ··· , 

(4.8)

in conformity with the classical standards. For example, as we will see below, the coefficient P j of δa j in their scalar product in (4.7), and H of δτ are associated respectively with the momentum components and of the Hamiltonian of a system under consideration. Hence by dividing by the unit of action  in (4.7) they will have the same units as in classical physics as mentioned above. The operator N is referred to as the boost operator, and J is the familiar angular momentum operator. For simplicity, we will consider transformations not involving rotation as depicted in Fig. 4.1. Under the infinitesimal transformation, an operator A, according to (4.4), transforms as     i i A = U†A U  1 − G + · · · A 1 + G + · · · ,          1 j j j δA = A − A  A, G , iδ A = δa A, P + δv A, N j − δτ A, H . i

(4.9) (4.10)

For transformations without a rotation, we have in reference to Fig. 4.1, x = x − a − v t, ⇒ δx = x − x = δa + t δv.

(4.11)

and for successive transformations the following chain of equations arise: x = x − a − v t, Hence 3 The

t = t − τ,

(4.12)

x = x − a − v t, t = t − τ.     x = x − a − vt − a − v t − τ

(4.13) (4.14)

insertion of  in the formalism via the consideration of general unitary transformations was emphasized by Schwinger in all his work: [3–5].

4

Transformation Theory and Emergence of  in the Formalism

35

   = x − a + a − v τ − (v + v t,

  t =t − τ +τ ,

and we have the following key group property for the parameters :      τ , a, v τ, a, v = τ + τ , a + a − v τ, v + v .

(4.15) (4.16)

Moreover, we have:     The identity element: I ≡ (0, 0, 0), (0, 0, 0) τ, a, v = τ, a, v ,   −1  ≡ − τ, −(a + v τ ), −v , The inverse of transformation: τ, a, v    − τ, −(a + v τ ), −v τ, a, v = (0, 0, 0) ≡ I,     Associativity rule: τ3 , a3 , v3 τ2 , a2 , v2 τ1 , a1 , v1     = τ3 , a3 , v3 ) τ2 , a2 , v2 τ1 , a1 , v1 .

(4.17) (4.18) (4.19)

We consider the transformation along a closed path, where inverse transformations are emphasized by the directions of the arrows along the way back to the starting point in the diagram below: 1 2

2 1

With infinitesimal parameters, this is given by −1  −1       δτ1 , δa1 , δv1 δτ2 , δa2 , δv2 δτ1 , δa1 , δv1 = δτ, δa, δv , δτ2 , δa2 , δv2 where, directly from Eqs. (4.11), (4.12), (4.16), (4.18), we have for the above closed path: δτ = 0,

δa = δv1 δτ2 − δv2 δτ1 ,

δv = 0.

(4.20)

On the other hand, the unitary operator on states and operators, corresponding to the above closed path, is clearly given by 

1−

    i i i i G2 + · · · 1 − G1 + · · · 1 + G2 + · · · 1 + G1 + · · ·      i 1 = 1 + 2 G1, G2 + · · · = 1 + G + · · · ,  

(4.21)

from which G=

 1 1 i j i G1, G2 = δa1 δa2 [ P , P j ] − (δa1i δτ2 − δa2i δτ1 )[ P i , H ] i i j j + (δa1i δv2 − δa2i δv1 )[ P i , N j ] − (δv1i δτ2 − δv21 δτ2 )[ N i , H ]  j j + (δv1i δv2 − δv2i δv1 )[ N i , N j ] ,

(4.22)

where G α = δaα · P − δτα H + δvα · N,

α = 1, 2,

G = δa · P − δτ H + δv · N + δϕ1  j  j j j j j and from (4.20) = δv1 δτ2 − δv2 δτ1 P j + M(δa1 δv2 − δa2 δv1 1.

(4.23) (4.24)

  for a closed path, where δϕ is a phase factor which must be anti-symmetric in the indices 1 and 2, since G 1 , G 2 is antisymmetric in the   1 and 2 indices, and must be constructed out of the scalar product of δvα and δaα . It cannot depend on δτ1 δτ2 since G 1 , G 2 does not contain such a term due to its anti-symmetry in the two indices. This gives the explicit expression on the extreme right-hand side of (4.24), where M is a constant to be interpreted physically.

36

4

Transformation Theory and Emergence of  in the Formalism

Upon comparing the expression on the extreme right-hand of Eq. (4.22) with its left-hand side, with G as given in Eq. (4.24), we immediately obtain the following commutation relations: [P i , P j ] = 0, [P j , H ] = 0, [H, N j ] = i P j, [P i , N j ] = i Mδ i j, [N i , N j ] = 0.

(4.25)

  We now consider the application of Eq. (4.10) for δτ = 0, i.e., via the generator G = P · δa + N · δv , for infinitesimal δa, δv, for the following operator Q defined by the linear combination Q=

  i  1 i   1 t P − N , ⇒ δQ = Q − Q = 1 − G Q 1 + G = [Q, G ] (from Eq. (4.10)) M   i  1 i i i = δa [ Q, P ] + δv [ Q, N i ] = δa + t δv, i

(4.26)

and the latter follows from the definition of the operator Q and the commutations relations in (4.25). Hence from Eqs.(4.11),  (4.26), allows us to identify Q with the position operator X, and from its very definition given by: Q = t P − N /M, it satisfies, in particular, the following commutation relations, as they follow directly from the ones in (4.25) 

 X i , X j = 0,



 X i , P j = i  δi j ,

  P X, H = i  , M

(4.27)

which are admittedly well known to the reader. Now we consider two frames F and F, for which a = 0 (origin of their coordinates systems coincide), v = 0 (no relative motion), with the clocks in frame F set −δτ units of time back relative to ones in frame F. That is t in X(t) corresponds to X(t + δτ ), with the former “lagging” behind by an amount δτ . Accordingly, ˙ δX = X(t) − X(t) = X(t) − X(t + δτ ) = −δτ X   1 P Moreover from (4.27) δX = (−δτ ) X, H = −δτ , i M

(4.28) (4.29)

From which we learn that  2   P ˙ = P , moreover X, P = i = X, H , X M 2M M

(4.30)

from which we are led to the following general expression for H : H=

  P2 + H1 , where H1 , P = 0, 2M

 [H1 , X = 0.

(4.31)

˙ = P/M, together with the expression for H just given lead inescapably to identify P as the momentum The expression X operator, M as the mass and H as the total Hamiltonian operator of the system in question. This will allow us in the next chapter to actually construct Hamiltonians for physical systems. Now it is easy to see why the time translation part of the unitary displacement operator was chosen as −(δτ/)H rather than +(δτ/)H for the simple reason that this would give the wrong sign to the kinetic energy operator: −P2 /2M as is easily verified from Eq. (4.21) and below it. In the X description with pure constant infinitesimal space displacement U = 1 + (i/)δa · P + · · · , unitarity implies that  x |ψ =  x|ψ, U ψ = |ψ, x − δa| =  x | = x|U † . Hence x − δa|U |ψ = x|ψ ⇒ (U ψ)(x) = ψ(x + δa) = ψ(x) + δa · ∇ψ(x),    i i.e., 1 + δa · P ψ (x) = ψ(x) + δa · ∇ψ(x), ⇒ P = −i∇. 

(4.32) (4.33) (4.34)

Since the total Hamiltonian H and the total momentum P commute, as well as the components of P, we may generate finite space-time displacements in N infinitesimal steps as N is taken to infinity with δτ = t/N , δa = a/N , with n → ∞ :

4

Transformation Theory and Emergence of  in the Formalism

37

 N    i  a i t  N ·P U − H U (a, t) = U → exp a · P exp − t H , N N   i  a·P − tH , or U (a, t) = exp 

(4.35) (4.36)

using the definition of an exponential (1 + c/N ) N → exp[c ], N → ∞, for a given c. The inclusion of a rotation in the above analysis generates the familiar angular momentum commutation relations: [ J i , J j ]= i  ε i j k J k , where ε i j k is totally anti-symmetric in its indices, and ε123 = +1. For additional details see Manoukian [2], Chap. 2; Schwinger [3–5]. We now consider the parity transformation, i.e., space reflection, as well as time reversal. In particular since the operation of space reflection followed by infinitesimal changes −δa, −δv, δτ is equivalent to first performing the infinitesimal changes δa, δv, δτ followed by a space reflection operation, we may infer that     i i − δa · P − δτ H − δv · N P = P 1 + δa · P − δτ H + δv · N , 1+  

(4.37)

where P is the parity operator, the above gives i P P P−1 = i(−P),

i P H P−1 = i H,

i P N P−1 = i (−N),

(4.38)

and the parity operator may be implemented by a unitary operator as it does not complex conjugate the multiplicative i factor (see Eq. (4.1)). On the other hand, since the operation of time reversal followed by infinitesimal changes δa, −δv, −δτ is equivalent to infinitesimal changes δa, δv, δτ followed by a time reversal, we may infer that     i i δa · P + δτ H − δv · N T = T 1 + δa · P − δτ H + δv · N , 1+  

(4.39)

where T is the time reversal operator, which gives i T P T−1 = −i(−P),

T H T−1 = −i H,

T N T−1 = −i (N),

(4.40)

and as this operator complex conjugates the multiplicative i factor : i → −i, it may be implemented by an anti-unitary operator (see Eq. (4.2)). We have seen above how  restores, uniformly, the dimensions of the generators in Eq. (4.7) leading to scaling the generator G → G/, as seen in Eq. (4.8), so that the dimensions of the operators momentum, Hamiltonian, … coincide with those of their classical counterparts. For completeness, we close this chapter by elaborating on the emergence and the evaluation of the fundamental unit of action  experimentally. To this end, we define a Black-Body (BB) a material surface which absorbs all the radiation incident upon it. This, in turn, heats up the material. At thermal equilibrium with its temperature kept fixed, the BB emits electromagnetic radiation whose intensity distribution has a very special shape and is sketched in Box 4.1 referred to as a BB distribution and the radiation emitted is referred to as BB radiation. As a model for a BB, one may consider a closed material cavity with a small hole in it, too small in comparison to the dimensions of the cavity, with the material surface kept at a fixed temperature, the radiation emitted from the interior of the enclosure may be then analyzed. Max Planck in 1899 had put forward the idea that the atoms in the material could exchange energy with the electromagnetic fields by absorbing and emitting radiation in discrete units.4 It was Einstein in 1905, however, who has advanced the idea that radiation of frequency ν absorbed and emitted also come in units of energy E = hν, where h is the Planck constant. This, as is well known, has become to be known as the photon. The word “photon”, however, was coined by Lewis later in 1926 [1]. The explicit distribution of the energy per unit volume per unit frequency U (ν, T ), of radiation at a fixed absolute temperature T , as a function of frequency ν, will be derived in detail in Chap. 82 which is very much relevant to cosmology. As a matter of fact the Microwave Background Radiation observed in nature, as we will see, fits perfectly with the BB distribution. A sketch of this distribution as well the equations leading to the determination of the reduced Planck constant  is given in ∞ Box 4.1, for the convenience of the reader. In particular we note that the total energy per unit volume 0 dν U (ν, T ) → ∞, for  → 0, as a → ∞, with a ∼ 1/3 ( see Box 4.1, and derived in Eq. (82.9) ), due to the divergence of the integral at high frequencies, referred to as the ultraviolet catastrophe, in a classical description of black-body radiation. 4 This

led to understand as to how the color of a hot body depends on its temperature.

38

4

Transformation Theory and Emergence of  in the Formalism

Box 4.1  and BB radiation: Its consistent numerical evaluation The energy per unit volume per unit frequency of radiation emitted U (ν, T ) has the U (ν, T ) T Fixed following shape for fixed T :

Exponential Damping in ν

Parabolic Growth in ν Area = a T 4 c/λpeak = c T /b

ν

(i) λpeak = c/ν, at which U (ν, T ) has its maximum value is given by: λpeak = b/T, b = 2.898 × 10−3 m-K, where K stands for Kelvin degree, referred to as the Wien Displacement Law, ∞ (ii) dν U (ν, T ) = a T 4 , a = 4.723 × 10−6 GeV/m3 K4 , referred to as the Stefan–Boltzmann Law. 0

It is an empirical fact that for a BB that the constants b & a are independent of the shape and the material of the cavity. That is they cannot depend on characteristics of the material such as charge and mass, etc, and thus may involve a new constant. In particular, we note that (λpeak /c)kB T has dimension of action = [energy] [second], ∝ , where kB is the Boltzmann constant, which is a conversion factor: T ↔ energy. If ε = hν denotes the energy of a photon emitted in the radiation, then we will show in Chap. 82 that b = (2π c)/(4.9651 kB ), a = π 2 k4B /(153 c3 ), from which  = 6.582 × 10−25 GeV s = 1.0546 × 10−34 J s, kB = 8.617 × 10−14 GeV/K = 1.381 × 10−23 J/K.

References 1. 2. 3. 4. 5. 6. 7.

Lewis, G. N. (1926). The conservation of photons. Nature, 118, 874–875. Manoukian, E. B. (2006). Quantum theory: A wide spectrum. Dordrecht: Springer. Schwinger, J. (1970). Particles, sources, and fields (Vol. I). Reading: Addison-Wesley. Schwinger, J. (1991). Quantum kinematics and dynamics. Redwood City: Addison-Wesley. Schwinger, J. (2001). Quantum mechanics. Symbolism of atomic measurements. Berlin: Springer. Wigner, E. P. (1959). Group theory, and its applications to the quantum mechanics of atomic spectra. New York: Academic. Wigner, E. P. (1963). Invariant quantum mechanical equations of motion. In Theoretical physics. Vienna: International Atomic Energy Agency.

5

Quantum Dynamics, Construction of Hamiltonians and Decay of Quantum Systems Prerequisite Chaps. 2, 4

In this chapter we will see how quantum dynamics may be described and how the Schrödinger equation arises by comparing physics in frames F and F as described in Fig. 4.1 of the last chapter. We will also learn how Hamiltonians are constructed consistently in QM. Finally, the decay of a quantum system will be briefly discussed and its associated expression for the energy-time uncertainty principle will be derived. at time t + δτ in frame F (see Fig. 4.1 in previous chapter). For In particular, let |ψt+δτ  denote a state determined  infinitesimal δτ , this state is denoted by |ψ t  = 1 − (i/)δτ H |ψt  in frame F in which a clock reading is set by −δτ units of time back relative to the other frame, i.e., for pure time translation   i |ψ t  = 1 − δτ H |ψt  = |ψt+δτ , or   1 ∂ i |ψt+δτ − |ψt  → |ψt , δτ → 0, from which − H |ψt  =  δτ ∂t ∂ the Schro¨ dinger equation follows i |ψt  = H |ψt , whose solution ∂t  it  is given by |ψt  = exp − H |ψ, where |ψ = |ψ0 . 

(5.1) (5.2) (5.3) (5.4)

Let us summarize what we have learnt about the operators P, X, and the Hamiltonian H from the previous chapter in Eqs. (4.27), (4.31): [ X i , P j ] = i δ i j , [ X i , X j ] = 0, [ P i , P j ] = 0, H=

P2 + H1 , where [ H1 , P j ] = 0, and [ H1 , X j ] = 0. 2M

(5.5) (5.6)

These properties now allow us to construct the Hamiltonians for various physical systems. To the above end, consider a system of n particles of masses m 1 , . . . , m n with associated position vectors X1 , . . . , Xn . Define the center of mass position: X=

n  mα Xα , M α=1

where,

M=

n 

mα .

(5.7)

α=1

We may now repeat the analysis involved in Eq. (4.32)–(4.34) in the previous chapters for the n particles to obtain

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_5

39

40

5

Quantum Dynamics, Construction of Hamiltonians and Decay of Quantum Systems

x1 , . . . xn |ψ = x1 , . . . xn |ψ, ⇒ x1 − δa, . . . , xn − δa|ψ = x1 , . . . xn |ψ, x1 , . . . , xn |U |ψ = x1 + δa, . . . xn + δa|ψ, n

 (U ψ)(x1 , . . . xn ) = ψ(x1 + δa, . . . , xn + δa) = 1 + δa· ∇ α ψ(x1 , . . . , xn ). α=1

But from (4.34) in the previous chapter: U = 1 + (i/)δa·P, from which P=

n n   (−i)∇ α = Pα , in an obvious notation. α=1

(5.8)

α=1

The following commutation relations then consistently follow j

j

[ X αi , X β ] = 0,

[ X αi , Pβ ] = iδ i j δαβ ,

j

[ Pαi , Pβ ] = 0.

(5.9)

where δαβ is the Kronecker delta. For the construction of multi-particle Hamiltonians, the following commutations are useful and are easily obtained:      [ X αi − X i ), P j ] = 0, [ X αi − X βi , P j ] = 0, [ X αi − X βi , X j ] = 0,

  i  j  i mα i i j i i j [ X α − X ), X ] = 0, [ P α − P β , P ] = 0, = 0, P ,X Pα− M

   mα i  P iα − m α − m β ) δi j . P , P j = 0, [ P iα − P iβ , X j ] = −i M M

(5.10) (5.11) (5.12)

Accordingly a very general class of Hamiltonians consistent with Eqs. (5.6), (5.10)–(5.12) are P2 + H1 , 2M n  1 m α 2 P + V, Pα − H1 = 2m α M α=1 H=

V =



n      Vαβ Xα − Xβ + Vα Xα − X .

1≤α P− < P > ≥ 92 /4 is of no value here. © Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_6

1 The

45

46

6

Stability of the H-Atom in Configuration Space

 r ≤r0

d3 x |ψ(x)|2 = Prob[r ≤ r0 ],

(6.6)

denotes the probability that the electron, in a state ψ, at a radial distance r from the proton, satisfies r ≤ r0 . Hence from Eqs. (6.4)–(6.6) P2 1/2 ≥

 Prob[r ≤ r0 ]. r0

(6.7)

On the other hand for β = /a0 , with a0 = 2 /μe2 , denoting the Bohr radius, Eq. (6.3) leads to the following lower bound for the Hamiltonian of the H-atom, well known from your undergraduate studies in QM, in an arbitrary state2 ψ :  2  P μe4 e2 ψ ≥ − ψ − , 2μ |x| 22

(6.8)

establishing the boundedness of the spectrum from below. This inequality is true for state ψ. Since ψ is arbitrary, the latter means, in turn, that we cannot find a state which gives a lower bound to the spectrum of the Hamiltonian in question lower than −μe4 /22 for any parameter μ. In particular, if the state ψ denotes a bound state, we also have  2  P e2 ψ < 0, ψ − 2μ |x|

(6.9)

which upon simply writing P2 /2μ = P2 /4μ + P2 /4μ, it may be equivalently rewritten as  2  2   P P e2 ψ . ψ ψ < − ψ − 4μ 4μ |x|

(6.10)

Now as far as the Hamiltonian [ P2 /4μ − e2 /|x| ] is concerned, any state must satisfy the lower bound Eq. (6.8) with μ in it simply replaced by 2μ for any state, otherwise a state may be found to give a lower bound, contradicting the bound established in (6.8) with such a replacement made defining this latter Hamiltonian. In particular, for a state ψ satisfying Eq. (6.9), this implies that   2 P μe4 e2 ψ ≥− 2 , − ψ 4μ |x| 

(6.11)

which from Eq. (6.10) gives μe4 1  2  ψ P ψ < 2 4μ 

 2  4μ2 e4 ψ P ψ < . 2



(6.12)

Hence we have the key bound for the H-atom as a bound state P2 1/2
− x u (x) ≡ − T  2 , which leads 3 3π 22 3π 22 d x 2 (x)     2 2/3 2 3π 1/3 to a Lieb–Thirring bound : < T, upon solving for T. d3 x 2 (x) m 16Z

for k = 1, 2, . . . , Z , and where a0 = 2 /me2 is the Bohr radius defined for an infinitely heavy nucleus. We may infer from (7.11) that the probability of any of the electrons fall to the center is zero, since the right-hand side of the above upper bound to the probability in question vanishes for R → 0. That is, with 100% probability, none of the electrons fall (R → 0) to the center of multi-electron atoms.7 It is also worth indicating that, in general, a non-vanishing probability density of finding an electron at the origin does not necessarily imply that the probability of finding it there is non zero, as the probability is obtained by multiplying the probability density by the volume element.

References 1. Blaise, P., Henri-Rousseau, O., & Merad, N. (1984). Further comments about the stability of the hydrogen atom. Journal of Chemical Education, 61, 957–958. 2. Lieb, E. H., & Thirring, W. E. (1975). Bound for the kinetic energy of fermions which proves the stability of matter. Physical Review Letters, 35, 687–689. [Errata (1975), 35, 1116(E)]. 3. Luck, W. A. (1985). Why doesn’t the electron fall in the nucleus? Journal of Chemical Education, 62, 914–917. 4. Manoukian, E. B. (2006). Quantum theory: A wide spectrum. Dordrecht: Springer. 5. Manoukian, E. B. (2016). Do atomic electrons fall to the center of multi-electron atoms? Modern Physics Letters, B30, 1650082 [8 p.]. 6. Mason, F. P., & Richardson, R. W. (1983). Why doesn’t the electron fall in the nucleus? Journal of Chemical Education, 60, 40–43. must be stressed that any sharper estimates such as obtaining a larger value than 3/2 for the power of (R/a0 ), or any smaller value than of the coefficient 2.34 in the above inequality, changes in no way the established result.

7 It

7

Do Atomic Electrons Fall to the Center of Multi-electron Atoms?

53

7. Pauli, W. (1925). About the relationship of the closing of electron shells in atoms with the complex structure of the spectrum. Zeitschrift für Physik, 31, 765–779. 8. Schwinger, J. (1961). On the bound states of a given potential. Proceedings of National Academy of Sciences, USA, 47, 122–129. 9. Stoner, E. C. (1924). The distribution of electrons among atomic levels. Philosophical Magazine, 48, 719–725.

8

How Quantum Mechanics Forces Matter in Bulk to Occupy Such a Large Volume and Prevents It from Collapsing Around Us Prerequisites end of Chaps. 2, and 5

In my first year of graduate studies at McGill University in Montreal, I came across a statement originally made, without proof, by Paul Ehrenfest as an address concerning the award of the Lorentz medal to Wolfgang Pauli1 : “We take a piece of metal, or a stone. When we think about it, we are astonished that this quantity of matter should occupy so large a volume”. He went on by stating that the Pauli exclusion principle is the reason: “Only the Pauli Principle, no two electrons in the same state.” This statement always stayed in my mind and an actual demonstration of a quantification of a large volume occupied by matter remained a challenge for me for years to come. This brings me to the subject matter of this chapter. Let me re-iterate what has been said about the subject matter of this chapter in the Introductory one, and certain aspects of it are certainly worth repeating here. One of the greatest problems that quantum mechanics has addressed over the years is the problem as to how matter in bulk does not collapse around us? realizing that classical theory fails to satisfactorily describe, and why it occupies such a large volume? This chapter addresses, in a direct mathematically rigorous and modern way, the fundamental role quantum mechanics plays in this one of the most important problems in physics, and, needless to say, also in chemistry and the sciences, in general. If one prepares a list of the most important problems in quantum theory addressed over the years, the subject matter treated in this chapter will undoubtedly be on it. The chapter deals with the quantum theory of matter in bulk as to why it does not collapse around and of the unusual large volume matter occupies which was clearly emphasized by Ehrenfest as mentioned above. It is amusing to read an article by Paul Sen [14] at the BBC, in which emphasizes once more the large volume that matter occupies. He states: if you were to take out all of the empty space in atoms, and then compress all of the atoms so they were physically touching, all of the human beings on the planet would be about the size of an apple. Invoking the Pauli exclusion principle in the fact that matter does not collapse around us turns out to be not only sufficient but necessary for the problem at hand.2 On the other hand, if the Pauli exclusion principle is not invoked, it is interesting to quote Freeman Dyson3 who states: [such] matter in bulk would collapse into a condensed high-density phase. The assembly of any two macroscopic objects would release energy comparable to that of an atomic bomb... . Matter without the exclusion principle is unstable.” In the translated version of the book by Shin’ichir¯o Tomonaga on spin,4 one reads in the Preface: “The existence of spin, and the statistics associated with it, is the most subtle and ingenious design of Nature - without it the whole universe would collapse.” It is also interesting to recall the role the exclusion principle plays in the periodic table of elements. The Hamiltonian of consideration in this work was given in Eq. (5.21) to be

1 In

Klein [3], p. 617. The binding of the book did not look professionally done and was probably redone by the university library. I found a much better copy of this book several years later at the “Canada Institute for Scientific and Technical Information—CISTI”, in Ottawa, presently known as “The National Science Library (NSL). The problem raised by Ehrenfest stayed with me for nearly forty years and I looked hard to find an actual quantification of his statement of “matter occupying a large volume” and I could find none, not even by Ehrenfest himself. The present chapter is based on a paper: Manoukian and Sirinilakul [9]. [Paul Ehrenfest’s address appeared originally in 1931 in Versl. Akad. Amsterdam 40, pp. 121–126, and published in Klein [3]. 2 Lieb and Thirring [5]. 3 Dyson [1]. 4 Tomonaga [16]. © Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_8

55

56

8

H=

How Quantum Mechanics Forces Matter in Bulk to Occupy …

k N N N  k     pi2 Z j e2 Z i Z j e2 e2 + − + , 2m |xi − x j | i=1 j=1 |xi − R j | i< j |Ri − R j | i=1 i< j

(8.1)

where Z j |e| is bounded in Nature and denotes the charge of a jth positively charged infinitely heavy particle. Moreover, xi , R j , correspond, respectively, to positions of N negatively and k positively charged particles, and m denotes the mass of  the negatively charged particles. We also consider neutral matter, that is, kj=1 Z j = N . The electron density is defined by  

ρ(x) = N

 d x2 . . . d x N |(xσ1 , x2 σ2 , . . . , x N σ N )| , 3

3

2

d3 x ρ(x) = N ,

(8.2)

σ1 ,...,σ N

where (xσ1 , x2 σ2 , . . . , x N σ N ) is a normalized wavefunction, anti-symmetric under the interchange of any pair (xi σi ) ↔ (x j σ j ), and the sums are over spins. As in the stability problem of the H-atom and the multi-electron atoms in the previous two chapters, a key estimate to discuss the above problems is a lower bound of the spectrum of the Hamiltonian of matter derived in a classic paper by Elliott Lieb and Walter Thirring5 :  me4   2 N 1 + Z 2/3 , Z = maxi Z i , (8.3) | H | ≥ −8.3104 2 2 and is referred to as a Lieb–Thirring bound. The numerical value 8.3104, however, is ours and may be further reduced, but this is not important in the present analysis.6 The right-hand side of the inequality in (8.3) provides a lower bound to the spectrum. An upper bound to the ground-energy is readily derived. Consider the following infinitely separated N clusters: k ions (atoms), each in the ground-state, of nuclear charges Z 1 |e|, . . . , Z k |e| having each one electron, and (N − k) free electrons k  with latter having vanishingly small kinetic energies. Formally, the ground-state of such a system is − Z i2 me4 /22 , and i=1 k since Z i2  Z i , i=1 Z i = N , this gives  me4  N, (8.4) EN ≤ − 22 thus establishing the N power law behavior of the ground-state energy E N ∼ −N , with a finite (negative) numerical coefficient. As we will elaborate on this behavior further below at the end of the chapter, this establishes the stability of matter, i.e., it shows as to why matter does not collapse around us. In Box 8.1, an upper bound of the average kinetic T energy of the electrons in matter, with the latter given by T =

 

d3 x1 . . . d3 x N ψ ∗ (x1 σ1 , . . . , x N σ N )

σ1 ,...,σ N

is derived and given by

5 Lieb

N  −2 ∇i2 ψ(x1 σ1 , . . . , x N σ N ), 2m i=1

 me4   2 N 1 + Z 2/3 , T < 16.63 2 

(8.5)

(8.6)

and Thirring [5]; See also Thirring [15].

6 The derivation of the lower bound in Eq. (8.3) is beyond the scope of this book. A pedagogical derivation of it may be found in my book: Manoukian

[10], pp. 765–776.

8

How Quantum Mechanics Forces Matter in Bulk to Occupy …

57

Box 8.1 Upper Bound for T Let |ϕ(m) denote a normalized strictly negative energy state of matter. That is, −ε N [m] ≤ ϕ(m)| H |ϕ(m) < 0 (∗ ), where −ε N [m] = E N < 0 denotes the lower end of the spectrum, and we have emphasized its dependence on the mass m. By definition of the ground-state, |ϕ(m/2) cannot lead for ϕ(m/2)| H |ϕ(m/2) a numerical value lower than −ε N [m] for the same Hamiltonian with mass m. That is, −ε N [m] ≤ ϕ(m/2)| H |ϕ(m/2), where we note that the interaction part V in the Hamiltonian in (8.1) is independent of the mass scale m. Accordingly, we may rewrite the latter equation in details as :   2    pi /2m + V ϕ(m/2) . This equation, in turn implies that for −ε N [m] ≤ ϕ(m/2) i≤N   2    pi /4m + V ϕ(m) . Upon using the r.h.s of the m → 2m : −ε N [2m] ≤ ϕ(m) i≤N

inequality (∗ ) above, as well as the latter inequality we obtain:

 pi2   1 ϕ(m) = ϕ(m) 2 2m

i≤N     pi2   pi2   + V ϕ(m) − ϕ(m) + V ϕ(m) ≤ ε N [2m] for all such states ϕ(m). ϕ(m) 2m 4m i≤N i≤N  2me4    me4    2 2/3 2 That is T ≤ 2 ε N [2m] = 2 × 8.3104 N 1 + Z N 1 + Z 2/3 . < 16.63 22 2

while in Box 8.2, a lower bound7 to T is derived expressed in terms of the electron density ρ: 3  3π 2/3 2 5 4 2m

 d3 x ρ 5/3 (x) ≤ T.

(8.7)

From the bounds in Eqs. (8.6) and (8.7), the following key bound for integral of some power of the particle density ρ(x) emerges:  d3 x ρ 5/3 (x) < 32

m 2 e4 N [1 + Z 2/3 ]2 . 4

(8.8)

Now let x denote the position of an electron relative, for example, to the center of mass of the nuclei, recalling that the Pauli exclusion was invoked in deriving the bound of the power of the electron number-density in (8.8). Let χ R (x) = 1, if x lies within a sphere of radius R, and = 0, otherwise.

(8.9)

Then clearly for the probability to have the electrons within a sphere of radius R, we have  1 d3 x χ R (x) ρ(x) Prob |x1 | ≤ R, . . . , |x N | ≤ R ≤ Prob[ |x1 | ≤ R = N  3/5 1 d3 x ρ 5/3 (x) ≤ (v R )2/5 , N 





(8.10)

7 The lower bound makes use of an elegant bound derived by Schwinger [13], which is easily stated, as spelled out in Box 8.2, but its proof is beyond

the scope of this book. For a complete and a pedagogical proof of the Schwinger bound and its generalizations, see, Manoukian [10], pp. 206–212, 216, 217. The explicit expression used is given in Eq. (4.5.92), p. 217 of the book.

58

8

How Quantum Mechanics Forces Matter in Bulk to Occupy …

Box 8.2 Lower Bound for T in terms of the electron density ρ in Eq. (8.2) First we may, in the process, make use of a Schwinger bound for the following Hamiltonian: 3/2

 h 0 = p2 /2m − u(x), u(x) ≥ 0, given by: h 0 ≥ − 4/15 π 2m/2 d3 x (u(x))5/2 . This   pi2 /2m − u(xi ) , allows us to introduce the following hypothetical Hamiltonian: h = i≤N  5 ρ 2/3 (x) pi2 /2m |. It is easily verified that : where u(x) =  3 5/3 T (∗ ), T = | 3 d x ρ (x ) i≤N      u(xi )  = 5/3 T. Hence|h| = −(2/3) T (∗∗ ). Allowing multiplicity and spin i≤N

degeneracy, we can put the N fermions in the lowest energy levels of the hypothetical Hamiltonian h in conformity with the Pauli exclusion principle, if N ≤ number of such levels. If N is larger than this number of levels, the remaining free fermions may be chosen to have arbitrary small (→ 0) kinetic energies, and be infinitely separated, to define the lowest energy of this Hamiltonian. Hence in all cases, this Hamiltonian is bounded below by 2, for allowing spin orientations, times the sum of the negative energy levels of the Hamiltonian h 0 allowing in the sum for multiplicity but not for spin degeneracy. This gives the bound:   3/2

  h  ≥ −2 4/15 π 2m/2 d3 x(u(x))5/2 . Using Eqs. (∗ ), (∗∗ ) above this leads to  −3/2 2 4  2m 3/2  5 5/2 5/2  the inequality − T ≥ −2 T , which upon d3 x ρ 5/3 (x) 2 3 15 π  3    2 3 3π 2/3  d3 x ρ 5/3 (x) ≤ T. solving for T gives: 5 4 2m

where in the last inequality, we have use a Hölder’s inequality which generalizes the Schwartz inequality and states, in particular, that for two positive real functions f 1 (x), f 2 (x), and real numbers a > 0, b > 0, such that (1/a) + (1/b) = 1,  d x f 1 (x) f 2 (X) ≤ 3



3

d x

f 1a (x)

1/a  

d3 x f 2b (x)

1/b

,

(8.11)

with a = 5/3, b = 5/2, (1/a) + (1/b) = 1, and the fact that χ R (x)2/5 = χ R (x), and that v R = 4π R 3 /3. From (8.8) and (8.10), we have the fundamental inequality  1 2/5   N 2/5 Prob |x1 | ≤ R, . . . , |x N | ≤ R 1, where (N + N ) denotes the number of the negatively charged particles plus an equal number of positively charged particles.9 This behavior for “bosonic matter” is unlike that of matter, with the exclusion principle, for which10 α = 1 as we have seen from Eqs. (8.3) and (8.4). A power law behavior with α > 1, implies instability, as the formation of a single system consisting of (2N + 2N ) particles is favored over two separate systems brought together each consisting of (N + N ) particles, and the 8 This

was derived in: Manoukian and Sirininlakul [9]. See also Manoukian [12] for a pedagogical treatment. Dyson [1]; Dyson and Lenard [2]; Lenard and Dyson [4]; Lieb [6]; Manoukian and Muthaporn [7]; Manoukian and Sirinilakul [8]. 10 This is the key result established rigorously in the monumental paper by Lieb and Thirring [5]. 9 See:

8

How Quantum Mechanics Forces Matter in Bulk to Occupy …

59

energy released upon collapse of the two systems into one, being proportional to [(2N )α − 2(N )α ], will be overwhelmingly large11 for realistic large N , e.g., N ∼ 1023 .

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.

Dyson, F. J. (1967). Ground-state energy of a finite system of charged particles. Journal of Mathematical Physics (NY), 8, 1538–1545. Dyson, F. J., & Lenard, A. (1967). Stability of matter. I. Journal of Mathematical Physics (NY), 8, 423–434. Klein, M. J. (Ed.). (1959). Paul Ehrenfest: Collected scientific papers. Amsterdam: North-Holland. Lenard, A., & Dyson, F. J. (1968). Stability of matter. II. Journal of Mathematical Physics (NY) 9, 698–709. Lieb, E. H., & Thirring, W. E. (1975). Bound for the kinetic energy of fermions which proves the stability of matter. Physical Review Letters, 35, 687–689, [Errata (1975), 35, 1116(E).] Lieb, E. H. (1979). The N 5/3 law for bosons. Physics Letters, 70A, 71–73. Manoukian, E. B., & Muthaporn, C. (2002). The collapse of “bosonic matter”. Progress of Theoretical Physics, 107, 927–939. Manoukian, E. B. & Sirininlakul, S. (2004). Rigorous lower bounds for the ground state energy of matter. Physics Letters, 332, 54–59, [Errata (2004). 337A, 496(E).] Manoukian, E. B., & Sirininlakul, S. (2005). High density limit and inflation of matter. Physical Review Letters, 95, (190402), 1–3. Manoukian, E. B. (2006). Quantum theory: A wide spectrum. Dordrecht: Springer. Manoukian, E. B., Muthaporn, C., & Sirininlakul, S. (2006). Collapsing stage of “bosonic matter”. Physics Letters, 352A, 488–490. Manoukian, E. B. (2013). Why matter occupies so large a volume? Communications in Theoretical Physics, 60, 677–686. Schwinger, J. (1961). On the bound states of a given potential. Proceedings of National Academy of Sciences, USA, 47, 122–129. Sen, P. (2007). You can’t see the atom. Published by BBC News: 2007, News Front Page. UK. Thirring, W. E. (Ed.). (2005). The stability of matter: From atoms to stars, Selecta of E. H. Lieb (4th ed.). Heidelberg: Springer. Tomonaga, S.-T., & (Translator T. Oka). (1997). The story of spin. Chicago: University of Chicago Press.

11 For

the collapsing stage of “bosonic matter” see: Manoukian, Muthaporn and Sirininlakul [11].

9

Schrödinger’s Cat and Quantum Decoherence Prerequisites Chaps. 2, 3

The “Schrödinger’s cat” paradox arises when one is dealing with macroscopic systems due to interference terms associated with the superposition principle, common in quantum physics, as we have encountered in Chap. 3. Schrödinger’s thought experiment of 1935,1 consists, in a simple description, of a vessel containing a live cat coupled by a lethal device to a radioactive substance. If the latter decays, this triggers the device to release a deadly gas and the cat dies. On the other hand, if no decay occurs, the cat lives (see Fig. 9.1). With the radioactive decay law obeying the probabilistic rules of quantum physics, a decay may or may not occur in the vessel within a given time specified by the “experimentalist” which depends on the half-life of the radioactive substance. According to the principle of superposition of quantum physics of adding amplitudes, the state of the combined system inside the vessel will have the form   |Φ = |no decay, cat alive a+ + |decay, cat dead a− ,

(9.1)

with some coefficients a± , consistent of the proper normalizability of the state Φ. In one’s usual perception of the macroscopic world, the cat is either alive or dead and not in a superposition of the two states. The above state leads to a density operator given by  |ΦΦ| = |no decay, cat aliveno decay, cat alive| |a+ |2  + |decay, cat deaddecay, cat dead | |a− |2  ∗ + |no decay, cat alivedecay, cat dead| a+ a−  ∗ + |decay, cat deadno decay, cat alive| a+ a− .

(9.2)

The expressions within the second set of square brackets in (9.2) involve unrealistic interference terms between cat dead and cat alive states. Although such interference terms are common in the quantum microscopic world as seen, e.g., in Chap. 2, they are problematic in the interpretation of macroscopic phenomena. Unless one looks into the vessel the cat is in superposition of alive and dead states. One may argue, however, that macroscopic systems involving meters for measurements in the real world are never in isolation from the environment and are coupled to it. The environment surrounding a physical system under study consists of everything else monitoring observables being measured and provides, as one may argue, the different alternative readings of a meter being sought.2 This gives rather a natural way of producing classical correlations between a system and the meter (detector) destroying quantum interference terms. The destruction of such interference terms, referred to as quantum decoherence, ensures that the system is in one of its alternative states rather than in a superposition of them.3 Modelings of the environment as coupled to a physical system under consideration may be argued to destroy such interference terms as the ones in the second set of square brackets in (9.2) giving rise only to the expressions within the first set of square brackets in conformity with one’s classical conception of measurements in the macroscopic world. We describe a simple model of generating a Schrödinger cat-like state and then show how quantum decoherence sets in destroying interference terms. 1 Schrödinger

[7]. See also Polkinghorne [5]. [10]. 3 Decoherence was introduced by Zeh [9]. For recent extensive studies of decoherence see, e.g., the following books: Joos [2]; Schlosshauer [6]. 2 Zurek

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_9

61

62

9

a

Schrödinger’s Cat and Quantum Decoherence

b

Fig. 9.1 a In the vessel no decay has occurred and the cat is alive. b A decay has occurred and the cat dies. Unless one looks into the vessel, should one assume that the cat is in a superposition of cat alive/cat dead states leading to interference terms between these cat dead and cat alive states?

The modeling is based on an experiment carried by M. Brune et al.4 in which the coupling of a two-level atom with a few photon coherent field in a cavity was considered generating a Schrödinger cat-like state of radiation, and the quantum decoherence in a measurement process was observed for such a state. The decoherence is considered as being due to dissipation corresponding to absorption by the cavity walls. One may model the experiment by considering the coupling of a spin 1/2 system and a harmonic oscillator, in a coherent state, to generate a “Schrödinger cat state”:      1  1 0 |α + | − α , |φ = √ 0 1 2

(9.3)

where (1 0) , (0 1) correspond to two states of an atom, while the coherent states |α, | − α correspond to two configurations of radiations. Intricacies of the harmonic oscillator are spelled out in Box 9.1, while those of coherent states are given in Box 9.2. The state in (9.3) is referred to as an entangled state in the sense that it cannot be written as the product of two states with one involving spin states and the other involving just radiation states.5 The spin states (1 0) , (0 1) correspond to the radioactive substance in which no decay/a decay has occurred, while |α/| − α correspond to the cat alive, cat dead, respectively. A key result in a coherent state description of the harmonic state is that the expectation values of the position and momentum operators with respect to a coherent state coincide with the corresponding classical solutions of a harmonic oscillator (see Box 9.2). This also true for the Hamiltonian of the harmonic oscillator up to the so-called familiar zero point energy of a harmonic oscillator as also given in Box 9.2. Thus coherent states allow us to bring the quantum mechanical treatment of the harmonic oscillator into very close proximity with its classical counterpart description and a contact with classical notions is most suitable for the experimentalist. To generate the Schrödinger cat-like state in (9.3), we consider as a first stage, the interaction of the spin  of a spin 1/2 particle, in an initial state (1 0) with a constant magnetic field B = B(0, 1, 0), with B > 0, for a time t = π m c/(2|e|B) with Hamiltonian H1 = −μ · B,6 where μ denotes the magnetic dipole moment of the electron given by μ = (e/2mc)σ (see Box 9.3). This generates the state      

it 1 1 1 1 0 1 + =√ , (9.4) exp − (−μ · B) =√ 0 1 0  2 2 1 4 Brune et al. [1]. For another interesting experiment see, e.g: Monroe et al. [4]. The latter experiment involved in preparing an atom in a superposition

√ of two spatially separated but localized wavepackets, thus creating a state |φ = [|x1 | ↑ + |x2 | ↓] / 2, where |x1 , |x2  refer to wavepacket states corresponding to separated positions of the atom, and | ↑, | ↓ refer to internal states of the atom. The extension of the wavepackets was about 7 nm, and the separation between the wavepackets was not smaller than the rather macroscopic distance of 80 nm which is large in comparison to the atomic dimension of the order 0.1 nm. 5 That is, it cannot be written as (a (1 0) + a (0 1) )(b |α + b | − α), as is easily verified. 1 2 1 2 6 If you are not familiar with the expression of this interaction see Box 9.3 for all the details needed to see how such a Hamiltonian arises from the interaction of spin with a magnetic field.

9

Schrödinger’s Cat and Quantum Decoherence

63

where μ · B = (e/2m c)Bσ2 , and where we have used the identity involving the Pauli matrices

ϕ ϕ ϕ exp i n · σ = cos + i [ n · σ ] sin , 2 2 2

(9.5)

for a given unit vector n. As a second stage, we consider the interaction of the spin in the “initial state”, given now on the right-hand side of (9.4) with a harmonic oscillator taken in an initial coherent state | − i α0 , with α0 being real and positive. The Hamiltonian will be taken in the simple form (9.6) H2 =  ωa † a − λσ3 a † a, λ > 0, operating for a time t = ( π/2λ), where a † , a are the creation, annihilation operators of a harmonic oscillator (see Box 9.1), and parameter λ. This generates the state      

1 1 i it 1 1 exp − t H √ | − iα0  = √ exp − (ω − λ)a † a | − iα0  0   2 1 2    

it 0 + exp − (ω + λ)a † a | − iα0  1     

1 1 0 |α + | − α , =√ 0 1 2 α = e−iφ α0 ,

φ=

ω π , λ 2

Box 9.1 The harmonic oscillator You were undoubtedly introduced to the intricacies of the harmonic oscillator in your undergraduate training. Briefly, the Hamiltonian of a harmonic oscillator in one dimension is defined by: H = −(2 /2m)(∂/∂ x)2 +mω2 x 2 /2 in a standard notation. By defining the  √ scaled variables: X = mω/ x, P = (1/ mω)(− i  ∂/∂ x), and introducing the operators   √ √ a = (X +iP)/ 2, a † = (X −iP)/ 2, x = /2m ω [a † + a], p = m ω/2 [a † − a], The Hamiltonian takes the simple form H =  ω(a † a +1/2), satisfying the eigenvalue equation √ √ H |n =  ω(n +1/2)|n, n |n = δn, n , where a † |n = n + 1|n + 1, a|n = n|n − 1.  √  √ n |x(t)|n = /2mω n + 1 δn , n+1 eiωt + n δn , n−1 e−iωt , t ≥ 0, (∗ )  √  √ n | p(t)|n = i mω/2 n + 1 δn , n+1 eiωt − n δn , n−1 e−iωt , t ≥ 0, (∗∗ ) which are non-diagonal and any resemblance to the well known classical solutions xc (t) = |A| cos(ωt − δ), pc (t) = −|A| sin(ωt − δ) of a harmonic oscillator is quite remote. A coherent state |z, is defined in such a way that the expectation values z|x(t)|z ≡ xc (t), z| p(t)|z ≡ pc (t), coincide with the classical solutions and z|H |z is as close as possible to the classical Hamiltonian Hc = |A|2 (mω2 /2). More precisely z|H |z = Hc +( ω/2), taking into account of the zero point energy (ω/2). The coherent states are special linear  combinations of the states: |n, and are given in Box 9.2, for z = mω/2 |A| eiδ .

(9.7)

(9.8)

64

9

Schrödinger’s Cat and Quantum Decoherence

Box 9.2 Coherent states of the harmonic oscillator Using the explicit expressions for n |x(t)n, n | p(t)n, H |n = ω(n + 1/2)|n, n |n = δn ,n given in Box 9.1, it is easy to verify that the linear combination of the states |n : |z given by ∞ ∞  −iω t n   ze (z)n 2 2 |n, satisfy |z = e−|z | /2 √ |n, |z ; t = e−i t H/ |z = e−|z | /2 e−iω t/2 √ n! n! n=0 n=0 z|x(t)|z ≡ xc (t), z| p(t)|z ≡ pc (t), coincide with the classical solutions and z|H |z is as close as possible to the classical Hamiltonian Hc = |A|2 (mω2 /2) : z|H |z = Hc +( ω/2),  provided z = mω/2 |A| eiδ, by using the equations labeled by (∗ ), (∗∗ ) in Box 9.1, and the fact ∞  2 (|z|2 /n!) = e|z | . The following properties should be noted: that     (2) z |z = exp z ∗ z − |z|2 /2 − |z |2 /2 , (3) −z|z = exp − 2 |z|2 ,   (4) |x|z; t|2 = mω/π  exp − mω(x − xc )2 / ] (i.e. it is normally distributed about the n=0

(1) z|z = 1,

classical solution). We have seen that the coherent states are special linear combinations of the states |n. The coherent states allow us to bring the quantum mechanical treatment of the harmonic oscillator into very close proximity with its classical counterpart description and a contact with classical notions is most suitable for the experimentalist. For more details on coherent states of the harmonic oscillator see, e.g., Manoukian [3], Sect. 6.6

and we have used the property of the general operator exp[A a † a], acting on a coherent state, for an arbitrary numerical coefficient A described in Box 9.1. This second stage mimics an interaction which finally gives Box 9.3 How does the interaction term between the spin and the magnetic field arise? The interaction term between, say, of the electron (spin 1/ 2) and an external magnetic field B arises in the following manner: Consider the kinetic energy operator in a Hamiltonian p2 /2m of the particle of mass m and charge e. Let A denote an external vector potential. The magnetic field is then given by B = ∇×A, and in the Coulomb gauge, ∇ · A = 0. Now one makes the so-called minimal coupling substitution: p → p − e/cA, where c is the speed of light. Upon using the identity involving the Pauli matrices: σi σ j = δi j + i εi jk σk , i, j, k = 1, 2, 3, where εi jk is totally anti-symmetric in its indices, and ε123 = +1, the quadratic term p2 in the kinetic energy operator undergoes the following transformation: p2

 

2

 ≡ pi p j σi σ j → pi − e/cAi p j − e/cA j δi j + i εi jk σk = p − e/cA − (e/c)σ · ∇×A ,  where, we have used the facts that p = −i  ∇, p · A = 0, εi jk ∇i A j ≡ ∇×A k . All told

2 the following transformation arises in the kinetic energy operator: p2 /2m → p − e/cA /2m − μ · B, where e < 0 is the charge of the particle, μ = (e/mc) S is called the magnetic dipole moment, and S =  σ /2 is the spin of the particle. This leads to the interaction term of the spin of the particle with a magnetic field in a Hamiltonian given by − μ · B as given above Eq. (9.4).

rise to the correlation between the radioactive substance and the cat as discussed below (9.3), [or formally the interaction of an atom in a superposition of two of its levels with a coherent field in a cavity.] The combined system generated in (9.7) is now considered to be exposed to the environment. The environment will be described by a harmonic oscillator of several degrees of freedom. The time development of the system in (9.7) exposed to the environment will be taken to develop via the Hamiltonian:

9

Schrödinger’s Cat and Quantum Decoherence

65

H = H0 + HI , H0 = HI = a †





 ωk ak† ak ,

(9.9)

k

λk bk + a



k

λ∗k bk† ,

(9.10)

k

omitting, for simplicity, the zero point energy, where HI describes the interaction of the meter with the environment, and bk , bk∗ denote the annihilation, creation operators associated with the various degrees of freedom of the environment. We consider a continuous variable extension of the couplings λk to be localized about the angular frequency ω of the meter, and express it as λ(ω). More precisely one may introduce the density of such states n(ω), replacing the summations in (9.9), (9.10) over k by an integral, and thus introduce, in the process, the parameter γ =

2π |λ(ω)|2 n(ω). 2

(9.11)

For weak couplings and time short enough such that the oscillator (the meter) has not changed much, we may take γ t/2  1.

(9.12)

We define the initial states of the meter and the environment: |φ, |φ0E , as well as the density operator   ρT (t) = e−i t H/ ρT (0)eit H/ , ρT (0) = |φφ| |φ0E φ0E | .

(9.13)

The initial state of the environment is taken to be in the ground state |φ0E  = |0E . Hence in particular E 0|ak |0E

= 0,

† E 0|ak |0E

= 0.

(9.14)

We take the trace over the environment (E) variables thus defining the density operator of physical interest, pertaining to the meter-physical system in response to the environment, by   ρ(t) = TrE ρT (t) .

(9.15)

To solve for ρ(t), one may introduce, in the process, the operator η(t) = ei t H0 / ρT (t)e−i t H0 / ,

 TrE η(t)] = ρ(t),

(9.16)

where η(t), from the expression of ρT (t) in (9.13), satisfies the differential equation i

d η(t) = [HI (t), η(t)], dt

HI (t) = ei t H0 / HI e−i t H0 / .

(9.17)

A weak coupling approximation of the meter to the environment is described in Box 9.4, with the condition γ t/2  1 in (9.12) satisfied, which gives rise to the following interesting expression for ρ(t) in Eq. (9.16) for the density operator of interest given by

66

9

Schrödinger’s Cat and Quantum Decoherence

Box 9.4 Quantum decoherence: Interaction of the meter with the environment The differential equation i /dη(t)/dt = [H I (t), η(t)] given in Eq. (9.17) may be solved to second  t  t      i t  1 order given by η(t) = η(0) − dt H I (t ), η(0) − 2 dt dt H I (t ), H I (t ), η(0) ,  0  0 0   where from Eqs. (9.13) and (9.16): η(0) = |0E E 0| |φφ| ≡ ρE (0) ρS (0), respectively. We introduce the operator Q(t) : ρ(t) = TrE [ η(t) ] = Q(t)ρS (0), where in detail  t  t       i t 1 Q(t)ρS (0) = ρS (0) − dt TrE H I (t ), ρE (0)ρS (0) − 2 dt dt TrE H I (t ), H I (t ), ρE (0)ρS (0) .  0  0 0  t    1 ˙ Q(t)ρ dt TrE H I (t), H I (t ), ρE (0)ρS (0) , where we have used Eq. (9.14) to set S (0) = − 2  0   ˙ is of second order. − (i/) TrE H I (t), ρE (0)ρS (0) = 0 in the above equation. We note that Q(t)   −1 ˙ ˙ ˙ Moreover dρ(t)/dt = Q(t)ρ (t) ρ(t)  Q(t)ρ(t), by replacing Q −1 (t) by the identity S (0) = Q(t)Q   −1 ˙ in the operator Q(t)Q (t) to second order. Accordingly, the following approximate differential equation is often taken in the literature as an approximation for the (reduced, i.e. traced) density operator ρ(t) in applications in the presence of the environment:  t    1 dt TrE H I (t), H I (t ), ρ E (0)ρ(t) . ρ(t) ˙ =− 2  0 The analysis involved in solving this equation leading to the solution given in Eq. (9.18) is tedious, and for details of the underlying analysis see Sect. 12.7 of Manoukian [3]. See also Walls and Milburn [8].

    t t t t 10 00 |α e− 0 I α e− 0 I | + |c− |2 | − α e− 0 I  −α e− 0 I | ρ(t) = |c+ |2 00 01    

t t t t 2 01 00 ∗ ∗ |α e− 0 I −α e− 0 I |+ c+ | − α e− 0 I α e− 0 I | e−2|α0 | γ t, + c+ c− c− 00 10 where

1 I (t) = 2 













t

dω |λ(ω )| n(ω )

0

2





dt ei(ω−ω )(t−t ) .

(9.18)

(9.19)

0

The expressions within the square brackets in (9.18) give rise to “cat-alive/cat-dead" interference terms. The overall exponential term exp[−2α0 γ t] multiplying these expressions within the second square brackets gives rise to a very welcome damping of them, relative to the first two terms in (9.18), for |α0 |2 =

1 |A|2 mω  , 2 2γ t

(9.20)

where α0 , takes on a macroscopic value, and is given in Box 9.2, line 6 with |α0 |2 ≡ |z|2 , and is arbitrary large for a classical amplitude of oscillations |A|, thus washing away the non-diagonal interference terms relative to the to the diagonal ones in Eq. (9.18). This establishes quantum decoherence setting in on a decoherence time scale ∼ 1/γ |α0 |2 .

References 1. Brune, M., et al. (1992). Manipulation of photons in a cavity by dispersive atom-filled coupling. Quantum-nondemolition measurements and generation of “Schrödinger cat states”. Physical Review A, 45, 5193–5214. 2. Joos, E., et al. (2003). Decoherence and the appearance of a classical world in quantum theory. Berlin: Springer. 3. Manoukian, E. B. (2006). Quantum theory: A wide spectrum. Dordrecht: Springer. 4. Monroe, C., et al. (1996). A “Schrödinger cat” superposition state of an atom. Science, 272(5265), 1131–1136. 5. Polkinghorne, J. C. (1985). The quantum world. New Jersey: Princeton University Press.

References

67

6. Schlosshauer, M. A. (2010). Decoherence and the quantum-to-classical transition. Berlin: Springer. 7. Schrödinger, E. (1935). The present situation in quantum mechanics. Naturwissenschaften, 23, 807–812. (English translation: Trimmer, J. D. (1980) in Proceedings of the American Philosophical Society, 124, 323–338.) 8. Walls, D. F., & Milburn, G. J. (1985). Effect of dissipation on quantum coherence. Physical Review A, 31, 2403–2408. 9. Zeh, H. D. (1970). On the interaction of measurement in quantum theory. Foundations of Physics, 1, 69–76. 10. Zurek, W. H. (1991). Decoherence and the transition from quantum to classical. Physics Today, 44, 36–44.

Quantum Teleportation

10

In its simplest form, quantum teleportation is the transfer of a quantum state of a particle to another distant particle without traversing the distance between them. The method relies on such fundamental and mysterious aspects of quantum theory such as entanglement, introduced earlier in the previous chapter in studying the Schrödinger’s cat problem, and on the general basic fact that a quantum system may be in a superposition of different states. The word teleportation was coined by the writer Charles Fort in 1931.1 To realize such a transfer of a quantum state of a particle, call it particle 1, to another distant particle, call it particle 3, we will need an additional particle, call it particle 2. Suppose a person, called Grace, prepares particle 1 of spin 1/ 2 in a state   α , β 1

|α|2 + |β|2 = 1,

(10.1)

and she wants another person, called Cary, at a distant location, to have eventually a particle (particle 3) in the above state (α β) 3 . Initially Grace has particle 1 and Cary has no particles. Now suppose a third person, called Alfred has two particles (particle 2 and particle 3), each of spin 1/ 2, and he prepares these two particles initially in the entangled state2  − Ψ  = √1 23 2

         0 1 0 1 , − 1 2 0 3 0 2 1 3

(10.2)

where recall that entanglement means that the above state cannot be written as the product of two states with one involving spin states of particle 2 and the other involving just spin states of particle 3.3 That is, our initial state involving particles 1, 2 and 3 is    − α Ψ . (10.3) β 1 23 Now Alfred sends particle 2 to Grace, and particle 3 to Cary. To find the final state of the three particles, we introduce, the following four entangled states involving particles 1 and 2 which are now with Grace           ± 0 1 0 1 ψ  = √1 , ± 12 1 1 0 2 0 1 1 2 2           ± 0 0 1 1 Φ  = √1 . ± 12 1 1 1 2 0 1 0 2 2

(10.4) (10.5)

Now it is straightforward but involves some algebra to show from (10.3)–(10.5) that we may write

1 See

Michell [4]. the previous chapter we have learnt how to prepare an entangled state.    3 That is, it cannot be written as a (1 0) + a (0 1) b (1 0) + b (0 1) , as is easily verified. 1 2 1 2 2 2 3 3 2 In

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_10

69

70

10

     − α α Ψ  = − 1 β 1 23 β 3 2   β − α 3

 − Ψ  + 12  − Φ  + 12

 

−α β −β α



Quantum Teleportation

 + Ψ  12

3

 + Φ  , 12

(10.6)

3

where the left-hand side represents the initial state in (10.3). − 4 Now all Grace has to do is to put particles 1 and 2 in the state |Ψ 12 . Then QM says that Cary’s particle will be in the state    ±  ± (α β) 3 . If it is unknown in which of the four states Ψ12 , Φ12 Grace’s particles 1 and 2 have been projected, then there is a 25% probability that Cary’s particle would be found in the state (α β) 3.   Ψ + , OR Φ − OR Φ + , instead of the state Most interestingly, if Grace’s measurement yields the entangled states 12 12 12  − Ψ , then Grace all she has to do is to call Cary, e.g., by phone and ask him to apply a magnetic field to his particle in the 12 directions z OR x OR y, respectively, for specific periods of time to generate the state (α β) 3 , in question by applying, in the process, the unitary operator in Eq. (9.4) of the last chapter (Chap. 9):   it exp − (−μ · B) , where μ ∝ σ , B = B n and use, respectively,  π e.g., exp −i σ j = −i σ j , j = 3 or 1 or 2, and, 2            −α β −β α α α =− , σ1 = , σ2 = −i , σ3 β 3 α 3 α 3 β 3 β 3 β 3

(10.7) (10.8) (10.9)

all leading to the desired state (α β) 3 , where the σ j are the Pauli matrices, thus completing the process of teleportation to the state in question. The pioneering work on teleportation was carried out by C. H. Bennett et al.5 in 1993 proposing of the teleportation of a quantum state. Several interesting experiments have been recently carried out on teleportation.6 In particular, the recent experiment by J.-G. Ri et al. [7] involved in the transportation of the polarization of a photon to another one at a large distances of 1400 km in comparison to earlier investigations of relatively shorter distances. It has even been proposed for the teleportation of energy.7 As we have seen, teleportation uses and relies on very basic facts of quantum physics, and the interest in this area of investigations will undoubtedly continue and will possibly lead to many surprising results in the future.

References 1. Bennett, C. H., et al. (1993). Teleporting an unknown quantum state via dual clasical and Einstein-Podolsky-Rosen channels. Physical Review Letters, 70, 1895–1899. 2. Bouwmeester, D., et al. (1997). Experimental quantum teleportation. Nature, 390, 575–579. 3. Hotta, M. (2008). A protocol for quantum energy distribution. Physical Letters A, 372, 5671–5676. 4. Michell, J., & Richard, B. (2000). The rough guide to unexplained phenomena. London: Rough Guides Ltd. 5. Miranowicz, A., & Tamaki, K. (2002), An introduction to quantum teleportation. Mathematical Sciences (Suri-Kagaku), 473, 28–34. 6. Nielsen, M. A., et al. (1998). Complete quantum teleportation using nuclear magnetic resonance. Nature, 396, 52–55. 7. Ri, J.-G., et al. (2017). Ground-to-satellite quantum teleportation. Nature, 549, 70–73. 8. Valivarthi, R., et al. (2016). Quantum teleportation across a metropolitan fibre network. Nature Photonics, 10, 676–680. 9. Wei, Y. (2016). How to teleport a particle rather than a state. Physical Review E, 93, 066103. 10. Yin, J., et al. (2012). Quantum teleportation and entanglement over 100-kilometer free space channels. Nature, 488, 185–188.

4 We

have seen in the previous chapter how entangled states may be generated. et al. [1]. 6 See, e.g., Bouwmeester et al. [2]; Nielsen et al. [6]; Miranowicz and Tamaki [5]; Yin et al. [10]; Valivarthi et al. [8]; Ri et al. [7]. 7 Hotta [3]. See also: Wei [9]. 5 Bennett

Lorentz Frames and Minkowski Spacetime Prerequisites Chaps. 1 and 4

11

In order to describe, in particular, the dynamics of particles at sufficiently high energies,1 as imposed by Nature, and due to the exchange that takes place between energy and matter, which allows, in turn, the creation of an unlimited number of particles and for which the number of particles in a given process may not be necessarily conserved, and particles may travel at speeds comparable to the speed of light, the need arises to describe physics in the so-called relativistic regime. This is carried out by the introduction of the concept of Minkowski spacetime of special relativity which, in turn, avoids instantaneous action at a distance of spatially separated different particles and objects. This is the subject matter of the present chapter and the following three chapters (Chaps. 12–14). In Chap. 14, we will learn, notably, that Maxwell’s equations of electrodynamics, for example, satisfy the criteria of relativistic physics.2 A frame in which a particle which is initially at rest in it remains at rest, or a particle in it which is initially in uniform motion continues in uniform motion, unless they are acted upon by an external force, is referred to as an inertial frame of reference (IFR). An IFR in uniform motion w.r.t. an IF is also an IF. Thus IFs should be all equivalent in describing physical laws. This is referred to as the principle of relativity. Taken as a physical law, the speed of light is vacuum is taken to be the same in all IFRs, independent of the speed of its source and independent of its direction of propagation.3 The principle of relativity and the universality of the speed of light in IFs constitute the postulates of the theory of special relativity (SR) introduced by Albert Einstein in 1905.4 If the speed of light determined in two different IFs were different this would allow one to distinguish between the two IFs thus violating the principle of relativity. The modern definition is that the speed of light is an absolute constant c = 2.99792458×108 m/s with the meter defined as the distance traveled by light in 10−8 /2.99792458 s. Spacetime consists of all possible events. An event E, such as an explosion, in spacetime occurs independently of any coordinate system that one may set up. In a given coordinate system, associated with a given inertial frame, an event E is labeled by coordinates (t, r) stating when and where the event has occurred in the given coordinate system. Different inertial frames give different labeling for the same event in spacetime. Taking Maxwell’s equations as the basis for a consistent development of the SR theory having electromagnetic waves propagating with a constant speed equal to c, the correct relationship which exists between such different labeling in these different coordinate systems, with corresponding labeling of the variables involved, is provided by the so-called Lorentz transformations (LT), to be derived below, rather than by their Galilean counterpart transformations which simply fail to give rise to the same form of the familiar Maxwell’s equations with simply relabeled variables (see Chap. 14). We will see that the Galilean transformations are recovered from the LTs in the limit as c is taken to infinity. An IF is also referred to as a Lorentz frame in which, by definition, the SR theory holds. A vector V is defined as an ordered set of two events (E 1 , E 2 ), which, again, may be defined independently of any coordinate system and as such is referred to as a geometric object. The different realization of such vectors in different LF are again related by LT transformations. This allows us to readily satisfy the first postulate of ST in constructing physical laws which have the same form in every LF involving simply the mere relabeling of the underlying variables involved. This 1 This

will be developed through Chaps. 15–48. if you are familiar with special relativity, you will find these four chapters devoted to its description in this book quite helpful. 3 For relevant experiments, see Waddoups, Edwards and Merrill [18]; Babcock and Bergman [2]. See also Beckmann and Mandics [3]. For a test of the independence of the speed of the light source, in the microscopic domain of elementary particles, see, Alväger et al. [1]. For a macroscopic test at cosmological distances in binary star systems, see, Brecher [4]. See also, e.g., Ragulsky [16], for the constancy of the speed of light in all directions. For the independence of the speed of light of its frequency, see, e.g., Schaefer [17]. 4 Einstein [7]. See also Einstein [8], Chapter III: On the Electrodynamics of Moving Bodies. pp. 35–65. An excellent historical account of the theoretical development which eventually led to the SR theory, beginning in 1880s, see Pauli [15]. Some nodding familiarity with special relativity is expected in this book. An excellent introductory source, which also describes early experiments, is by French [9]. 2 Even

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_11

71

72

11

Lorentz Frames and Minkowski Spacetime

will be clear in our investigation of Maxwell’s equations of electrodynamics in Chap. 14. The set of all possible such vectors, as just described, generate a vector space. The set of all events together with the generated vector space is referred to as the Minkowski spacetime which is equipped with a scalar product U V for any given two vectors U and V . This scalar product will be spelled out subsequently as it emerges from the theory. With the above brief introduction, let us search for the underlying LT of SR. Consider two inertial frames F and F such that the latter moves with a velocity v relative to the first frame. Suppose a particle may travel from one spacetime time point to another. The coordinate labels for the particle in the corresponding two frames, for infinitesimal changes, may be written as     F  : t  , r , t  + dt  , r + dr .

F : (t, r) , (t + dt, r + dr) ,

(11.1)

We first consider the boundary condition such that the space coordinate axes, and the origins, associated with the two frames coincide for v = 0. The corresponding LT are then referred to as special LT. We assume that a functional relationship r = r (r, t), t  = t  (r, t) exists. The chain rule of calculus gives dr =

3 3   ∂r ∂t  ∂r ∂t  j  j dt, dt dt, dx + = dx + ∂x j ∂t ∂x j ∂t j=1 j=1

r = (x 1 , x 2 , x 3 ).

(11.2)

Quite generally, we may write   dr = A dr + B dt + C v · dr v,

dt  = D dt + E v · dr,

(11.3)

where r, v, dr are the only available 3 vectors at hand in F, with dr and dt the only infinitesimal quantities. The assumed independence of as to “when and where”, the experiment occurs in spacetime, is taken as a property of spacetime referred to as the homogeneity of spacetime. This means, in turn, that the coefficients A to E must be independent of r and t. The directional dependence of dr in (11.3) has already been taken into account, and the assumed property of spacetime taken as isotropic means that the coefficients in question cannot depend on a particular direction of the velocity v vector.5 That is, these coefficients, in general, may depend on the speed v and may be expressed as functions of v2 . The inverse transformation to (11.3) may be simply written by replacing v by −v and using the same coefficients A to E since they are invariant under the replacement v by −v. The speed of light, of course, is considered as a fixed number. That is,   dr = A dr − B dt  − C v · dr v,

dt = D dt  − E v · dr ,

(11.4)

Upon considering first the components of (11.3), (11.4) perpendicular to v, we obtain dr⊥ = A(v2 ) dr⊥ ,

dr⊥ = A(v2 ) dr⊥ ,

(11.5)

from which A2 (v2 ) = 1, or A(v2 ) = +1 using the boundary condition that for v = 0, A(0) = +1. Thus we may write   dr = dr + B dt + C v · dr v,   dr = dr − B dt  − C v · dr v,

dt  = D dt + E v · dr, 



dt = D dt − E v · dr .

(11.6) (11.7)

These equations should better give us the infinitesimal changes in space and time variables corresponding to particles as determined in the two frames. For a particle at rest in F, dr = 0, and the two equations in (11.6) give −v = dr /dt  = (B/D)v ⇒ D = −B. On the other hand for a particle at rest in F , dr = 0, and the first equation in (11.6) gives 0=

    dr dr + B +Cv· v = v + B + C v2 v, ⇒ C v2 = −(1 + B), dt dt

while in this latter case, the second equations in (11.7) and (11.6) together, in turn, give 5 For

an experiment on the isotropy of space, see, e.g., Chupp et al. [5].

11

Lorentz Frames and Minkowski Spacetime

73

   dt = D dt  = D dt D + E v2 ), ⇒ D D + E v2 ) = 1, ⇒ B(B − E v2 = 1. Therefore, we have, in particular, 

(1 + B) v · dr dr = dr + B dt − v2 







v, dt = −B dt +

B2 − 1 B v2

 v · dr.

(11.8)

Two problems remain. We have to establish the equality of the speed of light as determined in both frames, by evaluating, in the process, the only unknown B and show that the B.C. at v = 0 is satisfied. We have the explicit expression following from (11.8) 

dr = dt 

dr dt

  dr v + B − (1+B) v · 2 v dt  2  . v · dr −B + BB −1 v2 dt

(11.9)

Now consider light traveling in a direction perpendicular to v in F. This gives c c2 1  2 c⊥ = − , c = −v, ⇒ (c⊥ ) + (c )2 = 2 + v2 ≡ c2 ⇒ B 2 = B B (1 − v2 /c2 ) 1 B = − , with a minus sign so that for v = 0, 1 − v2 /c2 (B + 1) 1 = − 2 exists in(11.9), and dr = dr as both frames coincide. v2 2c All told we have,  (γ − 1)  v · dr v, 2 v   v · dr dt  = γ dt − 2 , c

dr = dr − γ v dt +

γ =

1 1 − v2 /c2

(11.10) (11.11)

where γ is referred to as the Lorentz factor. If the origins of both coordinate systems coincide at t = 0, t  = 0, then the above equations are readily integrated to give, r = r − γ v t +

(γ − 1) (v · r) v, v2

 v · r t = γ t − 2 . c

(11.12)

Note that for v2 /c2 → 0, γ → 1, and we recover the so-called Galilean transformations of Newtonian mechanics: r = r − vt, t  = t. An important invariance property is given below in the third equation which follows from the first two equations above it (v · r)2 − 2γ 2 v · r t + γ 2 v2 t 2 , v2 (v · r)2 , (ct  )2 = γ 2 (ct)2 − 2γ 2 v · r t + γ 2 c2   2 γ −1 γ2 , ≡ (r )2 − (ct  )2 = (r)2 − (ct)2 , v2 c2 (r )2 = (r)2 + (γ 2 − 1)

(11.13) (11.14) (11.15)

which, of course, has the fundamental built in property of the constancy of the speed light in different inertial frames.6 This invariance property generalizes invariance under a rotation in Euclidean space for which r 2 = r2 , t  = t. To make the invariance property  of a given equation evident, we introduce the following standard notation for time and  space variables: r = (x 1 , x 2 , x 3 ) 6 Some

textbooks, start from this last invariant equality to discuss LT.

74

11

x 0 = c t,

Lorentz Frames and Minkowski Spacetime

x i for the space components, with Latin index i = 1, 2, 3,

(11.16)

and due to the minus sign in the first equation in (11.15), one may introduce a diagonal matrix [ημν ] = diag [−1, 1, 1, 1], with the Greek indices μ, ν running over {0, 1, 2, 3}, and rewrite (11.15) in the compact form: ημν x μ x ν = ημν x μ x ν ,

(11.17)

using the Einstein summation convention to sum over upper and lower identical indices. The inverse of the matrix [ημν ] is given by [ηνσ ], which coincides with [ηνσ ], such that ημν ηνσ = δμ σ , [δμ σ ] = diag[1, 1, 1, 1]. The matrix [ημν ] is called the Minkowski metric and generalizes the Euclidean metric [δi j ], with the Latin indices i, j going over 1, 2, 3, with invariance under coordinate rotation given by δi j x  i x  j = δi j x i x j . We note the following key equations that follow upon setting μ = 0 and μ = i, in turn, in the metric ημν , and then sum over the remaining index η0ν x ν = η00 x 0 = − x 0 ≡ x0 ,

ηi ν x ν = ηi j x j = δi j x j = + x i ≡ xi .

(11.18)

That is, the metric may lower an upper index changing an upper index to a lower index: ημν x ν = xμ . These, in turn, imply that we may simply write x 0 = −x0 , and x i = xi for the space (Latin) indices. The Lorentz invariance statement in (11.17) may be then also simply rewritten as xμ x μ = xσ x σ . On the other hand, one may raise a lower index via the inverse matrix: ημν xν = x μ . The (homogeneous) lorentz transformations in (11.12) may be now combined in the elegant form as given in Box 11.1. One may introduce, in general, four vectors U, V , in Minkowski spacetime, as ordered sets of two events, and as discussed earlier, are entities which may be defined independently of any coordinate system, and they Box 11.1 Homogeneous Lorentz Transformations vi v j , Λ0 0 = γ , Λ0 i = −γ vi /c, Λi 0 = −γ vi /c, v2 ⎞ ⎛ 0 Λ 0 Λ0 1 Λ0 2 Λ0 3 1 1 1 1 ⎜ Λ 0 Λ 1 Λ 2 Λ 3⎟  ν μ ⎟ γ = (1 − v2 /c2 )−1/2 , vi = vi . Λ ≡ [Λμ ν ] = ⎜ ⎝ Λ2 0 Λ2 1 Λ2 2 Λ2 3 ⎠ , (Λ ) μ = (Λ) ν . Λ3 0 Λ3 1 Λ3 2 Λ3 3   [ ημν x μ x ν ] = [x σ Λμ σ ημν Λν λ x λ ] = x (Λ η Λ) x = x η x, x  ≡ (x 0 x 1 x 2 x 3 ). ⇒ x μ = Λμ ν x ν , Λi j = δ i j + (γ − 1)

Λη Λ = η, det [ημν ] = −1 = 0, ⇒ det Λ2 = 1, det Λ = +1 for special LT, Λ0 0 > 0.

are referred to as geometric objects. In given coordinate systems, they may have components (U μ , V ν ), U μ , V ν , and, in turn, one defines their (invariant) scalar product by U V ≡ U μ ημν V ν = U μ ημν V ν ,

U μ = Λμ σ U σ ,

V μ = Λμ σ V σ .

(11.19)

The invariance of the scalar product follows in the same manner as the invariance of the product x μ ημν x ν . Note also the notations ημν U ν = Uμ , ημν Uν = U μ .7 Some Properties of Four-Vectors in Minkowski Spacetime: We often use the notation U = (U 0 , U), U 2 = U2 − U 0 2 , U V ≡ U μ Vμ = U · V − U 0 V 0 .

(11.20)

A non-zero8 four vector may satisfy one of the following three conditions:

ν μ and U , are often, especially in the old literature, referred to as covariant and contravariant components of a vector, respectively. is, at least one of its components is not zero.

7 The components U 8 That

11

Lorentz Frames and Minkowski Spacetime

75

⎧ ⎪ ⎨ > 0, space-like 4-vector, 2 U = < 0, time-like 4-vector, ⎪ ⎩ = 0, light-like 4-vector.

(11.21)

Let U and V be two non-zero four vectors. They are said to be orthogonal if U V = 0. The following properties of non-zero four vectors should be noted: (i) Clearly a light-like 4-vector is orthogonal to itself. (ii) Given U 2 < 0, V 2 ≤ 0, then U V = 0. To see this, note that the two given conditions imply that |U| < |U 0 |, |V| ≤ |V 0 |. Also |U · V| ≤ |U||V|. Hence |U · V| < |U 0 V 0 |. That is U · V = U 0 V 0 . (iii) Given U V = 0, V 2 < 0, then U 2 > 0. To see this, note that the two given conditions imply that U · V = U 0 V 0 , V2 < V 0 2 . I f |V| = 0, then U 0 = 0 since V 0 = 0, and since U is not the zero 4-vector, this implies that U 2 > 0. On the other hand i f |V| = 0, then (U 0 V 0 )2 ≤ |U|2 |V|2 , and |V|2 < (V 0 )2 , which imply that (U 0 )2 |V|2 < (U 0 V 0 )2 ≤ |U|2 |V|2 . That is (U 0 )2 < |U|2 . (iv) Given U V = 0, V 2 = 0, then either U 2 > 0 or U = λV, where λ is some real number. To see this, note that the two given conditions, imply that U · V = U 0 V 0 , |V| = |V 0 |. That is, |U 0 V 0 | ≤ |U||V| and hence |U 0 | ≤ |U|, or U 2 ≥ 0. In particular i f U 2 = 0, then |U||V|| cos θ | = |U 0 V 0 | = |U||V|, | cos θ | = 1. That is, U = λV. The given conditions then imply that U 0 V 0 = λ|V|2 = λV 0 2 . Since V 0 = 0, this means that U 0 = λV 0 as well. That is U = λV. (v) Given that U V = 0, V 2 = 0, U 2 = 0. Then U = λV. This follows from the last part of the proof in (iv) above. One may introduce tensors of second, third ,..., rank tensors as geometric objects with components; F μν , F μνσ , ..., in a given coordinate system, involving more than one index, which transform as the product of four vectors. For example, one has F μν = Λμ σ Λν λ F σ λ , and so on. They are said to transform covariantly. On the other a product such as F μν Fμν , involving no free indices, as one is summing over both indices according to the Einstein summation convention, is an invariant. That is, it has the same value in every inertial frame. Indices of tensors are lowered via the metric the same way as of vectors, while indices may be raised by the inverse [ημν ] to the metric. Consider an object moving with a given uniform velocity in a given inertial frame F. The recorded time t corresponding to this motion in F is referred to as a coordinate time. On the other hand, the clock reading in the objects rest frame, is referred to as proper time denoted by τ . By definition, the rest frame of the object means that it is not moving in it. As the object moves, say, from r1 to r2 , from time t1 to time t2 , as determined in frame F, the invariance property relating the frame F and the object’s rest frame, gives c2 (τ2 − τ1 )2 = c2 (t2 − t1 )2 − (r2 − r1 )2 ,

 μ  (x2 − x1μ ) = Λμ ν (x2ν − x1ν ) ,

(11.22)

x μ = (cτ, 0), x μ = (ct, r). That is,  (τ2 − τ1 ) = (t2 − t1 ) 1 −

(r2 − r1 )2 < (t2 − t1 ), c2 (t2 − t1 )2

(11.23)

showing that the time elapsed in the rest frame of the object is shorter than in the coordinate frame F in which the object is moving. This is referred to as time dilation. A well known classic experiment9 implies, in particular, that for the decay 9 See,

e.g., Frisch and Smith [10]. For a general review paper on experimental tests of time dilation, see Gwinner [11].

76

11

Lorentz Frames and Minkowski Spacetime

√ μ− → e− + νμ + ν˜ e , there will be fewer decays with the moving muons than expected. For v(μ− ) = 0.994 c ⇒ t/τ ≈ 84, where t is a measure of the time of survival of muons in flight and τ is a measure of time of survival muons at rest, both before decay, consistent with the SR prediction. The LT for space variables in (11.12) implies, with ⊥ and  corresponding, respectively, to perpendicular and parallel to v, that r2 − r1 = r2 − r1 − γ v(t2 − t1 ) + r⊥2 − r⊥1 = r⊥2 − r⊥1 ,

(γ − 1) [v · (r2 − r1 )] v, v2

(11.24) (11.25)

showing a change in readings occurs only along the velocity 3-vector v. We recall that the rest frame of a moving object with velocity v in the frame F, is denoted by F . Consider a stick, set along the x-axis, is moving with velocity v along the x-axis of frame F. When one makes a measurement of, say, a moving stick one has to locate the end points of the stick simultaneously10 in frame F which means that we must set t2 = t1 ≡ t in Eq. (11.24), otherwise the stick will move in the mean time. This gives simply (x2 − x1 ) = γ (x2 − x1 ), where x2 − x2 ≡ L denotes the length of the stick measured in frame F. If L 0 denotes the length of the stick in its rest frame, referred to as proper length, i.e., x2 − x1 ≡ L 0 , then Eq. (11.24), with t2 = t1 ≡ t, gives moving stick v

t

L0 =

1 1−

v2 /c2

L , that is L =

1 − v2 /c2 L 0 < L 0 ,

x2

x1

t F

(11.26)

showing a length contraction in the measurement of a stick in motion.11 The LT transformation of the time variable in (11.12) gives [t2 − t1 ] = γ

  v · (r2 − r1 ) [t2 − t1 ] − , c2

(11.27)

which implies that if two events occur simultaneously at different points r2 and r1 in F, that is t2 = t1 , then this does not happen simultaneously in F , and vice versa. One may define the interval between two events in spacetime with coordinate labels (t1 , r1 ) and (t2 , r2 ), in a given coordinate system, by 2 2 = (r1 − r2 )2 − c2 (t1 − t2 )2 = − τ12 , (11.28) s12 2 and for a time-like interval (squared), that is, s12 < 0, a distance may be defined between these two events in Minkowski  2 spacetime by d12 = −s12 , in a given coordinate system. A peculiar property of measuring distances in Minkiwski spacetime is the so-called reversal of the triangular inequality one encounters in Euclidean space. This arises as follows: , r2 ), E 3 : (t3 , r3 ) which are causally related in pairs. Consider three events with coordinate labels E 1 : (t1 , r1 ), E 2 : (t2 2 2 2 That is, s12 ≤ 0, s13 ≤ 0, s23 ≤ 0, and, say t2 > t3 > t1 . Define di j =

−si2j , where si2j = (ri − r j )2 − c2 (ti − t j )2 . Then

ct

d12 ≥ d13 + d23 ,

(ct2 , r2 ) (ct3 , r3 )

.

(ct1 , r1 )

This is established in Box 11.2. That is, unlike Euclidean space, the straight line between two events, in Minkowski spacetime, does not correspond to the shortest but the longest one. In particular, r1 = r2 , corresponds to the interesting case of the so-

10 This

requires that the clocks in the F frame in which the stick is moving are synchronized. If the distance between two clocks is d and the time reading in one clock is t = 0, then when a light signal is sent from this clock and received from the second clock, the clock reading on the latter clock should be set equal to t = d/c which is the time taken for the light signal to reach the second clock. At this time both clocks will the read the time t/c and will be synchronized. 11 For an interesting study on the search of the length contraction of a relativistic moving stick observed on a two dimensional surface as on a photograph, see Manoukian and Sukkhasena [14].

11

Lorentz Frames and Minkowski Spacetime

77

called twin paradox in which the twin who stays put at home is found to get older in years than her twin sister who travels, when they finally meet.12 Box 11.2 The Reversal of the Triangular Inequality in Minkowski Spacetime Let T1 = c(t3 − t1 ), T2 = c(t2 − t3 ), R1 = r3 − r1 , R2 = r2 − r3 . Then T1 ≥ |R1 | ≡ R1 , T2 ≥ |R2 | ≡ R2 . Moreover, T1 T2 ≥ R1 R2 . We may explicitly write :     d 212 = d 213 + d 223 + 2 T1 T2 − R1 · R2 = (d13 + d23 )2 + 2 T1 T2 − R1 · R2 − d13 d23 . Moreover, R1 · R2 ≤ R1 R2 , (T1 R1 )2 + (T2 R2 )2 ≥ 2 T1 T2 R1 R2 . Since T1 T2 − R1 · R2 ≥ 0,  2 2  2 2 we have T1 T2 − R1 · R2 ≥ T1 T2 − R1 R2 ≥ (T12 − R12 )(T22 − R22 ) ≡ d13 d23 .   2 2 That is d 12 ≥ d13 + d23 , or d12 ≥ d13 + d23 .

The LT transformations considered earlier in Box 11.1 give rise to the following equations for t = 0, v = 0 x  i = Λi j x i , Λi j = δ i j ,

ct  = Λ0 j x j , Λ0 j = 0,

(11.29)

and the two space coordinates both frames coincide, with the clocks, set at the corresponding origins, both read zero. One may, quite generally consider the case that the space coordinate of F is initially rotated relative to frame F. This will be worked out in detail in the next chapter, This leads in replacing Λi j = δ i j , for v = 0, effectively by n r

Ri j = δi j − εik j n k sin ϑ + (δi j − n i n j )(cos ϑ − 1),

O

ϑ r

(11.30)

corresponding to a rotation of r by angle ϑ about a unit vector n, where εi j k is totally anti-symmetric with ε123 = +1. One may also consider that the space origins of the coordinates in question do not necessarily coincide for t = 0, v = 0, involving different clocks readings at the corresponding origins. This leads to transformations, referred to as Poincaré transformations, which have the general structure x μ = Λμ ν x ν − b μ ,

(11.31)

where for the implicity of the notation, we have used the same symbol for Λ. Under a subsequent transformation, we have     x λ = Λλ μ − bλ = Λλ μ Λμ ν x ν − Λλ μ b μ + bλ .

(11.32)

Hence a Poincaré transformation may be specified by a pair (Λ, b), with Λ = (Λμ ν ), and b = (b μ ), satisfying the following readily established group properties directly from Eq. (11.32), referred to as the Poincaré group: 1. 2. 3. 4.

Group multiplication: (Λ , b  )(Λ, b) = (Λ Λ, Λ b + b  ). Identity (I, 0) : (I, 0)(Λ, b) = (Λ, b). Inverse (Λ, b)−1 = (Λ−1 , −Λ−1 b) : (Λ, b)−1 (Λ, b) = (I, 0). Associative rule: (Λ3 , b3 )[(Λ2 , b2 )(Λ1 , b1 )] = [(Λ3 , b3 )(Λ2 , b2 )](Λ1 , b1 ).

By causality it is meant that there exists a cause-effect relationship between events, i.e., events have a definite order in time. Consider a frame F , referred to as a subluminal frame, moving with speed |v| < c in a frame F, and |u| = |dr/dt| ≤ c, denotes the velocity of a given particle in F, then the LT of the time variable for dt > 0 gives   v u · v > 0, dt  = dt 1 − 2 > dt 1 − c c 12 At

(11.33)

first, this may sound to be inconsistent but it is not as, in the simplest description, two different inertial frames are involved for the traveling twin, while only one is involved for the twin who stays home and no symmetry is involved between the two twins based on arguments of relative motion, see, e.g., Davies [6]. For an actual experiment involving clocks going around the world, see the classic paper by Hafele and Keating [12].13

78

11

Lorentz Frames and Minkowski Spacetime

that is dt  > 0 as well, and that the order of events is preserved in all inertial subluminal frames. [It is easily verified that if |u| > c, one may always find a subluminal frame F , moving with a specific speed v < c, in which the order of the sequence of events is interchanged with that of frame F thus violating this sacred principle of causality.] The LT transformations now allow us to construct covariant dynamical equations, that is of equations which have the same form in every IF involving a mere relabeling of the underlying variables in question in the different IFs. One such a description is provided by Maxwell’s equations of electrodynamics described in Chap. 14. This is then extended to provide a covariant description of all of the present successful theories of particle physics in subsequent chapters. Finally note that under a Lorentz transformation x μ → x μ , the Jacobian of the transformation is given by |det[∂ x μ /∂ x ν ]| ≡ |det[Λμ ν ]| = 1 with the latter property shown in Box 11.1. Hence the volume element (dx) ≡ dx 0 dx 1 dx 2 dx 3 is Lorentz invariant.

(11.34)

In later chapters, we will also see the unique role Lorentz frames and SR play in developing the general theory of relativity involving gravitation thanks to a principle referred to as the principle of equivalence.

References 1. Alväger, T., et al. (1964). Measuring the velocity of light emitted by fast sources, using accelerated particles. Physics Letters, 12, 260–262. 2. Babcock, G. C., & Bergman, T. G. (1964). Determination of the constancy of the speed of light. Journal of the Optical Society of America, 54, 147–151. 3. Beckmann, P., & Mandics, P. (1965). Test of the constancy of the velocity of electromagnetic radiation in high vacuum. Radio Science Journal of Research, 69D, 623–628. 4. Brecher, K. (1977). Is the speed of light independent of the source? Physical Review Letters, 39, 1051–1054. [Erratum: ibid., 1236.] 5. Chupp, T. E., et al. (1989). Results of a new test of local Lorentz invariance: A search for mass anisotropy in 21 Ne. Physical Review Letters, 63, 1541–1545. 6. Davies, P. (1996). About time: Einstein’s unfinished revolution (p. 59). New York: Simon & Schuster. 7. Einstein, A. (1905). Zur elektrodynamik bewegter Körper. Annalen der Physik, 17, 891–921. 8. Einstein, A., Lorentz, H. A., Weyl, H., & Minkowski, H. (1952). The principle of relativity. New York: Dover Publications. 9. French, A. P. (1968). Special relativity. M.I.T. introductory physics. 10. Frisch, D. H., & Smith, J. H. (1963). Measurement of relativistic time dilation using μ−mesons. American Journal of Physics, 31, 342–355. 11. Gwinner, G. (2005). Experimental tests of time dilation in special relativity. Modern Physics Letters, A, 20, 791–805. 12. Hafele, J. C., & Keating, R. E. (1972). Around-the-world atomic clocks: Observed relativistic time gains. Science, 177, 168–170. 13. Manoukian, E. B. (1993). On the reversal of the triangular inequality in Minkowski space in relativity. European Journal of Physics, 14, 43. 14. Manoukian, E. B., & Sukkhasena, S. (2002). Projection of relativistically moving objects on two-dimensional plane, the ‘train’ paradox and the visibility of the Lorentz contraction. European Journal of Physics, 23, 103–110. 15. Pauli, W. (1958). Theory of relativity. London: Pergamon Press. 16. Ragulsky, V. V. (1997). Determination of light velocity dependence on direction of propagation. Physics Letters A, 235, 125–128. 17. Schaefer, B. E. (1999). Severe limits on variations of the speed of light with frequency. Physical Review Letters, 82, 4964–4966. 18. Waddoups, R. O., Edwards, W. F., & Merrill, J. J. (1965). Experimental investigation of the second postulate of special relativity. Journal of the Optical Society of America, 55, 142–143.

The Celebrated Lorentz Group: Underlying Transformations Derived

12

Prerequisite Chap. 11

We have seen that homogeneous Lorentz transformations leave the following scalar product1 ημν x μ x ν invariant.

(12.1)

In the present chapter, we consider the reversed approach to the earlier one and look for all real linear transformations which leave the expression in (12.1) invariant. In doing so , we will have to consider those involving space and/or time reflections and their products defined, respectively, by (x 0 , x) → (x 0 , −x); (x 0 , x) → (−x 0 , x); (x 0 , x) → (−x 0 , −x). Physically, in an experiment involved in measuring the speed of light, ημν x μ x ν = x·x−c2 t 2 = 0. The invariance property in (12.1) means, in particular, that the speed of light is c in every inertial frame with corresponding coordinate labels x μ , related to the coordinate system with coordinate labels x μ , for which2 ησρ x σ x ρ = ημν x μ x ν . The underlying transformations provide the symmetry of space and time of particle physics. Even in the presence of gravitation, we will learn in a later chapter, thanks to the principle of equivalence, that at every point in spacetime one may set up an inertial frame at which non-gravitational laws take on their special relativistic forms and hence have already built in them this underlying symmetry. We will generalize these transformations further when dealing with supersymmetric physical theories in a later chapter. Let L denote the set {Ω} of all linear transformations defined by Ω μ ν x ν = x μ .

(12.2)

satisfying the invariance property ημν x μ x ν = ησρ x σ x ρ , ⇒

ημν Ω μ σ Ω ν ρ = ησρ .

(12.3)

The latter equality, in turn, implies that ημν Ω μ σ Ω ν ρ = ησρ ,  μ σλ  ν Ω σ η ημν Ω ρ = δ λ ρ ,

⇒ ⇒

(ημν Ω μ σ Ω ν ρ )ηλσ = ηλσ ησρ ,   (Ω −1 )λ ν = ηνμ Ω μ σ ησ λ .

(12.4)

On the other hand, with μ and ν, in Ω μ ν , labeling, respectively, the column and row of the matrix, then as far as matrix multiplication is concerned, the matrix elements of the transpose matrix Ω  may be written as (Ω  )μ ν = Ω ν μ . This allows us to rewrite the second equation in (12.3) in a convenient matrix multiplication form as follows: Ω  η Ω = η.

(12.5)

Since detη = −1 = 0, and due to the property of the transpose of a matrix detΩ  = detΩ, the above equation implies that (det Ω)2 = 1, i.e., 1 At one time the legendary Abdus

Salam was giving some lectures on particles and fields, and he needed some aspects of the Lorentz group. When he found out that most of the students attending the lectures were not familiar with the Lorentz group, he discontinued his lectures and started lecturing on the Lorentz group instead. 2 A lucid account of the Lorentz group may be found in Streater and Wightman [1]. Generators and representation of the group will be discussed in Chap. 15. © Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_12

79

80

12

The Celebrated Lorentz Group: Underlying Transformations Derived

det Ω = ± 1,

(12.6)

referred to, respectively, as proper and improper transformations. We will consider both cases below. We first show that L forms a group. That is: (i) (ii) (iii) (iv)

If Ω1 , Ω2 are in L , then so is Ω1 Ω2 . (Ω1 Ω2 )Ω3 = Ω1 (Ω2 Ω3 ), for Ω1 , Ω2 , Ω3 in L . L contains the identity transformation I such that I Ω = Ω I = Ω, for Ω in L . If Ω is in L , then its inverse Ω −1 exist and is in L , where Ω −1 Ω = Ω Ω −1 = I .

To verify the above properties, note that for Ω1 , Ω2 in L , recall first that (Ω1 Ω2 ) = Ω2 Ω1 , • Ω1 η Ω1 = η, Ω2 → (Ω1 η Ω1 = η, ) ← Ω2 , (Ω1 Ω2 ) η (Ω1 Ω2 ) = Ω2 η Ω2 , but Ω2 η Ω2 = η, hence (Ω1 Ω2 ) η (Ω1 Ω2 ) = η, thus establishing (i) • Given that Ω1 , Ω2 , Ω3 are in L , then from (i) Ω1 Ω2 and Ω2 Ω3 are in L , and hence (Ω1 Ω2 )Ω3 , Ω1 (Ω2 Ω3 ) are in as well, and (ii) follows from the rules of matrix multiplication. • (iii) is obvious since I = [δ μ ν ] leaves Eq. (12.3) invariant and hence is in L . • From Eq. (12.6), detΩ = 0, i.e., Ω −1 exists. Upon multiplying Eq. (12.5) from the left by (Ω −1 ) , and from the right by Ω −1 , we obtain η = (Ω −1 ) η Ω −1 , that is, Ω −1 is in L as well, thus establishing (iv). Thus L is a group and is referred to as the Lorentz Group.3 The relation Ω Ω −1 = I may be written in detail as (Ω)μ ν (Ω −1 )ν σ = δ μ σ . In particular for μ = i, σ = 0, gives Ω i 0 Ω 0 0 −

3 

Ω i j Ω 0 j = 0, using (12.4),

(12.7)

j=1 3  (Ω 0 j )2 , using (12.4), that is while for μ = 0 = σ, gives, (Ω 0 ) = 1 + 0

2

(12.8)

j=1

Ω 0 0 ≥ 1,

or

Ω 0 0 ≤ −1,

(12.9)

referred to, respectively, as orthochronous and non-orthochronous transformations. ↑ ↓ ↑ ↓ Thus L splits into 4 parts denoted by L+ , L+ , L− , L− , where ↑

(1) L+ : consists the set of all transformations in L such that det Ω = +1, Ω 0 0 ≥ 1, and includes the identity. It is readily verified that it forms a group. It is referred to as the restricted Lorentz group. ↓ (2) L+ : consists the set of all transformations in L such that det Ω = +1, Ω 0 0 ≤ −1, and does not include the identity. Therefore it does not form a group. It will be readily seen below that it includes space-time reflections. ↑ (3) L− : consists the set of all transformations in L such that det Ω = −1, Ω 0 0 ≥ 1. It does not include the identity and hence does not form a group as such. It includes space reflection. ↓ (4) L− : consists the set of all transformations in L such that det Ω = −1, Ω 0 0 ≤ −1, and does not include the identity. Therefore it does not form a group. It includes time reflection. ↑

We first consider all the elements in L+ . With Ω 0 0 ≥ 1, and define a matrix M with matrix elements M 0 0 = Ω 0 0 , M 0 i = − Ω 0 i , M i 0 = − Ω 0 i , Mi j = δi j + 3 It

Ω0 i Ω0 j . Ω0 0 + 1

(12.10)

is also referred to as the homogeneous Lorentz group, while the Poincaré group, which includes translations in space and time, is referred to as the inhomogeneous Lorentz group. The latter is discussed in the previous chapter.

12

The Celebrated Lorentz Group: Underlying Transformations Derived

81



Obviously, M is in L + , with M 0 0 = Ω 0 0 ≥ 1, and coincides with a homogeneous Lorentz transformation given in the previous chapter and where det M = +1. Moreover, we define the matrix R with matrix elements: Rμ σ = Ω μ ν M ν σ ,

(12.11)

 • μ = 0, σ = 0: (Ω 0 0 )2 − 3j=1 (Ω 0 j )2 = 1 from (12.8), R 0 0 = 1.  • μ = i, σ = 0: Ω i 0 Ω 0 0 − 3j=1 Ω i j Ω 0 j = 0 from (12.7).    3 Ω 0 i δi j + Ω 0 i Ω 0 j / Ω 0 0 + 1 • μ = 0, σ = j: − Ω 0 0 Ω 0 j + i=1  2   3  0 2   0 Ω i / Ω 0 + 1 = 0. = Ω 0 j − Ω 0 0 − Ω 0 0 + 1 + Ω 0 0 + i=1 • μ = i, σ = j: R i

j

≡ ai j are just some real numbers, with ⎞ 10 0 0 ⎟ ⎜0 ⎟ [R μ ν ] = ⎜ ⎝ 0 ai j ⎠ , 0 ⎛

ai j ai k = δ j k ,

det[ai j ] = + 1,

(12.12)



where in writing the last two equalities we have used the fact that [R μ ν ] is also in L + . ai j ai k = δ j k , ⇒ (a −1 ) j i aik = δ j k .

(12.13)

To find the explicit structure of the matrix R, we proceed as follows. First note that [(a −1 ) j i ] =

[(cofactor a j i )] = [(cofactor a j i )] . det a

(12.14)

In particular,  a11 = det

     a22 a23 a a a a , a22 = det 11 13 , a33 = det 11 12 , a32 a33 a31 a33 a21 a22

(12.15)

which give a11 = a22 a33 − a23 a32 ⇒ a23 a32 = a22 a33 − a11 .

(12.16)

a22 = a11 a33 − a13 a33 ⇒ a13 a33 = a11 a33 − a22 .

(12.17)

a33 = a11 a22 − a12 a21 ⇒ a12 a21 = a11 a22 − a33 .

(12.18)

Define b1 =

  1 1 1 a23 − a32 ), b2 = a31 − a13 , b3 = a12 − a21 , 2 2 2

and note that ai j ai k = δ jk , implies that

3 

(ai j )2 = 3.

(12.19)

(12.20)

i, j=1

Moreover, we define

 1 a11 + a22 + a33 − 1 , b = (b1 , b2 , b3 ), 2 and use, in the process, Eqs. (12.16)–(12.20), to obtain the basic equality ρ =

ρ 2 + b2 = 1, ⇒ ρ = cos ϑ,

b = n sin ϑ, n2 = 1.

The matrix elements ai j may be written as a symmetric part plus an anti-symmetric part as follows

(12.21)

(12.22)

82

12

The Celebrated Lorentz Group: Underlying Transformations Derived

  1 1 ai j + a j i + ai j − a j i , implying the general structure 2 2   ≡ [ A(ϑ) δi j + B(ϑ) n i n j ] − [ εik j n k sin ϑ ], see (12.19) and (12.22) for the bi

ai j =

(12.23) (12.24)

 normalized such that ai j ϑ=0 = δi j , i.e., A(0) = 1, B(0) = 0. From the definition of ρ, it follows that  1 3 A(ϑ) + B(ϑ) = cos ϑ, ⇒ 2

B(ϑ) = 1 + 2 cos ϑ − 3 A(ϑ).

(12.25)

For n 1 = 0, n 2 = 0 = n 3 , a11 = A = a22 , a23 = 0, a33 = A + (1 + 2 cos ϑ − 3 A), a11 = a22 a33 − a23 a32 ,   ⇒ A = A 1 + 2 cos ϑ − 2 A , ⇒ A(ϑ) = cos ϑ, B(ϑ) = 1 − cos ϑ.

(12.26) (12.27)

Same results follow for (n 1 = 0, n 2 = 0, n 3 = 0) and (n i = 0, i = 1, 2, 3). ↑ Therefore any Ω in L + , may be written as Ω = R M −1 , where in detail Ω μ ν = R μ ρ (M −1 )ρ ν ,

 (M −1 )ρ ν = ηνα M α β ηβρ

  see Eq. (12.4) .

(12.28)

n

R 0 0 = 1, R 0 i = 0 = R i 0 ,

r O

R i j = δi j cos ϑ + (1 − cos ϑ)n i n j − εi k j n k sin ϑ.

ϑ

(12.29)

x i

= r r = (x 1 , x 2 , x 3 ) r = (x 1 , x 2 , x 3 )

(M −1 )0 0 = Ω 0 0 , (M −1 )0 i = Ω 0 i , (M −1 )i 0 = Ω 0 i , (M −1 )i j = δi j +

Ri

j

xj

Ω0 i Ω0 j . Ω0 0 + 1

(12.30)

(12.31)

With Ω 0 0 ≥ 1, and c denoting the speed of light, we may set 3  c 2    0 2 c  0 0 0 2 Ω = − v ⇒ v Ω j , and from (12.8), , Ω , Ω = 1 2 3 0 0 Ω 0 Ω 0 j=1   c2 (γ − 1) c 1 1 = v= 0 (Ω 0 0 )2 − 1 , Ω 0 0 =  ≡ γ, . Ω 0 Ω0 0 + 1 v2 γ 2 1 − v2 /c2

Thus we may write Ω0 j = − γ

vj , c

  vi v j Ω0 i Ω0 j = γ −1 . 0 Ω 0+1 v2









• L+ forms a group. For Ω in L+ , detΩ = +1, and :  vk v j Ω i j = Ri j + (γ − 1)Ri k 2 , Ω 0 0 ≡ γ = 1/ 1 − v2 /c2 , Ω 0 i = −γ vi /c, v Ω i 0 = −γ Ri j v j /c, Ri j = δi j − εik j n k sin ϑ + (δi j − n i n j )(cos ϑ − 1).  in L+↓ : Ω  = Ist Ω, Ist = diag [−1, −1, −1, −1], det Ω  = +1. • For Ω  = Is Ω,  in L+↓ : Ω • For Ω

 = −1. Is = diag [+1, −1, −1, −1], det Ω

↓ L+

 = −1. It = diag [−1, +1, +1, +1], det Ω

 in • For Ω



x μ = Ω μ ν x ν , Ω element in L = L+ ∪ L+ ∪ L− ∪ L− .



 = It Ω, : Ω

(12.33)

(12.34)

Box 12.1 Summary: The Lorentz Group L ημν x μ x ν = ησρ x σ x ρ ,

(12.32)

12

The Celebrated Lorentz Group: Underlying Transformations Derived

83

The above analysis brings us back into contact with our earlier chapter (Chap. 11) dealing with Lorentz transformations, but now including rotations of coordinate systems, as well as of space and time reflections. It is not just enough to state the properties of the Lorentz group, the derivation of the underlying explicit expressions of the transformations, as done above, is essential. Under a Poincaré transformation, including a rotation, we then have the transformation (see below Eq. (11.31) in Chap. 11, as well as (12.4) above with (Ω −1 )μν = Ω νμ ) x μ = Ω μ ν x ν − bμ , (Ω  , b )(Ω, b) = (Ω  Ω, Ω  b + b ), (Ω, b)−1 = (Ω −1 , −Ω −1 b),

(12.35)

and for infinitesimal transformations with parameters δϑ, δbμ , δβ i , β i = vi /c, we have from lines three and four in Box 12.1, Ω μ ν = δ μ ν + δωμ ν , δωμν = − δωνμ , δωi j = ε i j k n k δϑ,

δωi 0 = −δβ i ,

δω0 i = −δβ i ,

δω0 0 = 0.

Reference 1. Streater, R. F., & Wightman, S. A. (2000). PCT, spin and statistics and all that. Princeton: Princeton University Press.

(12.36)

Physics in Minkowski Spacetime: Applications Prerequisites Chap. 11

13

In the present chapter, we consider some pertinent applications of the Lorentz transformations in Chap. 11. You are probably familiar with some of these applications from your undergraduate studies, but nevertheless, I advise you to carefully go through their descriptions. Upon dividing the Lorentz transformation of the differential space variable dr by the differential of the time variable dt  , we obtain from Eqs. (11.10) and (11.19) in Chap. 11 the expression    dr/dt − γ v + (γ − 1)/v2 v · dr/dt v dr , v = |v|. = dt  γ − v · dr/dt

(13.1)

But u = dr/dt, and u = dr /dt  , denote the velocities of a particle as determined, respectively, in frames F and F  . Accordingly, one has the following transformation law of the velocity of the particle in question: u =

u − γ v + (γ − 1)v(v · u/v2 )   . γ 1 − (v · u/c2 )

(13.2)

At this stage, one is tempted to define the momentum of a particle in relativity by multiplying the velocity u of a particle by it mass: m u. For one thing, we see from the above equation that we cannot simply add such expressions to generate a total momentum which would be conserved in every inertial frame, because the time differential dt, in u = dr/dt, has also a transformation law under a LT. To understand how this expression is to be modified in relativity, consider a sequence of events: E(λ), a ≤ λ ≤ b, where λ is an invariant parameter, that it has the same value in every inertial frame. This sequence of events defines a curve in spacetime and it may describe the motion of a particle in spacetime referred to as the worldline of the particle. One may, in turn, introduce a displacement four vector ds, which describes infinitesimal changes of the particle’s position in spacetime. In given coordinate systems with coordinate labels x μ , x μ , it would be represented, respectively, by: dx μ (λ), dx μ (λ), as a tangent vector to a worldline describing infinitesimal changes of the particle with λ, as determined in the corresponding frames. The scalar product of ds with it self is then given by the familiar expression Eq. (11.15) of Chap. 11,  dsds notation ≡ ds 2 = ημν dx μ dx ν = (dr)2 − c2 dt 2 = ημν dx μ dx ν = (dr )2 −c2 dt 2 .

(13.3)

The infinitesimal ds 2 is referred to as the line element (squared) or just line element. Since it is an invariant, we may evaluate it, in particular, in the rest frame of the particle, for which the line element will take the simple form ds 2 ≡ −c2 dτ 2 , where τ is, of course, an invariant, measured in the rest frame of the particle, and is referred to as the proper time already introduced in Chap. 11, above Eq. (11.22). We note that  dτ = dt 1 −

 dr 2 dt = , cdt γ

(13.4)

where dr/dt = v is the velocity of the particle, and t is referred to as coordinate time. One may take the parameter λ to coincide with τ , and define the four velocity vector ds/dτ ≡ U as a tangent to a particle’s worldline with components: dx μ /dτ , dx μ /d/τ , . . ., in different coordinate systems. The components are related by a Lorentz transformation U μ = μν U ν , such that ημν U μ U ν = − c2 is an invariant. What is the significance of U 0 ? or more precisely what is the significance of the more relevant object mU 0 ?, where m is the mass of the particle. It is given explicitly by © Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_13

85

86

13

mU 0 = cm

Physics in Minkowski Spacetime: Applications

dr 1 1 dt 1 = mcγ = mc2 + mv2 + · · · , = v, γ =

, dτ c 2 dt 1 − v2 /c2

(13.5)

expressed in powers of v. We recognize mv2 /2 as the non-relativistic kinetic energy of the particle, and hence mc2 is interpreted as a rest energy arising by putting v = 0, and cm U 0 ≡ E as the total energy of the particle which includes its rest energy E 0 = m c2 (the famous Einstein equation). We may thus define the energy momentum of a particle, or more conveniently the four momentum, of a massive particle of mass m, by  dx μ  E = (γ c, γ v), , E = γ mc2 , (U μ ) = c dτ p = u, p 2 ≡ ημν p μ p ν = m 2 ημν U μ U ν = −m 2 c2 ,

⇒ p 2 = − m 2 c2 , E = p2 c2 + m 2 c4 , p μ = ( p 0 , p), p 0 = mU 0 =

(13.6)

(13.7)

and the mass m of a particle is an invariant,1 and where we have used (13.4) and the last two equalities in (13.5) in writing the last equality in (13.6). What about the above expressions for a massless particle as the photon? In this case one may use the Einstein equation of a photon energy E = hν, where ν is its frequency and h is Planck’s constant. We may then summarize the expressions for the four momentum of a massive particle to be given by p=

E c



, p , E = p2 c2 + m 2 c4 , p μ pμ = −m 2 c2 , U = (γ c, γ v)

and for a massless particle of associated frequencyν : E = hν, p =

hν n, |n| = 1, p μ pμ = 0. c

(13.8) (13.9)

In a scattering process of particles of various four momenta p1 + p2 + · · · + pn → p 1 + p 2 + . . . + p m , energy momentum m n pi − i=1 pi = 0 (see also Eq. (13.34)). Since the zero vector implies that it is zero in every inertial conservation reads i=1 frame, conservation of energy-momentum, defined together as a covariant equation, holds then rigorously in every Lorentz frame. We now carry out some applications of the above findings including those given in Boxes 13.1–13.4, some of which are elementary and are chosen for the convenience of readers of different backgrounds. Box 13.1 deals with the aberration of light giving rise to an apparent position of a star in a moving frame, while Boxes 13.2–13.4 deal, respectively, with the Compton effect, the doppler effect and particle decay. Box 13.1 Aberration of Light Frame F  moves with a velocity v = (0, 0, v) along the z− axis of the coordinates

system corresponding to the frame F. We have, c = nc, c = n c, γ = 1 1 − v2 /c2 .         n = n − γ v/c + (γ − 1)n γ (1 − v · n/c) = n⊥ + γ (n − v/c γ (1 − v · n/c) .





n = (− sin θ, 0, cos θ) cos θ 

θ

O

, n = (− sin θ  , 0, cos θ  )

apparent position

θ

O

       sin θ  = sin θ γ (1 + v cos θ/c) , = cos θ + v/c 1 + v cos θ/c ,     ⇒ tan θ  = sin θ γ (cos θ + v/c) . 

The aberration of light is detected from the Earth (frame F  ) because of the change in the direction of its velocity during the year and leads to an evident change in the apparent position of a star. In general, for a frame F  for which speed v → c, cos θ  → 1, sin θ  → 0, and the apparent position appears in the forward direction to the observer in the moving frame.

1 The

old terminology of γ m being the relativistic mass, and m the rest one, has essentially disappeared in the modern literature.

13

Physics in Minkowski Spacetime: Applications

87

Box 13.2 Compton effect 

 pe = (m e c, 0) ,   2 k = 0 = k 2 , pe = k − k  + pe , −m 2e c2 = −2kk  + 2 pe (k − k  ) − m 2e c2 , 1  (hν)(hν  ) h  1 1 − cos θ , = (cos θ − 1) = − m e h(ν − ν  ), ⇒  − 2 2 c ν ν mec

γ + e− (at rest) → γ + e− , k + pe = k  + pe ,

where h/m e c2 is the Compton wavelength of the electron. The recoil kinetic energyTe is easily found from the energy conservation law m e c2 + hν = Te + m e c2 + hν  , Te = h(ν − ν  ), ⇒ Te = 

hν 1 + [m e

.

c2 / hν(1 − cos θ)]

In addition to these applications, including the ones in Boxes 13.3, 13.4, we also consider two-body scattering processes, as well as with successive Lorentz transformations and the so-called Thomas precession. Finally we will elaborate on the structure energy-momentum tensor of matter.  + pn ,  Two-Particle Scattering: p1 + p2 → p1 + p2 + · · · + pn−1

p1

p1

p2

 pn−1

p2

pn

  • In the lab frame : p2 = (m 2 c, 0)LAB , p1 = (E 1LAB /c, pLAB ) LAB , 1

  = −m 2 E 1 = −m 2 (pLAB )2 c2 +m 21 c4 p1 p2  1 LAB

⇒ m 2 c|pLAB | = ( p1 p2 )2 − m 21 m 22 c4 . 1

(13.10)

Box 13.3 The Doppler Effect p M

p



M

p = p  + k  , p − k  = p  , p 2 − 2 pk  + k 2 = p 2 , −M 2 c2 − 2 pk  = −M 2 c2 ,

θ k

p = (γ Mc, γ Mv),

k  = (hν /c,hν  n/c),

pk  = p · k − p 0 k  0 = γ Mhν  (v cos θ/c − 1),   hν  = [(M 2 − M 2 )/2M]c2 γ 1 − (v/c) cos θ , For v = 0, hν  → hν, therefore   hν  = hν γ 1 − (v/c) cos θ .

The momenta p, p  , k  and the frequency ν  are determined by an observer, while ν is the frequency of light emitted in the object’s frame. As the source approaches an observer from the left 0 < θ < π/2, ν  > ν, and the light reaching the observer is blue shifted. As the source moves away from an observer to the right, π/2 < θ < π, ν  < ν, and light reaching the observer is red shifted. 0 < θ < π/2

π/2 < θ < π observer

88

13

Physics in Minkowski Spacetime: Applications

Box 13.4 Particle Decay p ⇒ p1 + p2 : p = (Mc, 0), p1 = (E 1 /c, p1 ), p2 = (E 2 /c, −p1 ),

p1

p

p2 = p − p1 , p22 = p 2 + p12 − 2 pp1 , ⇒ −m 21 c2 = M 2 c2 + 2M E 1 , ⇒  2  2   M − [m 22 − m 21 ] c2 M + [m 22 + m 21 ] c2  E 1 E 2 M 4 − [m 22 − m 21 ]2  E1 = = , E2 = , 2M 2M c4 4M 2 E E 1 2 p 2 = p12 + p22 + 2 p1 p2 , → −M 2 c2 = −(m 21 + m 22 )c2 − 2p21 − 2 2 , c c 2 [M − (m 2 − m 1 )2 ][M 2 − (m 2 + m 1 )2 ]. ⇒ |p1 | = 2M

p2

  • In the center of momentum frame : p2 = (E 2CM/c, p)CM , p1 = (E 1CM/c, −p)CM .

 (E CM )2 (E 1LAB +m 2 c2 )2  LAB 2 LAB −( p1 + p2 )2  = TOT = −(p ) , E = (pLAB )2 c2 + m 21 c4 . 1 1 1 CM c2 c2

CM = (m 21 + m 22 )c4 + 2m 2 E 1LAB c2 . ⇒ E TOT

(13.11)

CM CM E 1CM E 2CM (E CM E CM )2 4 CM 2 E 1 E 2 , ( p1 p2 )2 = (pCM + 1 42 , 1 ) + 2(p1 ) 2 2 c c c   E 1CM E 2CM 2 CM 2 2 2 2 ( p1 p2 )2 = (pCM + m 21 m 22 c4 , 1 ) 2(p1 ) + (m 1 + m 2 )c + 2 c2   E 1CM E 2CM 2 CM 2 2 2 2 ( p1 p2 )2 − m 21 m 22 c4 = (pCM ) ) + (m + m )c + 2 2(p , 1 1 1 2 c2

2 p1 p2 = −(pCM 1 ) −

2 s ≡ −( p1 + p2 )2 = p12 + p22 − 2 p1 p2 =(m 21 + m 22 )c2 + 2(pCM 1 ) +2

=

⇒ |pCM 1 | =

( p1 p2 )2 − m 21 m 22 c4 2 (pCM 1 )

E 1CM E 2CM c2

,

(13.12)

( p1 p2 )2 − m 21 m 22 c4 |c2 m 2 |pLAB 1 . , also |pCM √ 1 | = TOT s E CM

1 2 ( p1 p2 )2 − m 21 m 22 c4 = s − 2(m 21 + m 22 )c2 s + (m 21 − m 22 )2 c4 . 2 In particular for n = 2, we may introduce Lorentz invariant Mandelstam variables: ⇒

p1

(13.13) (13.14)

p1

p2   2 s = −( p1 + p2 ) = −( p1 + p2 ) , t = −( p1 − u = −( p1 − p2 )2 = −( p1 − p2 )2 , such thats + t p2

2

Successive infinitesimal Lorentz transformations: Thomas Precession: Consider three frames F, F  , F  with relative motion as given below: • Velocity of frame F  in frame F : v. • Velocity of frame F  in frame F : v + δv.

p1 )2 = −( p2 − p2 )2 , +u =

m 21

+ m 22

+ m 2 1

+ m 2 2.

(13.15) (13.16)

13

Physics in Minkowski Spacetime: Applications

89

• Velocity of frame F  in frame F  : −Δv.

F

F v

Δv

F 

v + δv We consider the consequence of the underlying successive Lorentz transformations. We first express the velocity Δv in terms of the infinitesimal velocity δv. To this end, since v + δv is the velocity of frame F  in F, then we may replace v in (13.2) by v + δv. On the other hand, since −Δv denotes the velocity of frame F  in F  , while the velocity of F  in F is v, we may replace u in (13.2) by v, and replace u by −Δv. These replacements lead from (13.2) to the equation:     v − γ (v + δv) + (γ − 1) v + δv (v + δv · v/(v + δv)2 )   − Δv = , (13.17) γ 1 − (v + δv) · v/c2 1 γ =

. (13.18) 1 − (v + δv)2 /c2 The following approximations are easily verified:   v · δv 1 1 1 − 2 , (v + δv)2 v2 v2   v · δv/c2 1 1 1+ . 2 2 2 2 2 1 − (v + δv) · v/c (1 − v /c ) (1 − v /c )   γ0 2 v · δv 1 1

 , where γ0 =

.  γ0 1 + 2 c 1 − v2 /c2 1 − (v + δv)2 /c2

(13.19) (13.20) (13.21)

Some algebra leads to the following approximate expression for Δv     v · δv 1 , γ0 =

Δv = γ0 δv + γ0 γ0 − 1 v . v2 (1 − v2 /c2 )

(13.22)

which is the velocity of frame F  relative to F  . Accordingly, for infinitesimal δv, we have r r − Δvt  , r = r − γ0 v t + (γ0 − 1) v

 v·r  v · r , t = γ0 t − 2 . v2 c

 v · δv    v·r  v · r − γ δv + γ − 1 v γ t − . γ 0 0 0 0 v2 v2 c2   The coefficient of t is given by −γ0 v + γ0 δv + γ0 (γ0 − 1)v v · δv/v 2 . The Lorentz factor for F → F  , however, is given by r r − γ0 v t + (γ0 − 1) v

  1 2 v · δv 1 + γ , ρ =  γ 0 0   c2 1 − v + δv/c)2

(13.23)

(13.24)

(13.25)

and the coefficient of t may be rewritten approximately as      (γ0 − 1) (γ0 − 1)  δv v2 − v(v · δv) = −ρ v + δv − (δv × v × v . −ρ v + δv + 2 2 v v

(13.26)

Upon setting v + δv = u,

v + δv −

 (γ0 − 1) (δv × v × v = uR , 2 v

(13.27)

and taking into account that   ( ρ − 1) (γ0 − 1) v · δv 1 + (γ . − 1)(γ + 2) 0 0 u2 v2 v2

(13.28)

some lengthy, though straightforward, algebra gives r rR +

 (ρ − 1) (γ0 − 1) uR u·r − ρ uR t, rR = r − (δv × v × r, u2 v2

(13.29)

90

13

Physics in Minkowski Spacetime: Applications

which is a homogeneous Lorentz transformation including rotation, referred to as a Wigner rotation.2 To see this note that with r i ≡ x i , and r iR ≡ x iR , the above second equation may be rewritten as x iR = x i − ε i j k

j (γ0 − 1) (δv × v x k , v2

(13.30)

which upon comparison with Eq. (11.30) of Chap. 11, we may infer that this refers to a rotation of coordinate system F  by an infinitesimal angle δξ =

 (γ0 − 1)  (δv × v , v2



 sin δξ ) ≈ δξ, cos δξ ≈ 1 ,

(13.31)

 about the vector (δv × v . To gain more insight about this finding, we recall that we have a frame F  moving with velocity v, and then a frame F  moving with velocity v + δv both relative to our frame F. Consider, for example, the spin of the electron. As the electron accelerates, say, from a velocity v to a velocity v + δv in a frame F, during a time t to t + δt, the spin would precess relative to F with an angular velocity ω

  (γ0 − 1)  δv 1 γ0 2  δv × v = × v , v2 δt c2 γ0 + 1 δt

(13.32)

referred to as the Thomas Precession.3 In a non-relativistic limit v 2 c2 , the above angular velocity becomes simply4 ω (1/2c2 )(δv/δt) × v. The Energy-Momentum Tensor: We may introduce a second rank symmetric tensor T μν , referred to as the energy-momentum tensor, with the following interpretations of its components: • T 00 denotes the energy density ρ of a medium (times c2 ), • T 0 j denotes the momentum density in the j th direction (times c), • T i j denotes the j th component of momentum passing, per unit time, across a surface whose normal is in the i th direction. The decrease in momentum along a direction specified by index j through a surface S enclosing a volume V is, by definition, −

∂ ∂t

 d3 r V

T0j = c



 dS n i T i j = S

d3 r ∂i T i j ,

(13.33)

V

where in the last equality we have used Gauss’ theorem, where n = (n 1 , n 2 , n 3 ) is a unit vector. For an infinitesimal V , this is expressed locally by ∂μ T μj = 0, and invoking covariance by ∂μ T μν = 0, which represents the conservation of the energy-momentum tensor. On the other hand, conservation of T μν implies that for P ν denoting the total energy-momenta of a system of particles, that if V is replaced by all space R 3 , and |T iν | → 0 for |r| → ∞, then    ∂ Pν ∂ d3 r T 0ν = − d3 r∂i T iν = − dS n i T iν → 0, (13.34) = ∂t c∂t R 3 R3 for a surface of infinite radius, implying the conservation of the total momentum P ν . For a continuous medium, a perfect fluid, that is a medium such that in an inertial frame moving (co-moving frame) with the fluid, it is isotropic—the pressure applied to a given element of the fluid is transmitted equally in all directions. In such a frame the momentum density T 0 j vanishes. Denote its energy density in this co-moving frame by ρc2 . Let U μ be the components of the four velocity vector of the fluid in any given frame. With U μ and ημν , as the only given vector and tensor, respectively, at hand, the general structure of the energy-momentum tensor for the fluid may be written as T μν = A ημν + B U μ U ν . (13.35) Since ημν , and U μ U ν correspond to tensors, the coefficients A and B must be scalars, that is they must be invariant. That is, to find A and B, we may go the co-moving IF in which U μ = (c, 0). In particular, T 0 j = T j 0 = 0, and isotropy implies that T i j = P δ i j , where P is the pressure determined in the co-moving frame, and hence A = P. Also ρc2 = T 00 = −P + Bc2 , from which B = ρ +

P . c2

(13.36)

where ρ is the density. That is, for a perfect fluid,5   P T μν = Pημν + ρ + 2 U μ U ν , c

2 Wigner

U μ ημν U ν = −c2 .

(13.37)

[3]. [2]. 4 This has an observable effect in the spectrum of the hydrogen atom as it contributes to the spin-orbit coupling, where the dynamics of the spin of the electron due to its interaction with the magnetic field set-up, by the motion of the proton, in the frame of the electron is altered due to this additional Thomas precession of the spin predicted by relativity. See, e,g., Manoukian [1], p. 376. 5 In a gravitational field, the same structure of the energy-momentum of a perfect fluid holds true by invoking general covariance which amounts μν μν to simply replacing  η by the inverse g of the metric in the general case. That is, the following expression emerges in the general case: T μν = P g μν + ρ + P/c2 U μ U ν , and U μ gμν U ν = −c2 . 3 Thomas

13

Physics in Minkowski Spacetime: Applications

91

The pressure may be a function of the energy density and other variables and the conservation law of the energy-momentum tensor will give rise to differential equations relating the underlying variables of the theory. In particular, for low velocities, going up to second order in v, P/c2 ρ: T 00 ρc2 , T 0 j ρcv j , T i j P δ i j + (ρ + P/c2 )vi v j P δ i j + ρ vi v j , and the conservation laws ∂μ T μ0 = 0, ∂μ T μj = 0 lead, together, to the equations   ∂ρ + ∇ · ρv = 0, ∂t

 ∂v  ∇P + v·∇ v=− , ∂t ρ

v = (v1 , v2 , v3 ).

(13.38)

For a cloud (of low velocities) dust particles, i.e., of non-interacting particles, the particles are at rest in the co-moving IF, and hence the energy momentum tensor for it may be obtained from the above one for the perfect fluid by setting the pressure P essentially equal to zero in (13.37).6

References 1. Manoukian, E. B. (2006). Quantum theory: A wide spectrum. Dordrecht: Springer. 2. Thomas, L. H. (1926). Motion of the spinning electron. Nature, 117, 514. 3. Wigner, E. P. (1939). On unitary representations of the inhomogeneous Lorentz group. Annals of Mathematics, 40, 149–204.

6 See

also Eq. (83.26) in Chap. 83, which relates density and pressure for such a cloud of dust particles and shows that the pressure is negligible in comparison to the density ρ by a factor of the order v2 .

First Unified Field Theory: Maxwell’s Equations and Lorentz Covariance

14

Prerequisites Chap. 11

Maxwell’s equations unify electricity, magnetism, and optics. They give rise to the first relativistic dynamical theory ever devised by men. In rationalized Gaussian units, in vacuum, they are given by,1 ∇ · E = ρ,

∇×E = −

∂E 1 ∂B , ∇ · B = 0, ∇ × B = + J, c ∂t c ∂t c

(14.1)

where the first equation is a consequence of Coulomb’s law, the second encompasses Faraday’s law of induction, the third is a statement of the absence of magnetic charge (monopoles), the last is a statement of Ampère’s law. E and B denote, respectively, the electric and magnetic fields, while ρ and J denote, respectively, the charge and current densities. Maxwell’s equations involve space and time derivatives of the electric and magnetic fields. That is, these fields are to be represented by a second rank tensor F μν to represent the two fields. The electric and magnetic fields, together, involve six components. That is, the tensor F μν must be antisymmetric tensor in its indices: F μν = −F νμ thus involving only six components. Moreover note that in the time derivative in (14.1), t is multiplied by c, and ct = x 0 . Consider an infinitesimal portion of charge dq = ρ0 dV0 , within a volume dV0 , instantaneously at rest, with corresponding charge density ρ0 . An argument by Pauli is that there is no reason that such an infinitesimal charge leaks outside the corresponding volume dV when the infinitesimal charge dq is set in motion (with velocity u), implying the invariance of the infinitesimal charge  u2 (14.2) ρ0 dV0 = ρdV ≡ dq, dV = 1 − 2 dV0 , ⇒ ρ = γρ0 , c by using, in the process, Eq. (11.26). Moreover, by definition J = ρu. We may thus write ρ = ρ0

U0 1 , J = ρ0 U = ρ0 γ u, U μ = (γ c, γ u), γ = , c (1 − u 2 /c2 )   J μ = ρc, J .

(14.3) (14.4)

where the expression for U , for a given velocity, is given in (13.8). Now we may simply set: E i = F 0i , B i = where εi jk

1 In

 1 i jk jk  i j ε F , F = εi jk B k , 2

 i and note that ∇ × B = εi jk ∂ j B k ,

⎧ ⎪ ⎨+1 if {i, j, k} is an even permutation of {1, 2, 3}, = −1 if {i, j, k} is an odd permutation of {1, 2, 3}, ⎪ ⎩ 0 if two or three of the indices in {i, j, k} are equal.

(14.5)

(14.6)

Gaussian units ρ and J get multiplied by 4π . Rationalized Gaussian units are widely used in modern field theory.

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_14

93

94

14

First Unified Field Theory: Maxwell’s Equations and Lorentz Covariance

The first and last equations, and the second and third equations in (14.1) may be now combined, respectively, to read μ ∂μ F μν = −

ν

J , c

∂ μ F σ ν + ∂ ν F μσ + ∂ σ F νμ = 0,

ν

σ

,

  1 J ν = ρc, J = ρ0 U μ , E i = F 0i , B i = εi jk F jk , 2

with ∂μ J μ = 0,

(14.7) (14.8)

where note that ∂ν ∂μ F μν = 0 due to the anti-symmetric nature of F μν in its indices, and hence ∂μ J μ = 0 which simply expresses the conservation of charge: ∂ρ/∂t + ∇. J = 0. The second rank tensor field F μν is referred as the Faraday tensor. It transforms as the product of two vectors under a homogeneous Lorentz transformation, with a velocity v relative to initial frame, 1 , γ = (1 − v2 /c2 )



v v E = E , B = B , E⊥ = γ E⊥ + × B B⊥ = γ B⊥ − × E , c c v = |v|. μ

F μν = Λλ Λνσ F λσ ,

(14.9) (14.10) (14.11)

where  and ⊥ denote components parallel and perpendicular to the 3-vector v, respectively. In particular note that E · B = E · B , and E⊥

·

B⊥

v v

v

v × E + B⊥ · ×B − ×E · ×B = γ E⊥ · B⊥ − E⊥ · c c c c

v2

= E⊥ · B⊥ , = γ 2 E⊥ · B⊥ − 0 + 0 − 2 E · B − E · B c 2

(14.12)

as well, where we have used the identities A.(B × C) = B · (C × A), and (A × B) · (C × D)= (A · C)(B · D) − (A · D)(B · C). Accordingly, E · B is an invariant. A similar analysis shows that E2 − B2 is also an invariant. One may, in turn, define the dual of Fλσ by 1 ∗ μν F = εμνλσ Fλσ , (14.13) 2 where εμνλσ , defined as εi jk , is totally anti-symmetric with ε0123 = +1, to express the above two invariants in the form −

1 ∗ μν F Fμν = E · B, 4

 1 1 2 E − B2 . − F μν Fμν = 4 2

(14.14)

Electromagnetic Waves: Away from charge distributions and current densities, i.e. in regions in which these densities are zero, Maxwell’s equations in vacuum read ∇ · E = 0,

∇×E = −

∂B ∂E , ∇ · B = 0, ∇ × B = . c ∂t c∂ t

(14.15)

Using the basic equations ∂(∇ × B) ∂ 2E = −∇ × (∇ × E) = − ∇(∇ · E) + ∇ 2 E = +∇ 2 E, = c2 ∂ t 2 c∂ t

(14.16)

we obtain ∇2 E =

1 ∂ 2E , c2 ∂ t 2

∇2 B =

1 ∂ 2B , c2 ∂ t 2

where ∇ 2 =

3

 ∂ 2 , ∂x j j=1

(14.17)

14

First Unified Field Theory: Maxwell’s Equations and Lorentz Covariance

95

and similarly for B, which are wave equations for the fields in question propagating with the speed of light c. The Vector Potential: Solution of Electrodynamics:   Since ∇ · B = 0, means that we may introduce a vector field A, and write B = ∇ × A, since ∇ · ∇ × A = 0. On the other hand, from the second equation in (14.1): ∇×E=−

1 ∂ (∇ × A) 1 ∂A 1 ∂A 1 ∂B =− , ∇× E+ = 0, ⇒ E + ≡ −∇φ. c ∂t c ∂t c ∂t c ∂t

Thus, we may express the electric and magnetic fields in terms of fields (φ, A), 1 ∂A as follows: E = −∇φ − , B = ∇ × A. c ∂t

(14.18)

(14.19)

It is readily verified that the Faraday tensor field F μν may be rewritten as F μν = ∂ μ Aν − ∂ ν Aμ ,

Aμ = (φ, A).



F 0i = E i , F i j = εi jk B k .

(14.20)

Clearly if one carries out a gauge transformation Aμ → Aμ + ∂ μ λ = Aμnew , for a given function λ, then F μν remains gauge invariant. The above gauge transformation implies that ∂μ Aμ → ∂μ Aμ + λ = ∂ν Aνnew . Accordingly if one chooses λ = −∂μ Aμ then one may appropriately choose a vector potential Aμnew such that ∂μ Aμnew = 0,

often referred to as the Lorenz gauge.

In subsequent analysis in this chapter we omit the word “new” in Aμnew while working in this gauge.2 Box 14.1 [− + m 2 ]−1 and Energy-Momentum Transfer: the Propagator −[  + m 2 c2 ] + (x − x  ) = δ (4) (x − x  ),  = ∇ 2 − ∂02 , (dk) = dk 0 dk 1 dk 2 dk 3 ,  (dk) δ 4 (x − x  ) = exp[ik(x −x  )], k(x − x  ) = k · (x−x ) − k 0 (x 0−x 0 ), and (2π )4 1 1 exp[ik(x −x  )] = 2 exp[ik(x −x  )], (k 2 + m 2 c4 ) = −(k 0 − E)(k 0 + E), − + m 2 c4 k + m 2 c4 k 0 -plane   )] −E + i (dk) exp[ik(x −x E = k2 + m 2 c4 , + (x − x  ) = − . E − i (2π )4 (k 0 − E)(k 0 + E) Consider the complex k 0−plane. Let Im k 0 denote the imaginary part of k 0 . For x 0 > x 0 , then the real part of exp[−ik 0 (x 0 −x 0 )] → 0 for Im k 0 → −∞, i.e., we may close the contour of integration from below, picking up the pole at k 0 = E. Similarly for x 0 < x 0 we may close the contour from above, picking up the pole at k 0 = −E.  dk 0 exp[−ik 0 (x 0 −x 0 )] exp[−iE(x 0 −x 0 )] Residue Theorem : x 0 > x 0 , = − i , 2π (k 0 − E)(k 0 + E) 2E  dk 0 exp[−ik 0 (x 0 −x 0 )] exp[+iE(x 0 −x 0 )] x 0 < x 0 , =+i . 2π (k 0 − E)(k 0 + E) −2E  3 d k exp[ik · (x−x )] exp[−i k 0 |x 0 −x 0 | ], k 0 = + k2 + m 2 c4 . hence + (x − x  ) = i (2π )3 2k 0  (dk) exp[ik(x −x  )] This may be also rewritten as + (x − x  ) = , → + 0, (2π )4 (k 2 + m 2 c4 − i ) since (k 2 + m 2 c4 − i ) = −[ k 0 − (E − i ) ][ k 0 − (−E + i ) ], → + 0.

2 More

details on gauge transformations will be carried out later in Chap. 17.

(14.21)

96

14

First Unified Field Theory: Maxwell’s Equations and Lorentz Covariance

 From Eq. (14.7) we may then write in the Lorenz gauge, and setting here c equal to one for simplicity of the notation ∂μ F μν (x) = −J ν (x), ∂μ Aμ (x) = 0 ν

ν

(14.22)

→ A (x) = −J (x),  ≡ ∇ −      ν hence A (x) = (dx  )D+ (x − x  )J ν (x  ), (dx  ) ≡ dx 0 dx 1 dx 2 dx 3 , 2

∂02 ,

where − D+ (x − x  ) = δ (4) (x − x  ),

(14.23) (14.24) (14.25)

so that Eq. (14.23) holds, as is easily verified using the property of the integral over the Dirac  (dx  )δ (4) (x − x  )J ν (x  ) = J ν (x), and from Box 14.1, delta δ (4) (x − x  ) :      3 ik(x−x  ) d e k (dk) e+ik(x−x ) , for x 0 > x 0 ,    D+ (x − x ) = + (x − x ) =i × . =  m=0 (2π )4 k 2 − i (2π )3 2|k| e−ik(x−x ) , for x 0 < x 0 ,

(14.26)



with k 0 = |k| in the above two cases. Hence, in particular for x 0 > x 0 at which point the current  d3 k  e+ik(x−x ) J ν (x  ), J ν (x) vanishes, Aν (x) = i k = ( |k|, k ).  3 (2π ) 2|k|

(14.27)

In following chapters, involving particle dynamics which arise by the exchange of energy-momentum between particles as well as in the creation of new particles in scattering process, the necessity arises of interpreting the inverse operator [− + m 2 ]−1 associated with a particle of given mass m, referred to as a propagator.3 The significance of this inverse operator is investigated in Box 14.1, and much use of it will be made in subsequent chapters. The i factor in the propagator in Box 14.1 specifies what is appropriately called the Schwinger–Feynman physical boundary condition. The Lorentz Force and Relativity: To find an equation of motion of a charged particle in an external electromagnetic field F μν consistent with SR, we first introduce the variables U = (γ c, γ u),

p = mU = mγ u, p 0 = mU 0 = mγ c, γ = 1/ (1 − u 2 /c2 ), dt = γ dτ.

p μ pμ = −m 2 c2 ,

(14.28) (14.29)

We also recall from Eq. (13.6) in Chap. 13, that E = mcU 0 E = mc2 γ =

mc2 (1 − u 2 /c2 )

mc2 +

1 2 mu + O(u 4 ), 2

(14.30)

denotes the total energy of a particle, with mc2 denoting its rest energy. A covariant equation may be now defined as follows and, in turn, determine the four vector with components f μ introduced below: d p·f d μ p = f μ ⇒ pμ p μ = 0, pμ f μ = 0, from which f 0 = 0 , dτ dτ p

(14.31)

using, in the process that pμ d p μ /dτ = (1/2) d( pμ p μ )/dτ = −(1/2)d (m 2 c2 )/dτ = 0. On the other hand   dt d  d d p = f, mU =γ m U = f, while the force is defined by dτ dτ dt dt   d d 0 d u·F 1 0 F= m U , hence f = γ F, p = f 0, m U0 = f = , dt dτ dt γ c

3 The

propagators of different particles of different spins, as we will see later, involve such a factor, see, e.g. Chap. 16.

(14.32)

14

First Unified Field Theory: Maxwell’s Equations and Lorentz Covariance

d E = u · F, dt

and

fμ =

γ c

u · F, γ F .

97

(14.33)

where dE/dt = u · F is the expression for power. Let q be the charge of a test particle moving in external electromagnetic field F μν produced by a given charge Q. Consider the rest frame F in which the charge Q is instantaneously at rest (a co-moving frame). Let u denote the instantaneous velocity of charge q in frame F, and let u denote its velocity in frame F  at the instant when charge Q is at rest in it. It is an empirical fact that the force on a test charge q due to a static charge Q is given by F = qE ,

(14.34)

where E  is the electric field produced by charge Q instantaneously at rest in frame F . Hence f  = qγ  E , f 0 =

q    1 . γ u ·E, γ = c (1 − u 2 /c2 )

(14.35)

That is, f i =

q 0 0i q U F , f 0 = F 0i Ui , ⇒ c c

Accordingly,

fμ =

f μ =

q μν  F Uν in F in covariant form. c

q μν F Uν in frame F by invoking covariance, i.e., c q d pμ = F μν Uν , dτ c

(14.36)

(14.37) (14.38)

as the Lorentz covariant equation of motion of a test charged particle in an electromagnetic field F μν .  i Since F i j = εi jk Bk , and εi jk u j Bk = u × B , we obtain the explicit expression for the force F = d(p)/dt from (14.38), with U μ given in (14.28), to be given by

u F=q E+ ×B , c referred to as the Lorentz Force, where note that F i0 U0 = F 0i U 0 .

(14.39)

QM Meets Relativity and Birth of QFT: Fields and Particles

15

Prerequisites Eqs. (12.35), (12.36) of Chap. 12

Quantum field theory (QFT) was born over 90 years when QM met relativity in 1926.1 The need for developing such a description of nature arises in the following manner. At sufficiently high energies, one is confronted with the necessary requirement of developing a formalism, as imposed by Nature, which extends quantum theory to the relativistic regime2 which has been necessarily described in details through Chaps. 11–14. A relativistic theory, as a result of the exchange that takes place between energy and matter, allows the creation of an unlimited number of particles and the number of particles in a given process need not be conserved. An appropriate description of such physical processes for which a variable number of particles may be created or destroyed, in the quantum world, is provided by the very rich concept of a quantum field. The theory which emerges from extending quantum physics to the relativistic regime is called Quantum field theory, and may be symbolically, and most appropriately, represented by QM + Relativity = QFT. The latter is a multi-particle theory.3 With Minkowski spacetime physics and its underlying theory developed in the earlier mentioned chapters, we may now, appropriately, deal with the large number of particles observed in nature and describe their interactions at high energies, as elaborated above, and confront experiments. Our present reasonably successful QFT descriptions of basic interactions are: quantum electrodynamics (QED), quantum chromodynamics (QCD), the electroweak theory (EW) and grand unified field theories (GUT). Much work, however, is still needed, in particular, to incorporate gravitation with the existing theories in a unified manner. The existing reasonably successful theories, mentioned above, meet the severe test of renormalizability which provides a consistent perturbative description of their underlying processes at energies met in present experiments. These theories, as well as the concept of renormalization, will be discussed, as we go along, in coming chapters. Unfortunately, a quantum description of GR fails to meet the criterion of renormalizability, as we will see in a later chapter (Chap. 72), and many attempts have been done in recent years to develop a consistent description of a quantum gravity but still much has to be done before one may even claim that a right approach to it has been developed. A key paper in the development of QFT was that of a paper4 by Paul Dirac in 1927 which extended quantum methods to the electromagnetic field thus providing a theoretical description of how photons emerge in the quantization of the electromagnetic field. Although this paper appeared after a fundamental paper5 by Born, Heisenberg and Jordan in 1926 who applied quantum mechanical methods also to the electromagnetic field giving rise to a system with an infinite degrees of freedom, described as a set of independent harmonic oscillators of various frequencies, Dirac’s paper is considered to mark the birthdate of quantum electrodynamics (a name coined by Dirac himself) and provided the prototype for the introduction of fields, as operators, for other particles with different spins and for the development of the other field theory interactions, mentioned above, as well. We will represent a field, in general, by χ (x) as a function of spacetime variables. The physical significance of a quantum field and the particle content of a theory emerges from the commutation relation of the field with the energy-momentum operator, of commuting components P μ , corresponding to the underlying physical system, and by considering, in particular, the vacuum state |vac as an eigenstate of P μ of zero energy-momentum: P μ |vac = 0. Spacetime translations may be carried this chapter to Chap. 37, we use units for which  and the speed of light c are set equal to one as it is customary in quantum field theory. a fairly detailed account of the historical development of QFT, since its birth in 1926 up to present days, is given in the introductory chapter of my book: Manoukian [4], pp. 1–43, and would certainly be most valuable for readers at all levels. 3 It is unfortunate that some of the older textbooks introduce a field, in the process, by expanding it in terms of creation and annihilation operators and refer the underlying theory as relativistic quantum mechanics, not emphasizing the quantum field theory aspect that a field may create not only a single particle but it may create many other particles as well, as seen below. 4 Dirac [2]. 5 Born, Heisenberg and Jordan [1]. Reprinted in van der Waerden [6]. For early papers on QED, see, Schwinger [5]. 1 In

2 Quite

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_15

99

100

15

QM Meets Relativity and Birth of QFT: Fields and Particles

out via the generator P μ giving rise to the expression μ

μ

χ (x) = e−ixμ P χ (0) eixμ P , and hence − i∂ μ χ (x) = [χ (x), P μ ].

(15.1)

Upon introducing the Fourier transform and a single particle state | p  , ν, where ν stands for other labels needed to specify the state of a particle, one may write  χ (x) =

(d p) ix p e χ ( p), (2π )4

 p  , ν|P μ = p μ  p  , ν|,

(15.2)

the second equation in (15.1) gives p μ χ ( p) = χ ( p)P μ − P μ χ ( p), or χ ( p)P μ = P μ χ ( p) + p μ χ ( p), and the latter leads to the fundamental equation  p  , ν|χ ( p)P μ = ( p μ + p μ ) p  , ν|χ ( p), similarlyvac|χ ( p)P μ = p μ vac|χ ( p).

(15.3)

That is, for p 0 ≷ 0, χ ( p) injects/absorbs a quantity of energy | p 0 |.6 In particular, the vacuum expectation value of the field χ (x) is given by vac|χ (x)|vac = vac|χ (0)|vac ≡ χc ,

χ(x) ≡ χ (x) − χc ,

(15.4)

where χc is just a c-number. A hermitian (scalar) field has the special role that this c-number need not be equal to zero. This is unlike other fields as one may argue, by imposing the invariance of the vacuum under a non-trivial transformation of a given field, to infer the vanishing of its corresponding vacuum expectation value of the field. The simplest example of the latter case is an application to a pseudo-scalar field, for which invariance of the vacuum state under parity: vac|χ (0)|vac = −vac|χ (0)|vac, implies that the vacuum expectation value of a pseudo-scalar field is zero. The spectral resolution of the energy-momentum operator7 is defined through the equations  eix P =

(d p) ix p e Ω( p), (2π )4

 1=

(d p) Ω( p), (2π )4

(15.5)

where Ω( p) is a projection operator on states with energy-momentum p, and may be written as Ω( p) = (2π )4 δ (4) ( p)|vacvac| +

 (2π )δ( p 2 + m 2 )θ ( p 0 )|p, νp, ν| + Ω  ( p),

(15.6)

ν

θ ( p 0 ) = 1 for p 0 > 0, θ ( p 0 ) = 0 for p 0 < 1,   1  δ( p 0 − E) + δ( p 0 + E) , E = + p2 + m 2 , δ( p 2 + m 2 ) = 2E

(15.7) (15.8)

and the first equation in (15.1), in turn, implies that 

(d p) ix p e vac|χ (0)Ω( p), (2π )4   d3 p vac|χ (x) = χc vac| + eix p vac|χ (0)|p, νp, ν| 3 2 p0 (2π ) ν  (d p) ix p + e vac|χ (0)Ω  ( p), (2π )4 p, ν|Ω  ( p) = 0. vac|p, ν = 0, vac|Ω  ( p) = 0,

vac|χ (x) = vac|χ (0)eix P =

(15.9)

(15.10) (15.11)

where the second integral on the right-hand side of (15.10) corresponds to multi- or composite- particles that might be created by the field from the vacuum, while the first one corresponds to a particle of mass m associated with the field in which 6 The

idea of expanding (free) fields in terms of creation and annihilation operators comes from here. Here we have a more realistic and more complete situation dealing with a multi-particle theory. 7 The second relation in (15.5) is referred to as the resolution of the identity.

15

QM Meets Relativity and Birth of QFT: Fields and Particles

p0 =



101

p2 + m 2 . Moreover vac|χ (0)|p, ν =



Z U (p, ν),

vac|χ (x)|p, ν =



Z U (p, ν) ei px ,

(15.12)

 Here p 0 = p2 + m 2 , U (p, ν), which √may include a phase factor, denotes the wavefunction of the particle in the momentumdescription, and the real coefficient Z , as required by the normalization condition of probability in quantum theory, may not be necessarily equal to one since the second integral in (15.10) is not necessarily equal to zero, as the field may describe more than on particle. Z is called a wavefunction renormalization constant.8 Situations may arise that a single isolated particle cannot be described, or that the theory predicts only composite and other multi-particle states. In such cases Z , as the probability of the creation of the particle in question by the field out of the vacuum, is equal to zero. It is important to note that the integration measure, corresponding to the second term on the right-hand side of (15.6), in an integral such as 

(d p) [(2π )δ( p 2 + m 2 )θ ( p 0 )]{·} = (2 π )4



d3 p {·}, (2π )3 2 p 0

p0 =

 p2 + m 2 ,

(15.13)

is Lorentz invariant as a Lorentz transformation does not change the sign of the component p 0 . In considering physical processes, a certain particle may be observed to emerge in the theory. To determine the conditional probability that such a particle, given that it was observed in the process, has some given characteristics such as a given √ momentum and a given spin projection, one divides the field χ (x) by Z leading to the definition of a renormalized field: √ χren = χ / Z ,

(15.14)

defined independently of any perturbation theory description, thus isolating and bringing the particle in evidence. Here is where Z acquires the name of a wavefunction renormalization constant. From Eq. (15.10), in particular, and its generalization by considering the first equation in (15.1), together with Eq. (15.5), we learn that fields spread through all of space and vibrate in spacetime. The photon, for example, is considered as a vibration and an excitation of the electromagnetic field. This will be fully exploited in Chap. 34, Box 34.2, to show how the so-called Higgs field vibrates about its vacuum expectation value and how the Higgs boson is a vibration and an excitation of the Higgs field. The simultaneous vibrations of several fields in spacetime and the excitations of corresponding particles lead to a myriad of basic processes involving such particles. Now that we have found the intimate connection between fields and particles, let us see how we may describe the dynamics of the underlying particles, most conveniently, in terms of these quantum fields as dynamical variables. Disturbances created by a measurement process cannot propagate faster than the speed of light, and different regions of space, at any given fixed time, are dynamically independent in the sense that any measurement made in a given region of space is incompatible with a different region of space. Space, at any given fixed time, defines a special case of a space-like (hyper)-surface with the latter defined in such a manner that every two distinct points x1 , x2 lying on it are space-like separated: (x2 − x1 )2 > 0, and hence cannot be connected by any signal. The quantum fields, as dynamical variables as functions of space and time may be rewritten as χa (x) = χax (t), where a is any index that the field may carry, and x, as a continuous variable, may vary over all space points. A field describes a dynamical variable of (an uncountable) infinite degrees of freedom. This is to be compared with the situation in classical dynamics in which a dynamical variable qi (t) describes a finite number of degrees of freedom as i takes on a finite number of values. The independent dynamical field variables χax (t), for different x-values, thus must satisfy locality conditions. For t  = t, the commutator of a scalar field ϕ(x), for example, at different x-values vanishes: [ϕ(x), ϕ(x  )] = 0, and more generally for any two space-like separated points x, x  . Similarly, for a Dirac field ψa (x), studied in the coming chapters, the anti-commutation relation {ψa (x), ψb† (x  )} = 0 for any two space-like separated points x, x  . As in classical mechanics one may then set-up Lagrangians, in terms of these dynamical variables, to develop field theoretical descriptions of the basic interaction in nature. QFT plays a unique role in fundamental theoretical physics. In his 1958, 1959–1960 Caltech lectures on QFT of fundamental processes, the legendary Richard Feynman the first statement he makes, the very first one, is that the lectures cover all of physics.9 It is not difficult to understand what Feynman meant by covering all of physics. The role of fundamental theoretical physics is to describe the basic interactions of nature and QFT, par excellence, is supposed to do just that. His statement is, 8 The

presence of a wavefunction renormalization constant is dictated by QM independently of perturbation theory. [3], page 1.

9 Feynman

102

15

QM Meets Relativity and Birth of QFT: Fields and Particles

of course, even more relevant today than it was then, as the main goal today is to provide a unified description of all the fundamental interactions in nature. To close this chapter, we introduce the generators which implement (inhomogeneous Lorentz) Poincar´e transformations on fields and particle states, induced by the infinitesimal Poincar´e transformations in Minkowski spacetime of coordinates as given in Eqs. (12.35) in Chap. 12: x μ = Ω μ ν x ν − bμ , (Ω  , b )(Ω, b) = (Ω  Ω, Ω  b + b ),  −1  −1   −1 μν Ω, b = Ω , −Ω −1 b , Ω = Ω νμ .

(15.15)

Such a transformation being connected to the identity one, induces a unitary operator to act on particle states and quantum fields. To find the algebra satisfied by the generators of such a transformation, we consider successive infinitesimal Poincar´e transformations forming a closed path as follows (Ω2 , b2 )−1 (Ω1 , b1 )−1 (Ω2 , b2 )(Ω1 , b1 ) = (Ω, b),

(15.16)

represented pictorially by 1 2 ? r 1

2 6

emphasizing the reversal of the transformations in the third and the fourth segments of the path. The matrix elements of Ω are explicitly given in lines three and four in Box 12.1. For infinitesimal transformations with parameters δϑ, δbμ , δβ i , β i = v i /c, we have from Eq. (12.36) in Chap. 12: δx μ = x μ − x μ = δbμ − δωμ ν x ν , Ω

μ

ν

where



μ

ν

+ δω

δω = ε ij

μ

ν,

δω

n δϑ,

ij k k

μν

(15.17) νμ

= −δω ,

δω

i

0

(15.18)

= −δβ = δω i , i

0

δω

0

0

= 0.

(15.19)

The group properties in Eq. (15.15) readily lead to δω μν = δω 2 μρ δω1ρ ν − δω 2 ν ρ δω1ρ μ , μ

δbμ = δω2 ρ δb 1 ρ − δω1 μ ρ δb 2 ρ .

(15.20) (15.21)

The unitary operator corresponding, for example, to (Ω, b), for infinitesimal transformations, has the structure U = 1 + i G,

G = δbμ P μ +

1 δωμν J μν , 2

J μν = −J νμ ,

(15.22)

with P μ , J μν denoting the energy-momentum and angular momentum operators, respectively, generating spacetime translations and homogeneous Lorentz transformations. The latter, corresponding to J μν , consist of Lorentz boosts and spacial rotations. In more details J 23 ≡ J 1 , J 31 ≡ J 2 , J 12 ≡ J 3 , correspond to the three components of orbital angular momentum generating rotations. J 01 , J 02 , J 03 generate boosts, while P 0 , (P 1 , P 2 , P 3 ), denoting the Hamiltonian and the momentum components, generate space-time translations as we have already explicitly done in Eqs. (15.1) and (15.9) above. Again the constant  is introduced, as in Chap. 4 in non-relativistic QM, via the scale transformation uniformly in G → G/, because the unit of energy is not simply taken as the inverse of the unit of time (which would otherwise define a so-called natural unit), and so on, consistently, for all the other variables in G. Here, as it is customary in quantum field theory, we use units for which  is set equal to one. Referring to the closed path in Eq. (15.16), one has

15

QM Meets Relativity and Birth of QFT: Fields and Particles

U2−1 U1−1 U2 U1 = U,

U −1 j = 1 − i G j , U j = 1 + i G j , j = 1, 2,

103

U = 1+iG

(15.23)

which leads to

1 [ G 1 , G 2 ], (15.24) i with the parameters of the generator G given by (δω μν , δb μ ) spelled out in (15.20), (15.21), and those of G 1 , G 2 are μν μ μν μ denoted by (δω1 , δb1 ), (δω2 , δb2 ). Upon comparison of the coefficients of identical products of the above parameters on both sides of (15.24), as we have done in Eqs. (4.22)–(4.25) in the non-relativistic case in Chap. 4, gives the Poincar´e algebra G =

[ P μ, P ν ] = 0

  [ P μ , J σρ ] = i η μρ P σ − η μσ P ρ ,   [ J μν , J σρ ] = i η μσ J νρ − η νσ J μρ + η νρ J μσ − η μρ J νσ ,

(15.25) (15.26) (15.27)

satisfied by the generators P μ , J σρ . We now consider discrete transformations consisting of space reflection (parity transformation) P, time reversal T and charge conjugation C, for which a particle is replaced by its antiparticle. A key criterion for finding out if T, for example, should be unitary or anti-unitary is to eliminate the choice which would lead to the inconsistent result that a Hamiltonian is unbounded from below. This is inferred as follows. A time reversal followed by an infinitesimal time translation δτ , via the Hamiltonian H of the system under consideration, is equivalent to an infinitesimal time translation −δτ followed by a time reversal, i.e., we have, in general, the equality [1 − i δτ H ] T = T [1 + i δτ (H + ϕ)],

(15.28)

up to a phase factor ϕ. If T were unitary,10 this gives H T = −T (H + ϕ). For a state |η with positive and arbitrary large energy E, there would correspond a state T |η with energy − (E +ϕ) and the Hamiltonian would be unbounded from below. That is, T is to be implemented by an anti-unitary operator which, in the process of applying time reversal, it would complex conjugate11 the i factor on the right-hand side of (15.27) and would lead to H T = T (H + ϕ) avoiding the nonphysical condition of unboundedness of the Hamiltonian from below. A similar analysis given for parity transformation P, as applied to [1 + iG] in (15.22), leads one to infer that it is to be implemented by a unitary operator. On the other hand, C, unlike P, and T, is not involved with space and time reflections, and may be implemented by a unitary operator. Accordingly, the product CPT, in turn, is implemented by an anti-unitary operator. The corresponding symmetry, embodied in the so-called CPT Theorem, will be dealt with in Chap. 36.

References 1. Born, M., Heisenberg, W., & Jordan, P. (1926). Zur quantunmechanik III. Zeitschrift fur Physik, 35, 557–615. 2. Dirac, P. A. M. (1927). The quantum theory of emission and absorption of radiation. Proceedings of the Royal Society of London, A, 114, 243–265. 3. Feynman, R. P. (1982). The theory of fundamental processes. Menlo Park: The Benjamin/Cummings Publishing Co., 6th Printing. 4. Manoukian, E. B. (2016). Quantum field theory I: Foundations and abelian and non-abelian gauge theories. Switzerland: Springer. 5. Schwinger, J. (Ed.). (1958). Selected papers on quantum electrodynamics. New York: Dover. 6. van der Waerden, B. L. (Ed.). (1968). Sources of quantum mechanics. New York: Dover.

10 See 11 See

Eq. (4.1) in Chap. 4. Eq. (4.2) in Chap. 4.

The Five Types of Fields You Meet in Quantum Field Theory

16

Prerequisites Chap. 14

Present QFT/HEP is involved with five types of fields of spins: 0, 1/2, 1, 3/2, 2.1 Spins 0, 1/2, 1 occur in the so-called standard model of elementary particles consisting of the electroweak interaction and of the strong interaction, while a quantum treatment of GR involves a massless spin 2 field, and its supersymmetric version involves both spins 2 and 3/2 massless fields. The restrictions to these 5 types of fields is because one often encounters difficulties in the formulations of consistent field theoretical descriptions involving higher spin fields as a consequence of the non-damping high energy behavior of the underlying theories involving, in general, higher spin fields as will be discussed in later chapters dealing with so-called renormalization theory. Even spins 3/2 and 2 are problematic in constructing consistent quantum gravitational theories. This problem is also shared by a massive spin 1 field (see Eq. (16.36) and below Eq. (16.37)). In the latter case, one starts with a massless spin 1 field and generates its mass by the so-called Higgs mechanism as will be discussed in detail in Chaps. 33–35. Thus at present one deals with these 5 types of fields. We describe the 5 types of fields in turn. In this chapter, we investigate fields, for non-interacting particles as they are initially produced, as well as when they are on their way to be detected by external (classical) sources with the latter acting as emitters and detectors, respectively, as they are encountered in experimental situations. • Spin 0: The Klein–Gordon Field The field equation of a spin zero field ϕ(x) in the presence of an external classical source K (x) is given by 

(− + m 2 )ϕ(x) = K (x),

  = ∇ 2 − ∂02 ,

(16.1)

2 ϕ(x) = In a region of4 spacetime where the source K (x) = 0, (− + m )ϕ(x) = 0, and a Fourier transform of 2the field 2 [(d p)/(2π ) ]exp[ix p]ϕ( p), leads to the necessary constraint on the energy-momentum of a free particle: ( p +m )ϕ( p) = 0, p 2 = −m 2 . Let

|0−  denote the vacuum state prior that the source is switched on,

(16.2)

|0+  denote the vacuum state after that the source is finally switched off.

(16.3)

Since the source, in the process that is switched on, it may create particles, QM says that |0+ |0− | is not necessarily one. Taking the expectation value: 0+ |ϕ(x)|0−  in Eq. (16.1) and setting: ϕ(x)=0+ |ϕ(x)|0− /0+ |0− , we have ϕ(x) =

1 K (x) = (− + m 2 )



(dx  )+ (x − x  )K (x  ),

(16.4)

where + (x − x  ) is the propagator defined earlier in Box 14.1 of Chap. 14, together with the Schwinger–Feynman physical boundary condition associated with energy transfer specified by the i factor, for a particle with a given mass m, and is given by

1 With all due respect to Mitch Albom, the title of this chapter is similar to the one of his interesting book: M. Albom. The five people you meet in heaven. New York: Hachette Books (2006), on an entirely different subject.

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_16

105

106

16

The Five Types of Fields You Meet in Quantum Field Theory



(d p) exp[i(x − x  ) p] ,  → +0, (d p) = d p 0 d p 1 d p 2 d p 3 , (2π )4 p 2 + m 2 − i   d3 p  0 0 0 =i ei(x−x )·p e−i|x −x | p , p 0 = p2 + m 2 , 3 0 (2π ) 2 p

+ (x − x  ) =

(16.5) (16.6)

where the expression of the last integral has been explicitly derived in Box 14.1 of Chap. 14. • Spin 1/2: The Dirac Field We consider first a spin 1/2 particle in the absence of an external source. The spin 1/2 particle has 2 spin states,  let Ψ be a two-component object. The condition that the particle satisfies the correct energy-momentum relation: E = p2 + m 2 , implies by Fourier transform that2  ∂   i − −∇ 2 + m 2 Ψ = 0. (16.7) ∂t  2 2 For simplicity of the notation, write i∂/∂t = E, −i∇  = p, to rewrite

the above equation equivalently as: [ p + m − E]Ψ = 2 2 0. Upon multiplying the latter from the left by p + m + E gives,

2 p + m 2 − E 2 Ψ = 0.

(16.8)

We may use the identity involving the Pauli matrices: σi σ j = δi j + iεi jk σk , where σ1 =

0 1 0 −i 1 0 , σ2 = , σ3 = , 1 0 i 0 0 −1

and rewrite (16.8) as 

   σ · p σ · p − (E − m)(E + m) Ψ = 0.

(16.9)

Now [σ · p]Ψ is simply a two component object and we may equivalently introduce a two-component object Φ proportional to [σ · p]Ψ , defined as follows, and set 1 [σ · p] Ψ, (16.10) Φ= (E + m) to rewrite Eq. (16.9), by using, in the process, directly the above equation, equivalently and simply in terms of the following two equations σ · p Φ − EΨ + mΨ = 0, EΦ + mΦ − σ · p Ψ = 0,

(16.11)

where the first equation reduces to Eq. (16.9), and the second one coincides with Eq. (16.10). These may be conveniently combined into one equation



0 σ −σ 0



·p−

I 0 0 −I



E +m

I 0 0 I



Ψ Φ

 = 0,

(16.12)

where I is the 2 × 2 unit matrix. This equation suggests to introduce the 4 × 4 matrices: γ = 0

I 0 0 −I



,γ =

0 σ −σ 0



 Ψ , and ψ = , Φ

(16.13)

and rewrite Eq. (16.12) as  γ μ∂ i 2 The

square root operator and Lieb [4].



μ

 + m ψ(x) = 0,

{γ μ , γ ν } = −2ημν , γ

0†

= γ 0 , γ † = −γ ,

(16.14)

−∇ 2 + m 2 is well defined which, unfortunately, is not sufficiently emphasized in the literature, see, e.g., Daubechies

16

The Five Types of Fields You Meet in Quantum Field Theory

107

which is the celebrated Dirac equation.3 Solutions of the Dirac equation for E > 0andE < 0 u(p, σ ), and v(p, σ ), respectively, and the Dirac theory are spelled out, in detail, in Box 16.1 with σ corresponding to spin states. As we will see in Eq. (16.28) below, that E < 0 corresponds to the anti-particle (of positive energy) of the electron—the positron- of opposite charge to that of the electron. The anti-commutation relation of the gamma matrices above is readily established from the explicit expressions of the gamma matrices in (16.13) by using the property of the Pauli matrices below Eq. (16.8). In the presence of an external source η(x), the above equation is replaced by γ ∂ i

 + m ψ(x) = η(x),

  γ ∂ = γ μ ∂μ .

(16.15)

We may multiply and divide the above equation from the left by [−(γ ∂/i) + m], and use the anti-commutations relations in (16.14) to obtain, in the process, [−(γ ∂/i) + m][(γ ∂/i) + m] = [− + m 2 ], leading to the equation [− + m 2 ]ψ(x) =





 γ∂ + m η(x). i

(16.16)

Upon defining < ψ(x) >= 0+ |ψ(x)|0− /0+ |0− , as we have done for the scalar field, we may, using in the process (16.15) and (16.16), write γ∂ + m S+ (x − x  ) = δ (4) (x − x  ), (dx  )S+ (x − x  )η(x  ), i  (−γ p + m) (d p)  ei p(x−x ) ,  → +0, S+ (x − x  ) = 4 2 2 (2π ) p + m − i    d3 p (−γ p + m)e+i p(x−x ) , forx 0 > x 0 , = i ×  (2π )3 2 p 0 (γ p + m)e−i p(x−x ) , forx 0 < x 0 , 

< ψ(x) > =

(16.17)

(16.18)

Box 16.1 Solutions and some aspects of the Dirac equation in the momentum description  (γ p + m)u(p, σ ) = 0, u(p, σ )(γ p + m) = 0, u ≡ u † γ 0 , p 0 = p2 + m 2 . 

−iφ/2 

 ξσ p0 + m e cos(θ/2) σ ·p u(p, σ ) = , σ = ±1, ξσ† ξσ  = δσ σ  , ξ+N = +iφ/2 , ξ σ e sin(θ/2) 0 2m  p +m

−iφ/2 sin(θ/2) −e , for p = |p|N, N = (cos φ sin θ, sin φ sin θ, cos θ), σ ·Nξ±N = ±ξ±N . ξ−N = +iφ/2 +e cos(θ/2) 0 I = iγ 0 γ 1 γ 2 γ 3, {γ 5 , γ μ } = 0. v(p, σ ) = γ 5 u(p, −σ ), v(p, σ ) = −u(p, −σ ) γ 5 , γ 5 = I 0  (−γ p + m)v(p, σ ) = 0, v(p, σ )(−γ p + m) = 0, v¯ = v † γ 0 , p 0 = p2 + m 2 .   u(p, σ ) u(p, σ ) = (−γ p + m), 2m v(p, σ ) v(p, σ ) = −(γ p + m). (∗ ) 2m σ

σ

u(p, σ ) u(p, σ  ) = δσ σ  , v(p, σ ) v(p, σ  ) = −δσ σ  , u(p, σ ) v(p, σ  ) = 0.

The identities in (∗ ) and the identity Tr [γ μ γ σ γ ν γ λ ] = 4 [ημσ ηνλ −ημν ησ λ +ημλ ησ ν ], give  m m 2 Tr [v(k , σ  )γ μ u(k, σ ) u(k, σ )γ ν v(k , σ  )] = Tr [v(k , σ  )γ μ (−γ k + m)γ ν v(k , σ  )] 2 spins spin  = (1/4)Tr [γ μ (−γ k + m)γ ν (−γ k  − m)]m=0 = (1/4) Tr [ γ μ γ σ γ ν γ λ ]kσ kλ . Hence :   • m 2 Tr [ v(k , σ  )γ μ u(k, σ ) u(k, σ )γ ν v(k , σ  ) ]m=0 = k μ k ν − ημν kk  + k ν k μ . spins

3 By

doubling the number of components from 2 to 4, we thus see that Dirac was able to generate an equation with the derivative appearing in it to first order: Dirac [5].

108

16

The Five Types of Fields You Meet in Quantum Field Theory

 where p 0 = p2 + m 2 , and S+ (x − x  ) is the spin 1/2 propagator. Consider a source η1 (x) switched on after the source η(x  ) in (16.17) is switched of, i.e., x 0 > x 0 , then the first case in (16.18) gives 

 i

(dx) η1 (x) < ψ(x) > =

d3 p [iη ( p)](−γ p + m)[iη( p)], η1 ≡ η1† γ 0 . (2π )3 2 p 0 1

(16.19)

Now using the first identity on the line marked (∗ ) in Box 16.1,4 the above expression above becomes 

   d3 p m d3 p m iη1 ( p)u(p, σ ) u(p, σ )iη( p) , (2π )3 p 0 (2π )3 p 0     

  σ

(16.20)

(amplitude of particle detection)(amplitude of particle emission) in a convenient notation used often by Schwinger.5 On the other hand, let η2 (x) be a source which is switched off before the source in (16.17) is switched on, i.e., x 0 < x 0 , the second case in (16.18) gives, using the fact that according to the Spin & Statistics, as we will see in Chap. 36, the fermion sources η2 (x), η(x  ) necessarily anti-commute, as well as the second identity on the line marked (∗ ) in Box 16.1, we obtain  (dx) η2 (x) < ψ(x) >       d3 p m d3 p m − iη2 (− p)v(p, σ ) − iv(p, σ )η(− p) = . (2π )3 p 0 (2π )3 p 0 σ      i

(16.21)

(amplitude for anti-particle detection) (amplitude for anti-particle emission) For example, if the amplitudes in (16.20) are for the electron, the amplitudes in (16.21) are then for the positron? How do we know that the latter correspond to the positron? To answer this question, we simply have to consider the electron in an external, i.e., classical, electromagnetic field Aμ (x). This is done by replacing ∂μ /i by ∂μ /i − eAμ (x) in (16.15), where e is the charge of the electron. The modified propagator then satisfies the equation (see (16.17))    ∂ μ − eAμ + m S+A (x − x  ) = δ (4) (x − x  ), γμ i  ∂  μ γμ + m S+A (x − x  ) δ (4) (x − x  ) + eγ μ Aμ S+ (x − x  ), i

(16.22) (16.23)

with the second equation valid for a sufficiently weak external field Aμ (x  ). By applying directly the operator [γ μ ∂μ /i + m] to it, and integrating over the delta function in the integral, it is easily verified that this amounts in replacing the propagator S+ (x − x  ) by S+A (x, x  ) S+ (x − x  ) + e



(dx  )S+ (x − x  )γ μ Aμ (x  )S+ (x  − x  ),

(16.24)

for a sufficiently weak external field Aμ (x  ) which, in practice, is effective only for x 0 x 0 x 0 . All told (16.20) becomes replaced by

4 For all details of the properties of the momentum description solutions u(p, σ ) and v(p, σ ), of the Dirac equation, with σ corresponding to spin components, refer to this Box. 5 Schwinger [13].

16

The Five Types of Fields You Meet in Quantum Field Theory

109

  3



d p m d3 p m iη 1 ( p  )u(p , σ ) p σ  |pσ  A,part. u(p, σ )iη( p) , 3 p 0 (2π )3 p  0 (2π ) σ,σ 

(16.25)

  p0 p σ  |pσ  A,part. = (2π )3 δσ σ  δ (3) (p − p) + i e u(p , σ  )γ μ Aμ ( p  − p)u(p, σ ) . m

(16.26)

The second term in (16.26) describes the transition amplitude for the process (p, σ ) → (p , σ  ) of the electron in the external field Aμ to a different final state. We verify that for Aμ = 0, (16.25) reduces, upon integration over p and summing over σ  , to (16.20). On the other hand, in the second case, in the presence of the external field Aμ , Eq. (16.21) becomes merely replaced by   3



d p m d3 p m − iv(p , σ )η(− p  ) p σ  |pσ  A,anti-part. − iη2 (− p)v(p, σ ) , 3 0 3 0 (2π ) p (2π ) p σ,σ 

  p0 p σ  |pσ  A,anti-part. = (2π )3 δσ σ  δ (3) (p − p) − i e v(p, σ )γ μ Aμ ( p  − p)v(p , σ  ) , m

(16.27) (16.28)

with the charge e in (16.26) replaced by − e in (16.28). The second term in (16.28) describes the transition amplitude of the positron in the external field Aμ , as a particle of the same mass as of the electron but with opposite charge. Box 16.2 C (Charge conjugation), P (parity), T (time reversal) transformations of the Dirac field    μ γ ∂μ /i + m)ψ(x) = 0, x = (x 0 , x), γ 5 = iγ 0 γ 1 γ 2 γ 3 = 0I 0I , and C = i γ 2 γ 0 .    • Parity : x  = (x 0 , −x), γ μ ∂ μ + m ψ(x  ) = 0 multiply from left by γ 0 and use [ γ 0 , γ j ] = 0,   to obtain γ μ ∂μ /i + m γ 0 ψ(x  ) = 0 ⇒  Pψ(x) P−1 = γ 0 ψ(x  ).   • Time reversal : x  = (−x 0 , x), γ μ ∂ μ + m ψ(x  ) = 0 multiply from left by γ 5 C and use [γ 5 C , γ 0 ] = 0, [γ 5 C , γ 2 ] = 0, {γ 5 C , γ 1 } = 0, {γ 5 C , γ 3 } = 0. Now multiply the resulting equation from the left by T−1 and from the right by T, using the anti-unitary nature of T (see   end of Chapter 15) to obtain γ μ ∂μ + m T−1 γ 5 C ψ(x  ) T = 0 ⇒  Tψ(x)T−1 = γ 5 C ψ(x  ).   • Charge conjugation: Consider equation γ μ [∂μ /i − eAμ (x)] + m ψ(x) = 0, of a particle of charge e, with Aμ (x) a given (real) external, ı.e., classical, field. Then it is readily checked that   ψ C (x) = C (ψ † (x)γ 0 ) ≡ C ψ (x) satisfies the equation γ μ [∂μ /i + eAμ (x)] + m ψ C (x) = 0,

in the same given external field as of the particle with charge − e, i.e.,  Cψ(x)C−1 = C ψ (x). C is called the charge conjugation matrix.

Paul Dirac developed6 his equation in 1928 with the prediction of a positively charged particle, and eventually the anti-electron (positron) was discovered by Anderson7 in 1932 and observed earlier in 1929 by Chung-Yao Chao8 but not pursued to its conclusion.9 Apparently,10 Dirac himself remarked in one of his talks that his equation was more intelligent than its author. It is also interesting to note that George Gamow referred11 to Dirac’s predicted “new” positively charged particle, prior to the discovery of the positron, as a “donkey electron”, because it would move in the opposite direction to an electron in an applied force.

6 Dirac

[5].

7 Anderson

[1, 2]. Chao was a graduate student at Caltech, and a Ph.D student of Robert Millikan. 9 See Merhra and Rechenberg [10], p. 804. See also, Anderson and Anderson [3]. It is rather surprising that Chao did not share the Nobel Prize with Anderson. 10 Weisskopf [15]. 11 Weisskopf [15]. 8 C.-Y.

110

16

The Five Types of Fields You Meet in Quantum Field Theory

We see that, the quantum field theory analysis, via the propagator, predicts quite simply the existence of the positron and anti-matter, in general.12 The C, P, T transformations of the Dirac field are carried out in Box 16.2. • Spin1, m = 0: The Maxwell Field, the Photon We have seen in Chap. 14 that the vector potential Aμ (x) : F μν = ∂ μ Aν (x) − ∂ ν Aμ (x), in the Lorenz gauge ∂μ Aμ = 0, satisfies the field equation (see also Eqs. (14.22)–(14.27)) − ∂μ F μν (x) = J ν (x) ⇒ −Aμ (x) = J μ (x), ∂ν J ν = 0,  and in a quantum setting, Aμ (x) = (dx  ) D+ (x − x  ) J μ (x  ),  (dk) exp[ik(x − x  )] D+ (x − x  ) = ,  → +0 (2 π )4 k 2 − i   d3 k exp[+ik(x − x  )], x 0 > x 0 , =i × (2π )3 2|k| exp[−ik(x − x  )], x 0 > x 0 ,

(16.29) (16.30)

(16.31)

With J ν (x  ) a conserved emission source of radiation energy, i.e., ∂ν J ν (x  ) = 0, as given in (16.29), we may introduce a detection (conserved) source J˜μ (x), i.e., ∂μ J˜μ = 0, which is switched on after the source J μ (x  ) is switched off, i.e. x 0 > x 0 , to obtain from (16.30) and (16.31),  i (dx) J˜μ (x)Aμ (x)     d3 k d3 k j ∗ i = (16.32) [i J˜i (k)eλ (k)] [i J j (k)eλ (k)], 3 3 2|k| (2π ) 2|k| (2π ) λ=1,2      amplitude of detection

amplitude of emission

describing, causally, the emission and the detection of radiation. Here ei1 (k), ei2 (k) are two ortho-normal unit vectors in 3D Euclidean space, referred to as polarization vectors, such that k i eiλ (k) = 0, λ = 1, 2, and the vectors k/|k|, e1 , e2 provide a complete set of ortho-normal vectors in 3D Euclidean space, i.e.,  δi j =

 λ=1,2

j

eiλ eλ +

ki k j μ μ μ , and upon writing eλ = (0, eλ ), i.e., kμ eλ = 0, eλ eμ λ = δλ λ , and k2

introducing the four vectors : k = (k 0 , k), k = (k 0 , −k), satisfying the photon massless condition : k 2 = k 2 = k2 − k 02 ≡ 0, and k k = −k2 − k 02 ≡ −2 k2 ≡ −2 k 02 , and hence we may write the 4D  μ kμkν + kμkν + eλ eνλ leading to the key result completeness relation ημν = kk λ=1,2   j μ ∗ μ ∗ μν ∗ J˜μ (k) eλ eνλ Jν (k) = J˜i∗ (k) eiλ eλ J j (k)as given in the J˜μ (k)J (k) = J˜μ (k) η Jν (k) = 0 + λ

λ

integrand in(16.32), where we have used the conservation laws J˜μ∗ (k) k μ = 0 = k ν Jν (k)for the Fourier transforms of the current sources. The 4D completeness relation is readily verified : 12 In Dirac’s original formulation he assumed the existence of negative energy states with negative mass, with energies going down to −∞ [Dirac [6, 7]], and that all these states are filled with electrons in accord with the Pauli exclusion principle. He argued that a negative energy electron may absorb radiation of sufficient energy to jump to a positively energy state leaving behind a “hole” with a surplus of positive energy and positive charge which was eventually interpreted as the positron. The quantum field theory analysis, via the propagator predicts the existence of the positron with no such an assumption of an infinite Dirac see of negatively charged states. In this respect the legendary Julian Schwinger remarked in 1973 [Schwinger [14]] regarding the “hole” theory as: “is best described as a historical curiosity, and forgotten”. As a young post-doctoral fellow, I was privileged to be present at this event.

16

The Five Types of Fields You Meet in Quantum Field Theory

− 1 = η00 =

111

 k 02 + k 02 −2k i k j j 01 10 ij ij + 0, 0 = η = η = 0 + 0, δ = η = + eiλ eλ , 2 −2k 02 −2k λ

with the latter expression coinciding with the 3D completeness relation above.  This ingenious way of expressing the 4D completeness relation is due to Schwinger [12]. Spin 1, m = 0: Consider first the field equation of a massive vector boson V μ in the absence of an external source. We may refer to Maxwell’s equation in (16.29) by adding a mass term obtaining − ∂μ F μν + m 2 V ν = 0 ⇒ ∂ν ∂μ F μν + m 2 ∂ν V ν = 0,   ⇒ (− + m 2 )V μ = 0, ∂μ V μ = 0, F μν = ∂ μ V ν − ∂ ν V μ .

(16.33)

The first two equations in (16.33) give the correct equation for the description of a spin 1 particle. The first one gives the correct energy-momentum constraint of a massive particle. On the other hand, Aμ has 4 components, and the second equation in (16.33) gives a constraint leaving only three independent components for the vector fields corresponding to three states of a spin 1, you are familiar with from quantum mechanics associated with spin 1. In the presence of an external source K μ (x), the field equation then takes the form

− ∂μ F μν + m 2 V ν = K ν ⇒ ημν (− + m 2 ) + ∂μ ∂ν V μ = K ν ,  μσ V μ (x) = (dx  )+ (x − x  )K σ (x  ),    eik(x−x )  μσ k μ k σ  ∂ μ∂ σ  (dk) μσ  η + + (x − x  ) = ημσ −  , (x − x ) = + m2 (2π )4 k 2 + m 2 − i m2



∂ μ∂ σ  where ημν (− + m 2 ) + ∂μ ∂ν ημσ − = δνσ −  + m 2 , m2

(16.34) (16.35) (16.36) (16.37)

μσ

and + (x − x  ) is the massive vector field propagator. We note that since m = 0, that if we scale the momenta of its momentum description by a scale parameter λ, then, unlike its massless counterpart, μσ + (λk) does not vanish for λ → ∞ because of the k μ k σ /m 2 term in it. This non-vanishing aspect of a massive spin 1 boson propagator at large momenta, in the momentum description, has caused much difficulty in the early stages of the development of the Electroweak theory, involving massive vector bosons, as it does not provide a convergence factor at high energies. This will be discussed in a later chapter. If we introduce a detection source K˜ μ (x) which is switched on after K σ (x  ) is switched off in (16.35), we obtain 

i (dx)K μ† (x)V μ (x) =





 μ

i K˜ μ† (k)eλ

λ=+,0,−

d3 k (2π )3 2k 0

  iK σ (k)e∗σ λ

 d3 k , (2π )3 2k 0

(16.38)

with a completeness relation  kμkν  ημν + = m2 μ



μ

∗μ

eλ e∗ν λ ,

(16.39)

 1 |k|, k 0 k/|k| , k · e± = 0, im

(16.40)

λ=+,0,− μ

μ

eλ eλ μ = δλλ , kμ eλ = 0, k 2 = −m 2 , μ

e+ = (0, e+ ), e− = (0, e− ), e0 =

where the relations in (16.39), (16.40) are readily checked exactly as we have already done for the photon case below Eq. (16.32). • Spin 3/2, m = 0, Massless Rarita–Schwinger Field: The Rarita–Schwinger massless field equation is given by13

13 Rarita

and Schwinger [11], Schwinger [13], Manoukian [8].

112

16

ημν



−γ ∂ i



− γμ

The Five Types of Fields You Meet in Quantum Field Theory

 ∂ν ∂μ γ∂ − γν − γ μ γ ν ψν = K μ , i i i

(16.41)

where ψν carries a spinor index, which is suppressed for convenience, as well as a Lorentz vector index ν. The external source K μ carries the same type of indices as the spinor field ψ μ . We consider the covariant gauge constraint ∂ ν ψν = 0 as we we have done for the photon. The challenge is now to show that this equation does indeed describe a massless particle of spin 3/2. We multiply (16.41) by ∂μ /i, use the fact that (γ ∂)2 = {γ μ , γ ν }∂μ ∂ν /2 = −ημν ∂μ ∂ν = −, [, γ ν ] = 0, to obtain 

 ∂μ K μ ⇒ ∂μ K μ = 0. − γ ν  + γ ν ψν = i

(16.42)

On the other hand if we multiply (16.41) by γμ , we obtain γ∂ 1 γ ψ = γ K. i 2

(16.43)

Moreover  if we multiply Eq. (16.41) by γ∂/i, and use (16.42) and (16.43), the following convenient equation emerges for the field, recall thatγ ν γ ∂ = −γ ∂γ ν − 2∂ ν   γ ∂  ∂μ 1  γ ∂  ν − γν − γμ − γ Kν . −ψ μ = ημν − i 2i 2 i

(16.44)

First we note that in the absence of the external source −ψ μ = 0, giving p 2 = 0 from a Fourier transform, thus establishing the masslessness of the particle. Second consider a conserved source K˜ μ switched on after the source K ν is switched off. This, with K˜ μ = K˜ † γ 0 , give  i



μν

d3 p ˜ i ( p) P ( p) iK ( p) , K μ ν (2π )3 2 p 0

1 P μν ( p) = (ημν (−γ p) − γ μ (−γ p)γ ν , p 0 = |p|. 2 μν

1 μσ = (η (−γ p) − η γσ (−γ p)γρ ηρν . 2

(dx) K˜ μ (x)ψ μ (x) =

(16.45) (16.46) (16.47)

Note that on account that p ν K ν ( p) = 0, p μ K˜ μ ( p) = 0, and that (γ p)(γ p) = − p 2 = 0, with p 2 = 0 due to the masslessness of the particle discussed above, we may, according to the analysis below Eq. (16.32), make the following replacements in the expression for P μν ( p) in (16.47), with polarization vectors eμλ = (0, eλ ), as given in Box 16.3: μσ μ s ∗i i ρν ν ημν → δsμ δkν esλ e∗k → γ j eλ e∗k λ , η γσ → δs eλ eλ γ , γρ η λ δk , j

(16.48)

with a summation over λ = ±, λ = ±, understood. The Latin indices s, k, i, j go over 1, 2, 3. Massless spinors are introduced  in Box 16.3, satisfying the basic relation σ =± u(p, σ )u(p, σ ) = (−γ p). Accordingly, P μν ( p) in (16.45) now effectively reads as

16

The Five Types of Fields You Meet in Quantum Field Theory

113





ν 1 i j j eμλ δλλ u(., σ )u(., σ ) − e∗i λ γ u(., σ )u(., σ )γ eλ eλ 2 λ,λ =±  μ

=2 eλ δλλ u(., σ )u(., σ ) − δλλ u(., −λ)u(., −λ) eνλ

P μν = 2

λ,λ =±

=2

 μ  eλ u(., λ)u(., λ)e∗ν U μ (., κ)U ν (., κ), λ ] = 2|p|

λ=± μ

U (., +) =

μ e+

(16.49)

κ= ± 3/2

ξ+ 0 μ μ , U (., −) = e− , U †μ (p, κ) Uμ (p, κ  ) = δκκ  , 0 ξ−

(16.50)

Box 16.3 Equations encountered in describing the spin content of the Rarita-Schwinger field m = 0. Consider p = |p|(0, 0, 1). γ p u(p, σ ) = 0, ⇒ |p|(γ 3 − γ 0 ) u(p, +) = 0, (σ 3 ∓ 1)ξ± = 0,   ξ 0 1 0 0 −I , ξ+ = , ξ− = , γ0 = . u(p, +) = |p| + , u(p, −) = |p| ξ− 0 1 −I 0 0  γp  1 μ u † (p, σ )u(p, σ  ) = |p|δσ σ  , , eλ = (0, eλ ), ei± = √ (±1, −i, 0), u(p, σ ) u(p, ¯ σ)= − 2 2 σ =± √ √ ∗i i ∗i i ∗i i ∗i i e+ γ u(., +) = 2 u(., −), e− γ u(., −) = 2 u(., +), e+ γ u(., −) = 0, e− γ u(., +) = 0. i i Therefore e∗i ¯ +)γ j e+ = 2 u(., −) u(., ¯ −), e∗i ¯ +)γ j e− = 0, + γ u(., +) u(., + γ u(., +) u(., j

j

i e∗i ¯ −)γ j e− = 2 u(., +) u(., ¯ +), − γ u(., −) u(., j

i e∗i ¯ −)γ j e+ = 0. − γ u(., −) u(., j

in a convenient notation, where in the second line in (16.49) we have, as a consequence of the equations in the two last lines in Box 16.3, that when we sum over σ , in the second term within the square brackets, only terms λ = λ contribute leading to the expression −u(., −λ)u(., λ) for each value of λ. Finally in the third line, we have used the fact that when we sum over σ = ±λ, in the first term, for a given value of λ, only the term σ = λ contributes to the total expression. Thus establishing the spin 3/2 character of the massles Rarita–Schwinger field involving two spin states. • Spin 2, m = 0, The Einstein Field, the Graviton: This is considered at length in Chap. 51, and we summarize the basic equations here for the convenience of the reader. The corresponding massless spin 2 field h μν may be appropriately referred to as the Einstein Field with the graviton as the associated particle. The field equation is given by

16π G ημσ ηνρ + ηνσ ημρ − ημν ησρ Tσρ (x). −h (x) = c4 2 μν

(16.51)

where Tσρ is a conserved second rank symmetric tensor ∂ σ Tσρ = 0, and G is the Newtonian gravitational constant. For a detection source, with associated symmetric and conserved second rank tensor T˜μν , which is switched on after Tσρ , is witched off, one has for i (dx)T˜μν (x)h μν (x), the expression 16π G c4



μσ νρ η η + ηνσ ημρ − ημν ησρ d3 k ∗ ˜ [i Tμν (k)] [i Tσρ (k)], k 0 = |k|, (2π )3 2k 0 2

(16.52)

∗ which describes the transmission of energy from the emission source Tσρ (k) to the detection source T˜μν (k). In Box 16.4, it is shown that, due to the conservation of the sources, we may effectively write

μσ νρ

 μν ∗σρ η η + ηνσ ημρ − ημν ησρ = eζ eζ , 2 ζ =± 2 μ

∗μ

∗μ

μ

with kμ e± = 0, eλ eμλ = δλλ , eλ = − e−λ ,

μν

μ

e± 2 = e± eν± ,

(16.53) (16.54)

114

16

The Five Types of Fields You Meet in Quantum Field Theory

establishing the spin 2 content of the graviton. For detailed geometrical description of polarization states of gravitation, see also Chap. 52.14 Box 16.4 Equations encountered in describing the spin content of the graviton

ημν =

ν

μ

kμk + k kν kk

+



μ

eλ e∗ν λ =

λ =± μσ νρ

η η + ηνσ ημρ − ημν ησρ

ν

μ

kμk + k kν kk



+

λ =±

∗μ

∗μ

μ

eλ eνλ , eλ = − e−λ (see Box 16.2).

≡ π μν,σρ , and due to the conservation of the sources,  1   μ ∗σ ν ∗ρ μ ∗ρ μ ∗ρ eλ eλ eλ eλ + eλ eλ eνλ e∗λσ − eλ e∗λ ν eσλ eλ . Or π μν,σρ = 2 λ,λ    μ 1 ∗ρ μ ∗ρ ∗ρ ∗ρ ∗ρ σ eλ e∗λ σ eνλ eλ − eλ e∗λ ν eσλ eλ − e∗λ σ eν−λ e−λ − eλ eν−λ e∗−λ = + e∗λ ν eσ−λ e−λ . 2

2 we effectively have π μν,σρ

λ

The first and the third terms within the round brackets cancel out, as well as the second  μν ∗σρ μν μ and the fourth terms. This gives π μν,σρ = eζ eζ , e± 2 = e± eν± . ζ =± 2

Box 16.5 Some Representations of the Gamma Matrices Consider the following transformations: γ μ → Gγ μ G −1 , where G is a unitary matrix. 0 σ I 0 0 I , γ 5 = iγ 0 γ 1 γ 2 γ 3 = , γ= . Dirac : G = I, γ 0 = −σ 0 0 −I I 0 1 0 σ I I 0 −I I 0 , γ5= , γ0= Chiral : G = √ , γ= . −σ 0 −I 0 0 −I 2 −I I 1 I σ2 0 σ2 σ2 0 Majorana : G = √ , γ0= 2 ,γ5= , 2 0 −σ 2 σ 0 2 σ −I 3 iσ 0 0 −σ 2 −iσ 1 0 , γ3= , γ2= 2 . γ1= 0 iσ 3 0 −iσ 1 σ 0

Some useful representation of the gamma matrices are given in Box 16.5. The unitarily equivalent representations of the gamma matrices in this Box are useful in dealing with certain problems as we will see later. For example, the chiral representation is useful in dealing with massless fermions. In practice there may be a particle whose mass is too small in comparison to the mass of another particle involved in a given process. In such a case, one may consider setting the mass of the former particle equal to zero in studying

14 For

much detailed studies of field equations of various fields, see also Manoukian [9], as well as Schwinger [13].

16

The Five Types of Fields You Meet in Quantum Field Theory

115

Box 16.6 Left- and Right-Handedness (Chirality) and the Spin 1/2 Field 1 − γ5 1 + γ5 ψ, ψR = ψ. The Dirac spinors u(p, σ ), σ = ±, 2 2 satisfy (γ p + m)u(p, σ ) = 0. Working in the chiral representation of the gamma matrices  p ( p 0 +m) given in Box 16.5, and with N = , we have u(p, σ ) = σ u(p, ˜ σ ), |p| m 

1 +( p 0 +m + |p|σ ) ξσ u(p, ˜ σ)= , ξ± are given in Box 16.1. 2( p 0 +m) −( p 0 +m − |p|σ ) ξσ 

 −σ ( p 0 +m − |p|σ ) 0 σ ( p 0 +m + |p|σ ) ξσ . For m = 0, u˜ L (p, σ ) = , u ˜ (p, σ ) = R ξσ 0 2( p 0 +m) 2( p 0 +m)



 0 ξ u˜ L (p, −) = , u˜ R (p, +) = + , u˜ †L (p, −) u˜ R (p, +) = 0, u˜ †L/R (p, ∓) u˜ L/R (p, ∓) = 1. ξ− 0

 σ ·N 0 Helicity : u˜ (p, ∓) = ∓ u˜ L/R (p, ∓), Chirality : γ 5 u˜ L/R (p, ∓) = ∓ u˜ L/R (p, ∓). 0 σ · N L/R ψ a spin 1/2 field: ψL =

Space Reflection: p → −p : (θ, φ) → (π − θ, φ + π ), ξ+ → i ξ− , ξ− → i ξ+ , ⇒ 0

i γ u L (p, −)|p→−p = u R (p, +). Under Parity : i γ 0 u R (p, +)|p→−p = u L (p, −), p p

Right-Handed helicity = + 1

Left-Handed helicity = − 1

That is massless particles can be left-handed or right-handed and parity transformation relates the two.

Box 16.7 Lorentz transformation of a Dirac spinor Under a homogeneous Lorentz transformation x μ → μ ν x ν = x μ and ∂μ = μ ν ∂ν . Accordingly the Dirac equation   under such a transformation takes the form: γ μ μ ν ∂ν /i + m K ψ(x) = 0, where K ψ(x) = ψ  (x  ), where K is some matrix. 

Upon multiplying the latter equation from the left by K −1 , the resulting equation reduces to the Dirac equation  γ ∂μ /i + m ψ(x) = 0, if K −1 γ μ μ ν K = γ ν . Upon multiplying this equation by ν σ and using Eq. (12.4) in Chapter μ

12 :  = −1 gives • K −1 γ σ K = σ ν γ ν . We take the adjoint of the Lorentz transformed Dirac equation above to   obtain : ψ † (x)K † − (γ μ )† μ ν ∂ν /i + m = 0. Upon writing ψ † (x) = ψ(x)γ 0 , and multiplying the earlier equation by (γ 0 K † )−1 from the right, the resulting equation reduces to the adjoint of the Dirac equation if (γ K )(γ ) μ ν (γ 0 K † )−1 = γ ν . Again we multiply this equation by ν σ , and use the properties  = −1 , 0

μ †



(γ μ )† = γ 0 γ μ γ This gives: γ K γ 0



μν

Chapter 15, 

=K

−1

0

to obtain (γ 0 K † )γ σ (γ 0 K † )−1 = σ ν γ

μν

or • K γ †

0

=γ K

ν

≡ K −1 γ σ K from above next to • above.

−1

. For infinitesimal transformations, we have from Eqs. (15.17)-(15.21) in i = δ + δω , and we may write K = I + δωμν Sμν and replacing it in equation next to • and using 2

i the identity γ σ , [γ μ , γ ν ] = 4 (γ μ ησ ν − γ ν ησ μ ) gives S μν = [γ μ , γ ν ]. 4 0

0

μν

the process in question. For example, in the problem of muon decay: μ− → e− ν˜ e νμ , one may consider setting the masses of the electron e− , the electron-anti-neutrino ν˜ e , and of the muon-neutrino νμ equal to zero in comparison to the mass of the muon which is a few hundred times as massive as the other three particles. In Box 16.6, we consider massless spin 1/2 particles, and introduce the concept of right-handed and left-handed particles, as well as introduce the concept of helicity which is defined, in general, as the projection of the spin of a particle along its momentum. The content of Box 16.6 will be useful in later applications. In Box 16.7, we consider the transformation of a Dirac spinor under a Lorents transformation.

116

16

The Five Types of Fields You Meet in Quantum Field Theory

Finally, in Box 16.8 some general properties of the gamma matrices are given based on the anti-commutation relation {γ μ , γ ν } = −2ημν in Eq. (16.14). This chapter gives all the fundamental fields used in present quantum field theoretical descriptions of basic interactions. Box 16.8 Some properties of the gamma matrices  2  2 1 γ μ γ ν = − ημν I + [ γ μ , γ ν ]; Tr [ γ μ ] = 0; γ 0 = I ; γ i = −I, i = 1, 2, 3; ημν γ μ γ ν = − 4 I, 2       ημν γ μ γ σ γ ν = 2 γ σ ; ημν γ μ γ σ γ λ γ ν = 4 ησ λ ; ημν γ μ γ σ γ λ γ ρ γ ν = 2 γ ρ γ λ γ σ ;     



  γ μ , γ σ , γ ρ = 4 γ σ ημρ − γ ρ ημσ ; Tr γ μ γ ν = − 4 ημν ; Tr γ α γ β γ μ γ ν = 4 ηαβ ημν − ηαμ ηβν + ηαν ηβμ ,  

Tr γ 5 γ α γ β γ μ γ ν = − 4 i εαβμν , εαβμν totally anti-symmetric with ε0123 = + 1; Tr odd number of γ ’s = 0.  5 2  μ 2

Tr [ γ 5 ] = 0; Tr [ γ 5 γ μ ] = 0; γ = I ; {γ 5 , γ μ } = 0; γ aμ = − I a2 − (a 0 )2 ; γ 5 = i γ 0γ 1γ 2γ 3;    2 γ · a = − I a2 , a = a1 , a2 , a3 , a0 = − a 0 , ai = a i , i = 1, 2, 3.

References 1. Anderson, C. D. (1932). The apparent existence of easily deflectable positives. Science, 76, 238–239. 2. Anderson, C. D. (1933). The positive electron. Physical Review, 43, 491–494. 3. Anderson, C. D., & Anderson, H. L. (1983). Unraveling the particle content of cosmic rays. In L. M. Brown & L. Hoddeson (Eds.), The birth of particle physics (pp. 135–136). Cambridge: Cambridge University Press. 4. Daubechies, I., & Lieb, E. H. (1983). One-electron relativistic molecules with Coulomb interactions. Communications in Mathematical Physics, 90, 497–510. 5. Dirac, P. A. M. (1928). The quantum theory of the electron. I. Proceedings of the Royal Society of London, A, 117, 610–624. 6. Dirac, P. A. M. (1930). A theory of electrons and protons. Proceedings of the Royal Society of London, A, 126, 360–365. 7. Dirac, P. A. M. (1930). On the annihilation of the electrons and protons. Proceedings of the Cambridge Philosophical Society, 26, 361–375. 8. Manoukian, E. B. (2016). Rarita-Schwinger massless field in covariant and Coulomb gauges. Modern Physics Letters A, 31, 1650047 (pp. 1–7). 9. Manoukian, E. B. (2016). Quantum field theory I: Foundations and abelian and non-abelian gauge theories. Switzerland: Springer. 10. Merhra, J., & Rechenberg, H. (2000). The historical development of quantum theory, The completion of quantum mechanics, vol. 6, pp. 1926–1941. Berlin: Springer. 11. Rarita, W., & Schwinger, J. (1941). On a theory of particles with half-integral spin. Physical Review, 60, 61. 12. Schwinger, J. (1969). Particles and sources. New York: Gordon & Breach. 13. Schwinger, J. (1970). Particles, sources, and fields, vol. 1. Reading: Addison-Wesley. 14. Schwinger, J. (1973). A report on quantum electrodynamics. In J. Mehra (Ed.), The physicist’s conception of nature. Dordrecht: D. Reidel Publishing Company. 15. Weisskopf, V. F., & (1980). Growing up with field theory, and recent trends in particle physics. “The (1979) Bernard Gregory Lectures at CERN”, 29 p. Geneva: CERN.

Gauge Fields Prerequisites Chap. 16

17

In order to describe the approximate charge independent nature of the strong interaction between the proton and neutron, Werner Heisenberg in 1932 introduced the concept of isospin1 for which the proton p and neutron n may be considered as two states of the same particle of isospin I = 1/2 and corresponding states identified, respectively, by quantum numbers I3 = ±1/2, in analogy to the description of spin states of a particle of spin 1/ 2. Charge independence of a theory involving the proton and neutron would then be implemented by its invariance under rotations in isotopic space. This is described by putting the proton and neutron in a doublet ( p n) , with the transformations, in question, carried via a unitary operator U : ( p n) → U ( p n) , U † U = I . In the simplest possible case, if we make a spacetime dependent phase changes of a charged field, as of the Dirac 2 ψ(x) → exp[ iθ (x) ]ψ(x), which may be considered  in a two-dimensional complex plane set up at each point  μas a rotation of spacetime, then in order that the Dirac equation γ ∂μ /i + m ψ(x) = 0 remains invariant, we must introduce a structure Γμ (x), referred to as a connection, telling us how the phase, or the angle of rotation θ (x), changes as we move infinitesimally away from the point x. This necessitates to rewrite the Dirac equation as     μ ∂μ γ − Γμ (x) + m ψ(x) = 0, i

(17.1)

meaning that the partial derivative ∂μ of ψ(x) is to be changed to a new one, referred to as a covariant derivative denoted by ∇μ defined by   ∂μ → ∇μ = ∂μ − iΓμ (x) ψ(x),

(17.2)

and under the local phase change of the field ψ(x), the structure Γμ then transforms as: Γμ (x) → Γμ (x) + ∂μ θ (x).

(17.3)

What have we achieved? By considering independent spacetime dependent phase changes of the field ψ(x), at different spacetime points x, we had to introduce a field Γμ (x), referred to as a connection, which tells us how a phase, or the angle of rotation θ (x), changes as we move infinitesimally away from the point x. By introducing, in turn, a charge e, one may then define the so-called vector potential Aμ (x), referred to as a gauge field, by the equation Γμ (x) = eAμ (x). Thus these local transformations, just described, generate a force of interaction between charges mediated by the gauge field. The underlying group of transformations by the phases is referred to as an abelian gauge group, denoted by U(1), and the vector potential is referred to as an abelian gauge field. The generalization of the above example to a symmetry involving transformations between particles assigned to, say, N different states of a given quantum number, as for the isospin case mentioned above, is as follows. At each point of spacetime x, we set up an N -dimensional Euclidean coordinate system such that one has complete freedom in orientating the coordinate systems differently as we move from one spacetime point to another. Let ei (x), i = 1, 2, . . . , N denote a basis vector of an N -dimensional Euclidean space set up at a point x. Consider a curve in spacetime parameterized by some parameter λ with 1 Heisenberg 2 Recall

[2]. that a rotation by an angle θ in the complex is given by z = eiθ r .

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_17

117

118

17

Gauge Fields

  coordinate labels x(λ). The change of basis vector components ei x(λ + dλ − ei x(λ) , under an infinitesimal change, must vanish for dx μ → 0. It may also be expanded in terms of the basis vectors e j (x(λ)) in such a limit. Accordingly, we may write d dx μ (λ) ei (x(λ)) = −iΓiμ j (x(λ))e j (x(λ)) , dλ dλ

(17.4)

where Γiμ j (x)e j are expansion coefficients, with ( j = 1, 2, . . . N ). The N × N matrix Γμ is referred to as a connection. It tells us how the coordinate systems are oriented as we move from a point x to an infinitesimal point x + dx on a given curve. The −i factor in (17.4) is chosen for convenience. At any given value taken by the parameter λ, we may introduce an N -plet, as N × 1 column matrix, χ (λ) expanded in terms of the basis vectors ei (x(λ)) and its derivative w.r.t. λ as follows from (17.4) and the chain rule d/dλ = (d/dx μ )(dx μ /dλ): χ (λ) = χi (x(λ))ei (x(λ)),  dx μ (λ) dχ (λ)  = ∂μ χi (x) − iΓiμ j (x)χ j (x) ei (x(λ)), dλ dλ

(17.5) (17.6)

where the index i denotes the various states of the quantum number. For example for the isospin case discussed earlier, i = 1 corresponds to I3 = +1/2, while i = 2 corresponds to I3 = −1/2. That is, particles corresponding to different states are grouped together in a column vector: χ  = (χ1 , . . . , χ N ) , of N possible states, referred to as a multiplet. From Eq. (17.6), we may infer that the freedom of setting up Euclidean coordinates systems at different spacetime points with, in general, different orientations, forces the partial derivative ∂μ of the field components χ i (x), of physical interest, becomes replaced by j

j

(∇μ )i = δi ∂μ − iΓiμ j (x),

(∇μ χ )i (x) = ∂μ χi − iΓiμ j (x)χ j (x).

(17.7)

Invariant theories, under symmetry transformations, such as of isospin, are constructed by introducing unitary local transformations of such fields χ (x) = (χ1 (x), . . . , χ N (x)) in the Euclidean coordinate set up at the point x. Thus we j introduce N × N unitary matrices V (x) = [Vi (x)], V † V = I = V V † , V † = V −1 , with transformations given by j

χi (x) → Vi (x)χ j (x).

(17.8)

The construction of symmetric theories invariant under the above transformations demands that the covariant derivative transforms the same way as the field components. That is, now with χ (x) = (χ1 (x), . . . χ N (x)) representing a column vector, we may, in a matrix notation, conveniently write χ (x) → V (x)χ (x),

 ∇μ χ (x) → V (x) ∇μ χ (x) .

(17.9)

It is easily verified that this transformations require from (17.7) that the connection Γμ (x), as a matrix, transforms as

∂μ −1 Γμ (x) → V (x) Γμ (x) − V (x). i

(17.10)

This, in particular, shows from (17.4) how the basis vectors, and hence of the orientations of the locally set coordinate systems, in general, change under such transformations as we move from one spacetime to a neighboring one. Clearly the N components of a field χ (x) = (χ1 (x), . . . χ N (x)) will vary from spacetime point to another due to the general different orientations of the coordinate systems set up at each point, implying different mixing between the components of χ (x), and one may hold the space associated with the integral degrees of freedom (charge, isospin, color,…) responsible for such changes incurred by the field, by introducing, in the process, the concept of the curvature of such space, locally defined at each spacetime point, as a measure of the changes it manifests. This gives rise to a geometrical way of introducing fields that interact with charges and other physical entities corresponding to other internal degrees of freedom. The concept of curvature may be introduced in the following way. We first define the gauge-covariant derivative of χ (x) along a Lorentz vector V μ byV μ ∇μ χ (x). Consider the transfer of χ (x) along the closed path as shown in the set of equations below as it is evaluated at different points:

17

Gauge Fields

119

(17.11) From which the following expression emerges for the change in twisting by going around the closed path χ F (x1 ) − χ I (x1 ) = δξ1 δξ2 v1μ v2ν [∇μ , ∇ν ] χ I (x1 ),  1 Rμν (x) = i[∇μ , ∇ν ] = ∂μ Γν (x) − ∂ν Γμ (x) + Γμ (x)Γν (x) − Γν (x)Γμ (x) , i

(17.12) (17.13)

where Rμν is the curvature responsible for the change of the state of χ I (x) by going around a closed path. From the transformation law of Γμ in (17.10), we may then infer the following simple transformation law for Rμν (x) Rμν (x) → V (x) Rμν (x) V −1 (x), Tr R μν (x)Rμν (x) → Tr R μν (x)Rμν (x),

(17.14)

N where Tr(.) = i=1 (.)ii , with the second relation showing an important invariance property which is a key equation for the construction of symmetric theories. The unitary matrices V (x) we will consider are of the form  i V i j (x) = exp[iϕ(x) · t] , ϕ(x) = (ϕ1 (x), . . . , ϕ M (x)), i, j = 1, . . . , N . j

(17.15)

where the indices i, j specify the various components of a field in internal space, with M matrices ta : t = (t1 , . . . , t M ), represented by N × N Hermitian matrices, referred to as generators of the transformation, whose basic properties will be spelled out below, and where ϕ(x), is real and, as indicated, may depend on the spacetime point x. We may also consider unitary matrices, as above, multiplied by a phase factor exp[i θ (x)]. This will be important, for example, in the electroweak theory needed to describe a quantum number referred to as weak hypercharge. In such cases, we will have a product of two groups denoted by G × U (1), where G denotes the group of transformations in (17.15). A group property satisfied by the matrices in (17.15) is given by exp(i ϕ 1 (x) · t) exp(i ϕ 2 (x) · t) = exp(i ϕ(x) · t),

(17.16)

for given ϕ 1 (x), ϕ 2 (x). The general expression of the commutation relations satisfied by the generators ta is derived in Box 17.1, and with a proper normalization condition of the trace of a product of two generators, we have [ ta , tb ] = i f abc tc ,

Tr(tc td ) =

1 δc d , 2

(17.17)

where the numerical coefficients f abc are referred to as structure constants. By multiplying the above commutation relation by td , and using the normalization condition, by taking the trace of the resulting expression, we obtain  Tr td [ta , tb ] = f abc ,

(17.18)

which implies that the structure constants f abc are totally anti-symmetric in its indices. Hermiticity of the matrices ta† = ta , and the fact, as seen from the commutation relations in (17.17) that Tr (ta ) = 0, imply, from the relation det A = exp [ Tr ln A], that    det exp iϕ · t] = exp[iϕa Tr(ta ) = 1.

(17.19)

The underlying group involves N × N unitary matrices of determinant one. Such a group is referred to as the SU(N) group, where U stands for unitary and S for special, i.e., of determinant one. This group involves M = N 2 − 1 (Hermitian) traceless

120

17

Gauge Fields

matrices ta , i.e., a = 1, 2, . . . , (N 2 − 1). Moreover N − 1 of theses are in diagonal form. The number N − 1 is referred to as the rank of the group.3 Generators of SU(2), SU(3) are given in Box 17.2. The underlying theories invariant under such group, involving non-commuting generators, are referred to as non-abelian gauge theories. A non-abelian gauge field Aμ , referred to as a Yang-Mills or as Yang-Mills-Shaw field4 with corresponding field strength G μν (x), may be materialized and defined in terms of the connection in (17.4) and the curvature in (17.13), respectively, by introducing, in in the process, a coupling parameter go : Aμ (x) = G μν =

1 Γμ (x), ∇μ = ∂μ − igo Aμ (x), go

1 Rμν (x) = ∂μ Aν (x) − ∂ν Aμ (x) − igo [Aμ , Aν ]. go

(17.20) (17.21)

From a geometrical setting, this provides a way of introducing vector fields, as dynamical variables, that interact with charge and physical entities corresponding to other internal degrees of freedom carried by a given field χ (x), referred to as a matter field, with go defining an overall coupling parameter between the fields. The vector fields may themselves possess internal degrees of freedom themselves and the geometrical description with an associated curvature in charge (integral degrees of freedom) space produces naturally and in a simple way means to describe interactions between these curious entities we call gauge fields, and all with the same coupling parameter go . In later treatment, we consider the parametrization ϕ(x) = go (x) . The field χ , and the vector field Aμ (x) together with the given field strength G μν (x) have from (17.9), (17.10)/(17.20) and (17.14)/(17.21), the following transformations

i Aμ (x) → V (x) Aμ (x) + ∂μ V −1 (x), go −1 μν G μν (x) → V (x)G μν (x)V (x), Tr G (x)G μν (x) → Tr G μν (x)G μν (x). χ (x) → V (x)χ (x),

(17.22) (17.23)

Of particular interest are infinitesimal transformations of the fields χ (x), Aμ (x), which are obtained from the above transformations to be χ (x) → χ (x) + i δϕa (x)ta χ (x), 1 ∂μ δϕa (x) ta + i δϕa (x) [ ta , Aμ (x) ]. Aμ (x) → Aμ (x) + go Box 17.1 Commutation relations of the generators ta

Quite generally exp(i ϕ 1· t)exp(i ϕ 2· t) = exp(i ϕ · t). In particular for any two components ta , tb of t, and a real number ξ : exp(−iξ tb ) exp(−iξ ta ) exp(iξ tb ) exp(iξ ta ) = exp(i ϕc tc ). Clearly the coefficients ϕc depend on ta , tb and → 0 for ξ → 0. Using the classic Baker-Campbell-Hausdorff formula : exp A exp B = exp[ A + B + C ], 1 1 1 with C = [A, B] + [A, [A, B] ] + [B, [B, A] ] + · · · , gives 2 12 12  2 3 exp − (iξ ) [ta , tb ] + O (ξ ) + · · · = exp(i ϕc tc ), leading to ϕc = ξ 2 f abc + · · · for any given pair (a, b) and with a choice of real numericals f abc , and for ξ → 0, the following commutation relations emerge • [ ta , tb ] = i f abc tc .

3 For

details on the explicit construction of generators of the group, see: Manoukian [3], pp. 379–382, p. 503. and Mills [5]; and work done independently by Shaw [4].

4 Yang

(17.24) (17.25)

17

Gauge Fields

121

Box 17.2 Generators of SU(2) and SU(3)

Generators of SU(2) are given by: ta = σa /2, where σ1 , σ2 , σ3 are the Pauli matrices:





 01 0 −i 1 0 σ1 = , σ2 = , σ3 = , ε123 = +1, one diagonal matrix. 10 i 0 0 −1 ta = λa /2, a = 1, . . . , 8, where λa are the Gell-Mann matrices [1]: ⎞ ⎛ ⎞ ⎛ ⎞ 1 0 0 −i 0 1 0 0 0 0 ⎠ , λ2 = ⎝ i 0 0 ⎠ , λ3 = ⎝ 0 −1 0 ⎠ , 0 0 0 0 0 0 0 0 ⎞ ⎞ ⎞ ⎛ ⎛ 0 1 0 0 −i 0 0 0 0 0 ⎠ , λ5 = ⎝ 0 0 0 ⎠ , λ6 = ⎝ 0 0 1 ⎠ , 0 0 i 0 0 0 1 0 ⎛ ⎞ ⎛ ⎞ 0 0 0 1 0 0 1 λ7 = ⎝ 0 0 −i ⎠ , λ8 = √ ⎝ 0 1 0 ⎠ . Tr [λa λb ] = 2 δab . 3 0 0 −2 0 i 0

For SU(3) they are given by: ⎛ 0 λ1 = ⎝ 1 0 ⎛ 0 λ4 = ⎝ 0 1

The structure constants f abc in [ta , tb ] = i f abc tc are easily evaluated and are given by: √ 3 1 f 123 = 1, f 147 = − f 156 = f 246 = f 257 = f 345 = − f 367 = , f 458 = f 678 = . 2 2

From the second and third terms on the right-hand side of (17.25), we make use of the matrix nature of Aμ (x) to write ( f abc = f cab ) 1 Aμ (x) = Acμ (x) tc , Acμ (x) → Acμ (x) + ∂μ δϕc (x) − f cab δϕa (x)Abμ (x). (17.26) go Here we see that in addition to a rather familiar change ∂μ δϕc (x)/go encountered in electrodynamics, various internal degrees of freedom are associated with Aaμ (x) as well. The field strength G μν may be then rewritten as G μν (x) = G cμν (x) tc , G cμν (x) = ∂μ Acν (x) − ∂ν Acμ (x) + go f cab Aaμ (x) Abν (x).

(17.27) (17.28)

One may set up a so-called kinetic energy term for the gauge vector fields, and, due to the invariance property on the right-hand side of (17.23), one is bound to consider an expression like 1 1 1 (17.29) − Tr G μν (x)G μν (x) = − Tr (ta tb ) G aμν (x)G bμν (x) = − G aμν (x)G aμν (x), 2 2 4 which automatically involves the self coupling of the gauge field itself. The field strength G aμν replaces the Faraday tensor Fμν of electrodynamics in non-abelian gauge theories.

References 1. 2. 3. 4. 5.

Gell-Mann, M. (1962). Symmetries of baryons and mesons. Physical Review, 125, 1067–1084. Heisenberg, W. (1932). Über den bau der atomkerne. Zeitschrift für Physik, 77, 1–11. Manoukian, E. B. (2016). Quantum field theory I: Foundations and abelian and non-abelian gauge theories. Dordrecht: Springer. Shaw, R. (1955). The problem of particle types and other contributions to the theory of elementary particles. Ph.D. Thesis. Cambridge University. Yang, C. N., & Mills, R. L. (1954). Conservation of isotopic spin and isotopic gauge invariance. Physical Review, 96, 191–195.

Particles & Symmetries Prerequisite SU(2), SU(3) in Chap. 17

18

With the overwhelming number of particles observed since the discovery of the electron by Joseph Thomson in 1897, the observation of the hydrogen nucleus as playing a fundamental role in atomic structure, which Ernest Rutherford in 1919 referred to it as the proton, as well as the discovery of the neutron by James Chadwick in 1932, and the positron by Carl Anderson in 1933,1 and observed earlier in 1929 by Chung-Yao Chao2 but not pursued to its conclusion,3 it had eventually become important to develop a classification scheme, with an underlying symmetry, to explain such an overwhelming number of particle states. The dust settled down when Gell-Mann and independently Ne’eman4 in 1961 introduced what is referred to as the eightfold way and realized that particles with the same spin J and same intrinsic parity may be accommodated in certain multiplets. This chapter is involved in a study dealing with particles observed in nature and their classifications. Once such particles are introduced, one may then use the concept of fields, introduced earlier, to develop field theoretical models to describe their interactions. This will be carried out in later chapters. In order to express the charge Q of a particle in term of its isospin component I 3 , it necessitated to introduce a quantum number Y , referred to as the hypercharge, and write Q = I3 +

Y . 2

(18.1)

For example,5 for the spin 1/2 hadrons ( p, n); (Σ + , Σ 0 , Σ − ); (Λ0 ); (Ξ 0 , Ξ − ), consisting, respectively, of: an isospin doublet, a triplet, a singlet, a doublet, the hypercharges are Y = 1, 0, 0, −1. As a baryon number B = 1 is associated with these particles,6 it has become customary to rewrite the hypercharge in terms of a new quantum number S, referred to as strangeness, and write Y = B+S = 1+S. For the baryons just mentioned then S = 0, −1, −1, −2. On the other hand mesons7 have baryon number equal to zero, and for mesons of spin J = 0, and intrinsic parity −1: (K + , K 0 ); (π + , π 0 , π − ); (η); (η ); (K 0 , K − ), Y = S = +1, 0, 0, 0, −1, respectively. By introducing a new quantum number S in addition to the isospin one necessarily had to enlarge the symmetry group SU(2) of isospin to SU(3), introduced in Chap. 17, to accommodate the new quantum number. It was quite remarkable that all these particles fell nicely on geometrical figures consisting of graphs plotted with Y 1 Anderson

[2]. Chao was a graduate student at Caltech, and a Ph.D student of Robert Millikan. 3 See Merhra and Rechenberg [17], p. 804. See also, Anderson and Anderson [3]. It is rather surprising that Chao did not share the Nobel Prize with Anderson. 4 Gell-Mann [11]; Ne’eman [18]. Although Gell-Mann was awarded the Nobel Prize for his work on classification of particles, it is rather unfortunate that Ne’eman was left out even after Feynman nominated him & Gell-Mann. Credit should be also given to Shoichi Sakata [22], who did the earlier pioneering work on particle classification and chose the proton, the neutron and the Lambda baryon as the building blocks for the composition of hadrons, instead of quarks, and didn’t quite work. 5 For a massive particle, its spin is defined as the angular momentum in its rest frame. Since the orbital angular momentum is perpendicular to the momentum of a particle, the spin of a massless particle may be defined in terms of its helicity as the projection of the angular momentum along its momentum. 6 A baryon is a strongly interacting particle of half-odd integer spin such as the proton and the neutron. As we will see, it may be described as being composite of three quarks. A baryon number +1 is associated with a baryon and −1 with its antiparticle. Baryon number is conserved. At very high energies ∼1016 GeV and above, GUTs, however, predict, for example, that a proton may decay with a lifetime much much larger than that of the age of the Universe. The latter will be discussed in Chap. 37. 7 A meson is a strongly interacting particle of integer spins and of zero baryon number. As we will see it is described as being composite of two quarks. 2 C.-Y.

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_18

123

124

18

J =0

S +1

K0

K+

J = 1/2 n

π−

0

Particles & Symmetries

π+

π0 η,η

−1

p

Σ0 Λ0

Σ+

I3

Σ− −1

K0

−1

Ξ−

+1

I3

Ξ0

−2

S

J = 3/2 0

0

+1 −1

K−

S

Δ −−

Δ0

−3/2

−1/2

Σ ∗−

−1

Δ+

Δ ++

+1/2

+3/2

Σ ∗+

Σ ∗0

−1

+1

Ξ ∗−

−2

I3

Ξ ∗0

−1/2

+1/2

−3

Ω− Fig. 18.1 Meson octet (J = 0), and a singlet, baryon octet (J = 1/2), and baryon decuplet (J = 3/2)

versus I 3 or equally for S versus I 3 . The meson octet of spin J = 0, together with a singlet,8 and the baryon octet of spin J = 1/2, as well the baryon decuplet of spin J = 3/2 are shown in Fig. 18.1, where the Σ ∗, Ξ ∗ being more massive than those of Σ, Ξ . Approximate empirical relations may be given for the masses of the particles for given isospin multiplets, for the meson octet, the baryon octet, as well as for the baryon decuplet, respectively, MK =

3Mη2 + Mπ2 4

,

MΞ + M N =

3MΛ + MΣ , 2

M = M0 + M1 Y.

(18.2)

where M N denotes an average of p and n masses, and for the decuplet of masses M, we simply have a linear equation in Y . The SU(3) symmetry, introduced in Chap. 17, has among its representations a fundamental triplet representation, denoted by 3, as well an anti-triplet, denoted by 3, in addition to the octet and decuplet representations. The particles associated with the triplet representation9 are denoted by u, d and s standing for up, down and strange which Gell-Mann referred to them as quarks: 8 The

singlet may be identified as a linear combination of the η and η particle states, see Tanabashi [23].

9 Gell-Mann [12]; Zweig [25]. The triplet, consisting of only three entities (quarks), is referred to as the fundamental multiplet as all other multiplets,

or groupings, can be made from the triplet.

18

Particles & Symmetries

125

S d −1/2 −1

u

0

1/2

I3

s

while Zweig referred to them as aces. Their respective charges are 3/2, −1/3, −1/3 in units of the proton charge and with strangeness S = 0, 0, −1. The entire SU(3) family of particles may be constructed out of these three quarks and the corresponding model is called the quark model. Of course other quarks have been discovered as the charm c, the bottom b and the top t quarks since the quark model was originally introduced. The next chapter will be devoted to discussing on how the s, c, b, t quarks came about weaving the underlying fabric of particle physics. The standard model (SM) of particle physics, consists of the electroweak theory, unifying the electromagnetic and weak interactions, as well as the modern theory of strong interactions, referred to as quantum chromodynamics. The standard model consists of six leptons (), and the six quarks (q), mentioned above u, d, c, s, t, b, referred to as six flavors of leptons and six flavors of quarks with each of the quarks labeled with three different colors, as an additional quantum number for quarks referred to as color - the need of which will be discussed below. The SM consists as well of their antiparticles, and the mediators consisting of vector bosons, and in addition of at least of one Higgs boson which will be considered in detail later. The classifications of leptons and quarks are spelled out in Table 18.1, and they fall into three generations shown vertically: Here L  stands for lepton number associated with lepton .10 On the other hand, S, C, B, T stand, respectively, for strangeness, charm, bottomness or beauty, topness or truth assigned quantum numbers to the quarks. Note that the sign of non-vanishing values of S, C, B, T are taken to coincide with the signs of the charges of quarks. The unit of charge is taken to be that of the proton |e|. The signs of all antiparticles are reversed. The baryon number B of all the quarks are 1/3 and all the antiquarks of −1/3. All the leptons and quarks are of spin 1/2. The mediators consist of the photon γ for the electromagnetic force, the massive vector bosons W ± , Z 0 (of approximate masses 80.4 Gev/c2 , 91.2 GeV/c2 ) for the weak force, and six massless vector bosons, referred to as the gluons, which also carry color, for the strong interaction between the quarks and for the interactions between themselves. The underlying theory of the interactions of quarks and gluons is referred to as quantum chromodynamics and is considered in Chap. 32. Moreover the Higgs boson, as the excitation of the Higgs field which gives masses to the massive particles, is a scalar, i.e., of spin 0, of approximate mass 125.1 GeV/c2 . The Higgs field will be studied in Chap. 34, while Chap. 19 deals on the important aspect on how the s, c, b, t quarks came about and emerged in the underlying fabric of elementary particles. The vector bosons W ± , Z 0 were introduced in the development of the modern theory of weak (electroweak) interaction, stemming from Fermi’s original theory, in the “image” of the QED theory, where the photon in the latter theory is replaced by these vector bosons in the weak interactions and will be discussed in Chaps. 33–35 in the electroweak theory which unifies electrodynamics and the weak interaction. These massive vector bosons have been all observed experimentally.11 The up u and down d quarks belong to an isotopic doublet I = 1/2 with components I 3 = ±1/2, respectively. The concept of isotopic12 (or isospin) was introduced by Werner Heisenberg in 1932 in order to describe the approximate charge independent nature of the strong interaction between the proton and neutron, for which the proton p and neutron n may be considered as two states of the same particle of isospin I = 1/2 and corresponding states identified, respectively, by quantum numbers I 3 = ±1/2, in analogy to the description of spin states of a particle of spin 1/2. The composite nature of the nucleons as having an underlying structure composed of point-like particles consisting of quarks as well as of gluons became evident from the deep inelastic scattering data13 at SLAC where electrons were fired, e.g., at protons.14 It is interesting to know that about 50% of the momentum of the proton is carried by the gluons.15 Because the quarks may emit gluons which, in turn may emit quark pairs, the collection of all particles confined to the nuclei have been referred to as partons by Feynman.16 The need of introducing the colors for the quarks as an additional feature for distinguishing quarks arose because of several reasons. Let us first introduce three colors for each quark, say red (R), blue (B) and green (G),17 and anti-red (R ), observation that there are different types of neutrinos was made by Danby [8]. The tau lepton τ was discovered by Perl et al. [20]. Rubbia [21]. 12 Heisenberg [15]. 13 Bloom et al. [6]; Breidenbach et al. [7]. 14 For a detailed treatment of deep inelastic scattering see: Manoukian [16], pp. 429-456. 15 For an explicit derivation of this see: Manoukian [16], p. 456. 16 Feynman [9]; Bjorken and Pachos [5]. 17 Needless to say, these have nothing to do with color in optics. 10 The

11 See,

126

18

Particles & Symmetries

Table 18.1 Classification of leptons and quarks and their (approximate) masses in MeV/c2 : S stands for strangeness, C for charm, B for beauty, and T for truth  Q Le Lμ Lτ q Q I3 S C B T νe (< 0 2 × 10−6 ) e (0.511) −1 νμ (< 0 0.19) μ (106) −1 ντ (< 0 18.2) τ (1777) −1

1

0

0

u (2.2)

+2/3

+1/2

0

0

0

0

1 0

0 1

0 0

d (4.7) c (1280)

−1/3 +2/3

−1/2 0

0 0

0 1

0 0

0 0

0 0

1 0

0 1

s (96) −1/3 t (173100) +2/3

0 0

−1 0

0 0

0 0

0 1

0

0

1

b (4180)

0

0

0

−1

0

−1/3

anti-blue (B ) and anti-green (G ) for the antiquarks. Free quarks have never been observed and one assumes that hadrons are colorless, that is they are bound states of a red, a blue and green quark, or are bound states of a red/anti-red pair of quarks, or a blue/anti-blue a pair of quarks or a green/anti-green a pair of quarks. This suggests to define three component-vectors (A1 , A2 , A3 ), for each quark A, in a three dimensional space, with 1 identified with red, 2 with blue and 3 with green. Colorless invariants may be then defined as follows Ai B i or εi jk Ai B j Ck for defining the quark contents of hadrons, where εi jk is totally anti-symmetric with ε123 = +1. Since every quark carries spin 1/2, the first invariants define bosons (spins 0 or 1), while the second define baryons (spins1/2,3/2). The particle Δ++ is a spin 3/2 particle of two units of charge. A quark content for it may be defined by Δ++ = (uuu). This particle, however, must satisfy the Pauli exclusion principle (Fermi–Dirac statistics) and must be anti-symmetric under the interchange of any two u quarks. Accordingly a detailed quark content description of it necessitates the introduction of the additional identifying factor (color) for quarks and write: 1 |Δ++  = √ εi jk |u i u j u k  6

(18.3)

√ Since the color structure |(RG B − R BG + B RG − BG R + G B R − G R B)/ 6, as obtained from above, must be common to all baryons, and is anti-symmetric, the [space×spin×flavor] of the wavefunction of a baryon must be symmetric. The gluons, as the quarks, are not observed as free particles at least at present energies. Moreover, a gluon may emit a quark and an antiquark, or it may be coupled to an incoming quark (antiquark) and an outgoing quark (antiquark). Accordingly, a gluon carries color in bilinear forms as color/anti-color. The invariant state under SUc (3) transformation in color (c) space: √ √  |(R R + B B + GG )/ 3, as a bilinear form color/anti-color, which is nothing but a scalar product in the form c c c/ 3, is colorless which would mean unconfined and the gluon should appear as a free particle. Accordingly, it is omitted as it is not observed—the strong force is of short-ranged. This leaves 8 possible color states that one may define for gluons which correspond to the 8 generators of color SUc (3) discussed in Chap. 17 for the SU(3) symmetry. We recall from Eq. (17.26) and μ Box 17.1, Chap. 17, that a gluon as a Yang–Mills field is expressed as a 3 × 3 matrix: Aμ = Aa ta , where ta are generators of the SUc (3) group, ta = λa /2, with the λa denoting the Gell-Mann matrices introduced in Chap. 17. In Box 18.1, the Gell-Mann matrices are expressed in terms of color states of quarks/antiquarks from which an orthonormal set of color/anticolor states for gluons is readily extracted. The necessity of introducing three colors for quarks also comes from experiments upon the comparing of the cross section for the process e+ e− → hadrons to the cross section of e+ e− → μ+ μ− at high energies. With a single photon exchange e+ e− → γ → qq, e+ e− → γ → μ+ μ− , and neglecting all the masses involved at sufficiently high-energies. The ratio of the cross sections then arises simply as follows: σ (e+ e− → qq)/σ (e+ e− → μ+ μ− ) = Q q2 |e|2 /|e|2 = Q q2 , since each of the cross sections is then proportional to the charge squaredof the relevant particles. For example at center of mass energy of 10.52 GeV,

18

Particles & Symmetries

127

Box 18.1 Quarks/antiquars and gluons color states: The Gell-Mann matrices expressed in terms of quark/antiquark color states in color space ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 0 0 1 We may define color states of the quarks by |R = ⎝0⎠ , |B = ⎝1⎠ , |G = ⎝0⎠, and 1 0 0 for the antiquarks R| = (1 0 0), B| = (0 1 0), G| = (0 0 1). The Gell-Mann matrices in   Box 17.2, Chapter 17, may be rewritten as: λ1 = |RB| + |BR|, λ2 = −i |RB| − |BR| ,   λ3 = |RR| − |BB|, λ4 = |RG| + |GR|, λ5 = −i |RG| − |GR| , λ6 = |BG| + |GB|,     √ λ7 = −i |BG| − |GB| , λ8 = |RR| + |BB| − 2|GG| / 3. One may then define the following 8 color states for the gluons: √ √ √ |1 = (R B + B R)/ 2, |2 = −i(R B − B R)/ 2, |3 = (R R − B B)/ 2, √ √ √ |4 = (RG + G R)/ 2, |5 = −i(RG − G R)/ 2, |6 = (BG + G R)/ 2,  √   √ |7 = −i(BG − G B / 2, |8 = R R + B B − 2GG / 6. √ The invariant colorless state corresponding to (R R + B B + GG )/ 3 is omitted.

only the first 5 quarks may be produced, where m b < 5.0 GeV. Accordingly, if the color of quarks is taken into consideration, together with the charges, we have   2 2 1 2 1 2 2 2 1 2 11 σ e+ e− → hadrons = , + + + + R=  + − = 3 × 3 3 3 3 3 3 σ e e → μ+ μ− )

(18.4)

and would be equal to 11 without the 3 color-factor. Experimentally R 3.6,18 consistent with three colors for quarks. Also note that apart from the top quark t, the Z 0 vector boson is the most massive particle in the standard model. The lifetimes decrease with increasing masses of the unstable particles: (μ) : 2.2 × 10−6 > (τ ) : 2.9 × 10−13 > (W ± ) : 3.1 × 10−25 > (Z 0 ) : 2.6 × 10−25 in seconds. With the mass of Z 0 particle 91.2 GeV, Z 0 may decay to any pair of leptons , or to any pair of quarks qq up to, and including, the bottom quark b. That is up to three generations of quarks. If it were any heavier, i.e., heavier than the top t quark, with a shorter lifetime, it could have presumably decayed into a pair of quarks in a fourth generation signalling the existence of a new generation, and a correlation between the lifetime and the number of quarks of masses less than m Z /2. It is evident that a quark must of spin 1/2 in order to form baryons. But independently of this, the spin 1/2 character of quarks is consistent with the jet-axis distribution of a two-jet formation19 in the process e+ e− → q q in the center-of-mass of the e+, e− system:

θ e−

e+

When a quark-antiquark pair is initially produced in such a process, these emerging particles develop bremsstrahlung cascades of narrowly collimated gluons, which in turn lead to the further productions of quark-antiquark pairs. These processes continue until all colored particles are combined into color singlet hadrons leading to the generation of jets of hadrons at large distances. The formation of hadrons from the q, q¯ pair is called hadronization. The jet-axis angular distribution, at high energies, is

18 Ammar 19 Hanson

et al. [1]. et al. [13].

128

18

Particles & Symmetries

observed20 to be given by ∝ (1 + cos2 θ ) with the coefficient of cos2 θ approximately equal to one. This distribution is readily derived and is given in Box 18.2.21 The charge of a quark Q may be conveniently expressed in the form 1 Q = I 3 + Y, 2

Y = B + S + C + B + T,

(18.5)

where B = ±1/3 for a quark/antiquark, and Y is referred to as the hypercharge. The above formula is referred to as a Gell-Mann & Nishijima formula.22 Box 18.3 describes how two and three spin 1/2’s may be combined which is quite useful in working out the quark content of mesons and baryons. We recall that the quark content of mesons is given by quark/antiquark which involves in combining two spin 1/2 s, while the quark content of baryons is given by quark/quark/quark which involves in combining three spin 1/2 s. We recall that in (u d) , the isotopic spin components of u, d are I 3 = ±1/2. On the other hand, for antiparticles (u d) , the isotopic spin components of u, d are I 3 = ∓1/2. To combine the isotopic spins of the particle and antiparticle, we have to rotate (u d) by an angle π , and treat them on an equal footing:   

 u 0 −1 u −d exp[−iπ σ2 /2] = = . 1 0 d d u

(18.6)

We may now treat (u d) and (−d u) as two independent spin 1/2 systems, each containing two states, denoted, respectively, by 2 and 2. Accordingly, Box 18.3 gives ⎧ (I 3 = +1) ⎪ ⎨ −|ud, √ (I = 1) : a triplet = |uu − dd/ 2, (I 3 = 0) , ⎪ ⎩ |du, (I 3 = −1) √ (I = 0, I 3 = 0) : a singlet = |uu + dd/ 2,

(18.7) (18.8)

represented symbolically as 2 ⊗ 2 = 3 ⊕ 1. The states for the baryon states written in terms of the u and d quarks for the decomposition 2 ⊗ 2 ⊗ 2 = 4 ⊕ 2 ⊕ 2 may be read directly from Box 18.3 by making the replacements: ↑ → u, ↓ → d, with I 3 = ±3/2, ±1/2. Box 18.2 Jet-axis distribution in e+ e− → q q Consider the process e+ (k  ) e− (k) → q( p  ) q( p). We consider all particles to be unpolarized. Recall that u and v represent the annihilation of a particle and an antiparticle, while u and v represent the creation of a particle and an antiparticle. We may then construct a vector for the initial configuration: v(k , σ  )γ μ u(k, σ ), which will be coupled to a photon. To find the jet-axis distribution, we do not need to consider the photon in the analysis at all. At sufficient high energies, we may set all the underlying masses equal to zero. In view of deriving the jet-axis distribution, we may use the following equation in Box 16.1 of Chap. 16 :   Tr [ v(k , σ  )γ μ u(k, σ ) u(k, σ )γ ν v(k , σ  ) ] = k μ k ν − ημν kk  + k ν k μ . m2 m=0

spins

μ



Similarly for the final state, we have p p − η μ ν

k·p/|k||p| = cos θ, |k| = |p| , we have [k k 2

2

μν



pp + p ν p μ . With k = −k, p = −p,

− ημν kk  + k ν k μ ] [ p μ p ν − ημν pp  + pν p μ ]

= 2 [k  p  k p + k  p k p  ] ∝ [(1−cos θ)2 + (1+cos θ)2 ] ∝ (1 + cos2 θ) consistent with observation.

20 Hanson

et al. [14]. the underlying theoretical description of two and three jets, see, e.g., Manoukian [16], pp. 444–447. 22 Nishijima [19]; Gell-Mann [10]. 21 For

18

Particles & Symmetries

129 Box 18.3 Combining two and three spin 1/2 s

• Combining two spin 1/2 s : |1/2, m 1 |1/2, m 2  ≡ |m 1 , m 2 ,  |m 1 m 2 m 1 m 2 | j, m = m 1+m 2 , m 1 = ±1/2 : ↑ / ↓ . | j, m = m 1 ,m 2     √ √ |1, +1 = | ↑↑, |1, 0 = (1/ 2)[ | ↑↓ + | ↓↑ ], |1, −1 = | ↓↓ , |0, 0 = (1/ 2)[ | ↑↓ − | ↓↑ ] , represented as the product of two states leading to a triplet and a singlet: 2 ⊗ 2 = 3 ⊕ 1. • Combining three spin 1/2 s : We may first immediately introduce four symmetric states :   

⎧  ⎪  23 , 23 = | ↑↑↑,  23 , 21 = √1 | ↓↑↑ + | ↑↓↑ + | ↑↑↓ ⎪ ⎨ 3 . 4 symmetric states  

  ⎪ ⎪ ⎩  3 ,− 1 = √1 | ↓↓↑ + | ↑↓↓ + | ↓↑↓ ,  3 , − 3 = | ↓↓↓ 2 2 2 2 3 3 1 3 1   We may introduce two mixed states orthogonal respectively to  , ,  ,− : 2 2 2 2 1 1   1 1   1  1   , = ,− = 2| ↑↑↓ − | ↓↑↑ − | ↑↓↑ , | ↑↓↓ + | ↓↑↓ − 2| ↓↓↑ . √ √   2 2 2 2 6 6 Finally, we may introduce two mixed states: anti-symmetric in the first two spins, and anti-symmetric in the first and third spins, respectively: 1 1   1 1   1  1   = √ | ↑↓↑ − | ↓↑↑ ,  ,− = √ | ↓↓↑ − | ↑↓↓ .  , 2 2 2 2 2 2 The above decomposition into eight states is represented by: 2 ⊗ 2 ⊗ 2 = 4 ⊕ 2 ⊕ 2.

The quark contents of some baryons (J = 1/2, J = 3/2) & mesons (J = 0, J = 1) is given in Table 18.2.23 It is worth considering again the fact as to why the π 0 state is a composition of two distinct quark combinations (uu), (dd). Look at the J = 0 multiplet in Fig. 18.1 at S = 0, and move from left to right on the I 3 axis. At the largest value of I 3 = +1, we have the quark combination (ud). When you move to the left of the axis, reducing the I 3 by a 1, two possible combinations arise: (ud ) → (uu) or (ud ) → (dd ). The same conclusion is, consistently, reached if you start at the smallest value of I 3 = −1, and we move to the right increasing the value of I 3 by 1, we encounter two possibilities: (du ) → (dd ) or (du ) → (uu ), which explains the just mentioned fact of the compositeness of two distinct quark contents. We have now to combine the strangeness quark s with the u and d quarks. We recall that mesons consist of the combinations of quarks/antiquarks, while the baryons of quarks/quarks/quarks. We first consider mesons. We may first introduce particles with quark contents: (ud), (ud), (us), (ds), (us), (ds), as well as a mixed combination of √ with a corresponding state |uu − dd/ 2, identified with π 0 as seen above. We may also introduce the symmetric (uu, dd),  √    state |uu  + |d d  + |ss  / 3 with which the particle η is approximately24 identified and the orthogonal state |uu  +  √ |dd  − 2|ss  / 6 to it and to the π 0 state, with which the particle η is approximately identified. The above leads to the decomposition: 3 ⊗ 3 = 8 ⊕ 1 corresponding to the meson octet, together with a singlet, or just a nonet, in Fig. 18.1. For baryons, look at the baryon octet in Fig. 18.1. Suppose you choose the S = −2 value. This means one must include two strange quarks in the analysis. At the extreme right hand side I 3 = 1/2. Accordingly the quark content of Ξ 0 is (uss). As you move to the left, the value of I 3 is decreased by 1 giving rise to the replacement: (uss) → (dss), with the latter providing the quark content of Ξ − , and so on. The analysis for the baryon decuplet is no different. Of particular interest is the − particle, corresponding to S = −3, I 3 = 0, which has quark content sss. Its existence was predicted in 1961 by the eightfold way before its discovery in 1964.25 The above analysis gives the following decomposition for baryons: 3 ⊗ 3 ⊗ 3 = 10 ⊕ 8 ⊕ 8 ⊕ 1. 10 corresponds to the decuplet just mentioned, while 8 is the baryon octet in Fig. 18.1, while another 8 provides just another baryon octet. The 1 consists of a singlet. Finally, the quark contents of particles involving charm and bottom may be also carried out by much more labor. The consideration of particles composite of quarks just within the set {u, d, c} or just within the set {u, d, b} are readily handled. At least at present energies, no internal structure need to be introduced, a priori, for leptons.

23 The

Σ ∗ , K ∗ particles are more massive than the Σ, K particles, and of different spins.

 √   √ |uu+|d d +|ss  / 3 and |uu +|dd −2|ss  / 6 states, respectively, see PDG [23]. The masses of η, η are 548, 958 GeV/c2 , respectively. 25 Barnes et al. [4]. 24 The particles η and η are considered as linear combinations of, and predominantly equal to,



130

18

Table 18.2 Quark contents of some baryons (J P = 1/2+ , J P = 3/2+ ) & mesons (J P = 0− , J P = 1− ) J P = 1/2 + p n Λ0 Σ+ Σ0 Σ− Content (uud) (udd) (uds) (uus) (uds) (dds)

Particles & Symmetries

Ξ0 (uss)

Ξ− (dss)

J P = 3/2 + Content

Δ++ (uuu)

Δ+ (uud)

Δ0 (udd)

Δ− (ddd)

Σ ∗+ (uus)

Σ ∗0 (uds)

Σ ∗− (dds)

− (sss)

J P = 0− Content

π+ (ud)

π0 (uu, dd)

π− (du)

K+ (us)

K− (su)

K0 (ds)

K0 (s d)

η, η

J P = 0− Content J P = 1− Content

D+ (cd) ρ+ (u d)

D− (d c) ρ0 (uu, dd)

D0 (cu) ρ− (d u)

D0 (u c) K ∗+ (u s)

B+ (u b) K ∗0 (d s, s d)

B− bu K ∗− (s u)

B0 db J/ψ (cc)

Table 18.3 Comparison of the three fundamental particles interactions Interaction Approximate range Strong Weak Electrodynamics

10−13 cm 10−16 cm ∞

Approximate typical decay lifetimes

Mediators

∼10−23 s ∼10−12 − 10−8 s ∼10−16 s

gluons W ±, Z 0 γ

(uu, dd, ss) B0 bd ϒ (bb)

Hideki Yukawa’s suggestion26 in 1935 that the strong interaction is mediated by the exchange of a massive particle (the π meson) between nucleons played an important role in the formulation of the modern descriptions of the strong interaction, which as in electrodynamics where the mediator of the force is the photon, the mediators in the strong interactions are the gluons, while in the weak interactions they are the W ± and Z 0 bosons as mentioned earlier. The strong interaction is responsible for the formation of nuclei, and its interaction range is approximately 1 fm = 10−13 cm, while the strong interaction time scale ∼1 fm/c 10−13 /3 × 1010 3.3 × 10−24 s, where the speed of light c 3 × 1010 cm/s. This approximately specifies the decay lifetimes27 τ in strong interactions such as in ρ → π π with τ 4 × 10−24 s, Δ → N π with τ 6 × 10−24 s. The weak interaction does not form bound states, it rather gives rise to decay of particles. Its interaction range, may be estimated, say, from the exchange of a W meson, to be ∼ c/MW c2 ≈ 2.2 × 10−16 cm. In the weak interactions decay lifetimes are much slower than the strong ones. For example a particle carrying a strange quark would decay by violating strangeness and would essentially drag, leaving visible tracks in bubble chambers, before it decays. For example, Λ0 (uds) decays into p π − , violating strangeness, with a lifetime τ 2.6 × 10−10 s. Other weak decays are, e.g., given by K − (su) → π − π 0 , with τ 10−8 s, and D 0 (cu) → K π π with τ 4 × 10−13 s, violating strangeness and charm, respectively. We will discuss such violating processes in detail in the next chapter. Electrodynamics binds together atoms and molecules. Its range of the interaction is ∞. Lifetimes in electrodynamics are determined from a typical processes such as π 0 → γ γ with τ 9 × 10−17 s. Table 18.3 summarizes roughly some of the properties of the three fundamental interactions in particle physics. One should also not forget the really long lifetime of the neutron in β decay: n → p e− ν˜ e , in the weak interactions violating parity (see Chap. 20), which takes about 15 min. The strengths of the interactions are best discussed when they are considered at specific energies, or equivalently at specific distances, as will be discussed in details in the following chapters. For example, the fine-structure constant α 1/137 describes the strength of electrodynamics interactions at the atomic level and at larger distances. On the other hand, at higher energies, such as at an energy of the order of the mass of the Z boson, corresponding to much smaller distances, the effective fine-structure constant 1/128 and thus becomes stronger as we will see in Chap. 28. All particles experience the gravitational force. It is, in particular, responsible of binding bodies on an astronomical scale and is of ∞ range. The gravitational interaction at the quantum level will be dealt with in Chap. 72. As a very rough rule of thumb one often compares the strengths of the forces exerted between protons which are just in contact as given by: 1 : 10−2 : 10−7 : 10−38 for the strong: electromagnetic: weak: gravitational interactions, respectively. 26 Yukawa

[24]. lifetime τ is related to the half-life τ1/2 by the relation τ = τ1/2 /ln2, where τ1/2 is the time taken for 1/2 of the particles in a large number of particles to decay.

27 The

18

Particles & Symmetries

131

The next chapter investigates on how the strange s, the charm c, the bottom b and the top t quarks came about. C, CP, and T symmetry violations which happen in the weak interactions is the subject matter of the chapter following the latter, where, as mentioned before, charge conjugation C consists of replacing every particle in a given process by its anti-particle, while parity P consists in considering a given process as reflected through the origin of a coordinate system and involves the reversal of the direction of the three-momentum of every particle as well as of the consideration of their intrinsic parities.28 The T transformation involves in reversing the direction in which a process evolves and as such, the particles initially going into it become outgoing, and vice versa, while reversing the direction of their momenta and their spin projections. I is violated in electrodynamics. The strong interactions seems to respect all such symmetries. All experiments seems to indicate that the product CPT is conserved. This is also established theoretically in QFT in Chap. 36.

References √ 1. Ammar, R., et al. (1998). Measurment of the total cross section e+ e− → hadrons at s =10.52 GeV. Physical Review D, 57, 1350–1358. 2. Anderson, C. D. (1933). The positive electron. Physical Review, 43, 491–494. 3. Anderson, C. D., & Anderson, H. L. (1983). Unraveling the particle content of cosmic rays. In L. M. Brown & L. Hoddeson (Eds.), The birth of particle physics (pp. 135–136). Cambridge: Cambridge University Press. 4. Barnes, V. E., et al. (1964). Observation of a hyperon with strangeness minus three. Physical Review Letters, 12, 204–206. 5. Bjorken, J. D., & Pachos, E. A. (1969). Inelastic electron-proton and y-proton scattering and the structure of the nucleon. Physical Review, 185, 1975–1982. 6. Bloom, E. D., et al. (1969). High-energy inelastic e-p scattering at 6o and 10o . Physical Review Letters, 23, 930–934. 7. Breidenbach, M., et al. (1969). Observed behavior of highly inelastic electronp roton scattering. Physical Review Letters, 23, 935–939. 8. Danby, G., et al. (1962). Observation of high-energy neutrino reactions and the existence of two kinds of neutrinos. Physical Review Letters, 9, 36–44. 9. Feynman, R. P. (1969). The behavior of hadron collisions at extreme energies. In Proceedings of the 3rd topical conference on high energy collisions. Stony Brook. New York: Gordon & Breach. 10. Gell-Mann, M. (1956). The interpretation of the new particles as displaced charged multiplets. Il Nuovo Cimento, 4(S2), 848–866. 11. Gell-Mann, M. (1961). The eightfold way: A theory of strong interaction symmetry. California Institute of Technology Synchrotron Laboratory Report CTSL-20. 12. Gell-Mann, M. (1964). A schematic model of baryons and mesons. Physics Letters, 8, 214–215. 13. Hanson, G., et al. (1975). Evidence for jet structure production by e+ e− annihilation. Physical Review Letters, 35, 1609–1612. 14. Hanson, G., et al. (1982). Hadron production by e+ e− annihilation at center-of-mass energies between 2.6 and 7.8 GeV. II. Jet structure and related inclusive distributions. Physical Review D, 26, 991–1012. 15. Heisenberg, W. (1932). Über den Bau der Atomkerne. Zeitschrift für Physik, 77, 1–11. 16. Manoukian, E. B. (2016). Quantum field theory I: Foundations and abelian and non-abelian gauge theories. Switzerland: Springer. 17. Merhra, J., & Rechenberg, H. (2000). The historical development of quantum theory, Volume 6: The completion of quantum mechanics 1926–1941. Berlin: Springer. 18. Ne’eman, Y. (1961). Derivation of strong interactions from a gauge theory. Nuclear Physics, 26, 222–229. 19. Nishijima, K. (1955). Charge independence for V-particles. Progress of Theoretical Physics, 13, 285–304. 20. Perl, M. L., et al. (1975). Evidence for anomalous lepton production in e+ e− annihilation. Physical Review Letters, 35, 1489–1492. 21. Rubbia, C. (1984). Experimental observation of the intermediate vector bosons, W + , W − and Z 0 . Nobel Lectures, 8 December (pp. 240–287). 22. Sakata, S. (1956). On a composite model for the new particles. Progress of Theoretical Physics, 16, 686–688. 23. Tanabashi, M. et al. (Particle Data Group) (2018). Physical Review D, 98, 010001. 24. Yukawa, H. (1935). On the interaction of elementary particles I. Proceedings of the Physical Mathematical Society, Japan, 17, 48–57. 25. Zweig, G. (1964). An SU3 model for strong interaction symmetry and its breaking. CERN-8419-TH-412.

28 Intrinsic

parities P of some of the particles are given in Table 18.2.

Strange, Charm, Bottom and the Top Quarks: How They Came About?

19

Prerequisites Chaps. 18

In the previous chapter, we have introduced the fundamental quarks, the roles they play in particle physics as well as the underlying symmetries involved with them. In this chapter, we consider the important aspect of how the strange s, the charm c, the bottom or beauty b and the top t quarks came about and emerged in the underlying fabric of particle physics. The inescapable introduction of the up u and down d quarks in describing the quark content of the proton and the neutron was already evident from the deep inelastic scattering data of nucleons, and were introduced in 1964 by Murray Gell-Mann, George Zweig and Yuval Ne’eman to explain the Eightfold Way classification scheme of hadrons.1 The s Quark: As a consequence of charge conservation, the electron e− being the charged particle with smallest mass is a stable particle and does not decay successfully conserving charge. On the other hand a more massive charged particle such as the ρ − particle, of quark content ρ − = (d u), decays into two pions: ρ − → π − π 0 within a very short lifetime of the order 4 × 10−24 s, as if effortless and conserves charge. With the Fermi unit 1 f = 10−15 m, denoting roughly the radius of a nucleus, the characteristic time scale of strong interaction ≈10−15 /3 × 108  3.3 × 10−24 s, where the speed of light c  3 × 108 m. On the other hand, certain particles were created in large quantities on such a short time scale, and no matter how massive they were, seemed to be trying desperately to conserve a certain quantum number, drag and eventually decay rather slowly failing to conserve the quantum number in question. Such particles, for example, are the kaons K and the hyperons, such as  and  decay relatively slowly (typically in about 10−10 s or so), much more slowly than one would expect because of their large masses. They were referred to as strange particles. For example, in a collision of non-strange particles such as nucleons and pions: p + π 0 → 0 + K 0 , the strange particles (K 0 , 0 ) were produced in large numbers and only in such pairs. The 0 particle, for example, decays to a proton and a pion: 0 → p + π − with a lifetime 2.63 × 10−10 s, and for a beam of K 0 particles, for example, a fraction decays with a lifetime of only 0.89 × 10−10 s into to two pions and the remaining ones decay into three pions with a lifetime of only 5.2 × 10−8 s. The existence of a strange quark s was postulated by Gell-Mann, Pais, Nakano and Nishijima2 to explain this strange phenomenon of the production of strange particles in pairs, by the strong interaction, and for their subsequent relatively slow decay characteristic of weak decays. Because of the production of these strange particles in pairs as just discussed, it was postulated that the associated new quantum number s was conserved in strong interactions but violated in the weak interactions involving the decay of the emerging strange particles. In particular, the lightest particles containing a strange quark cannot decay by the strong interaction, and must instead decay via the much slower weak interaction. If a particle had a small mass and decayed slowly then that would be expected. But the hyperon 0 with a mass 1116 Mev/c2 decays so slowly with a lifetime 2.63 × 10−10 s, as mentioned above, is rather surprising. Unlike strongly decaying particles with very short lifetimes ∼ 10−24 s, the strange particles because of their slow lifetime they drag and leave tracks in cloud chambers such as ones observed by Rochester and Butler3 in 1947 which left visible V shaped pair tracks. The production of the strange particles is a fast process, but their decay is a slower process giving rise to these visible tracks. The quark content of 0 and K 0 particles, for example, are given by  = (uds), K 0 = (ds), and their daughters discussed above, are independent of the strange quantum numbers s, s thus violating strangeness conservation in the weak interaction, while the pair 0 , K 0 , when they are produced together, have zero strangeness, conserving strangeness in strong interaction. Think of these strange particles carrying the quantum number s or s are trying desperately to conserve the strange 1 The

original papers are collected in Gell-Mann and Ne’eman [16]. [15]; Pais [23]; Nakano and Nishijima [22]. See also Nambu et al. [21]. 3 Rochester and Butler [26]. 2 Gell-Mann

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_19

133

134

19

Strange, Charm, Bottom and the Top Quarks: How They Came About?

quantum number and eventually give up and decay while dragging for a relatively longer time, as mentioned earlier. The neutral vector boson φ particle of mass 1.02 GeV/c2 has both a strange quark and its antiparticle and decays strongly, with a lifetime ∼1.6×10−22 , e.g.., to K + K − , conserving strangeness, with daughters of quark contents K + = (us), K − = (su). K − , for example, decays into π − π 0 , with a lifetime ∼5.2×10−8 s, violating strangeness conservation through the weak interaction. The c Quark: A key ingredient that led eventually to the emergence of the charm quark in theoretical particle physics was a significant work carried out by Cabibbo4 striving to preserve the universality of the weak interaction by flavor mixing in defining currents in weak interactions involving quarks. To understand this, let us first recall that the current in QED is formally given by ψ(e)γ σ ψ(e) involving a bilinear form (a product of two fields) which may be conveniently rewritten as e e omitting, for simplicity, the spacetime structure and underlying gamma matrices. This defines a neutral current describing an electron going to an electron (coupled to the photon—a neutral particle). We consider three types of processes: (A) those involving quarks and conserving strangeness with only d ⇔ u or their antiquarks, transitions, (B) those involving quarks but violating strangeness with S = 1, and (C) those not involving quarks. Cabibbo realized to supress the contribution of a current to processes, consistent with experiments, a current in type (B) is to be be multiplied by sin θC , where θC is referred to as Cabibbo angle and that unitarity requires to multiply the current, defined in terms of quarks, in type (A) by cos θC ≈ 1. On the other hand processes not involving quarks in type (C) remain unmodified. Thus the following modifications arise for the currents of the following processes: π + (u d) → μ+ νμ , +

du → cos θC du,

(19.1)

+

K (u s) → μ νμ ,

su → sin θC su,

(19.2)



ud → cos θC ud,

(19.3)



us → sin θC us,

(19.4)

no change.

(19.5)

n(udd) → p(uud)e ν˜ , (uds) → p(uud)e ν˜ , +

+

μ → e νe ν˜ μ ,

In particular, we learn that the coupling parameter squared of the decay process n → p e− ν˜ e is smaller by a factor of cos2 θC of the coupling squared of the decay process μ+ → e+ νe ν˜ μ in a theory involving the product of two currents as in the Fermi theory to be discussed in Chap. 33. Experimentally, sin θC  0.23. One may generate the currents on the right-hand of the above equations by introducing, in the process, the doublet (u d) → (u d ) , where d = d cos θC + s sin θC . Charged currents may be then defined by u d = u d cos θC + u s sin θC , d u = d u cos θC + s u sin θC .,

(19.6)

On the other hand, a general neutral current is given by uu + d d = uu + (dd cos2 θC + ss sin2 θC ) + (sd + sd ) sin θC cos θC , , S = 0

(19.7)

S = 1

which clearly involves a strangeness changing term. Such a term gives rise to a net leading order contribution to transition probabilities, even at the tree level, that is, involving no radiative corrections, dominating over the smaller radiative corrections, in contrast to experiment. In order to suppress such a tree level term, Bjørken, Glashow, Iliopoulos, and Maiani5 introduced a fourth quark c, referred to as the charm quark, with which one may precisely cancel out the above S = 1 term as follows. By introducing the doublets (u d ) , (c s ) , in defining currents, where 

4 Cabibbo 5 Bjørken

d s



 =

cos θC sin θC − sin θC cos θC

[12]. and Glashow [11]; Glashow, Iliopoulos and Maiani [17].

  d . s

(19.8)

19

Strange, Charm, Bottom and the Top Quarks: How They Came About?

135

This leads to the neutral current: uu + cc + d d + s s = uu + cc + (dd + ss) cos2 θC + (ss + dd) sin2 θC ,

(19.9)

involving no S = 1 term at the tree level. This explains the strong suppression of the strangeness changing neutral current associated with the process K 0 → μ+ μ− , where the quark content of K 0 = (d s) relative to strangeness changing charge current associated, say, with the process K + → μ+ νμ , observed experimentally, where the quark content of K + = (us), and in terms of quarks charge changing becomes evident by the emission of a W + particle in the process: u s → W + → μ+ νμ . The introduction of the charm quark c provides remarkable symmetry between leptons and quarks for the first two generations: (νe e) , (νμ μ) and (u d) , (c s) as weak isospin doublets. The weak neutral current (coupled to the Z boson) does not change flavor in the Standard Model proper at the tree level whereas the weak charged currents (coupled to W bosons) do. This is referred to as the GIM mechanism. More generally, if one includes the quarks of the three generations, then no flavor-changing neutral current occurs, at the tree level, if one introduces the Cabibbo-Kobayashi-Maskawa (CKM)6 unitary matrix V = [Vi j ], where Vi j denotes the amplitude for i th quark → j th quark, and defines the combinations U = V U, U = (d s b ) , U = (d s b) ⎛ ⎞ ⎛ ⎞⎛ ⎞ d Vud Vus Vub d ⎝ s ⎠ = ⎝ Vcd Vcs Vcb ⎠ ⎝ s ⎠ b Vtd Vts Vtb b

(19.10)

d d + s s + b b = U V † V U = UU = dd + ss + bb,

(19.12)

(19.11)

This gives rise to a neutral current

involving no flavor change at the tree level. The CKM matrix is a generalization of the Cabibbo matrix defined in (19.8).7 We will encounter the CKM matrix again in Chap. 35 when incorporating quarks in the electroweak theory. The existence of the charm quark was confirmed through the observation of an invariant mass peak in e+ e− collisions at SLAC,8 as well as in pp → e+ e− + X , where X stands for anything accompanying the pair e+ e− , collisions at the Brookhaven National Laboratory9 led, respectively, by B. Richter and S. C. C. Ting who were awarded the Nobel Prize in 1976. The particle in question consists of a charm quark and its antiparticle cc, and was called the ψ and J particle by the respective groups. Now it is most often referred to as the J/ψ particle. It is a chargeless vector Boson, with a mass estimated to be about 3.1 GeV. Some typical decays of J/ψ are into e+ e− , μ+ μ− , or into pions, with lifetime of the order ∼10−20 s which is surprisingly 103 longer than lifetimes of hadrons in the same mass range which are of the order ∼10−23 s. To confirm the charm hypothesis it was necessary to produce particles involving the charm quark, also referred to as Naked charm. Such particles where discovered10 for example the D mesons which contain a charm quark (or its antiparticle) and an up or down quark, such as D 0 = (cu) particle. The charm quark c has a charge 2/3|e|, and weak isospin components I3 = −1/2 thus provided a complete symmetry between the leptons (νe e− ) , (νμ μ− ) and the following quark doublets which emerged as (u d) , (s c) at that time. The discovery of J/ψ was confirmed in November of 1974 and came to be known as the November revolution. The symmetry between the leptons and quark recovered above was “spoiled” when the tau lepton τ was discovered in11 in 1975, where τ and its antineutrino ντ form the third generation of lepton doublets. Soon after the bottom, or beauty, b quark, however, was discovered in 1977, and the top quark t was discovered 17 years after the bottom quark in 1994, completing the lepton and quark symmetry of three generations. The bottom and top quarks are discussed next.

6 Cabibbo

[12]; Kobayashi and Maskawa [20]. is rather surprising that only Kobayashi and Maskawa were awarded the Nobel Prize while Cabibbo was left out. 8 Augustin et al. [7]. 9 Aubert et al. [6]. 10 Cazzoli et al. [13]; Goldhaber et al. [18]; Peruzzi et al. [25]. 11 Perl et al. [24]. 7 It

136

19

Strange, Charm, Bottom and the Top Quarks: How They Came About?

The b Quark: In 1977, a di-muon resonance at about 9.5 GeV was observed in proton-nuclues collisions12 at the Fermi lab, as well as in e+ e− annihilation at DESY.13 The particle in question was called the upsilon ϒ and is identified as a spin 1. It subsequently decays, e.g.., into a pair of muons, or electron-positron pair with a lifetime of ∼2 × 10−20 s. The ϒ particle is assumed to consist of a pair b b of a new quark, more massive than the strange one, and its antiparticle, and referred to as bottom or beauty quark. Higher excited states of the ϒ particle were subsequently also observed14 at the Cornell Electron Storage Ring. To confirm the bottom/beauty hypothesis it was necessary to produce particles involving the bottom or beauty quark or its antiparticle, referred to as bare bottom or naked beauty. The excited state of mass 10.58 GeV/c2 , which is above the threshold for B B production, each of mass about 5.27 GeV/c2 , decays as follows ϒ(10.58) → B 0 B 0 , to the bare bottom or naked beauty mesons with quark contents: B 0 = (bd ), and B 0 = (d b ) containing a single b quark, and its antiparticle, respectively. Moreover these mesons have been observed15 violating bottom/beauty conservation. For example B 0 decays into D ∗ π π , with a lifetime ∼1.5 × 10−12 typical of a weak decay. The D ∗ particles have 0, ± charges, with quark contents: D ∗0 = cu, D ∗+ = cd, D ∗− = d c. The b quark is identified to belong to a doublet with weak isospin component I3 = −1/2, i.e, a down-type quark of a 3rd generation (and not a singlet), consistent with experiment on the angular distribution of b hadrons in e+ e− collisions.16 We are on our way of completing the third generation of quarks, consistent with a three generations of leptons, by introducing, in the process, the top quark t—the heaviest quark of them all. The t Quark: The top quark finally emerged in 1994, that is about 17 years after the bottom one, in proton-antiproton collisions at Fermilab.17 It is estimated to be very massive ∼173.5 GeV/c2 and decays with a lifetime ∼5 × 10−25 s.18 That is, it is extremely too short lived, even on the strong interaction time scale of order 1 f/c ∼ 3.3 × 10−24 s estimated earlier, to give it enough time, before its decay, to bind with other quarks and form hadrons, i.e., it fails to hadronize. On the hand, the other quarks hadronize, i.e., they combine with other quarks to form hadrons, and can only be observed as such. In p p collisions, p p → t t + X , and t → W + b, t → W − b, and such single top production have been finally observed in recent years,19 confirming its existence. The top quark decays predominantly to a W boson and and a bottom quark, and less frequently to a strange one and even less so to a down quark, i.e., |Vtb | |Vts | > |Vtd | in the CKM matrix in (19.11).20 The top quark mass is estimated from the momenta of its decay products.

References 1. Abachi, S., et al. (1995). Observation of the top quark. Physical Review Letters, 74, 2632–2637. 2. Abazov, V. M. et al.(2007). Evidence for production of single top quarks and first direct measurement of |Vtb |. Physical Review Letters, 98, 181802. 3. Abazov, V. M., et al. (2009). Observation of single top quark production. Physical Review Letters, 103, 092001. √ 4. Abe, F., et al. (1994). Evidence for top quark production in p p collisions at s = 1.8 TeV. Physical Review Letters, 73, 225–231. 5. Aaltonen, T., et al. (2009). First observation of electroweak single top quark production. Physical Review Letters, 103, 092002. 6. Aubert, J. J., et al. (1974). Experimental observation of a Heavy particle J . Physical Review Letters, 33, 1404–1406. 7. Augustin, J.-E., et al. (1974). Discovery of a narrow resonance in e+ e− annihilation. Physical Review Letter, 33, 1406–1408. 8. Bartel, W. (1984). A measurement of the electroweak induced charge asymmetry in e+ e− → bb. Physics Letters, 146B, 437–442. 9. Behrends, S., et al. (1983). Observation of exclusive decay modes of b-flavored mesons. Physical Review Letters, 50, 881–885 (1983). 10. Berger, Ch., et al. (1978). Observation of a narrow resonance formed in e+e- annihilation at 9.46 GeV. Physics Letters, 76B, 243–245. 11. Bjørken, B. J., & Glashow, S. L. (1964). Elementary particles and SU(4). Physics Letters, 11, 255. 12. Cabibbo, N. (1963). Unitary symmetry and leptonic decays. Physical Review Letters, 10, 531–533. 12 Herb

[19]. et al. [10]. 14 E.g., Finocchiaro et al. [14]. ϒ is above the threshold for B B production. 15 Behrends et al. [9]. 16 Bartel [8]. 17 Abe et al. [4]; Abachi et al. [1]. 18 The t quark was predicted by Kobayashi and Maskawa [20] in their work on CP violation. 19 Abazov et al. [2, 3]; Aaltonen et al. [5]. 20 Magnitudes of these matrix elements, i.e., the absolute values of the amplitudes, will be given in Chap. 35 on the the Electroweak theory. 13 Berger

References

137

13. Cazzoli, E. G., et al. (1975). Evidence for S = −Q currents or charmed-baryon production by neutrinos. Physical Review Letters, 34, 1125–1129. 14. Finocchiaro, G., et al. (1980). Observation of the ϒ at the Cornell Electron Storage Ring. Physical Review Letters, 45, 222–224. 15. Gell-Mann, M. (1953). Isotopic spin and new unstable particles. Physical Review, 92, 833–834. 16. Gell-Mann, M., & Ne’eman, Y. (1964). The eightfold way. New York: Benjamin. 17. Glashow, S. L., Iliopoulos, J. L., & Maiani, L. (1970). Weak interactions with lepton-hadron symmetry. Physical Review D, 2, 1285–1292. 18. Goldhaber, G., et al. (1976). Observation in e+ e− annihilation of a narrow state at 1865 MeV/c2 decaying to K π , K π π π . Physical Review Letters, 37, 255–259. 19. Herb, S. W. (1977). Observation of a di-muon resonance at 9.5 GeV in 400-GeV proton-nucleus collisions. Physical Review Letters, 39, 252–255. 20. Kobayashi, M., & Maskawa, K. (1973). CP violation in the renormalizable theory of weak interaction. Progress of Theoretical Physics, 49, 652–657. 21. Nambu, Y., et al. (1951). On the Nature of V-Particles, I. Progress of Theoretical Physics, 6, 615–619. 22. Nakano, T., & Nishijima, K. (1953). Charge independence for V-particles. Progress of Theoretical Physics, 10, 581–582. 23. Pais, A. (1952). Some remarks on the V-particles. Physical Review, 86, 663–673. 24. Perl, M. L., et al. (1975). Evidence for anomalous lepton production in e+ e− annihilation. Physical Review Letters, 35, 1489–1492. 25. Peruzzi, I., et al. (1976). Observation of a narrow charged state at 1876 MeV/c2 decaying to an exotic combination of K π π . Physical Review Letters, 37, 269–271. 26. Rochester, G. D., & Butler, C. C. (1947). Evidence for the existence of new unstable elementary particles. Nature, 160, 855–857.

C, P, CP and T Violations in Weak Interactions Prerequisites Chap. 18

20

We recall the meanings of the fundamental C, P and T transformations of basic processes. Charge conjugation C consists of replacing every particle in a given process by its anti-particle, while parity transformation P, also referred to as space reflection, consists in considering a given process as reflected through the origin of a coordinate system and involves the reversal of the direction of the three-momentum of every particle as well as of the consideration of their intrinsic parities. The time reversal T involves in reversing the direction in which a process evolves and as such, the particles initially going into it become outgoing, and vice versa, while reversing the direction of their momenta and their spin projections. As far as its known, the product transformation CPT is a symmetry of nature. The CPT symmetry in QFT is the subject matter of Chap. 36. In the weak interactions P, C, CP and T violations have been observed and will be considered in detail below. To discuss these symmetry breakings, the need arises in combining various angular momenta as well as in considering some basic aspects of spherical harmonics in dealing with orbital angular momentum, and all that is needed in here is summarized in Box 20.1. That P violation occurs in weak interaction was first emphasized1 by T. D. Lee and C. N. Yang in 1956, and verified experimentally2 by C. S. Wu in 1957 in β-decay of the spin polarized nucleus Co60 . In its simplest description of β-decay n → p + e− + ν˜ e , if the spin of n is polarized in the +z direction and θ denotes the angle between the z-axis and the direction of the momentum of the emerging electron, then under space reflection θ → π − θ , and the data shows that Prob(θ ) = Prob(π − θ ) giving rise to an up-down asymmetry and establishing P violation. Observations3 show that the neutrinos are left-handed, while anti-neutrinos are right-handed. That is, in particular, the anti-neutrino ν˜ e observed in β-decay is righthanded. In Box 20.2, we consider P violation in the decay 0 → p + π − , where 0 , π − have each an intrinsic parity of −1, while the intrinsic parity of the proton is +1, and thus the intrinsic parities need not be considered in the analysis. An early experiment4 was involved in investigating the helicities of the electron and positron in the charge conjugate processes of μ∓ decay: μ− → e− + ν˜ e + νμ , μ+ → e+ + νe + ν˜ μ , and found them to be opposite to each other with the electron being left-handed and the positron being right-handed.5 These charge conjugate decay weak interactions processes are interesting and worth considering in the light of C violation. Suppose the μ− particle, at rest, has its spin, say, in the +z direction. As a simple description of C violation here, we

1 Lee

and Yang [10]. [15]. 3 See, e.g.., Goldhaber, Grodzins and Sunyar [8]. 4 See, Culligan, Frank and Holt [5]. 5 These processes have been studied theoretically in detail, e.g.., in Okun, and Sehter ˘ [12]. 2 Wu

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_20

139

140

20

C, P, CP and T Violations InWeak Interactions

Box 20.1 Combining angular momenta and some aspects of the spherical harmonics Combining Angular momenta: J = J1 + J2 , j = | j1 − j2 |, . . . , j1 + j2 , m = m 1 + m 2 , ( j1 , m 1 = − j1 , − j1 + 1, · · · , + j1 ), | j1 , m 1 | j2 , m 2  ≡ |m 1 , m 2 ,

( j2 , m 2 = − j2 , − j2 + 1, · · · , + j2 ),  |m 1 , m 2 m 1 , m 2 | j, m.

• | j, m =

m 1 ,m 2

• j1 = 1/2, j2 = 1/2 : √   |1, +1 = |1/2, 1/2, |1, 0 = (1/ 2) |1/2, −1/2 + |− 1/2, 1/2 , |1, −1 = |− 1/2, −1/2, √   |0, 0 = (1/ 2) |1/2, −1/2 − |− 1/2, 1/2 . • j1 = 1, j2 = 1/2 :     |1/2, 1/2 = 2/3 |1, −1/2 − 1/3 |0, 1/2, |1/2, −1/2 = 1/3 |0, −1/2 − 2/3 |− 1, 1/2,   |3/2, 3/2 = |1, 1/2, |3/2, 1/2 = 1/3 |1, −1/2 + 2/3 |0, 1/2,   |3/2, −3/2 = |− 1, −1/2. |3/2, −1/2 = 2/3 |0, −1/2 + 1/3 |− 1, 1/2, • Spherical Harmonics : θ, φ |, m ≡ Ym (θ, φ),  = 1, 2, . . . ; m = −, − + 1, · · · , +. Under space reflection : θ → π − θ, φ → φ + π, • Ym (θ, φ) → (−1) Ym (θ, φ).      = 1/4π , Y10 (θ, φ) = 3/4π cos θ, Y11 (θ, φ) = − 3/8π eiφ sin θ , · · ·

Y00 (θ, φ)

Box 20.2 P violation in the process 0 → p + π − • Consider the spin 1/2 of 0 along the +z-axis. For the pair of particles p & π − , we must combine an orbital angular momentum of quantum number  for the motion of the pair and a spin 1/2. That is we must have 1/2 = | − 1/2|, i.e. ( = 0, m 0 = 0) or ( = 1, m 1 = 1, 0). We use the notation |, m for the orbital angular part, and |1/2, m s = ±1/2 for the spin of the proton, and total m T = m + m s . The final state must have a total m T = 1/2. Then for some constants A and B, and Box 20.1 to combine angular momenta with j1 = , j2 = 1/2, it    2/3 |1, 1 |1/2, −1/2 − 1/3 |1, 0 |1/2, 1/2 .

is given by: |ψ = A |0, 0 |1/2, 1/2 + B

By using θ, φ |, m   ≡ Ym (θ, φ), we may define the angular distribution state of the    proton: ψ(θ, φ) = |1/2, 1/2 A Y00 (θ, φ) − B 1/3 Y10 (θ, φ) + |1/2, −1/2[ B 2/3 Y11 (θ, φ)]. From the expressions of Ym in Box 20.1, the angular distribution of the proton |ψ(θ, φ)|2 is  then given by : |ψ(θ, φ)|2 ∝ |A|2 + |B|2 − (A∗ B + B ∗ A) cos θ ∝ [1 + α cos θ , [A∗ B + B ∗ A] . From the particle data group, on  decay parameters, α = 0, i.e. [ |A|2 + |B|2 ]  = 0,  = 1 both contribute and A = 0, B = 0. Since cos θ → − cos θ under space

α=−

reflection, |ψ(θ, φ)|2 = |ψ(π − θ, φ + π )|2 and parity is violated.

may start by neglecting the masses of e− , ν˜ e , νμ in comparison to their momenta and to the mass of the muon. We consider a detector which detects the electron in the decay process in which the electron emerges carrying its maximum available energy. To find this maximum available energy for the electron in μ− ( p1 ) → e− ( p2 ) + ν˜ e ( p3 ) + νμ ( p4 ), energy-momentum conservation gives: ( p1 − p2 )2 = −m 2μ + 2 m μ E e = ( p3 + p4 )2 , ⇒ E e = [ m 2μ − M 2 ]/2m μ , M 2 ≡ −( p3 + p4 )2 = 2|p3 ||p4 |(1 − cos ϑ) is a minimum = 0 for cos ϑ = 1, i.e., when ν˜ e , νμ move in same direction. Thus the maximum energy that may be carried by the electron is E emax = m μ /2, and ν˜ e , νμ move in opposite direction to the electron. To describe this process, we use the Fermi description of the process, which will be considered in detail in Chaps. 33 and 35, with an amplitude of the process given by the simple structure   M ∝ νμ γ μ (1 − γ 5 ) μ e γμ (1 − γ 5 ) ν˜ e ,

(20.1)

20

C, P, CP and T Violations InWeak Interactions

141

expressed in terms of momentum description Dirac spinors with ν˜ e , in particular, represents the Dirac spinor of an anti-particle created with its spin along its momentum, that is, ν˜ e = v(p3 , +) = γ 5 u(p3 , −), while ν μ represents the Dirac spinor of a particle with its spin opposite its momentum, i.e., ν μ = u(p4 , −) as described in Box 16.1 in Chap. 16. Thus, in the rest frame of μ− , if p denotes the momentum of e− , then ν˜ e and νμ will move in opposite direction to p with momenta, say, −α1p, −α2p, with positive α1 , α2 , α1 + α2 = 1. The e− in (20.1) must be left-handed as a right-handed one gives zero to the expression in it. C violation of the process is worked out in Box 20.3.

Box 20.3 C violation of the process μ− → e− + ν˜ e + νμ

0 −I I 0 ,γ 5= . −I 0 0 −I

We work in the chiral representation (Box 16.5 in Chapter 16) : γ 0 =

Suppose that μ− , at rest, has its spin along the +z direction. Then (Box 16.5 in Chapter 16), in a convenient notation, μ− = ([ 1 0 ] [−1 0]) , up to a normalization constant. Let p denote the momentum of the electron. Then: for the process: μ− → e− + ν˜ e + νμ ,   e− = u(p, opposite p) = [ 0 0 ] [− sin(θ/2) cos(θ/2)] ,       ν˜ e = v − α1 p, along − p = γ 5 u − α1 p, opposite − p = [ 0 0 ] [− cos(θ/2) − sin(θ/2)] ,   νμ = u(−α2 p, opposite − p) = [ 0 0 ] [cos(θ/2) sin(θ/2)] .   νμ γ σ (1 − γ 5 )μ− = 2 − cos(θ/2), sin(θ/2), i sin(θ/2), cos(θ/2) ,   e− γσ (1 − γ 5 )˜νe − = 2 0, cos θ, i, − sin θ . Accordingly, given that the electron has emerged carrying its maximum possible energy, the amplitude that the angle between its momentum and the spin of μ− is equal to θ is given by M(μ− → e− + ν˜ e + νμ ) ∝ [νμ γ σ (1 − γ 5 )μ− ][e− γσ (1 − γ 5 )˜νe −] ∝ sin(θ/2).

A similar analysis gives M(μ+ → e+ + νe + ν˜ μ ) ∝ cos(θ/2).

Accordingly, given that e− /e+ have emerged carrying their maximum possible energy, then the probabilities that their momenta make an angle θ with the spins of their respective 1  1 ∓ cos θ), Prob∓ (θ) d = 1, muons are, respectively, given by: Prob∓ (θ) = 4π where we have used the identities: sin2 (θ/2) = (1 − cos θ)/2, cos2 (θ/2) = (1 + cos θ)/2. That is, e− moves, predominantly, in a direction opposite to the spin of μ− , while while e+ moves, predominantly, in a direction along the spin of μ+ , thus confirming C violation.

CP violation was first observed in the K 0 − K 0 system in 1964.6 The K 0 particle7 is a neutral particle (J P = 0− ) produced via the strong interaction, e.g.., through the process π − + p → K 0 + 0 . The quark content of the products are 0 = (uds), K 0 = (d s ), and the latter process conserves strangeness. The anti-

6 Christenson 7 The

et al. [4]. neutral Kaon was discovered in 1947 by Rochester Butler [13].

142

20

C, P, CP and T Violations InWeak Interactions

Box 20.4 CP violation, and direct T violation in the K 0 − K 0 system • The |K 0  and |K 0  states are defined by the linear combinations: √ √ |K 0  = (1/ 2)[|K 1  + |K 2 ], |K 0  = (1/ 2)[|K 2  − |K 1 ]. On the other hand P|K 0  = −|K 0 , CP|K 0  = −|K 0 , CP|K 0  = −|K 0 . From which we infer that: • CP|K 1  = |K 1 ,

CP|K 2  = −|K 2 .

For a neutral particle decaying into two pions, we have to consider the pair products (π 0 π 0 ), (π + π − ). The Kaon as well as of the pions are of spin zero, moreover under space reflection (see Box 20.1) Ym → (−1) Ym , this requires that  = 0. On the other hand due to the pseudoscalar nature of the pions: Pθ, φ |π π  = (−1)2 (−) θ, φ |π π , (−1)2 (−) = 1. According to Bose statistics, this requires that the isospin final state is even under exchange π + ↔ π − , π 0 ↔ π 0 . That is the final state is a linear combination of  + − |π π  + |π − π + ], |π 0 π 0 . This is invariant under C. Accordingly, CP |π π  = |π π  and if CP is conserved, only the K 1 mode can decay into two pions. The mere fact the rare decay of the long lived state |K 2  into two pions has been observed, as mentioned in the text, establishes C violation. [It is not difficult to show that the | K 2  mode can decay into three pions, e.g. K 2 → π 0 π 0 π 0 , where CP |π 0 π 0 π 0  = −|π 0 π 0 π 0 .] • Direct T violation was observed experimentally in the K 0 − K 0 system by comparing the forward and backward probabilities of K 0 − K 0 virtual transitions given by: (K 0 → K 0 ) − (K 0 → K 0 )

= (4.3 − 8.9) × 10−3 . [ See also PDG [14] for the tabulation (K 0 → K 0 ) + (K 0 → K 0 ) of this asymmetry.] The mere fact that the latter does not vanish, provides evidence of time-reversal non-invariance.

particle of K 0 has a quark content K0 = (s d) and is a different particle. The main observation is that when one produces a beam of K 0 particles, part of the beam decays rapidly into two pions, and a part decays later into three pions, with corresponding lifetimes of about 10−10 s and 10−8 s, characteristics of weak decays. This has led to the assumption8 that the state describing the K 0 particle is a linear combination of two states √  having decay laws with different lifetimes. Initially, one may then define the initial state of the K 0 meson as |K 0  = (1/ 2) |K 1  + |K 2  , with |K 1  and |K 2  denoting the short and long lived states decay respectively to two pions and three pions. Here we see the linear superposition of quantum theory of adding amplitudes in action. If CP is conserved, the K 2 cannot decay into two pions (see Box 20.3). It was subsequently observed to happen by Christenson et al.9 mentioned above. This rare decay of the long lived kaon into two pions as well gave rise to the discovery of CP violation. For details of CP violation see Box 20.4. For consistency with experiments, the long lived state |K 2  had to  be slightly modified to include the |K 1  state as follows: |K L  = ( 1 + ||2 )−1 [|K 1  + |K 2 ], where || ≈ 2 × 10−3 . 10 A direct test of T invariance was observed in the K − K 0 system11 and is based in comparing the transition rates K 0 ↔ K 0 , and K 0 ↔ K 0 and is given in Box 20.4.

References 1. Angelopoulos, A., et al. (1998). First direct observation of time-reversal non-invariance in the neutral-kaon system. Physics Letters B, 444, 43–51. 2. Aubert, R., et al. (2004). Direct CP violating asymmetry in B 0 → K + π − Decays. Physical Review Letters, 93, 131801. 3. Chao, Y., et al. (2004). Evidence for direct CP violation in B 0 → K + π − Decays. Physical Review Letters, 93, 191802. 4. Christenson, J. H., et al. (1964). Evidence for the 2π decay of the K 20 meson. Physical Review Letters, 13, 138–141. 8 Gell-Mann

and Pais [7]. et al. [4]. 10 CP violation is also observed in the B-meson system, see, Aubert et al. [2]; Chao et al. [3]. CP violation has been also considered in baryon decay which would indicate an asymmetry in the decay rate of baryon and an anti-baryon: LHCb collaboration [11]. See also the paper by Durieux and Grossman [6] which may be relevant in understanding of the matter-antimatter imbalance in the Universe. 11 Angelopoulos et al. [1]. Test of T-invariance in the neutral kaon system goes back to investigations carried out by Kabir [9]. 9 Christenson

20

C, P, CP and T Violations InWeak Interactions

143

5. Culligan, G., Frank, S. G. F., & Holt, J. R. (1959). Longitudinal polarization of the electrons from the decay of unpolarized Positive and negative Muons. Proceedings Physical Society, 73, 169–177. 6. Durieux, G., & Grossman, Y. (2017). CP violation: Another piece of the puzzle. Nature, 13, 322. 7. Gell-Mann, M., & Pais, A. (1955). Behavior of neutral particles under charge conjugation. Physical Review, 97, 1387–1390. 8. Goldhaber, M., Grodzins, L., & Sunyar, A. W. (1959). Helicity of neutrinos. Physical Review, 109, 1015–1017. 9. Kabir, P. K. (1968). Tests of T and TCP invariance in K 0 decay. Nature, 220, 1310–1313. 10. Lee, T. D., & Yang, C. N. (1956). Question of parity conservation in weak interactions. Physical Review, 104, 254–258. 11. LHCb collaboration,. (2017). Measurement of matter-antimatter differences in beauty baryon decays. Nature Physics, 13, 391–396. ˘ 12. Okun, L. B., & Sehter, Y. M. (1958). On the polarization of electrons from the decay of muons. Nuovo Cimento, 10, 359–364. 13. Rochester, G. D., & Butler, C. C. (1947). Evidence for the existence of new unstable elementary particles. Nature, 160, 855–857. 14. Tanabashi, M., et al. (2018). Particle data group (2018). Physical Review D, 98, 010001. 15. Wu, C. S. (1957). Experimental tests of parity conservation in beta decay. Physical Review, 105, 1413–1415.

Lagrangians: Varying Action Integrals in QFT Prerequisite Chap. 16

21

Field equations of a given theory may be derived by taking functional derivatives of the corresponding action integral w.r.t. the fields appearing in the Lagrangian density L (x) of the underlying theory and is referred to as the principle of stationary action.1 Let χ (x) be a generic field, then the field equation generated by the functional derivative w.r.t the field χ (x) of the action are given by

 where

δ δχ (x) (dx) L (x) = A

 (dx) L (x) = 0, is referred to as the action integral.

(21.1) (21.2)

The generation of field equations by taking functional derivatives of the action is a faster way but equivalent to the one obtained by the application of the familiar Euler-Lagrange equations you learn in mechanics. A Lagrangian density is, in general, a sum of products of fields and their derivatives as we will encounter in subsequent chapters. Accordingly, we have to set up the rules and learn how to take functional derivatives of fields and their derivatives. We consider, in turn, bosonic as well as fermionic fields as given below. • Bosonic fields: δ χbν (x  ) = δ ν μ δba δ (4) (x  − x), δχaμ (x)

δ ∂  χb ν (x  ) = δ ν μ δba ∂ ρ δ (4) (x  − x) = − δ ν μ δba ∂ ρ δ (4) (x  − x), δχaμ (x) ρ δ ∂  χbα (x  ) ∂ σ χcβ (x  ) δχaμ (x) ρ = δ αμ δba ∂ ρ δ (4) (x  − x) ∂ σ χcβ (x  ) + ∂ ρ χbα (x  ) δ β μ δca ∂ σ δ (4) (x  − x)   = −δ (4) (x  − x) ∂ρ ∂σ δ αμ δba χcβ (x) + δ β μ δca χcα (x) .

(21.3) (21.4)

(21.5)

• Fermionic Fields:

δ δψa μ (x)

1 For

δ ψbν (x  ) = δ ν μ δba δ (4) (x  − x), δψa μ (x)   ψbν (x  ) ψcσ (x  ) = δ ν μ δba ψcσ (x  ) − δ σ μ δca ψbν (x  ) δ (4) (x  − x), δ ψ ν (x  ) ψcσ (x  ) = − ψ bν (x) δ σ μ δca δ (4) (x  − x), δψa μ (x) b δ ψ bν (x  ) ψcσ (x  ) = + ψcσ (x) δ ν μ δba δ (4) (x  − x), δψ a μ (x)

(21.6) (21.7) (21.8) (21.9)

the rather non-trivial aspect of the rationale of the principle of stationary action, see Manoukian [1], pp. 146–150.

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_21

145

146

21

Lagrangians: Varying Action Integrals in QFT

  ∂ ρ ψ bν (x  ) ψcσ (x  ) = − ∂ρ ψ bν (x) δ σ μ δca δ (4) (x  − x),

δ

δψa μ (x)   δ    σ σ a ν ν δ (4) (x  − x). ψ (x ) ∂ ψ (x ) = + ∂ ψ (x) δ δ μ c ρ c b b ρ δψa μ (x)

(21.10) (21.11)

We learn, in particular, that a minus sign arises when we commute a Fermionic field with the functional derivative of a Fermionic field. For example consider the Lagrangian density   1 ∂μ ψ μ μ ∂μ ψ γ ψ − ψγ − m ψ ψ + ηψ + ψ η, L = 2 i i

(21.12)

where η is an external source (see Chap. 16), and ψ = ψ † γ 0 . In particular, note that δ δψ(x)

∂ μ ψ(x  )γ μ ψ(x  ) = ∂ μ δ (4) (x  − x)γ μ ψ(x  ) = −∂μ γ μ ψ(x)δ (4) (x  − x),

(21.13)

and similarly applied to other terms in the L (x  ), leads, upon integration over x  to −

 γ μ∂ i

μ

  γ μ∂  μ + m ψ(x) + η(x) = 0 ⇒ + m ψ(x) = η(x). i

(21.14)

If |0−  denotes the vacuum state before the source J μ is switched on, while |0+  denotes the vacuum state after the source J is switched off (see Chap. 16), then μ

 0+ |ψ(x)|0−  = (dx  )S+ (x − x  )η(x  ), 0+ |0−   (d p) i p(x−x  ) (−γ p + m) ,  → + 0. S+ (x − x  ) = e (2π )4 p 2 + m 2 − i

(21.15) (21.16)

As another example consider the Lagrangian density 1 λ L = − Fμν F μν + J μ Aμ − χ ∂μ Aμ + χ 2 , 4 2

  F μν = ∂ μ Aν − ∂ ν Aμ ,

(21.17)

where J μ is an external source, and the term −χ ∂μ Aμ + λ2 χ 2 , with χ referred to as an auxiliary field, is a gauge fixing term, which generates covariant gauges that may be used in describing the photon as we will just see. The field equations of Aμ and χ are elementary and are given by −∂μ F μν = J ν + ∂ ν χ ,

∂μ Aμ = λχ ,

(21.18)

and upon taking the partial derivative ∂ν of the first equation, while using the second equation in the first, we obtain the equivalent equations −Aμ = J μ + (1 − λ)∂ μ χ ,

−χ = ∂ ν Jν .

(21.19)

The auxiliary field χ together with the arbitrary real parameter λ provide a wealth of covariant gauges for electrodynamics that will be discussed later. By taking the matrix elements of the above two equations: 0+ | · · · |0− , where |0−  is the vacuum state before the source J μ is switched on, while |0+  is the vacuum state after the source J μ is switched off (see Chap. 16), then    ∂ μ ∂ ν 0+ |Aμ (x)|0−    μ   = (dx )D+ (x − x ) J (x ) − (1 − λ) Jν (x ) 0+ |0−    μν (21.20) = (dx  )D+ (x − x  )Jν (x  ),

21

Lagrangians: Varying Action Integrals in QFT

where μν

D+ (x − x  ) =





 (dk) k μ k ν e i k(x−x ) μν η ,  → +0, − (1 − λ) (2π )4 k2 k 2 − i

147

(21.21)

defines the photon propagator in covariant gauges specified by the parameter λ. In particular λ = 1 defines the Feynman gauge, while λ = 0 defines the Landau gauge. The situation with the (Hermitian) Klein–Gordon field interacting with an external source K is particularly simple. For example, for the Lagrangian density (see also Chap. 16) m2 2 1 −  + m 2 ϕ(x) = K (x), L = − ∂μ ϕ∂ μ ϕ − ϕ + K ϕ, 2 2  0+ |ϕ(x)|0−  = (dx  )Δ+ (x − x  )K (x  ). 0+ |0−    eik(x−x ) (dk) Δ+ (x − x  ) = ,  → +0. (2π )4 k 2 + m 2 − i

Reference 1. Manoukian, E. B. (2016). Quantum field theory. I: Foundations and abelian and non-abelian gauge theories. Switzerland: Springer.

(21.22) (21.23) (21.24)

Quantum Dynamics: The Functional Differential Formalism of QFT

22

Prerequisites Chaps. 16, 17 and 21

The differential formalism of QFT was introduced by Julian Schwinger.1 It is referred to as the Schwinger Action Principle or as the Quantum Dynamical Principle, as well simply as the Action Principle. It is a most powerful, easy to apply, and a very elegant formalism, which involves in carrying out functional differentiations of explicitly known functional expressions. Transition amplitudes are then extracted by factoring out, from the vacuum-to-vacuum transition amplitude 0+ |0− , the amplitudes for particles that are initially produced and eventually interact, and the amplitudes for particles that are eventually detected in a given process, as they are encountered in experimental situations. Suppose one has a Hamiltonian H (t, λ) describing the underlying dynamics of a system, depending on some parameters, such as coupling parameters, external sources that we wish to vary in order to investigate the response of the system to their variations. These parameters are collectively denoted by λ. The Hamiltonian may have an explicit time dependence assumed to come from such parameters. We may, in turn, introduce the Hamiltonian H (t, 0) for λ set equal to zero, which may depend on some other parameters which we do not want to vary, and hence the general explicit time dependence in H (t, 0) may come from these additional parameters. The corresponding time development unitary operators U † (t, λ) and U † (t, 0), associated with these two Hamiltonians, satisfy the equations i

d † U (t, λ) = U † (t, λ)H (t, λ), dt

i

d † U (t, 0) = U † (t, 0)H (t, 0). dt

(22.1)

Given a state a, λ = 0; t| = a, λ = 0|U † (t, 0), corresponding to the Hamiltonian H (t, 0), we define a state a τ, λ| = a, λ = 0|U † (τ, 0)U (τ, λ),

(22.2)

developing in time via the unitary operator U † (t, λ), i.e., a τ, λ; t| = a, τ, λ|U † (t, λ),

i

d a : τ, λ; t| = a, τ, λ; t|H (t, λ). dt

(22.3)

The important aspect to note about this state is that it satisfies the constraint   a τ, λ|U † (t, λ)

t =τ

= a, λ = 0|U † (τ, 0)U (τ, λ)U † (τ, λ) = a, λ = 0; τ | = a, λ = 0|U † (τ, 0),

(22.4)

using the unitarity condition U (τ, λ)U † (τ, λ) = I . That is, it coincides with the time dependent state associated with the Hamiltonian H (t, 0). In practice, one may choose τ → ±∞, with states a ± ∞, λ; t| describing scattering states as they approach the states a, λ = 0; ±∞| for t → ±∞. We use the rather standard notation

1 Schwinger

[5–9]. See also Johnson [1]; Lam [2]; Manoukian [3, 4].

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_22

149

150

22

Quantum Dynamics: The Functional Differential Formalism of QFT

b + ∞, λ|a − ∞, λ = b out|a in,

(22.5)

obtained from the state a τ, λ| defined in (22.2). These are the transition amplitudes one wishes to compute in QFT. To the above end, we introduce, respectively, the states and the following operator |a τ1 , λ; t,

b, τ2 , λ; t|,

V (t, λ) ≡ U † (t, 0)U (t, λ),

τ1 ≤ t ≤ τ2 ,

(22.6)

and rewrite a τ, λ| in (22.2) as a τ, λ| = a, λ = 0 |V (τ, λ).

(22.7)

In Box 22.1, it is shown that the variation of amplitude b τ2 , λ|a τ1 , λ, of interest, with λ is given by  δb τ2 , λ|a τ1 , λ = − i b τ2 , λ|

τ2

τ1

dτ δH (τ, λ) |a τ1 , λ,

(22.8)

where H (τ, λ) is the Hamiltonian in the Heisenberg representation, i.e., H (τ, λ) = H (χ (t), π(t), λ, t) = U † (t, λ)H (χ , π, λ, t)U (t, λ),

(22.9)

where H (χ , π, λ, t) is expressed completely in terms of independent fields χ and their canonical conjugate momenta π = π [χ ], that is for which π [χ ] = 0, by definition. The canonical conjugate momentum is defined as in mechanics by:  π [χ ] = ∂L (x)/∂ ∂ 0 χ (x) , where the derivative here is taken as in ordinary calculus—a partial derivative—rather than as a functional derivative,2 and L (x) denotes the underlying Lagrangian density of the theory. Unlike χ , π , the Heisenberg picture representations χ (t), π(t), may depend on the parameters λ since U (t, λ) does. Moreover, the variation δH (τ, λ), w.r.t. λ in (22.8) is taken by keeping χ (t), π(t) in H (τ, λ) fixed. Equation (22.8) is the celebrated Schwinger Dynamical Principle also known as the Quantum dynamical principle (QDP). Equation (22.8) may be simplified and generalized further in the following manner. Consider the limits τ2 → +∞, τ1 → −∞ relevant to applications. Moreover, consider |0 −∞, λ, and 0 +∞, λ| corresponding, respectively to the vacuum states in the remote past and the the distant future, respectively, before particles were produced by external sources to participate in an interaction process, and after all the particles that were eventually produced in the process were finally detected by external sources and the process ended up again the vacuum state. That is, one may consider the situation where one starts with a vacuum state and ends up in a vacuum state. QM says that these vacuum states are not necessarily the same since new particles may be produced or annihilated in the process, i.e., |0 +∞, λ|0 −∞, λ| is not necessarily equal to one. These  +∞ , 0 +∞, λ| ≡ 0 |. Then with dτ δH (τ, λ) replaced by states will be denoted, respectively, by |0 −∞, λ ≡ |0 − + −∞  − (dx)δL (x), as in mechanics, Eq. (22.8) takes the following equivalent form3 expressed instead in terms of the Lagrangian density L (x):  δ 0+ |0−  = i 0+ | (dx) δL (x)|0− ,

2 See 3 For

the definition of functional derivatives in Chap. 21. the derivation involved in the transition from (22.8) to (22.10) as well as of the proof of (22.12), see Manoukian [4], pp. 171–174.

(22.10)

22

Quantum Dynamics: The Functional Differential Formalism of QFT

151

Box 22.1 Variations of b τ2 , λ|a τ1 , λ, 0+ |0−   d  V (τ2 , λ)V † (τ, λ)V (τ, λ )V † (τ1 , λ ) dτ 

 = V (τ2 , λ) U † (τ, λ) H (τ, λ) − H (τ, λ ) U (τ, λ ) V † (τ1 , λ 1 ).   Integrate over τ : τ1 ≤ τ ≤ τ2 to obtain V (τ2 , λ)V † (τ1 , λ) − V (τ2 , λ )V † (τ1 , λ )  τ2 

 dτ U † (τ, λ) H (τ, λ) − H (τ, λ ) U (τ, λ ) V † (τ1 , λ ). = −iV (τ2 , λ) i

τ1

  For δλ = λ−λ , δ V (τ2 , λ)V † (τ1 , λ) = −iV (τ2 , λ)



 dτ U † (τ, λ)δ H (τ, λ)U (τ, λ) V † (τ1 , λ).

τ2 τ1

Take matrix element of above b, λ = 0| · · · |a, λ = 0, and note that δ|a or b, λ = 0 = 0. Upon using the definition of the states in (22.7) we obtain   τ2  δbτ2 , λ|a τ1 , λ = −iδbτ2 , λ| dτ U † (τ, λ)δ H (τ, λ)U (τ, λ) | a τ1 , λ. H (τ, λ) = H (χ, π, λ, τ ) τ1

is expressed in terms of independent fields χ and their canonical conjugate momenta π = π [χ], that is for which π [χ ] = 0. Therefore, we may write U † (τ, λ)δ H (τ, λ)U (τ, λ) = δ H (χ(τ ), π(τ ), λ, τ ) ≡ δ H (τ, λ), with the latter in the Heisenberg representation provided, χ(τ ), π(τ ), in H (χ(τ ), π(τ ), λ, τ ), which would have a λ-dependence, are kept fixed.

where the variation δL (x) is carried out on the explicit dependence of L (x) on λ, with all the fields and their derivatives kept fixed. This version is more convenient for applications. Its simplicity is remarkable. It is also referred to as the Schwinger Dynamical Principle or as the Quantum dynamical principle. Gauge theories also involve so-called dependent fields, and the following further generalization of the above equation is important for solving the quantum dynamical problem associated with them. To this end, a field ρ(x) is called a dependent field if its canonical conjugate momentum π [ρ ] vanishes. That is, if L (x, λ) is the Lagrangian density of the system into consideration, then ∂L (x)  = 0, π[ρ ] =  ∂ ∂ 0 ρ(x)

(22.11)

where the derivative here is taken as in ordinary calculus rather than as a functional derivative. Now a dependent field ρ(x) may be written as a function of the independent fields & their derivatives, as well as perhaps of the parameters of the theory some of which are the λ parameters we wish to vary. Accordingly, consider the variation of the matrix element of an arbitrary function F(χ (x), π(x), λ) = B(x, λ), where χ (x) are independent fields and π(x) their canonical conjugate momenta, and suppose that the independent fields ρ(x) may be expressed in terms of χ (x), π(x) and may be by their space derivatives, as well as perhaps, of some of the parameters λ we wish to vary. Then it is readily verified that 

  δ 0+ |B(x, λ)|0−  = i0+ | (dx ) δL (x ) B(x, λ) + |0−  + 0+ |δ B(x, λ)|0− ,   δL (x ) B(x, λ) + = δL (x ) B(x, λ) θ (x 0 − x 0 ) + B(x, λ)δL (x ) θ (x 0 − x 0 ),

(22.12) (22.13)

where the latter equation defines the time-ordered product of δL (x ) and B(x, λ), and θ (y) = 1 for y > 0, and = 0 for y < 0. In (22.12), the variation δ B(x, λ) is w.r.t λ with χ (x) and π(x) kept fixed. As the simplest example, consider a (Hermitian) Klein-Gordon field interacting with an external source K (x) (see Eqs. (21.22)–(21.24) in Chap. 21)

152

22

Quantum Dynamics: The Functional Differential Formalism of QFT

 1 m2 2 0+ |ϕ(x)|0−  ϕ + K ϕ, = (dx ) + (x − x )K (x ), L = − ∂μ ϕ∂ μ ϕ − 2 2 0+ |0−   δ 0+ |0−  = (dx ) + (x − x )K (x )0+ |0− . 0+ |ϕ(x)|0−  = (−i) δ K (x)

(22.14) (22.15)

where the functional derivative has been introduced in (21.3), and we have used (22.10), with Lagrangian density given in (22.14), to write first equality in (22.15). The above equation is like the equation d f (x)/dx = a(x) f (x) whose solution  x the is f (x) ∼ e dx a(x ) . Accordingly 0+ |0−  = exp

i  2

 (dx)(dx )K (x) + (x − x )K (x ) ,

(22.16)

properly normalized to unity for K = 0, and where the factor 1/ 2 arises due to the reality of K . As another example consider a spin 1/ 2 field interacting with an external Grassmann source as given in Chap. 21, Eqs. (21.12)–(21.16) 1 ∂μ ψ μ μ ∂μ ψ − m 0 ψ ψ + ηψ + ψ η, γ ψ − ψγ L = 2 i i  0+ |ψ(x)|0−  = (dx )S+ (x − x )η(x ), 0+ |0−   (d p) i p(x−x ) (−γ p + m 0 ) . S+ (x − x ) = e (2π )4 p 2 + m 20 − i

(22.17) (22.18) (22.19)

Then with functional derivatives of the Grassmann sources as defined for spinor fields in (21.6)–(21.11), δ 0+ |0−  = 0+ |ψ(x)|0−  = (−i) δη(x)



(dx )S+ (x − x )η(x )0+ |0− .

(22.20)

Hence the solution of the vacuum-to-vacuum transition amplitude is given by   0+ |0−  = exp i (dx)(dx )η(x)S+ (x − x )η(x ) ,

(22.21)

properly normalized to unity for η = 0, η = 0. As another example, consider the Maxwell field in the presence of an external current J μ in arbitrary covariant gauges, specified by a gauge parameter λ as given in Eqs. (21.17)–(21.21) in Chap. 21

λ 1 F μν = ∂ μ Aν − ∂ ν Aμ L = − Fμν F μν + J μ Aμ − χ ∂μ Aμ + χ 2 , 4 2  0+ |Aμ (x)|0−  ∂ μ ∂ ν μ = (dx )D+ (x − x ) J (x ) − (1 − λ) Jν (x ) 0+ |0−    μν = (dx )D+ (x − x )Jν (x ).

Or

(−i)

δ 0+ |0−  = δ Jμ (x)

0+ |0−  = exp

i  2



μν

(dx )D+ (x − x )Jν (x )0+ |0− ,

 μν (dx)(dx )Jμ (x)D+ (x − x )Jν (x ) ,

(22.22)

(22.23)

(22.24)

(22.25)

22

Quantum Dynamics: The Functional Differential Formalism of QFT

153

normalized to unity for J μ = 0. The factor 1/ 2 in (22.25) occurs because of the reality condition of J μ and the symmetry of D μν in its indices. As another example, consider the Lagrangian density m2 2 λ 1 ϕ + K ϕ + ϕ4, L (x) = − ∂μ ϕ∂ μ ϕ − 2 2 4 4

  1 i δ ∂ 4 0+ |0−  = i0+ | (dx ) ϕ (x )|0−  = 0+ |0− . (dx ) (−i) ∂λ 4 4 δ K (x )

(22.26) (22.27)

Let LI (x)= (λ/4)[(− i)δ/δ K (x)]4 , which is the operator obtained from the interaction Lagrangian density L I (x) = (λ/4)ϕ 4 (x) by making the replacement ϕ(x) → (−i)δ/δ K (x). We may readily integrate (22.27) over λ to obtain    0+ |0− λ = exp i (dx)LI (x) 0+ |0− λ= 0 ,  i     0+ |0− λ = exp i (dy)LI (y) exp (dx)(dx )K (x) + (x − x )K (x ) , 2

(22.28) (22.29)

giving the explicit full solution of the underlying system which involves merely functional differentiations. Before considering the quantization problem of gauge theories, we consider an example, where the relation in (22.12) is being used. To this end, consider the formal Lagrangian density involving a vector and a scalar particle 1 m2 λ 1 Vμ V μ − ϕ 2 Vμ V μ + Vμ K μ + ϕ K , L = − Fμν F μν − ∂μ ϕ∂ μ ϕ − 4 2 2 2 ∂L ∂L ∂L = ∂0 ϕ, = ∂ k V 0 − ∂ 0 V k, π0 = = 0, π [ϕ] = πk = ∂0 Vk ∂0 V0 ∂ 0ϕ − ∂μ (∂ μ V ν − ∂ ν V μ ) + (m 2 + λϕ 2 )V ν = K ν , −ϕ + λVμ V μ ϕ = K .

(22.30) (22.31) (22.32)

That is, the only dependent field is V 0 . Upon taking the derivative ∂k of π k in (22.31), and using it in the ν = 0 component of the first equation in (22.32), we obtain the key equation (m 2 + λϕ 2 )V 0 = K 0 + ∂k π k .

(22.33)

Now if we vary w.r.t. K 0 , we have to keep ϕ, ∂k π k fixed. To this end, let  ϕ (x) = (−i)

δ μ (x) = (−i) δ , , V δ K (x) δ K μ (x)

λ 2  μ (x). LI (x) = −  ϕ (x)Vμ (x)V 2

(22.34)

In Box 22.2, a derivation of Eq. (22.36) below is given. It reads  ∂ 0+ |0−  = 0+ | (dx)ϕ 2 (x)Vμ (x)V μ (x)|0−  ∂λ

    (4) ϕ 2 (x) μ (x)V μ (x) − i (dx)(dx )δ (4) (x − x)   δ = (x − x ) 0+ |0− . (dx)  ϕ 2 (x)V m 2 + λ ϕ 2 (x) (−2)(−i)

(22.35) (22.36)

154

22

Quantum Dynamics: The Functional Differential Formalism of QFT

Box 22.2 Functional Derivatives Involving a Dependent Field

(m 2 + λ ϕ 2 )V 0 = K 0 + ∂k π k . If we vary w.r.t. K 0 , we have to keep ϕ, ∂k π k fixed, i.e.,   2 δ  δ δ (−i) 0 0+ |0−  (−i) 0 (−i) δ K (x) δ K (x ) δ K (x ) 



 δ  δ  δ = (−i) 0 (−i) 0 0+ | ϕ 2 (x ) |0−  = (−i) 0 0+ | ϕ 2 (x )V0 (x ) |0−  + + δ K (x) δ K (x ) δ K (x)



δ = 0+ | ϕ 2 (x )V0 (x )V0 (x) |0−  − i 0+ | ϕ 2 (x ) 0 V0 (x ) |0−  + + δ K (x)

)]2 [(−i)δ/δ K (x  0+ |0− . = 0+ | ϕ 2 (x )V0 (x )V0 (x) |0−  + i δ (4) (x − x)  2 + m + λ [(−i)δ/δ K (x)]2   2 δ  δ δ On the other hand (−i) k 0+ |0−  (−i) (−i) δ K (x) δ K k (x ) δ K (x )

= 0+ | ϕ 2 (x )V k (x )Vk (x) |0− . +

Upon integration over λ, the above equations give    ϕ , λ] 0+ |0− λ = 0 , 0+ |0− λ = exp i (dx)LI (x) F[

   1 λ 2  (4) (4) F[ ϕ , λ] = exp − ϕ (x) δ (x − x ) , (dx)(dx )δ (x − x) ln 1 + 2  2 m where in writing the expression for F[ ϕ , λ], we have used the elementary integral 0+ |0− λ = 0 = exp × exp

i 

2 i  2

λ 0

(22.37) (22.38)

 a dλ /[b + a λ ] = ln 1 + λa/b],

μν

(dx)(dx )K μ (x) + (x − x )K ν (x )

 (dx)(dx )K (x)D+ (x − x )K (x ) ,



(22.39)

μν

where D+ (x − x ) and + (x − x ) are, respectively, given in Eqs. (16.31) and (16.36) in Chap. 16. From the basic equations such as in (22.27) and (22.35), which follow from direct applications of the quantum dynamical principle, expressing a Lagrangian density in terms of functional derivatives of external sources, we learn that a Lagrangian density is to be defined as the coincident limits of a time-ordered product of the product of its fields. This means, for example that for the product of two fermionic fields AB, the expression of its time-ordered product (A(x 0 )B(x 0 )+ = A(x 0 )B(x 0 ) for x 0 > x 0 , and = −B(x 0 )A(x 0 ) for x 0 > x 0 , leads to the following definition of its coincident limit [A(x 0 )B(x 0 ) − B(x 0 )A(x 0 )]/2, while for bosonic field C D, it is simply defined by [C(x 0 )D(x 0 ) + D(x 0 )C(x 0 )]/2. This means that although one is involved with field operators, one is free to interchange the order of the product of two fermion fields provided one multiplies the resulting expression by minus, while for the product of bosonic field one may arbitrarily change their order in a Lagrangian density. This is thanks to the coincident time limit of the time ordered product nature of a Lagrangian density. This point is rarely emphasized in the literature and the student blindly changes the orders of quantum fields in Lagrangian densities without knowing why this is permissible. The application of the quantum dynamical principle to solve the expression of the vacuum-to-vacuum transition amplitude in terms of functional derivatives with respect to the external sources coupled their corresponding fields then follows in a straightforward manner as we have seen above.

References 1. Johnson, K. (1968). 9th Latin American School of Physics, Santiago de Chile. In K. Johnson & I. Saavedra (Eds.), Solid state physics, and particle physics. New York: W. A. Benjamin. 2. Lam, C. S. (1965). Feynman rules and Feynman integrals for systems with higher-spin fields. Nuovo Cimento, 38, 1755–1765. 3. Manoukian, E. B. (1986). Action principle and quantization of gauge fields. Physical Review D, 34, 3739–3749.

22

Quantum Dynamics: The Functional Differential Formalism of QFT

155

4. Manoukian, E. B. (2016). Quantum field theory I: Foundations and abelian and non-abelian gauge theories. Switzerland: Springer. 5. Schwinger, J. (1951). On the Green’s functions of quantized fields. I. In Proceedings of the National Academy of Sciences, USA, 37, 452–455 (1953). 6. Schwinger, J. (1953). The theory of quantized fields. II, III. Physical Review, 91, 713–728, 728–740 7. Schwinger, J. (1960). Unitary transformations and the action principle. Proceedings of the National Academy of Sciences, USA, 46, 883–897. 8. Schwinger, J. (1962). Exterior algebra and the action principle I. Proceedings of the National Academy of Sciences, USA, 58, 603–611. 9. Schwinger, J. (1973). A Report on quantum electrodynamics. In L. Mehra (Ed.), The physicit’s conception of nature. Dordrecht: D. Reidel Publishing Company.

Quantization of Gauge Theories and Constraints: Functional Differential Formalism

23

Prerequisites Chaps. 16, 21 and 22

The classic abelian gauge theory is QED described by the Lagrangian density   1 1 ∂μ ψ μ 1 μν μ ∂μ ψ γ ψ − ψγ − m 0 ψ ψ + e0 [ ψ, γ μ ψ ] Aμ L = − Fμν F + 4 2 i i 2   λ − χ ∂μ Aμ + χ 2 + Aμ J μ + ηψ + ψη, F μν = ∂ μ Aν − ∂ ν Aμ . 2

(23.1)

where J μ , η, η are external sources. The current j μ = (1/2)e0 [ ψ, γ μ ψ ] Aμ is defined as commutator to satisfy the charge conjugation1 property of the current2 C j μ (x)C−1 = − j μ (x),



jμ =

 1 e0 [ ψ, γ μ ψ ] Aμ . 2

(23.2)

The term [−χ ∂μ Aμ + (λ/2) χ 2 ] is a gauge fixing term as we have already seen in Eqs. (21.17)–(21.21) in Chap. 21. The field equations are readily derived to be −∂μ F

μν

    μ ∂μ − e0 Aμ + m 0 ψ = 0, = J + j + ∂ χ, γ i μ

μ

ν

∂μ Aμ = λχ .

(23.3)

Using, in the process, the anti-commutativity properties of fermion fields, as well the functional derivatives w.r.t. them, we obtain δ δ η(x  )ψ(x  ) = +ψ(x  )δ (4) (x  − x), ψ(x  )η(x  ) = −ψ(x  )δ (4) (x  − x), δη(x) δη(x)

(23.4)

and the fact that ψ, ψ are independent fields, as their time derivatives in the Lagrangian density are not zero, we obtain from (22.10), with Lagrangian density given in (23.1), 

∂ 0+ |0−  = i (dx)0+ | Aμ (x)ψ(x)γ μ ψ(x) + |0−  ∂e0 

δ 0+ | ψ(x)γ μ ψ(x) + |0− , = i (dx)(−i) δ Jμ (x)  δ δ δ γ μ (−i) 0+ |0− , = i (dx)(−i) μ (+i) δ J (x) δη(x) δη(x) 

δ δ δ

0+ |0−  , 0+ |0−  = exp e0 (dx) μ γμ e0 = 0 δ J (x) δη(x) δη(x)

1 See

(23.5) (23.6) (23.7) (23.8)

Box 16.2, in Chap. 16. is a transformation of operators and does not act on the c-number e0 as one may otherwise naïvely carry out.

2 This

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_23

157

158

23

Quantization of Gauge Theories and Constraints: Functional Differential Formalism

where

0+ |0− 

e0 = 0

 = exp i (dx)(dx  )η(x)S+ (x − x  )η(x  ) i  μν × exp (dx)(dx  )Jμ (x)D+ (x − x  )Jν (x  ) , 2 

(d p) i p(x−x  ) (−γ p + m 0 ) , → +0, e (2π )4 p 2 + m 20 − i     (dk) k μ k ν e i k(x−x ) μν  μν D+ (x − x ) = η − (1 − λ) 2 , → +0. (2π )4 k k 2 − i S+ (x − x  ) =

(23.9)

(23.10) (23.11)

That is the full vacuum-to-vacuum transitional amplitude 0+ |0−  of QED may be simply obtained by functional differentia tions of the exact functional 0+ |0−  e0 = 0 given above in arbitrary covariant gauges specified by the gauge parameter λ. This equation may be used in various applications as we will see later. μ The non-abelian case is harder to tackle. Consider the Lagrangian density of a non-abelian gauge field3 Aa 1 L = − G aμν G aμν + Aa μ Jaμ , 4

(23.12)

G aμν

(23.13)

∇abμ ∇bcν G μν c = 0.   μ μ = ∂ μ Aaν − ∂ ν Aaμ + g0 f a bc Ab Aνc , ∇ab = δa b ∂ μ + g0 f a cb Aμc

(see Chap. 17). In the above equation we have not specified any particular gauge as we have done in the abelian case earlier. We work in the celebrated Coulomb gauge: ∂ j Aaj = 0, j = 1, 2, 3,

(23.14)

from which we may, for example solve for Aa3 given by Aa3 = −∂3−1 ∂κ Aaκ , κ = 1, 2,

(23.15)

j

becoming a dependent variable. The variation of Aa may be then conveniently written as a variation of the independent components Aa1 , Aa2 : δ Aa j = δ j κ − δ j 3 ∂3−1 ∂κ δ Aaκ , κ = 1, 2.

(23.16)

At the outset, we note the following key equation needed to solve for 0+ |0−  1 1 L I = − g0 f a bc Abμ Acν Faμν − g02 f a bc f a b  c  Abμ Acν Aμb  Aνc  . 2 4 μν

i ∂ 0+ |0−  = − (dx) f a bc 0+ | G a Abμ Acν + |0−  ∂g0 2  



i = − i (dx) f a bc 0+ | G aj 0 Ab j Ac0 + |0−  − (dx) f a bc 0+ | G iaj Abi Ac j + |0− , 2

(23.17) (23.18) (23.19)

and we want to express the right-hand of the latter equation in terms of functional differentiations acting on 0+ |0− , as we have done in the abelian case. To the above end, first note that since G 00 = 0, the Lagrangian density does not involve ∂0 Aa0 . Moreover Aa3 is a dependent field. That is, the following canonical conjugate momenta vanish: πa0 = 0, πa3 = 0. On the other hand, in Box 23.1 it is shown that

3 For

a proof of the second relation in (23.12) see, Manoukian [2], p. 574, solution of Problem 6.3.

23

Quantization of Gauge Theories and Constraints: Functional Differential Formalism

159

πaκ = G aκ 0 − ∂3−1 ∂ κ G a30 , κ = 1, 2, ⇒ πaj = G aj 0 − ∂3−1 ∂ j G a30 , j = 1, 2, 3,    G aj 0 (x) = πaj (x) − ∂ j (dx  )Da b (x, x  ) ∇ bci πci (x  ) + Jb 0 (x  ) ,   δd a ∂ j + g0 f d ca Acj ∂ j Da b (x, x  ) = δ (4) (x, x  )δd b , j0

(23.20) (23.21) (23.22) j

where πa3 trivially vanishes. Since Aa0 is a dependent field, then so is G a . Accordingly, by keeping πa fixed, as well as the independent fields Aa1 , Aa2 and hence also Aa3 , which is expressed in terms of the two latter two independent fields, the following basic equation emerges δ δ Jc0 (x  )

δ

G aj 0 (x) = −∂ j Da c (x, x  ),

δ Jci (x  )

G aj 0 (x) = 0.

(23.23)

a μ = (−i)δ/δ Jaμ . In reference to the first part of Eq. (23.19), Let A aj 0 (x) A b j (x  )0+ |0−  c0 (x  ) G A

c0 (x  )0+ | G aj 0 (x)Ab j (x  ) |0− = A +

b j (x  )0+ |0− , a c (x, x  ) A = 0+ | Ac0 (x  )G aj 0 (x)Ab j (x  ) + |0−  + i ∂ j D

(23.24) (23.25) (23.26)

where in the second line we have used the facts that Aa1 , Aa2 are independent fields and A3b is expressed in terms of the two independent fields. In the third line, we have simply used the first equation in (23.23). Accordingly,

0+ | Ac0 (x)G aj 0 (x  )Ab j (x  ) + |0− 

c0 (x) G aj 0 (x  ) A b j (x  )0+ |0−  − i A b j (x  )∂ j D a c (x, x  )0+ |0− . =A

(23.27)

The second term on the extreme right-hand side of (23.19) is expressed within the brackes in terms of independent field components. Accordingly by multiplying the above equation, in particular, by −i, and taking into account of the second simple term in (23.19), we obtain Box 23.1 Some of the intricacies in quantization of gauge fields in the coulomb gauge   j ∂ j Aa = 0, j = 1, 2, 3, Aa3 = −∂3−1 ∂κ Aaκ , κ = 1, 2. δ Aa j = δ j κ − δ j 3 ∂3−1 ∂κ δ Aaκ . G 00 = 0.

j0 μj j μ0 3 Field Equations : ∇ab j G b ≡ ∇abμ G b = −Ja0 , ∇abμ G b = ∂3−1 ∂ j ∇acμ G μ3 c + Ja − Ja .

μν μ3 These may be combined to read : ∇abμ G b = δ ν j ∂ j ∂3−1 ∇ac j G b + Ja3 − Jaν . Apply ∇da ν

μ3 to the latter : ∇da j ∂ j ∂3−1 ∇ac j G b + Ja3 = ∇da ν Jaν , using the relation ∇abμ∇bcν G μν c = 0. Introduce the Green function Dab satisfying the equation : ∇da ∂ j Dab (x, x  ) = δ (4) (x, x  )δd b ,

μν to obtain ∇abμ G b = − δac δ ν σ − δ ν j ∂ j Dab ∇bcσ Jcσ in matrix notation. On the other hand j

πaκ = G aκ 0 − ∂3−1 ∂ κ G a30 , πa3 = 0, πa0 = 0, ⇒ πaμ = G aμ0 − δ μj ∂3−1 ∂ j G a30 . ∇ba μ πaμ = ∇ba μ G aμ0 − ∇ba j ∂3−1 ∂ j G a30 = −J a0 − ∇ba j ∂3−1 ∂ j G a30 , ⇒ ∇ba j ∂3−1 ∂ j G a30



= − ∇ba μ πaμ + Jb0 . Therefore G aμ0 = πaμ − δ μj ∂ j Dab ∇bcν πcν + Jb0 .

∂ 0+ |0−  ∂g0   ∂  b j (x)∂ j D ca (x, x  ) 0+ |0− , = (dx) i L I (x) + (dx)(dx  ) δ (4) (x − x  ) f a bc A ∂g0

(23.28)

160

23

Quantization of Gauge Theories and Constraints: Functional Differential Formalism

where we have used the anti-symmetry property of f a bc . The second expression within the square brackets denotes the trace of an infinite dimensional matrix in spacetime and in internal degrees of freedom 

   j )a c ∂ j  (dx)(dx )δ (x − x ) f a bc ∂ Ab j (x) Dca (x, x ) = Tr ( f A 

(4)



j

 1 . i ∂ i ]ca [∇ 2 + g0 f A

(23.29)

 b j ∂ j ]. Accordingly, (23.28) may An elementary integration of the latter over g0 from 0 to g0 gives Tr ln δa c + g0 (1/∇ 2 ) f a bc A be integrated over g0 to give 

    j I (x) 0+ |0− C  , 0+ |0− C = exp Tr ln[∇a c ∂ j ] exp i (dx)L g0 =0

(23.30)

μ

up to an unimportant multiplicative numerical factor independent of Jc , where C stands for Coulomb. The amplitude 0+ |0− C g0 =0 is worked out in Box 23.2, and is given by

DaCbμν

i 

0+ |0− C g0 =0 = exp (dx)(dx  )Jaμ (x)DaCbμν (x − x  )Jbν (x  ) , 2  ki k j  1 1 C C , D0Cj = 0 = D Cj0 . =δa b Dμν , D00 (k) = − 2 , DiCj (k) = δi j − 2 k k k 2 − i

(23.31) (23.32)

aμc ∂μ ] = Tr ln[∂μ ∇ aμc ] independently of which gauge one chooses. The In Boxs 23.3 and 23.4, it is shown that Tr ln[ ∇ corresponding expression for 0+ |0−  for g0 = 0 in an arbitrary covariant gauges specified by a parameter λ as given in aj c ∂ j ] Eqs. (21.17), (21.18), (21.21) in Chap. 21, (22.22), (22.25) in Chap. 22, may, by covariance, with the substitutions Tr ln[∇ μ a c ], and DaCbμν → D+ ab μν , inferred to be given by → Tr ln[∂μ ∇ Box 23.2 Propagator in Coulomb gauge L = −(1/4)Fa μν Faμν , Faμν = ∂ μ Aaν − ∂ ν Aaμ ; ∂ j Aa = 0, j = 1, 2, 3; Aa3 = −∂3−1 ∂κ Aaκ , κ = 1, 2. j



As in Box 23.1, the field equations emerge as ∂μ Faμν = δ ν j ∂ j ∂3−1 ∂μ Faμ3 + Ja3 − Jaν . Apply 

∂ j ∂σ  Jaσ . ∂ν to the latter : ∂3−1 ∂μ Faμ3 + Ja3 = (1/∇ 2 ) ∂ν Jaν , ⇒ ∂μ Faμν = − ηνσ − ην j ∇2   i j ∂ ∂ j In detail, this equations leads to : −∇ 2 Aa0 (x) = J a0 , −Aia (x) = δ i j − 2 Ja . ∇ These give rise to the gauge field propagator in Eq. (23.32), in the Coulomb gauge.



  

1 μ 

1 μ Box 23.3 f a cb x  Ac ∂μ x = f bca x   ∂μ Ac x 

  1     1





   f a cb x Aμ ∂ = f (dx  ) x x  Aμ

x μ a cb c c (x ) x ∂μ x    1 1 μ   (4)   (dx  )δ (4) (x − x  )Aμ = f a cb Ac (x)∂μ δ (4) (x − x  ) c (x )∂ μ δ (x − x ) = f a cb  

 1 1  

 (4)  x ∂μ Aμ = − f a cb Aμ c (x)∂μ δ (x − x) = f bca c x   

   1 

1   μ = f bca (dx  ) x   x  δ (4) (x  − x)(∂μ Aμ c )(x ) = f bca x ∂μ Ac x .  

23

Quantization of Gauge Theories and Constraints: Functional Differential Formalism

161

 μ   μ  Box 23.4 Tr ln ∇a b ∂μ = Tr ln ∂μ ∇a b   1 Using, in the process, relation in Box 23.3, we have Tr ln δab + g0 f a cb Aμ c ∂μ =  

 

1

  1 



μn−1 1 (−g0 )n−1 (dx)(dx1 ). . .(dxn−1) x Aμ ∂μn−1 x f a1c1b1· · ·f bn−1cn−1a1 Acn−1 c1 ∂μ1 x 1 · · · x n−1   n≥1 

   ∂  ∂ 

μ

μ μn−1 1 = (−g0 )n−1 (dxn−1) . . . (dx1 )(dx) x n−1 Acn−1

xn−1 · · · x1 1 Aμ c1 x f a1cn−1bn−1 · · · f b1c1a1   n≥1       1 μ μ = Tr ln δab +g0 f a cb ∂μ Aμ c . Moreover Tr ln ∇a b ∂μ = Tr ln δa b +g0 f a cb Ac ∂μ = Tr ln[δa b ]       1 1 μ  + Tr ln δab + g0 f a cb Aμ ∂μ = Tr ln[δa b ] + Tr ln δab + g0 f a cb ∂μ Aμ c = Tr ln ∂μ ∇a b .  c 

     i μ   0+ |0−  = exp Tr ln[∂μ ∇a c ] exp i (dx)L I (x) exp (dx)(dx  )Jaμ (x)D+μνab (x − x  )Jbν (x  ) , 2     k μ k ν e i k(x−x ) (dk) μν  μν η D+ ab (x − x ) = δab . − (1 − λ) (2π )4 k2 k 2 − i 

(23.33) (23.34)

The inclusion of fermion fields is now straightforward using the fact that their time derivatives in the Lagrangian density are not zero, and hence are independent fields as they were in the abelian case. The quantization of non-abelian gauge theories using the quantum dynamical principle was developed in Manoukian [1].

References 1. Manoukian, E. B. (1986). Action principle and quantization of gauge fields. Physical Review D, 34, 3739–3749. 2. Manoukian, E. B. (2016). Quantum field theory.I: Foundations and abelian and nonabelian gauge theories. Switzerland: Springer.

Functional (Path) Integral Formalism as a Functional Fourier Transform of the Functional Differential One

24

Prerequisites Chaps. 21, 22 and 23

The functional integral formalism was introduced by Richard Feynman.1 It is referred to as the Feynman path integral formalism or as the path integral formalism. It has its roots in earlier investigations by Paul Dirac.2 As a generating functional for Green functions, it has the general structure dμ[χ ]eiAction . Here dμ[χ ] defines a measure of integration over classical fields as the counterparts of the quantum fields of the theory. “Action” denotes the classical action. In simple cases  dμ[χ ] takes the form ax dχa (x) as a product over all spacetime points as well as over any other label a a field may carry. In a theory involving constraints the latter expression is, in general, modified. The path integral is a method which allows one to derive, for example, the expression for 0+ |0−  by carrying out integrations of continual, i.e., (uncountable) infinite dimensional, integrals. To develop this powerful formalism, let us first learn how to introduce continual integrals. To this end, consider first the integrals   a 2 dy 1 , exp − i y = √ 2 a −∞ 2π/i

2  ∞  a   1 b dy 2 y + by = exp i , exp − i √ 2 a 2a 2π/i −∞  n n n i    1 dy1 1 dyn exp ··· √ exp i − yi Ai j y j + K j yj = √ K i Ai−1j K j , √ 2 i, j=1 2 i, j=1 2π/i 2π/i det A j=1 



(24.1) (24.2) (24.3)

  N  to the last equation, beginning from its r.h.s., and On the other applying the differential operator exp iλ n=1 − iδ/δ K  using the identity exp[ i(−id/dx)n ] exp[ i k x ]= exp[ i(k x + (k)n )], we obtain n  n i   δ N  −i exp exp iλ K i Ai−1j K j δ K 2 i, j=1, j=1 =1

=



det A

  n =1

n n n  1  dy exp i − yi Ai j y j + K j yj + λ ( y j )N . √ 2 i, j=1 2π/i j=1 j=1

(24.4)

Now we consider the limit n → ∞ corresponding to all points of space time as well as of internal degrees of freedom and of various components a field may carry with the yn , now, going over to fields χa (x). The corresponding integrals to (24.3) is then defined as follows in terms of integrations over classical fields:

1 Feynman 2 Dirac

[5, 7, 8].

[3].

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_24

163

164

24



Functional (Path) Integral Formalism as a Functional Fourier Transform of the Functional Differential One



  1    (x, x )χ (x ) + (dx)K (x) χ (x) (Dχ ) exp i (dx)(dx  ) − χa (x)Ma−1 a a a b 2    i 1 exp = √ (dx)(dx  )K a (x)Mab (x, x  )K b (x  ) , −1 2 det M

(24.5)

where det M −1 is defined as a determinant in spacetime variables and in the indices a. It is obtained by setting K a (x) equal to zero and is hence just a multiplicative numerical factor. In applications it may be omitted as it cancels out. On the other-hand, −1 (x, x  ), as the inverse of Mab (x, x  ), are defined as follows: (Dχ ) and Mab (Dχ ) =

   −1 dχa (x)/ 2π/i , (dx  )Mab (x, x  )Mbc (x  , x  ) = δ (4) (x, x  )δac

(24.6)

x,a

Moreover the corresponding integral to (24.4) becomes √

 det

M −1



   1  −1    N (Dχ ) exp i − (dx)(dx ) χa (x)Ma b (x, x )χa (x ) + (dx)K (x)a χa (x) + λ (dx)(χa (x)) 2

 i   δ N exp = exp iλ (dx) − i (24.7) (dx)(dx  )K a (x)Mab (x, x  )K b (x  ) . δ K a (x) 2

As an immediate application, consider the Lagrangian density of a Hermitian scalar field, m2 2 λ 4 1 ϕ + ϕ + K ϕ. L = − ∂μ ϕ∂ μ ϕ − 2 2 4

(24.8)

Then the vacuum-to-vacuum amplitude as obtained directly from the differential formalism is given by  

   i λ δ 4 exp 0+ |0−  = exp i (dx) − i (dx)(dx  )K (x) + (x, x  )K (x  ) , 4 δ K (x) 2

(24.9)

where + (x − x  ) = (− + m 2 − i )−1 δ (4) (x − x  ) is the propagator. While the path integral expression is from (24.27), up to an unimportant numerical factor independent of K , given by  0+ |0−  =

    1 λ (Dϕ) exp i (dx)ϕ 4 (x) + (dx)K (x)ϕ(x) , (dx) − ϕ(x)[− + m 2 ]ϕ(x  ) + 2 4

(24.10)

defined as integrations over classical fields, where, for convenience, we used the same notation for the classical field as the quantum one. An integration by parts in the first term in the exponential gives 



1 (dx) − ϕ(x)[− + m 2 ]ϕ(x) + 2   1 m2 2 ϕ (x) + = (dx) − ∂μ ϕ(x)∂ μ (x)ϕ − 2 2

λ 4 ϕ (x) + K (x)ϕ(x) 4   λ 4 ϕ (x) + K (x)ϕ(x) = (dx) L (x), 4

(24.11)

denoting the action integral. theory, it is thus far easier to start from the In order to obtain the vacuum-to-vacuum amplitude 0+ |0−  of the underlying   functional differential form in (24.9), by simple functional differentiation δ/δ K (x) K (x  ) = δ (4) (x  − x), then by starting from the functional integral in (24.10) and attempting to carry out the continual functional integral in it. We also note that by re-writing (24.10) in the form 



 λ     1 2 4 (Dϕ) exp i (dx)ϕ (x)ϕ(x) exp i (dx)K (x)ϕ(x) , (dx) − ϕ(x)[− + m ]ϕ(x) + 2 4

(24.12)

one recognizes the first exponential, as a functional of ϕ, of the above functional (path) integral simply as the functional Fourier transform of 0+ |0−  as given in (24.9) as a functional differential form by comparing it to the elementary one

24

Functional (Path) Integral Formalism as a Functional Fourier Transform of the Functional Differential One

165

dimensional case as follows. Given two explicitly known functions f 1 (x), f 2 (k) of single variables x and k, respectively, we may write

  ∞  ∞  d d dk f 2 (k) = f 2 (k) e−ik x , dx f (x) eikx , where f (x) = f1 − i f1 − i dk dk −∞ −∞ 2π where the expression on the left-hand side of the first equality corresponds to the functional differential form, i.e., of 0+ |0− , expressed in terms of two known functions, while the expression on its right-hand side corresponds to the functional (path) integral form. Now we introduce functional integrals for anti-commuting variables referred to as Grassmann variables.3 Consider a set of all anti-commuting variables ρ 1 , ρ1 ; . . . ; ρ n , ρn , occurring in pairs, for which, for example, (ρ1 )2 = 0, ρ 1 2 = 0. A typical integral of interest involving these Grassmann variables is given by 

n n n        dρ n dρn dρ 1 dρ1 −1 −1 ··· exp i − = det M exp i ρ i Mi j ρ j + ρ jηj + ηjρj η i Mi j η , i i i, j =1 j =1 i, j =1

(24.13)

where the Mi j are matrix elements of c-numbers. A major difference between the bosonic case in (24.5) and the fermionic one in (24.13) above is that det M −1 , with a square root, appears in the denominator on the right-hand side of the equation in (24.5) rather than in the numerator as in (24.13). In the limit n → ∞, on may then define a continual integral by 

   

 −1    (Dρ)(Dρ) exp i − (dx)(dx ) ρ a (x)Mab (x, x )ρb (x ) + (dx) ρ a (x)ηa (x) + ηa (x)ρa (x)   = (det M −1 ) exp i (dx)(dx  ) ηa (x)Mab (x, x  )ηb (x  ) .

(24.14)

 where (Dρ)(Dρ)= x,a (dρ a /i)(dρa /i). Again, we note that (det M −1 ) is independent of ηa , ηa as it is obtained by setting them equal to zero. Aa an application of the above equation, consider M −1 (x, x ) = (γ μ ∂μ /i+m)δ (4) (x − x ). Up to a multiplicative numerical independent of η, η, it leads to the useful expression    exp i (dx)(dx  ) η(x)S+ (x − x  )η(x  )

      ∂

 μ . = Dρ)(Dρ) exp i − (dx)ρ(x) γ μ + m ρ(x) + (dx) ρ(x)η(x) + η(x)ρ(x) i

(24.15)

The path integral expression of QED may be readily given. For the photon propagator in an arbitrary covariant gauge specified by a parameter λ, we have (see Eqs (22.17), (22.22)–(22.25) in Chap. 22)   ∂μ ψ 1 λ 1 ∂μ ψ μ γ ψ − ψγ μ − m 0 ψ ψ + ηψ + ψ η − Fμν F μν − χ ∂μ Aμ + χ 2 + Aμ Jμ , 2 i i 4 2   μ ν 1 k k 1 −1 −1 , Dμν kμ kν , Dμσ (k) = ημν k 2 − 1 − (k)D σ ν (k) = δμ ν . D μν (k) = ημν − (1 − λ) 2 k k 2 − i

λ L =

(24.16) (24.17)

Accordingly, up to a multiplicative numerical factor independent of Jμ , i   exp (dx)(dx  )Jμ (x)D μν (x − x  )Jν (x  ) 2

     1  1 = (D A) exp i (dx) − Aμ − ημν  + ∂μ ∂μ − ∂μ ∂ν Aν + Aμ Jμ 2 λ

    1 1 ∂μ Aμ ∂ν Aν + Aμ Jμ , = (D A) exp i (dx) − Fμν F μν − 4 2λ

3 For

a much detailed treatment of integrations over Grassmann variables, see Manoukian [9], pp. 57–70.

(24.18)

166

24

Functional (Path) Integral Formalism as a Functional Fourier Transform of the Functional Differential One

where F μν = ∂ μ Aν −∂ ν Aμ , using, for convenience, the same notation for the classical fields Aμ (x) as the quantum ones. Moreover, using the property of delta functions, we may write    1 exp − i (dx)∂μ Aμ (x) ∂ν Aν (x) 2λ

     λ = (Dχ )δ ∂μ Aμ − λχ ) exp i (dx) − χ (x)∂μ Aμ (x) + χ 2 (x) , 2

(24.19)

    where δ ∂μ Aμ − λχ )= x δ ∂μ Aμ (x) − λχ (x) . From (24.14), (24.15), (24.18) and (24.10), the explicit expression for 0+ |0−  in the path integral representation, as follows from the differential one, emerges as  0+ |0−  =

 

 (Dρ)(Dρ)(Dχ )(D A) δ ∂μ Aμ − λχ ) exp i (dx)Lc (x) ,

(24.20)

up to a multiplicative numerical factor independent of η, η, Jμ , where Lc (x) is the classical Lagrangian density obtained from the quantum one in (24.16) by the substitutions: ψ, ψ, Aμ → ρ, ρ, Aμ with the latter now classical fields (and the first two as Grassmann fields). The above equation shows the gauge constraint is imposed via the delta functional δ ∂μ Aμ − λχ ). An equivalent expression to the one in (24.20) is obtained as before by integrating over the field χ giving the following full path integral expression for QED4  0+ |0−  =

   1 ∂μ Aμ ∂ν Aν . (Dρ)(Dρ)(D A) exp i (dx) Lc (x) − 2λ

(24.21)

The path integral expression for the non-abelian gauge theory given by the Lagrangian density 1 μ L = − G aμν G aμν + Aa μ Jaμ , G aμν = ∂ μ Aaν − ∂ ν Aaμ + g0 f a bc Ab Aνc 4 1 1 μ L I = − g0 f a bc Abμ Acν Faμν − g02 f a bc f a b  c  Abμ Acν Ab  Aνc  . 2 4  μ ∇ab = δa b ∂ μ + g0 f a cb Aμc .

Faμν = ∂ μ Aaν − ∂ ν Aaμ ,

(24.22) (24.23) (24.24)

j

In the Coulomb gauge ∂ j Aa = 0, it is now straightforward to derive the expression for 0+ |0− , with 0+ |0− g0 = 0 derived, in turn, in Box 24.1. It may be written down from the expression of 0+ |0− C given in Eq. (23.30) in Chap. 23 to be  

j   I (x) 0+ |0− C  a c ∂ j exp i (dx)L 0+ |0− C = det ∇ g0 =0       1 j = (D A) det[ ∇aj c ∂ j ] δ(∂ j Ab )exp i (dx) − G aμν (x)G aμν (x) + Aaμ (x)Jaμ (x) . 4 b

(24.25)

The covariant version of the above expression may obtained from Eqs. (23.33), (23.34) in the last chapter together with Eqs. (24.17), (24.21) above, to be given by5  0+ |0−  =

(D A) det[∂μ ∇aμc ]

   1 1 ∂μ Aaμ ∂ν Aaν , (24.26) × exp i (dx) − G aμν (x)G aμν (x) + Aaμ (x)Jaμ (x) − 4 2λ

 μ where we have used the relation exp Tr ln [ B ] = det B for a matrix B. A factor such as det [∂μ ∇a c ] in (24.26), modifying the measure (D A), is referred to as a Feynman-DeWitt-Faddeev-Popov factor or more commonly as a Faddeev-Popov

= 0 is obviously obtained by deleting the (1/2λ)∂μ Aμ ∂ν Aν term and by multiplying the integrand by δ(∂μ Aμ ) (see (24.20) and use the propagator in (24.16) with λ set in it equal to zero. 5 For a rigorous derivation of this, see Manoukian [9], pp. 389–395. 4 The Landau Gauge λ

24

Functional (Path) Integral Formalism as a Functional Fourier Transform of the Functional Differential One

167

determinant.6 Unlike the bosonic case in (24.5), we have seen in (24.14), that the determinant of M −1 there appears in the numerator on its right-hand, without a square root. It is obtained by setting η = 0, η = 0, in (24.14). Accordingly, by carrying the μ substitution M → −M, we may express det[∂μ ∇a c ] in (24.26) as a path integral det[∂μ ∇aμc ]

 =

   (Dρ)(Dρ) exp i (dx)(dx  ) ρ a (x)∂μ ∇aμc ρc (x  )

(24.27)

up to a multiplicative numerical factor, defining spin 0 fields in the theory which obey Fermi-Dirac statistics. Because of this unusual property of such spin 0 fields, obeying Fermi-Dirac statistics, they are referred to as ghost fields. As they are not observable particles described by asymptotic in and out states, there is no physical inconsistency in the underlying theoretical description. They necessarily emerge here, however, in order to cancel out those contributions in the theory which would otherwise destroy gauge invariance. The part δa c appearing in ∂μ ∇aμc in Eq. (24.27) (see Eq. (24.24)) gives rise to a propagator equal to δac D+ (x − x  ) for the ghost particles. One may finally rewrite (24.26) as  0+ |0−  =

    (D A)(Dρ)(Dρ) exp i (dx) Leff (x) + Aa μ Jaμ (x) ,

1 1 ∂μ Aaμ ∂ν Aaν + ρ a ∂μ∇aμb ρb . Leff = − G aμν (x)G aμν (x) − 4 2λ

Box 24.1 Path integral of  0+ |0− g0 =0 in the Coulomb gauge δ μj ∂ j ∂ ν Jaν . ∇2 νμ (x) 0+ |0− g =0 = −J μ  0+ |0− g =0 (∗ ). aμ (x) = (−i)δ/δ Ja (x). Then ∂ν F Let A a 0 0   δ  νμ ν  ν (4)    , J (x ) = δ μ δab δ (x −x), Let A = (−1/4) (dx) Fνμ (x) F (x). The relations μ δ Ja (x) b  δ 



δ μ  νμ (x) and i A, i A, J μ = 0, lead to i A, J a = ∂ν F = 0. Hence , μ a δ Ja (x) δ Jbν (x  )  μ  μ  from Eq. (∗ ) above, we obtain [i A, J a + J a  0+ |0− g0 =0 = 0. We may now use μ

μ

 0+ |∂ν Faνμ (x)|0− g0 =0 = −J a (x) 0+ |0− g0 =0 , J a = Jaμ −

the fact that for two operators A and B if [A, [A, B]] = 0 then e A B e−A = [A, B] + B, μ μ to infer that exp[i A] J a exp[−i A] 0+ |0− g0 =0 = 0. Or J a exp[−i A] 0+ |0− g0 =0 = 0.

The basic equation x f (x) = 0 implies that f (x) = Const. δ(x ) from which we obtain the    μ μ μ δ( J a ), where δ( J a ) = δ( J a (x)). In Box 24.2 equation  0+ |0− g0 =0 = exp[ i A] aμ



aμx

  δ  μ we show that = δ − i∂ j δ(Jb ). Upon using the functional integral j δ J a aμ a bμ         δ μ ] exp[ i A δ(Jaμ ) = (D A) exp [ i (dx) Aaμ Jaμ (x) , gives  0+ |0− g0 =0 = δ − i ∂ j δ(Jb ) j δ Ja aμ a bμ    1   j or  0+ |0− g0 =0 = (D A) δ(∂ j Ab ) exp i (dx) − Faμν (x)Faμν (x) + Aaμ (x)Jaμ (x) . 4 

μ δ( J a )

b

6 Feynman

[6]; DeWitt [1, 2]; Faddeev and Popov [4].

(24.28) (24.29)

168

24

Functional (Path) Integral Formalism as a Functional Fourier Transform of the Functional Differential One Box 24.2 A basic relation in the Coulomb gauge

It is shown that

   μ   δ J a (x) = δ − i∂ j

δ j

   δ Jaμ (x) , up to a multiplicative

δ Ja (x) bμx    δ Ja0 (x) means to set Ja0 (x) equal to a constant. To this end first note that δ(J 0 ) = a μx

ax

ax

constant equal to zero for all (a, x) in a function multiplied by it. Accordingly,

  μ  δ J a (x)

a μx

      ∂i ∂ j ∂i ∂ j = δ(J 0 )δ J i − 2 J j = δ(J 0 ) (D A) exp i (dx) Jai Aia − 2 Aa j ∇ ∇      j j   ∂ i  0 i dλb (x) δ ∂ A (x)b − λb (x) exp i (dx)Ja Aia − 2 λa = δ(J ) (D A) ∇ bx          ∂ i  δ = δ(J 0 ) δ − i∂ i i dλb (x) exp i (dx)Jai Aia − 2 λa (D A) ∇ δ Jb (x) bx bx            δ ∂i  (D A) = δ(J 0 ) δ − i∂ i i dλb (x) exp i (dx)Jai Aia , Aia → Aia + 2 λa ∇ δ Jb (x) bx bx     μ  δ = δ − i∂ j δ Jb (x) , up to a multiplicative non-contributing factor in applications. j δ J (x) a ax bμx 

References 1. 2. 3. 4. 5. 6. 7. 8.

DeWitt, B. (1964). Theory for radiative corrections for non-abelian gauge fields. Physical Review Letters, 12, 742–746. DeWitt, B. (1967). Quantum theory of gravity. II. The manifestly covariant theory. Physical Review, 162, 1195–1239. Dirac, P. A. M. (1933). The Lagrangian in quantum mechanics. Physikalische Zeitschrift der Sowjetunion, 3, 64–72. Faddeev, L. D., & Popov, V. N. (1967). Feynman diagrams for the Yang-Mills field. Physics Letters B, 25, 29–30. Feynman, R. P. (1948). Space-time approach to non-relativistic quantum mechanics. Reviews of Modern Physics, 20, 367–387. Feynman, R. P. (1963). Quantum theory of gravitation. Acta Physica Polonica, 24, 697–722. Feynman, R. P., & Hibbs, A. R. (1965). Quantum mechanics and path integrals. New York: McGraw-Hill. Feynman, R. P. (1972). The development of the space-time view of quantum electrodynamics. In Nobel Lectures. Physics 1963–1970. 11 Dec 1965. Amsterdam: Elsevier. 9. Manoukian, E. B. (2016). Quantum field theorry I: Foundations and abelian and non-abelian gauge theories. Switzerland: Springer.

Transition Amplitudes and the Meaning of Virtual Particles

25

Prerequisites Chaps. 13, 21–23

The energy-momentum of a particle p μ = ( p 0 , p) satisfies the constraint pμ p μ ≡ p 2 = −m 2 , referred to as the mass shell constraint, for m = 0 and m = 0. This corresponds to particles detected in experiments There are, however, particles, referred to as virtual particles, for which p 2 = −m 2 . Such particles may not be observed and hence not detected, but they have a key role in fundamental processes and in associated radiative corrections. They may have a surplus of momentum |p| for which p2 − p 02 > −m 2 or they may have a surplus of energy for which p2 − p 02 < −m 2 . A virtual particle may borrow energy-momentum from an incident particle to provide a balance between the relation of its energy and momentum, thus satisfying the mass shell restriction, and emerge in the process in question as an observable particle. On the other hand, it may give up all of its unbalanced energy-momentum to an incident particle to scatter it off, or it may simply lead to the creation of some particles. Such conversions of virtual particles provide the essence of a Feynman diagram. All the above proliferations are, of course, subjected to some symmetry properties to be satisfied such as of charge conservation.1 In Fig. 25.1, we provide three examples involving virtual particles: • In Fig. 25.1a, a process is given in which a photon,2 of momentum k1 is one of the initial particles, and an electron, of momentum p2 is one of its emerging particles in the process, while the remaining part of the process is depicted by the shaded area. Consistent with charge conservation, an electron, with momentum p is shown in this part of the figure. In Box 25.1, it is shown that p 2 > −m 2 and the latter electron is a virtual one. It borrows energy-momentum from the photon and the electron surfaces out of the process as a real particle. • In Fig. 25.1b, we consider the scattering of an electron with a no-zero angle between its initial and final three momenta caused by the virtual of momentum k shown to satisfy the off-shell condition k 2 > 0. The underlying kinematics is worked out in Box 25.2. • In Fig. 25.1c, the virtual photon of momentum k is shown to satisfy the off-shell condition k 2 < 0, and it gives off its energy-momentum to create an electron-positron pair. The underlying kinematics is worked out in Box 25.3. Box 25.1 p = p2 − k1 , p2 = 0, k1 = 0, p22 = −m 2 , k12 = 0 ⇒ p 2 > −m 2   p 2 = −m 2 + 2|k1 | |p2 |2 + m 2 − |p2 | cos θ ,     ⇒ p 2 + m 2 = 2|k1 | |p2 |2 + m 2 − |p2 | cos θ ≥ 2|k1 | |p2 |2 + m 2 − |p2 | > 0, p 2 > −m 2 .

1 Due to the masslessness of the photon, a charged particle may be accompanied by an unlimited number of undetected almost zero energy photons,

referred as soft photons, and the mass shell restriction for a charged particle may be then only valid up to the inherited experimental resolution. This will be studied in the next chapter. 2 By momentum, of course, one means a four-momentum. © Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_25

169

170

25

(a) p = p2 − k1 , p2 = 0, k1 = 0, p22 = −m2 , k12 = 0. (b) k = p1 − p2 , cos θ < 1, p1 = 0, p2 = 0, p21 = p22 = −m2 . (c) k = p1 + p2 p21 = p22 = −m2 .

Transition Amplitudes and the Meaning of Virtual Particles

(a)

(b)

(c)

p

k

k k1

p2

p2

p2

p1

p1

Fig. 25.1 Three examples involving virtual particles as described above, with p 2 > −m 2 , k 2 > 0, k 2 < 0, respectively, in (a), (b) and (c)

Box 25.2 k = p1 − p2 , cos θ < 1, p1 = 0, p2 = 0, p12 = p22 = −m 2 ⇒ k 2 > 0    2 |p1 |2 |p2 |2 1 − cos2 θ > −m 2 |p1 | − |p2 | (trivially)     = −m 2 |p1 |2 + |p2 |2 + 2m 2 |p1 ||p2 | > −m 2 |p1 |2 + |p2 |2 + 2m 2 |p1 ||p2 | cos θ (1 > cos θ)   2 2 2 2 2 2 2 2 2 ⇒ |p1 | |p2 | + m |p1 | + |p2 | > |p1 | |p2 | cos θ + 2m |p1 ||p2 | cos θ,    2 (by adding m 4 on both sides) ⇒ |p1 |2 + m 2 |p2 |2 + m 2 > (|p1 ||p2 | cos θ + m 2 ,        |p1 |2 + m 2 |p2 |2 + m 2 >  |p1 ||p2 | cos θ + m 2 . (∗ ) ⇒    ⇒ k 2 = p12 + p22 − 2|p1 ||p2 | cos θ + 2 p10 p20 = −2m 2 − 2|p1 ||p2 | cos θ + 2 |p1 |2 + m 2 |p2 |2 + m 2 ,       ⇒ k2 ≥ 2 (From (∗ )) |p1 |2 + m 2 |p2 |2 + m 2 −  |p1 ||p2 | cos θ + m 2  > 0.

Box 25.3 k = p1 + p2 , p12 = p22 = −m 2 ⇒ k 2 < 0 k 2 = −2m 2 − 2

      |p1 |2 + m 2 |p2 |2 + m 2 + 2|p1 ||p2 | cos θ ≤ −2m 2 − 2 |p1 |2 + m 2 |p2 |2 + m 2 + 2|p1 ||p2 |     |p1 |2 + m 2 |p2 |2 + m 2 − |p1 ||p2 | < 0. = −2m 2 − 2

Schwinger3 developed an ingenious physically clear, elegant and an easy method to quickly extract transition amplitudes of fundamental processes directly from the vacuum-to-vacuum transition of the theory under consideration in which, multiplicative numerical factors also come out automatically. Moreover, it is closely related to actual experimental situations in with particles are produced and emitted which eventually interact and are finally detected. Thus one starts in a vacuum state, before particles are produced and emitted, and ends up in a vacuum state after the emerging particles are finally detected. In Chap. 16, we have already determined the amplitudes of particles emitted and particles detected by respective sources for particles encountered in QFT/HE. The transition amplitudes of a given process may be then extracted from the coefficients of these amplitudes of emission and detected from the expression of 0+ |0− as follows:   0+ particles (out) detected by sources   particles (out) on their way to detectionparticles (in) emitted by sources   particles (in) emitted by sources0− .

(25.1)

The amplitudes of emissions and detection of particles are given by the product the individual single particles amplitudes which have been determined in Chap. 16. In the transition amplitude on the second line of the above equation, the external sources are set equal to zero. As an application, consider QED in the Feynman gauge.4 The explicit solution of the vacuum-to-vacuum transition amplitude 0+ |0− is given by (see Eqs. (23.8)–(23.11)) 3 Schwinger

[2, 3]. See also Manoukian [1]. quantities should be gauge invariant. This is the essence of gauge theories. One chooses a particular gauge for convenience, e.g., to simplify the calculations involved of the problem at hand.

4 Physical

25

Transition Amplitudes and the Meaning of Virtual Particles

171



δ δ δ  γμ 0+ |0−  , 0+ |0− = exp e (dx) μ e=0 δ J (x) δη(x) δη(x) 

 = exp i (dx)(dx )η(x)S+ (x − x )η(x ) 0+ |0−  e=0

i σρ × exp (dx)(dx )Jσ (x)D+ (x − x )Jρ (x ) , 2 (d p) i p(x−x ) (−γ p + m) S+ (x − x ) = e , → +0, (2π )4 p 2 + m 2 − i

(dk) e i k(x−x ) σρ D+ (x − x ) = ησρ , → +0. (2π )4 k 2 − i

(25.2)

(25.3) (25.4) (25.5)

As a monumental classic application of the above expressions, we consider the celebrated Compton scattering: e− γ → e− γ . To describe this process, we need two powers of J μ for emission and detection of a photon, one power of η for emission of an electron, and one power of η for detection of an electron. The corresponding expression to be considered then, to lowest order in e, is from (25.2) given by 1 2 δ δ δ δ δ δ μ e (dx1 ) μ γ γν (dx2 ) ν 2 δ J (x1 ) δη(x1 ) δη(x1 ) δ J (x2 ) δη(x2 ) δη(x2 )



i μν × exp i (dx)(dx )η(x)S+ (x − x )η(x ) exp (dx)(dx )Jμ (x)D+ (x − x )Jν (x ) . 2

(25.6)

Considering only connected integrals, i.e., not involving the product of two integrals not involving any variables in common, the above equation leads, to the powers of the sources mentioned above, the following explicit expression

  i e2 (dx1 )(dx2 ) (dx)(dx ) η(x)S+ (x − x1 )γ μ1 S+ (x1 − x2 )γ μ2 S+ (x2 − x )η(x )   × (dz 1 )(dz 2 )D+μ1 ν1 (x1 − z 1 )J ν1 (z 1 )D+μ2 ν2 (x2 − z 2 )J ν2 (z 2 ) . μ

μ

(25.7)

μ

To describe the process in question, we write η = η1 + η2 , J μ = J1 + J2 , where η1 and J1 are switched on and then μ switched off in the remote past after an electron and a photon are, respectively, emitted, while η2 and J2 are switched on in the distant future after and then switched off after the outgoing electron and photon are, respectively, detected (absorbed), with the interaction taking place later in time than the emissions and earlier than detections. From (25.7), the whole process is then described by the expression 2

ie 

(dx)(dx )(dx1 )(dx2 )(dz 1 )(dz 2 )

η2 (x)S+ (x − x2 )γ μ2 S+ (x2 − x1 )γ μ1 S+ (x1 − x )η1 (x )

× D+μ2 ν2 (x2 − z 2 )J2ν2 (z 2 )D+μ1 ν1 (x1 − z 1 )J1ν1 (z 1 ) + D+μ1 ν1 (x1 − z 1 )J1ν1 (z 2 )D+μ2 ν2 (x2 − z 2 )J2ν2 (z 2 ) .



(25.8)

Here x 0 > x20 , x10 > x 0 with x 0 , −x 0 , arbitrarily large, which means that an electron has been emitted and then detected. This allows us to make the substitutions  m d3 p1 i p1 x1 



η1 ( p1 )u( p1 , σ1 ) u( p1 , σ1 ), e (25.9) (dx )S+ (x1 − x )η(x ) → i p20 (2π )3  m d3 p2 −i p2 x2  (dx)η2 (x)S+ (x − x2 ) → i e η2 ( p2 )u( p2 , σ2 ) u( p2 , σ2 ). (25.10) 0 3 p2 (2π ) Similarly, we have z 20 > x20 , z 10 < x10 , with z 20 , −z 10 arbitrarily large, which means that a photon has been emitted and then detected. Hence we may make the substitutions considering real polarization vectors

172

25



Transition Amplitudes and the Meaning of Virtual Particles

→ i

  d3 k1 eik1 x1 eλ1 μ1 eλ1 ν1 J ν1 (k1 ) , 2k10 (2π )3   d3 k2 (dz 2 )D+ν2 ν2 (x2 − z 2 )J2ν2 (z 2 ) → i e−ik2 x2 J2ν2 ∗ (k2 )eλ2 ν2 eλ2 ν2 . 0 3 2k2 (2π ) (dz 1 )D+μ1 ν1 (x1 −

z 1 )J1ν1 (z 1 )

(25.11) (25.12)

We consider the notations: 

   m d3 p1  d3 k1  ν p1 , σ1 ; k1 , λ1 |0− = e i η ( p )u( p , σ ) J (k ) 1 1 1 1 λ ν 1 1 1 0 0 p1 (2π )3 2k2 (2π )3      m d3 p2  d3 k2 i ( p )u( p , σ ) i J2ν2 ∗ (k2 )eλ2 ν2 , η 0+ |p2 , σ2 ; k2 , λ2 = 2 2 2 2 0 0 3 3 p2 (2π ) 2k2 (2π )

(25.13)

(25.14)

denoting the amplitude for the emission of an electron of momentum-spin state p1 , σ1 , & a photon of momentum-polarization state k1 , λ1 , and the amplitude for the detection of momentum-spin state p2 , σ2 , & a photon of momentum-polarization state k2 , λ2 , respectively (see Eq. (16.32) and below it in Chap. 16). Upon substituting the last five equations in (25.8), and integrating over x1 , x2 , we obtain 0+ |p2 , σ2 ; k2 , λ2 A p1 , σ1 ; k1 , λ1 |0−   

 3 3 3 m d p d k d k m d3 p1 2 1 2 · · · . A = i e2 (2π )4 p20 (2π )3 2k20 (2π )3 2k10 (2π )3 p10 (2π )3



−γ ( p1 + k1 ) + m ν1 · · · = u(p2 , σ2 ) eλ2 μ2 (k2 )γ μ2 γ eλ1 ν1 (k1 ) ( p 1 + k 1 )2 + m 2  −γ ( p1 − k2 ) + m μ2 γ e (k ) + eλ1 ν1 (k1 )γ ν1 λ2 μ2 2 u(p1 , σ1 ). ( p 1 − k 2 )2 + m 2

(25.15) (25.16)

(25.17)

The contributions of the above two terms to the scattering process are as follows: p2

p2

k2

k2

p1 −k2

p1 +k1

+ p1

k1

p1

k1

where a solid line denotes an electron and a wavy one denotes a photon. We consider initial states with sharp momenta p1 , k1 . In scattering theory, one carries out, in the process of the analysis, a box normalization with the scattering process considered to occur within a three-dimensional space of volume V . This amounts in replacing d3 p1 /(2π )3 , d3 k1 /(2π )3 each by 1/V . Accordingly, A takes the form  A = i e (2π ) 2

4

d3 p2 2 p20 (2π )3



 d3 k2 δ (4) ( p2 + k2 − p1 − k1 )  · · · . 2m 2k20 (2π )3 V 4 p0 k 0

(25.18)

1 1

Now we compare the contribution of the above process to the vacuum-to-vacuum transition amplitude 0+ |0− to obtain to lowest order in e,   : 0+ |p2 , σ2 ; k2 , λ2 p2 , σ2 ; k2 , λ2 |p1 , σ1 ; k1 , λ1 p1 , σ1 ; k1 , λ1 |0− , (25.19) 0+ |0−  − − e γ →e γ

25

Transition Amplitudes and the Meaning of Virtual Particles

173

from which we infer that the transition amplitude with sharp initial states is given by p2 , σ2 ; k2 , λ2 |p1 , σ1 ; k1 , λ1 = A.

(25.20)

To find the transition probability per unit volume per unit time, we have to take the absolute value squared of the above expression. Using the property of a delta function, a prescription usually attributed to Enrico Fermi, is to write (4)



δ ( p) =

(dx) i p x 1 e ⇒ [δ (4) ( p)]2 = δ (4) ( p) (2π )4 (2π )4



(dx) = δ (4) ( p)

VT , (2π )4

(25.21)

leading for the transition probability, per unit time, with final momenta in d3 p2 , d3 k1 given by  2  d3 p2 d3 k2 e4 m 2 δ (4) ( p2 + k2 − p1 − k1 )   . · · ·   0 0 4π 2 0 0 2 p2 2k2 V p 1 k1

(25.22)

 2   We consider an experiment where the initial electron is at rest. To evaluate  · · ·  , we may use the basic identities: (see Boxes 16.1 and 16.8) 

(−γ p + m) , {γ μ , γ ν } = −2ημν , γμ (γ α γ β )γ μ = 4ηαβ , 2m σ     γμ γ α γ β γ ρ γ μ = 2γ ρ γ β γ α , Tr[γ α γ β γ μ γ ν ] = 4 ηαβ ημν − ηαμ ηβν + ηαν ηβμ , u(p, σ )u(p, σ ) =

(25.23) (25.24)

as well as Tr[odd number of γ ’s] = 0. A straightforward, though tedious evaluation, averaging over the spin projections σ1 and summing over the final spin projections σ2 of the electron, gives  2  0 0    1    = 1 k2 + k1 + 2 2[ eμ eλ μ ]2 − 1 . · · · 2 λ1  2 σ ,σ  4m 2 k10 k20 1

(25.25)

2

We also average over initial photon-polarizations states λ1 and sum over final photon-polarization states λ2 each with two μ μ μ states.5 Polarizations vectors may be given eλ = (0, eλ ), satisfying kμ eλ = k · eλ = 0, eλ1 μ eλ2 = δλ1 ,λ2 . Accordingly, using the completeness relation given below Eq. (16.32): 

j

eiλ eλ = δ i j −

λ=1,2

we obtain

μ

[ eλ1 eλ2 μ ]2 ≡

 λ1 ,λ2

ki k j , k2

j j  k i k  ki k  j j eiλ1 eλ1 eiλ2 eλ2 = δ i j − 1 21 δ i j − 2 22 = 1 + cos2 θ, k1 k2

(25.26)

(25.27)

where θ is the angle between k1 , and k2 , and hence (k1i k2i )2 = k21 k22 cos2 θ , δ i j δ i j = δ j j = 3. Thus the expression in (25.25) becomes replaced by  2  

0 0    · · ·  = 1 k2 + k1 + (1 + cos2 θ ) − 1 1 4 × 2   0 0 2 λ ,λ ,σ ,σ 2m 2 k1 2 2 k2 1 2 1 2 1 k0 k0 = 2 20 + 10 − sin2 θ . 2m k1 k2

 1  1  2



The p2 integration to be carried out in (25.22) is evaluated by using the integral

5 See

also Eq. (16.32) and below it in Chap. 16.

(25.28)

174

25

Transition Amplitudes and the Meaning of Virtual Particles

d3 p2 (4) δ ( p 2 +k2 − p1 −k1 ) = (d p 2 )Θ( p20 )δ( p22 + m 2 )δ (4) ( p 2 +k2 − p1 −k1 ) 2 p20     = Θ(m +k10 −k20 )δ ( p1 +k1 −k2 )2 +m 2 = Θ(m +k10 −k20 )δ − 2k1 k2 − 2m(k10 −k20 ) .

(25.29)

The transition probability, per unit time, of the process for the photon with initial momentum-polarization state (k1 , λ1 ) impinging on an electron at rest to a photon with momentum k2 within the element d3 k2 and polarization state λ2 , is then from (25.22), (25.28), (25.29) given by (k20 > k10 + m)

   d3 k2 e4 k20 k10 2 + − sin θ Θ(m +k10 −k20 )δ − 2k1 k2 − 2m(k10 −k20 ) ] 0 0 π2 0 0 16 m V k2 k1 k1 k2

0    0 2 0 4 (k2 ) dk2 d k2 e k10 k10 2 0 = , + − sin θ δ k − 2 32k10 /k20 m 2 V k20 k10 π 2 k10 k20 (k10 /m)(1 − cos θ ) + 1

(25.30)

which upon integration over k20 gives   k 0 2 k 0 e4 k10 2 2 2 + − sin θ d, 32V m 2 π 2 k10 k10 k20 wher e

(25.31)

k20 1 . = 0 0 k1 1 + (k1 /m)(1 − cos θ )

(25.32)

The differential cross section is defined by comparing the transition probability, per unit time, of the process in question above to the probability per unit area A per unit time T that the impinging photon crosses, perpendicular, to the area A centered around the position of the electron at rest as if the electron is absent, i.e, in the absence of the interaction. The latter probability per unit area per unit time is referred to as the flux. The probability of finding the photon within a cube of cross sectional area A and width cT , during a time T , is given by AcT /V : γ cT A

×

where recall that V is the volume of the 3-dimensional space in which the scattering process occurs. Thus the probability in question per unit area per unit time, that is the flux, is just c/V , and with c = 1, it is given by 1/V , which upon dividing the expression in (25.31) by 1/V , the volume V cancels out as it should. The differential cross section then emerges as

 k10 dσ α 2  k20 2 k20 2 + 0 − sin θ , = d 2m 2 k10 k10 k2

α=

e2 , 4π

(25.33)

known as Klein–Nishina Formula,6 and where α is the fine-structure constant. In the low energy limit k10 → 0 of incident photon, (25.32), shows that k20 /k10 → 1, and the above expression, with d = 2π sin θ dθ , is readily integrated to give σ =

8π 2 r , 3 0

r0 =

α , m

(25.34)

known as the Thomson cross section, and r0 2.8 × 10−13 cm is the classical radius of the electron.

6 For

the chain of developments which finally led to the derivation of the Klein–Nishina formula see the fascinating discussion by Weisskopf [4].

25

Transition Amplitudes and the Meaning of Virtual Particles

175

References 1. 2. 3. 4.

Manoukian, E. B. (2016). Quantum field theory I: Foundations and abelian and non-abelian gauge theories. Switzerland: Springer. Schwinger, J. (1969). Particles and sources. New York: Gordon & Breach. Schwinger, J. (1970). Particles, sources, and fields (Vol. 1). Reading: Addison-Wesley. Weisskopf, V. F. (1980). Growing up with field theory, and recent trends in particle physics. “The (1979) Bernard Gregory Lectures at CERN”, 29 p. Geneva: CERN.

Radiative Corrections Prerequisite Chaps. 21, 23 and 25

26

Virtual particles, in general, give rise to corrections in the analytical expressions of propagators which are referred to as radiative corrections. Such radiative corrections, in turn, appear in transitions amplitudes involving these propagators. For a proper particle interpretation, one sets up boundary conditions to be satisfied by these propagators on their respective mass shells. As a result of radiative corrections, coupling parameters and masses in a Lagrangian density are modified and are eliminated in favor of physically measurable parameters, that one encounters in particle detection regions. These modified parameters are referred to as renormalized parameters. This fundamental property of QFT gives rise to renormalized propagators, involving wavefunction renormalization constants and renormalized parameters in the theory and allows one to confront experiments. In the present chapter, we consider, in QED and to lowest order in the charge of the electron, the modifications that arise in the photon and electron propagators, as well as of the vertex function which the latter describes, in a fundamental way, the nature of the interaction of the electron and the photon. The parameters e0 , m 0 , in turn, in the Lagrangian density in Eq. (23.1) are then expressed in the forms: e0 = e(1 + O(e2 )), m 0 = m (1 + O(e2 )), where e and m are the measured charge and mass of the electron experimentally. The integrals needed for carrying out these radiative corrections in this chapter are conveniently given in Box 26.1.1 You will observe that a shift of an integration variable k → k + p is possible for finite and logarithmically divergent integrals. For integrals with higher degrees of divergence surface terms arise as shown in the corresponding integrals. μν The vacuum expectation value 0+ |Aμ (x)|0− /0+ |0−  = Aμ (x) of the vector potential Aμ (x), with D+ given in the momentum description in (23.11), satisfies the equation (see Chap. 23)

Aμ (x) =



  μ (dz)D+ ρ (x − z) J ρ (z) +  j ρ (z) .

(26.1)

The exact fully interacting photon propagator D˜ μν , in the absence of the external current J ν , is defined by (i)(−i)

 δ  Aμ (x) = D˜ μν (x − y), J =0 δ Jν (y)

(26.2)

μν

which coincides with D+ when the quantum electromagnetic current j ν is also set equal to zero.

1A

fairly detailed treatment of doing integrals in QFT is given in Appendix II of Manoukian [3]. This book may be also consulted for more details on radiative corrections, in general, and Schwinger [6], Kinoshita [5], Weinberg [9], as well as early classic books, which involve extensive computations of radiative corrections and are always of great padagogical value, such as by Jauch and Rohrlich [4], Bjørken and Drell [1] and Itzykson and Zuber [2]. © Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_26

177

178

26

Radiative Corrections

Box 26.1 A panorama of basic integrals occurring in QFT  •

[k 2

(dk) iπ 2 = 2 3 +M ] 2 M2

 •

[k 2

(dk) iπ 2 = 2 4 +M ] 6(M 2 )2

 •

k μ k ν (dk) iπ 2 μν = η . 2 2 4 [k + M ] 12M 2

 2 m−2 (dk) Γ (m)Γ (n − m) (k ) iπ 2 exists for n > m > 0, Γ (z) is the gamma function. • = [k 2 + M 2 ]n (M 2 )n−m Γ (n)    (dk) iπ 2 k μ (dk) iπ 2 p μ • = • = • k μ f (k 2 ) (dk) = 0. [(k − p)2 + M 2 ]3 2 M2 [(k − p)2 + M 2 ]3 2 M2    pμ kμ iπ 2 μ − • (dk) =− p . 2 2 2 2 2 2 [(k − p) + m ] [k + m ] 2    kμkν pμ pν kμkν = 0. (A) • (dk) − − [(k − p)2 + M 2 ]3 [k 2 + M 2 ]3 [k 2 + M 2 ]3    p μ pν kμkν 1 iπ 2  μ ν p 2 ημν 2 μν • (d p) − − k + k η . (B) = − 5k [(k − p)2 + M 2 ]2 [ p 2 + M 2 ]2 4 [ p 2 + M 2 ]2 6

   Λ2 Λ2 (dk) (C) • − 1 , Λ2 → ∞. = iπ 2 ln [k 2 + M 2 ]2 [k 2 + Λ2 ] M2  2  1 x dx 1 μ • (D) , (μ2 /M 2 ) → 0. = − ln 2 2 2 2 M2 0 [ x + (1 − x)(μ /M ) ]  1  1  x dx dz 1 1 = dx = • , • 2 . (E) 2 3 [Ax + B(1 − x)] AB [A(1 − x) + Bz + C(x − z)] ABC 0 0 0

Using the chain rule: δ  j ν (z) = δ Jν (y)



(dz  )



 δ δ Aσ (z  )  j ν (z), σ δ Jν (y) δA (z  )

(26.3)

we obtain from Eqs. (26.1) and (26.2), the following convenient expression for the exact photon propagator D˜ μν , with the external current J μ set equal to zero, D˜ μν (x − y) = D μν (x − y) + D˜ μν (x − y) = D μν (x − y) −

 

 μ (dz)(dz  )D+ ρ (x − z) D˜ σ ν (z  − y)

 δ ρ   j (z)  δAσ (z  ) J =0

μ

(dz)(dz  )D+ ρ (x − z) ρσ (z − z  ) D˜ σ ν (z  − y),

(26.4) (26.5)

where we have set [δ/δAσ (z  )] j ρ (z) = − ρσ (z − z  ).

(26.6)

In Box 26.2 it is shown that to lowest order in e0

 δ δ ρ  ρ − ρσ (z − z  ) = (z) = e ψ(z )γ ψ(z)   j  + z  =z δAσ (z  ) δAσ (z  )    = i e2 Tr γ ρ S+ (z − z  )γ σ S+ (z  − z) = i e2 Tr S+ (z − z  )γ σ S+ (z  − z)γ ρ ].

(26.7)

26

Radiative Corrections

179

  Box 26.2 e δ ψ(y)γ ν ψ(x) + /δAμ (z) = i e20 Tr γ ν S+ (x − z)γ μ S+ (z − y) to 2nd Order 1 δ + m 0+ |ψ(x)|0−  − e0 0+ |γ μ ψ(x)|0−  = η(x)0+ |0− , take iδ/δη(y), i i δ J μ (x)



use 0+ | ψ(y)ψ(x) + |0−  = −0+ | ψ(x)ψ(y) + |0−  set, η = 0, η = 0, divide by

0+ |0−  and multiply by γ ν to obtain −  γ ν ψ(x)ψ(y) +  = i γ ν S+ (x − y) 

1 1 δ + i e (dz)γ ν S+ (x − z)γ μ 0+ | ψ(z)ψ(y)| |0− . μ 0+ |0−  i δ J (z)  1 1 δ δ  Use the identity 0+ |0−  = 0+ |0−  Aμ (z) + , as well i δ J μ (z) i δ J μ (z)

˜ y), the exact electron propagator, to obtain i  ψ(z)ψ(y) +  = S(z, 

˜ y)] e ψ(y)γ ν ψ(x) +  = i eTr[γ ν S+ (x − y)] + i e20 (dz) Tr[γ ν S+ (x − z)γ μ Aμ (z) S(z,  ˜ y)/iδ J μ (z)]. + i e2 (dz) Tr[γ ν S+ (x − z)γ μ δ S(z,

γ∂

˜ x) by S+ (z − x) inside the integrals, which is To second order in e, we may replace S(z,

independent of J μ , to obtain 

e ψ(y)γ ν ψ(x) +  = i eTr[γ ν S+ (x − y)] + i e2 (dz) Tr[γ ν S+ (x − z)γ μ S+ (z − y)]Aμ (z),

  δ giving e ψ(y)γ ν ψ(x) +  = i e20 Tr γ ν S+ (x − z)γ μ S+ (z − y) . δAμ (z)

Using the equations γ ∂z S+ (z − z  ) = +i δ (4) (z − z  ) − i m S+ (z − z  ), ← − S+ (z  − z)γ ∂ z = −i δ (4) (z − z  ) + i m S+ (z  − z),

(26.8) (26.9)



we may infer from (26.7), that ∂μ(z) μν (z − z  ) = 0, ∂ν(z ) μν (z − z  ) = 0, and hence the Fourier transform of μν (z − z  ) has the following structure

μν (k) = ημν k 2 − k μ k ν (k 2 ), kμ μν (k) = 0, kν μν (k) = 0,

(26.10)

referred to as the polarization tensor. By Fourier transforming of Eq. (26.5), in turn, to lowest order, we obtain by using, in the process, the fact that (Q 2 ) ∝ e2  μν μρ σν (k). D˜ μν (k) = D+ (k) − D+ (k) ηρ σ k 2 − kρ kσ ] (k 2 )D+

(26.11)

The orthogonality relations in (26.10), then lead to the expression   kμkν k μ k ν [1 − (k 2 )] + λ , D˜ μν (k) = ημν − 2 k k 2 − i k 2 (k 2 − i)

(26.12)

and the longitudinal part, proportional to λ, does not change with the interaction. The latter is true to all orders as well. A Fourier transform of (26.6)/(26.7), with μ set equal to ν and summed over in (26.10), gives  (2π ) 3k (k ) = −i e 4

2

2

2

   k k  γμ S+ p + , (d p) Tr γ μ S+ p − 2 2

(26.13)

180

26

Radiative Corrections

and up to the −i factor is depicted as :

The explicit expression for (k 2 ) is given in Box 26.3. For k 2 0, i.e, near the mass shell, the following behavior of the photon propagator arises

 k μ k ν  Z 3 + O(k 2 ) kμkν μν D (k) = η − 2 +λ 2 2 , 2 k (k − i) k (k − i) ˜ μν

(26.14)

Box 26.3 Trace of the polarization tensor Recall that S+ ( p) = (−γ p + m)/( p 2 + m 2 ), suppressing the i  factor. Using the elementary  1 1 dz integral : referred to as the Feynman parameter = AB [Az + B(1 − z)]2 0 representation of the product of two denominators, and carrying out the trace in (26.10),   1 −8 p 2 + 2k 2 − 16m 2 the integral in it becomes − i e2 (d p) dz 

2 2 . 0 p + (k/2)(1 − 2z) + k 2 z(1 − z) + m 2 For the evaluation of this integral we may, conveniently, use integral (B) in Box 26.1, with μ = ν which, after subsequent integrations, referred to in Box 26.1, the following expression emerges :     e2 1 k2 e2  1 1 (d p) .

(k 2 ) = (0) − dz z(1 − z) ln 1 + z(1 − z) ,

(0) = + 2 2 2 2 2 2 2 2π 0 m 12π iπ [ p + m ] 2

(0) is a logarithmically divergent integral for large p. We may evaluate it by introducing an  e2  1 Λ2 (d p) 1 ultraviolet cut-off Λ in it as follows: (0) = which for + 2 2 2 2 2 2 2 12π iπ [ p + m ] [ p + Λ ] 2 e2   Λ2 1  Λ2 → ∞ (see Box 26.1 (0) = ln 2 − whose interpretation is given in the text. 12π 2 m 2

where Z 3 = 1 − π(0) (see Box 26.3), i.e., Z3 = 1 −

α   Λ2 1  ln 2 − 3π m 2 1+

1   2 , α Λ 1 ln − 3π m2 2

α=

e2 , 4π

(26.15)

and Λ is an ultraviolet cut-off. The expression in (26.14) coincides with correct behavior of a free photon propagator near the mass shell, corresponding to asymptotically large distances of emission and detection regions of a photon, only if we scale D˜ μν by 1/Z 3 . This amounts in defining a scaled vector potential 1 √ Aμ = Aμren , Z3

(26.16)

√ referred to as a renormalized field, and Z 3 as a wavefunction renormalization constant. Here we recall how wavefunction renormalization constants arise, quite generally, as witnessed earlier in Chap. 15. Actually Z 3 has an important physical interpretation which arises in the following manner: Upon writing the interacting and free photon propagators as   1 kμkν ˜ kμkν 1 μν μν ˜ , D˜ + (k 2 ) = , D (k) = η − 2 D+ (k 2 ) + λ 2 2 k k (k − i) [1 + (k 2 )] k 2 − i   kμkν kμkν 1 , D+ (k 2 ) = 2 , D μν (k) = ημν − 2 D+ (k 2 ) + λ 2 2 k k (k − i) k − i

(26.17) (26.18)

26

Radiative Corrections

181

the Coulomb interaction between two static charges, of charges q1 , q2 may be extracted from the expression of D+ (k 2 ) and is given by 

  d3 k ik·r d3 k eik·r q1 q2 2  . e D (k ) = q q =  + 1 2 3 3 2 0 k =0 (2π ) (2π ) k 4π r

U (r ) = q1 q2

(26.19)

On the other, in the presence of radiative corrections, and for large separation of the charges  3  d3 k ik·r ˜ d k eik·r 2  e (k ) → q q [Z 3 + O(k2 ], D  + 1 2 k 0 =0 (2π )3 (2π )3 k2 √ √  Z 3 q1 Z 3 q2 q1ren q2ren ˜ U (r ) → ≡ , qiren = qi Z 3 , 4π r 4π r

 U˜ (r ) = q1 q2 Limr →∞



(26.20)

leading to renormalization of charges. Moreover, at large distances from the electron the coupling parameter e0 gets renormalized as follows  e0 Z 3 = e,

α 0 Z 3 = α 1/137,

(26.21)

with e, α denoting the electromagnetic charge of the electron and the fine-structure as measured in the laboratory. This, in turn, implies that e0 , α0 correspond to the values at a zero distances when one is right on top of the electron and are clearly unattainable in any experiment. Their elimination in favor of the experimentally observable parameters e, α is the essence of renormalization theory. This point will be taken up again in forthcoming chapters (Chaps. 30–32) dealing with renormalization theory in which one considers on how these parameters, at these extreme energies, are related. For the time being one may define an effective fine-structure parameter α(k 2 ) corresponding to small distances, i.e., at high energies k 2 m 2 given to lowest order in α 1/137 through the following set of equations:   k μ k ν α0 d(k 2 ) kμkν μν μν ˜ + α , λ α0 D (k) = η − 2 0 k k 2 − i k 2 (k 2 − i) α0 α0

α0 d(k 2 ) = = α(k 2 ) = 2 1 + α0 π(k ) 1 + α0 π(0) + α0 π(k 2 ) − π(0) α α

=

. = 2 Z 3 + απ(0) + α π(k ) − π(0) 1 + α π(k 2 ) − π(0) 

 2 

5  1 α k 1 1 2 − = + α π(k 1 − ln ) − π(0) → , α(k 2 ) α α 3π m2 3 2 2

(26.22)

k m

1 1 where we have used 0 dz z(1 − z) = 1/6, 0 dz z(1 − z)ln[z(1 − z)] = −5/18 (see Box 26.3). For a wide range of high energies available in experiments α(k 2 ) is still a small number to the order α. An important application of this will be carried out in a forthcoming chapter which incorporates the charged quarks and charged leptons in the underlying analysis. We now consider the modification of the electron propagator. From the third line in Box 26.2, not involving the i γ σ factor, we have after setting η = 0, η = 0, and working to second order in e 

 1 δ μ e ψ(y)γ ψ(z) +  (dz)S+ (x − z) i δ J μ (z) J =0  

 δ νμ ν e ψ(y)γ ψ(z) +  = S+ (x − y) − (dz)(dw)S+ (x − z)D+ (w − z) δAν (w) J =0  νμ = S+ (x − y) − ie2 (dz)(dw)S+ (x − z)D+ (w − z)γ μ S+ (z − w)γ ν S+ (w − y),

˜ − y) = S+ (x − y) − S(x



(26.23)

182

26

Radiative Corrections

D νμ (w − z) = D μν (z − w). Upon Fourier transform, the above equation becomes    νμ ˜ p) = S+ ( p) 1 − ie2 (dk) γ μ S+ ( p − k)γ ν D+ (k)S+ ( p) , S( 4 (2π )  ( p) (− sign by convention) S˜ −1 ( p) = γ p + m 0 −   −γ ( p − k) + m (dk) μ γ S+ ( p − k)γ ν D+μν (k), S+ ( p − k) = . ( p) = −ie2 4 (2π ) ( p − k)2 + m 2 To second order in e,



(26.24)

(26.25) (26.26)

( p), up to the −i factor, is depicted as

In evaluating the above integral we include a fictitious small photon mass μ → 0 in the denominator part (k 2 + μ2 − i) of the photon propagator to avoid divergences in the low energy (infra-red) region. By combining the denominators of the electron and photon propagators in Eq. (26.26), in an arbitrary covariant gauge, using the equations (E) in Box 26.1, and carrying out the p−integration and using, in the process equation D in Box 26.1, the following behavior of the inverse of the electron propagator S˜ −1 ( p) emerges, near the mass shell p 2 −m 2 ,   2 

˜S −1 ( p) (γ p + m 0 + δm) + O (γ p + m)2 , δm = 3α m ln Λ + 1 , Z2 4π m2 2  2

  2  1 μ 9 Λ α −1 +(3 − λ)ln +λ+ , m = m 0 +δm, λ ln =1+ 2 2 Z2 4π m m 2

(26.27) (26.28)

where m denotes the experimentally observed, i.e., renormalized, mass of the electron, and δm, which is gauge independent, is called the self mass. To have the correct behavior of the electron propagator near the mass shell, the unrenormalized mass, also called bare mass, is eliminated √ in favor of√the renormalized mass m, and the electron field is to be divided by the wavefunction renormalization constant Z 2 : ψ/ Z 2 = ψren . We note that Z 2 is ultraviolet finite in the Landau gauge λ = 0, infrared finite in the Yennie gauge λ = 3. It is both ultraviolet and infra-red divergent in the Feynman gauge λ = 1. We now consider the vertex part Γ μ (x, y; z) describing the interaction of the electron/positron with the photon. The latter is defined by the equation δ δAμ (z)

S˜ −1 (x, y) = −e0 Γ μ (x, y; z).

(26.29)

after setting all the external sources equal to zero. In Box 26.4, it is shown that to second order in e, in the momentum description, it satisfies the equation Γ μ (x, y; z) = γ μ δ (4) (x − z)δ (4) (y − z) − i e2 γ σ S+ (x − z)γ μ S+ (z−)γ ρ Dσρ (x − y),

Γ μ (x, y; z) =



(d p  ) (d p) i p  (x−z) i p(y−z) μ  e e Γ ( p , p). (2π )4 (2π )4

(26.30)

(26.31)

26

Radiative Corrections

183

Box 26.4 The vertex part to 2nd order ˜ Referring to line 7 in Box 26.2, we have, without the γ ν matrix: S(x, y) = S+ (x − y)     σ  ˜    σ ˜  + e0 (dz ) S+ (x − z )γ Aσ (z ) S(z , y) + e0 (dz ) S+ (x − z )γ δ S(z , y)/i δ J σ (z  ). −1  (x − x) and Multiply from the right by (dy) S˜ −1 (y, y  ) and from the left by (dx)S+ −1  integrate to obtain S˜ −1 (x  , y  ) = S+ (x − y  ) − e0 δ (4) (x  − y  )γ σ Aσ (y  )  ˜  , y)/i δ J σ (x  )] S˜ −1 (y, y  ). The last term may be rewritten as − e0 (dy)γ σ [δ S(x   ˜  , y)[δ S˜ −1 (y, y  )/i δ J σ (x  )] = −i e0 (dy)(dz  )γσ S(x ˜  , y) D˜ ρσ (z  , x  ) δ S˜ −1 (y, y  )/δAρ (z  ). + e0 (dy)γ σ S(x

Accordingly from Eq. (26.29), Γ μ (x  , y  ; z) = γ μ δ (4) (x  − y  )δ (4) (y  − z) − i e0



(dy)(dz  )γ σ

δ δAμ (z)



 ˜  , y) D˜ ρσ (z  , x  ) Γ ρ (y, y  ; z  ) . S(x

˜  , y)/δAμ (z) = e0 S+ (x  − z)γ μ S+ (z − y) contributes, the other To lowest order only the term δ S(x non-vanishing terms are of higher order. With all the sources set equal to zero, we have Γ μ (x  , y  ; z) = γ μ δ (4) (x  − z)δ (4) (y  − z) − i e2 γ σ S+ (x  − z)γ μ S+ (z − y  )γ ρ Dσρ (x  − y  ).

Hence

Γ μ ( p  , p) = γ

μ

+ Λμ ( p  , p),  (dk) μ  2 Dσρ (k)γ σ S+ ( p  − k)γ μ S+ ( p − k)γ ρ . Λ ( p , p) = −i e (2π )4

(26.32) (26.33)

and, up to the −i factor, the latter is depicted as

Again to evaluate this expression, one introduces a fictitious photon mass μ, and by combining the denominators, using the equations (E) as well as, in the process, the integral (D) in Box 26.1, the leading contribution to first order in Q = p  − p is given by Γ μ ( p  , p) =

1 μ α [γ μ , γ ν ] Q ν , γ + Z1 8π m

Z1 = Z2.

(26.34)

The equality Z 1 = Z 2 holds to all orders in perturbation theory and is referred to as the Ward identity.2 The renormalized vertex is defined by μ Γren ( p  , p) = Z 1 Γ μ ( p  , p),

(26.35)

μ ( p  , p) → γ μ for Q → 0, indicating that with a zero-momentum transfer of a scattered particle, and it satisfies the B.C. Γren one is not able to probe the internal structure of the electron and the latter is “seen” to behave rather like a point-like particle. The consequence of the renormalization constants Z 1 , Z 2 , Z 3 is considered next. We recall that each end of a photon line is connected to (the rest of) a diagram, i.e., connected to an electron line, is photon field is multiplied by e0 . Moreover, the multiplied by a single power of the unrenormalized charge e0 . That is, each √ √ √ electron, and the photon field and the vertex part are renormalized as ψ → Z 2 ψren (ψ → Z 2 ψ ren ), Aμ → Z 3 Aμren , μ /Z 1 . That is, when Γ μ → Γren √ the theory is expressed in terms of renormalized components, the unrenormalized charge e0 gets renormalized only by Z 3 as seen from the formal scalings of the electron and photon lines as well as of the vertex part: 2 Ward

[8]. See also Takahashi [7].

184

26

Radiative Corrections

(26.36) √ a result which holds true to all orders, using the Ward identity Z 2 = Z 1 to cancel out the ( Z 2 )2 and 1/Z 1 factors. To second order in Q 2 = 0, the external photon line to the vertex part represents a virtual photon, and from the expression of (k 2 ), with k → Q, as given in the 7th line in Box 26.3, (Q 2 ) − (0) = −α Q 2 /(15π m 2 ) is readily extracted from a lowest order of a Taylor expansion in Q 2 . For the external electron lines of the vertex part with the electrons on their mass shell, the vertex part, after mass and charge renormalization, up to second order in Q 2 is given by   α Q 2  1  μ2 3 1  α  μ ν + γ , γ Qν , − ln 2 − − γ 1− 2 3π m 2 m 8 5 8π m μ

(26.37)

which is infra-red divergent. Experimentally, however, there exists a limit of resolution such that photons of momenta of magnitudes |k| < K , are not detectable. This allows one to introduce an infrared cut-off photon propagator part defined by



D+ (x − x ) =

 = Limμ→0

 |k|>K 

eik(x−x ) (dk) (2π )4 k 2 + μ2 − i





dk 0 eik(x−x ) (2π ) k 2 − i   eik(x−x ) d3 k dk 0 . 3 (2π ) k 2 + μ2 − i |k| K , and consistently apply perturbation theory to the Dirac theory. On the other hand, for |k| < K , one is in the low energy region and a modified perturbation treatment, in which the Coulomb potential is taken exactly, will be carried out. The corresponding energy contributions in the two regions are denoted, respectively, by δ E > and δ E < , and, quite importantly, the total contribution δ E = δ E > + δ E < is independent of the infra-red cut-off K . The δ E > Contribution:   The bound-state problem of the Dirac equation in the Coulomb potential Aμ (x) = 0, A0 (x) , eA0 (x) = −α/|x|, is described by the equation

5 Schwinger [13]. For a text-book treatment when the electron is also just treated non-relativistically, see Manoukian [10], p. 453, with the corrective

factor given by [1 + κ(α/2π )], where κ = (16/9) − 2 ln(3/2)  0.97, which compares well with the QED result. [4], p. 7. 7 See, e.g.., Kinoshita [7]. 8 C. N. Yang [15], p. 176, writes that Enrico Fermi, Edward Teller, Gregor Wentzel, together with several graduate students including himself gathered into Fermi’s office from April to May in 1948, after attending lectures given, in particular, by Schwinger and Feynman at the New York APS meeting in January 1948, to try to understand, with much difficulty, some of Schwinger’s computations in QED such as the anomaly. At the end of the six weeks of work, somebody asked, “Wasn’t it true that Feynman also talked. All three said,“Yes, yes, Feynman did talk”, “What did he say? ”. All they could remember from Feynman’s lecture, however, was the strange notation of a p with a slash in it: p/, as a notation standing for γ p. Surprisingly, Fermi was taking notes while attending Schwinger’s talk which apparently was unusual for Fermi to take notes in a lecture. 6 Feynman

27

Anomalous Magnetic Moment of the Electron and the Lamb Shift

187



 γ μ ∂μ 0 − e γ 0 A0 (x) + m ψn (x) = 0, ψn (x) = ψn (x)e−iEn x , i  0 j  γ γ ∂j − e A0 (x) + m γ 0 ψn (x) = E n ψn (x). ((γ 0 )2 = I ) i



 † ∂j γ 0γ j = γ 0γ j . − ψn† (x)γ 0 γ j + ψn† − e A0 (x) + m γ 0 = ψn† (x)E n , i

(27.11) (27.12) (27.13)

For |Q| of the order the energy difference mα 2 between the levels in the atom, we have Q2 /m 2  1. Also from Eqs. (26.37) and (26.40) in Chap. 26, we may infer that in the presence of radiative corrections, the Fourier transform eγ 0 A0 (Q) of the interaction term eγ 0 A0 (x) is, for Q2 /m 2  1, modified to

α  0 j  α Q 2 m 5 3 1  γ0 1 − + − − + γ , γ Q j eA0 (Q), ln 3π m 2 2K 6 8 5 8π m

(27.14)

giving rise to an additional interaction term in the Dirac equation given by 

 α m 5 3 1  −∇ 2 α γ ·∇ α ln + − − , − 3π 2K 6 8 5 m2 4π m i |x|  4α 2 γ 0 m 5 3 1  δ (3) (x) α 2 γ 0 γ · ∇ 1 = ln + − − , − 3 2K 6 8 5 m2 4π m i |x|

γ 0 δU =γ 0

(27.15) (27.16)

where in writing the second line we have used the relation −∇ 2 (1/|x| ) = 4 π δ (3) (x).9 Thus the term − γ 0 e A0 (x) in the Dirac equation (27.11), as well E n , ψn are then modified as follows: −eγ 0 A0 → −eγ 0 A0 + γ 0 δU,

En → En + δ En ,

ψn → ψn + δψn ,

(27.17)

and the following modification of the Dirac equation in (27.11) emerges 

 γ 0γ j ∂ j − e A0 (x) + γ 0 m δψn (x) + δU ψn = E n δψn (x) + ψn δ E n . i

(27.18)

By multiplying the above equation from the left by ψn† , integrating over x, and using, in the process, the normalization condition ψn |ψn  = 1, as well as Eq. (27.13), we obtain δ E n = ψn |δU |ψn .

(27.19)

For the problem at hand, we have to make the replacement n → (n = 2, ), where is the orbital quantum number. One also has to combine angular momentum of the electron and its spin.10 We have the explicit matrix elements: 

(m α)3 † δ ,0 , (x) δ (3) (x) ψn=2, (x) = d3 x ψn=2, 8π     γ ·∇ 1 (m α)3 δ ,1 † ψn=2, (x) = − δ ,0 . d3 x ψn=2, (x) i |x| 4m 3

(27.20) (27.21)

Equations (27.19)–(27.21) then give rise to

9A

derivation of this expression used often in classical electrodynamics is derived for the convenience of the reader in Box 51.2. ψn=2, , for the problem at hand, see: Manoukian [11], pp. 295–298. See also Manoukian [10], pp. 383, 384, 407.

10 For the very elaborate and explicit construction of the four component-wave-functions

188

27

Anomalous Magnetic Moment of the Electron and the Lamb Shift



4α 2 m 5 3 1  (m α)3

α 2 (m α)3 δ

,1 + − − δ ,0 − − δ ,0 ln 2 3m 2K 6 8 5 8π 4π m 4m 3  

  m 5 3 1 1 α 5m 1 α 5m ln + − − + δ ,0 − δ ,1 . = 2π 3 2K 6 8 5 8 48π

δE> =

(27.22)

The δ E < Contribution: Now we are in the low energy region for photons with momenta |k| < K . For the hydrogen atom, in the absence of the interaction of the electron with radiation, the Hamiltonian is simply given by H = p2 /2m − α/|x|. The interaction of the electron with radiation is obtained through the substitution p → (p − eArad ), where Arad denotes the radiation field. Moreover we need to introduce the free Hamiltonian H0rad of radiation. The Hamiltonian of the electron-photon system in Hydrogen may be then defined by 2  p − eArad α + H0rad − 2m |x| α p2 − + H0rad + HI = HC + H0rad + HI , = 2m |x| e2 2 e

Arad − Arad · p + p · Arad . HI = 2m 2m H=

(27.23) (27.24) (27.25)

The Lamb shift has an lnα dependence on the fine-structure constant α and this requires to treat the Coulomb interaction term exactly. Of course, we can do this because the Hydrogenic eigenfunctions are known. That is, we treat the part of the Hamiltonian [HC + H0rad ] exactly, with the HI term in (27.25) as a perturbation. Let |vac denote the vacuum state involving no photons, and |k, λ a single photon state of momentum k and polarization specified by the parameter λ. In particular, note that with HI as a perturbation, we may, with eλ denoting real polarization vectors satisfying the relations: k · eλ = 0, eλ · eλ = δλλ , conveniently, use the expressions involving the field Arad (x) in Box 27.1. Let |ϕn  and En denote, respectively, the Hydrogen-atom eigenfunctions and the corresponding energy levels. That is, |ϕ n  ≡ |ϕn |vac, [HC + H0rad ]|ϕ n  = En |ϕ n ,   [HC + H0rad ]|ϕn |k, λ = En + |k| |ϕn |k, λ.

(27.26) (27.27)

In the presence of perturbation, the bound-state problem is described by the set of equations Box 27.1 Expressions involving the field Arad (x) vac|Arad (x)|vac = 0, vac|Arad (x)|k, λ = eik·x eλ (k), a  vac|Arad (x) |k j , λ j  = 0 for a ≥ 2, k, λ|Arad (x) = e−ik·x eλ (k)vac|.  j   d3 k |vacvac| + |k, λk, λ| + · · · = 1. (2π )3 2|k| λ=1,2   d3 k vac|Arad (x)|k, λ · k, λ|Arad (x) vac|A2rad (x) = vac|Arad (x)|vac · vac|Arad (x) + (2π )3 2|k| λ   d3 k = 0 + vac|. (2π )3 2|k| λ

H |φn  = En |φn , ⇒ ϕ n |H |φn  = En ϕ n |φn ,  ⇒ ϕ n |HI |φn  = En − En )ϕ n |φn , using(27.26), (27.25),  ϕ  |HI |φn  En = n  , En = En − En ). ϕ n |φn 

(27.28)

(27.29)

27

Anomalous Magnetic Moment of the Electron and the Lamb Shift

189

From the last two equations in Box 27.1, we may infer that ϕ n |A2rad (x)|φn  ϕ n |1 A2rad (x)|φn  ϕ n |φn  = = ϕ n |φn  ϕ n |φn  ϕ n |φn 

  λ

d3 k = (2π )3 2|k|

  λ

d3 k , (2π )3 2|k|

(27.30)

which is independent of the quantum numbers n, and hence is common to all the energy levels. Accordingly, the first term in HI defined in (27.25) does not contribute in computing differences in energy levels.11 Multiply the first equation in (27.28) by k, λ|, generating a state corresponding to HC given by 1 k, λ|HI |φn , [ En − HC − |k| ]

k, λ|φn  =

k, λ|H0rad = |k|k, λ| ,

(27.31)

which, using in the process the properties of Arad (x) in Box 27.1, and, notably, that the matrix elements between the vacuum state and multi-photon states are zero, leads to the following expression corresponding to the second term in HI defined in (27.25)

e ϕ n | Arad (x) · p + p · Arad (x) |φn  2m     4π e2 d3 k   1  ik·x −ik·x · e ϕ e = e e · p p ϕn . λ λ 3 2|k| n m2 (2π ) [ E − H − |k| ] n C λ=1,2 −

(27.32)

Here to the lowest order contribution of HI , we have replaced the denominator in (27.29) by one, replaced En in the denominator of the above equation by En , and used the property k · eλ = 0. In Box 27.2, it is shown that this equation leads to the following expression for δ E < :      2α  K   H p ϕ ] · ln δE = ϕ [p, n=2, C n=2, 3π m 2 |HC − En |   

K   2α   Ry  [p, H 2,

p + [p, H ≡ ] · ln ] · p ln   2, , C C 3π m 2 |HC − En | Ry 0, dτ 3π

(31.1)

where the positive sign of the slope, of course, means that the effective parameter increases with τ , and for a sufficiently large Q 2 , the term α ln(Q 2 /m 2 ), as an expansion parameter, may become too large. This is unlike the situation in a non-abelian gauge theory such as QCD in which the effective coupling parameter rather decreases with Q 2 , a welcome property inherited in QCD, referred to as asymptotic freedom, and a perturbation expansion may be valid at high energies which is considered in the next chapter (Chap. 32). More generally, the effective fine-structure parameter α(Q 2 ) in QED, at a Q 2 value, may be defined through the equations:   Q μ Q ν α0 d(Q 2 ) Qμ Qν μν μν ˜ + α , λ α0 D (Q) = η − 0 Q2 (Q 2 − i ) Q 2 (Q 2 − i ) α0 α0  , α(Q 2 ) = α0 d(Q 2 ) = = 2 1 + α0 π(Q 2 ) 1 + α0 π(ξ ) + α0 π(Q 2 ) − π(ξ 2 )

(31.2) (31.3)

where α0 is the unrenormalized parameter, D˜ μν (Q) is the full photon propagator,1 and to lowest order in the momentum description in (26.12) with 1/[1 − α0 π(Q 2 )]  [1 + απ(Q 2 )], and π(Q 2 ) was explicitly given to lowest order in Box 26.3 of Chap. 26. The polarization tensor is given by Π μν (Q) = ημν Q 2 − Q μ Q ν Π (Q 2 ) (see Eq. (26.10)) where, Π (Q 2 ) = α π(Q 2 ),

α(ξ 2 ) =

α0 . 1 + α0 π(ξ 2 )

(31.4)

For Q 2 = 0, α(0) = α  1/137 and is given explicitly by α = α(0) =

1 It

α0 = Z 3 α0 , 1 + α0 π(0)

Z3 =

1 . 1 + α0 π(0)

(31.5)

is given in spacetime variables in (26.4). See also (26.17).

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_31

207

208

31

Renormalization Group of an Abelian Gauge Theory: QED

With a dependence on the fine-structure constant α, we also have the renormalized part dren of the part of the photon propagator given by α dren (Q 2 , α) = α0 d(Q 2 ),   Q μ Q ν αdren (Q 2 , α) Qμ Qν μν + α . α0 D˜ μν (Q) = α D˜ ren (Q) = ημν − λ 0 Q2 (Q 2 − i ) Q 2 (Q 2 − i )

(31.6) (31.7)

We may now use Eqs. (31.2) and (31.3) to write

αdren (Q 2 , α) = α(ξ 2 ) dren Q 2 , ξ 2 , α(ξ 2 ) ,

1  , dren Q 2 , ξ 2 , α(ξ 2 ) = 2 1 + π(ξ ) π(Q 2 ) − π(ξ 2 )



dren Q 2 , 0, α(0) ≡ dren Q 2 , α).

(31.8) (31.9) (31.10)

Equation (31.8) gives the important invariance equation relating the effective fine-structure for any two ξ12 , ξ22 values, referred to as a so-called renormalization group equation:



α(ξ12 ) dren Q 2 , ξ12 , α(ξ12 ) = α(ξ22 ) dren Q 2 , ξ22 , α(ξ22 ) ,

α(ξ 2 ) = α dren ξ 2 , α ,

(31.11) (31.12)

with no reference to the unrenormalized coupling α0 . We have, however, the boundary conditions: α(ξ ) = 2

α , for ξ 2 → 0 ,

(31.13)

α0 , for ξ 2 → ∞.

−1 Equations (31.3)–(31.6), lead to the following expression for dren (Q 2 , α), using, in the process, the second equality in (31.5)

 −1 (Q 2 , α) = Z 3 [1 + α0 π(Q 2 )] = Z 3 + Z 3 α0 π(0) + Z 3 α0 π(Q 2 ) − π(0) , dren   −1 (Q 2 , α) = 1 + Z 3 α0 π(Q 2 ) − π(0) = 1 + α π(Q 2 ) − π(0) . dren

(31.14)

We differentiate the above equation with respect to the mass of electron m by keeping the unrenormalized fine structure α0 and the arbitrarily large ultraviolet cut-off  fixed. To the above end, d 2α 1 Z 3 = β(α) = + ··· , m Z 3 dm 3π

 m2 

d 2 m π(Q ) , =O dm Q2 Q 2 m 2

(31.15)

where the extreme right-hand side of the first equation gives the leading order contribution in α as obtained from the expression Z 3 = 1 + (2α/3π )ln(m/) + α/6π ), in Eq. (26.15) in Chap. 26 and given in Box 31.1. Furthermore the vanishing aspect as m 2 /Q 2 is derived in Box 31.1, to lowest order in α, and holds true to every order of perturbation theory2 up to powers of −1 (Q 2 , α) = Z 3 [1 + α0 π(Q)2 )], we obtain ln(m 2 /Q 2 ). Accordingly upon taking the derivative md/dm of the relation dren m

 m d d d −1 2 −1 dren (Q , α) = Z 3 dren π(Q 2 ), (Q 2 , α) + α m dm Z 3 dm dm

(α = Z 3 α0 )

(31.16)

remembering, that we are keeping the unrenormalized α0 , and 2 , and hence the derivative on the left-hand side of the above equation must involve the derivative of the fine-structure α. That is, the left-hand of the above equation reads  ∂   ∂  ∂ m d  ∂  m d −1 dren d −1 (Q 2 , α), (Q 2 , α) = m m + α α + Z3 α ∂m α dm ∂α ∂m Z 3 dm ∂α ren 2 This is a consequence of a famous theorem due to Weinberg [13]. For a proof of the Weinberg theorem which takes into account

of renormalization, see Manoukian [10].

(31.17)

of the subtractions

31

Renormalization Group of an Abelian Gauge Theory: QED

209

where in the last equality we have used the relation dα/dm = α0 dZ 3 /dm. We may now combine (31.16) and (31.17) and use the definition of β(α) in (31.15) to write  m

 ∂  ∂ d −1 + β(α) α − 1 dren π(Q 2 ). (Q 2 , α) = α m ∂m ∂α dm

(31.18)

Such an equation is referred to as a Callan-Symanzik scaling equation3 and is in the form given in a remarkable paper by Stephen Adler.4 Now we use the fact that the effective fine-structure parameter is given by α(Q 2 ) = αdren (Q 2 , α) and that from Eq. (31.15) that for Q 2  m 2 , α m d π(Q 2 )/dm = O(m 2 /Q 2 ), to infer from (31.18) that  ∂ ∂  as 2 + αβ(α) α (Q ) = 0, m ∂m ∂α

(31.19)

where α as (Q 2 ) is obtained from α(Q 2 ) by neglecting terms which vanish for Q 2 /m 2 → ∞. α as (Q 2 ) will be a function of ln(Q 2 /m 2 ). As a function of the latter variable, we have from (31.15) and the last relation in Box 31.1, or directly from (31.1), α as (m 2 ) ≡ q(α) = α −

5 2 α + ··· . 9π

(31.20)

To develop an all order perturbation expansion of α as (Q 2 ) as a function of ln(Q 2 /m 2 ), we introduce the function

1 ψ q(α) = αβ(α)q  (α), 2

(31.21)

referred to as the Gell-Mann and Low function,5 to rewrite (31.19) as Box 31.1 Leading order contributions to the renormalization group Eqs. (31.18)/(31.19) in QED   −1 dren (Q 2 , α) = Z 3 1 + α0 π(Q 2 ) = Z 3 + Z 3 α0 π(0) = 1 + Z 3 α0 π(Q 2 ) − π(0) .    2 1 Q2 1   2  1  ln 2 − . dz z(1 − z) ln 1 + 2 z(1 − z) , π(0) = π(Q 2 ) = π(0) − π 0 m 3π m 2   d ∂ m dα ∂ ∂ m dZ 3 ∂ ∂ ∂ ∂ m =m + α =m + α =m + β(α) α m . dm ∂m α dm ∂α ∂m Z 3 dm ∂α ∂m ∂α ∂m     −1  2 2 2 2 −1 Q Q Q d Q m ln 1+ 2 z(1 − z) = −2 2 z(1 − z) 1+ 2 z(1 − z) = −2 + 2 1+ 2 z(1 − z) . dm m m m m  1

2  1 2 α 4α 4α m 2 d 4α m

dz z(1 − z) − dz = − . • m απ(Q)2 2 2 = − + 2 Q m dm 3π π 0 π Q 0 π Q2  m d 2α Z 3 = 1 − (α/3π ) ln(2 /m 2 ) − (1/2) , • Z 3 = β(α) = . Z 3 dm 3π  2 1   2α 2α 1 Q dz z(1 − z) − dz z(1 − z) ln[z(1 − z)] α π(Q)2 ) − π(0)]as = − ln 2 π m π 0 0  2

 α 5α 5α Q 2 as =− ) − π(0)] + ln , • α π(Q) .

2 2= Q =m 3π m2 9π 9π



3 Callan

[3]; Symanzik [12]. [1]. 5 Gell-Mann and Low [5]. 4 Adler



∂ ∂ − ψ q(α) α as (Q 2 ) = 0, ∂τ ∂q(α)

τ = ln(Q 2 /m 2 ).

(31.22)

210

31

Renormalization Group of an Abelian Gauge Theory: QED

An all order expansion of α as (Q 2 ) in powers of ln(Q 2 /m 2 ) readily follows from this equation and is given6 by n 

   ∞ 

 ln(Q 2 /m 2 ) d d n−1

α (Q ) = q(α) + ψ q(α) z . ψ(z) n! dz dz z=q(α) n =1 as

2

(31.23)

Higher order computations of β(α) and ψ(z) have been given in the literature7 1 α2 2α 121 α 3 + − + ··· , 2 3π 2π 144 π 3     1 z2 101 z 3 1z 1 8 + ζ (3) − ψ(z) = z + + ··· , 3π 4 π2 8 3 36 π 3 β(α) =

(31.24) (31.25)

where ζ (3) ≈ 1.202 with ζ denoting the Riemann zeta function. It has been speculated that the fine structure constant α may be determined from a zero of the beta function β(α) = 0.8 It has been argued, however, that β(α) may not have a non-trivial zero.9 One would expect that α0 may be infinite. Such a possible divergence should be of no surprise as this would correspond to an unattainable charge “measurement” right at the core of the electron as our theories are not justified to be extended all the way to absolute zero distances as mentioned earlier. Various other general possible alternative “solutions” for the effective fine-structure parameter may be also inferred directly from Eq. (31.22) without10 carrying out an expansion in powers of ln(Q 2 /m 2 ), and we refer the reader to the just mentioned references. We will also carry out a renormalization group analysis of the effective coupling parameter in QCD in the next chapter and discuss its important consequence.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.

Adler, S. L. (1972). Short distance behavior of quantum electrodynamics and an eigenvalue condition for α. Physical Review D, 5, 3021–3047. Baker, M., & Johnson, K. (1969). Quantum electrodynamics at small distances. Physical Review D, 183, 1292–1299. Callan, C. G. (1970). Broken scale invariance in scalar field theory. Physical Review, D, 2, 1541–1547. de Rafaël, E., & Rosner, J. L. (1974). Short-distance behavior of quantum electrodynamics and the Callan-Symanzik equation for the photon. Annals of Physics (NY), 82, 369–406. Gell-Mann, M., & Low, F. E. (1954). Quantum electrodynamics at small distances. Physical Reiew, 95, 1300–1312. Gorishinii, S. G., et al. (1991). The analytic four loop corrections to the QED beta function in the MS scheme and the QED psi function: Total reevaluation. Physics Letters B, 256, 81–86. Jentschura, U. D., & N´andori, I. (2014). Attempts at a determination of the fine-structure constant from first principles: A brief historical overview. The European Physical Journal H, 39, 591–613. Krasnikov, N. V. (1989). Is finite charge renormalization possible in quantum electrodynamics? Physics Letters B, 225, 284–286. Manoukian, E. B. (1975). Fundamental identity for the infinite-order-zero nature in quantum electrodynamics. Physical Review, D, 12, 3365– 3367. Manoukian, E. B. (1983). Renormalization. New York/London/Paris: Academic Press. Manoukian, E. B. (2016). Quantum Field Theory I: Foundations and abelian and non-abelian gauge theories. Switzerland: Springer. Symanzik, K. (1970). Small distance behaviour in quantum field theory. Communications in Mathematical Physics, 18, 499–520. Weinberg, S. (1960). High-Energy behavior in quantum field theory. Physical Review, 118, 838–849.

6 For

the corresponding details see Manoukian [11], pp. 351–353. de Rafaël and Rosner [4], and Baker and Johnson [2]. See also Gorishinii et al. [6]. Note that we use Adler’s β(α) function notation which is β(α)/α of de Rafaël’s & Rosner’s. 8 See Adler [1] and, e.g., Manoukian [9]. 9 See, e.g., Krasnikov [8]. See also Jentschura and N´ andori [7]. 10 See, for example, Adler [1], and Manoukian [11], pp. 351–353. 7 See,

Renormalization Group of a Non-Abelian Gauge Theory: QCD and Asymptotic Freedom

32

Prerequisites Chaps. 17, 23, 24 and 31

QCD, as the modern theory of strong interaction, involves massless non-abelian gauge fields—the so-called gluons— interacting with quarks and is based on the symmetry color gauge group SU(3). The necessity of having the quantum number color for a consistent description was discussed in Chap. 18. The Lagrangian density of QCD is similar in structure to the one in QED, with obviously some variations. For the SU(N ) gauge symmetry, in particular with N = 3 colors for QCD, with generators of the group given by N × N matrices ta , a = 1, 2, . . . , N 2 − 1, and n f quark flavors, may be spelled out, by using in the process Eq. (23.12) as well as Chap. 17, to be1     nf  1 μν μ ∂μ − go Aμ + m 0 j ψ j , L = − G aμν G a − ψj γ 4 i j=1

Aμ = Aaμ ta ,

μ ν ν μ μ ν G μν c = ∂ Ac − ∂ Ac + go f a bc Aa Ab , [ta , tb ] = i f a bc tc , f a cd f bcd = C A δab , 1 i f a bc tb tc = − C A ta , Tr[ta tb ] = δab /2, [ta ta ]s = C F δs , , s = 1, 2 . . . , N , 2  1 N 2 −1 1  , tb ta tb = ta tb tb − C A = − ta . CA = N , CF = 2N 2 2N

(32.1) (32.2) (32.3) (32.4)

At high energies, specified by a high-energy scale parameter μ, one may set masses of those quarks with m q  μ equal to zero2 in comparison to μ. On the other hand, for those quarks with masses m q  μ, one may invoke the decoupling theorem, discussed in Chap. 28, to eliminate Feynman diagrams which involve lines representing the propagators of these relatively massive quarks. This gives rise to restricting one’s study at high energies to a massless theory, and by restricting, in the process, n f only to those quarks with masses small in comparison to the high-energy scale μ. By including the contributions of the quantum ghost fields which we denote by ωa ,3 the effective interaction Lagrangian density then reads: LI eff = −



μ go g2 μ f abc ∂μ Aaν −∂ν Aaμ Ab Aνc − o f abc Ab Aνc f ade Adμ Aeν 2 4 + g0 ψ j γ μ Aμ ψ j − go f acb ∂μ ωa Aμc ωb .

(32.5)

In n dimensional space, and in particular for n = 4, one has the following integral often occurring in evaluating Feynman integrals

Γ ν − n2 1 1 i dn k

=

(2π )n k 2 + M 2 − i ν (4π )n/2 M 2 − i ν−(n/2) Γ (ν)

(32.6)

1 Proper 2 3

“symmetrizations” of product of fields is understood. For detailed studies of zero mass limits of renormalized Feynmam amplitudes, see Manoukian [9]. See Eqs. (23.33) and (24.27).

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_32

211

212

32

Renormalization Group of a Non-Abelian Gauge Theory …

which exists for ν > n/2, and where Γ (z) is the gamma Function. One may then define an integral by an analytic continuation of spacetime dimensions to D as follows4

D Γ ν − 2 1 1 i d k , =

(2π ) D (k 2 + M 2 − i )ν (4π ) D/2 M 2 − i ν−(D/2) Γ ν D

(32.7)

and dimensional regularization consists in choosing D = 4−ε, with ε arbitrarily small with ε > 0. In studying dimensionally regularized integrals, some of the following basic properties of Γ (z) are quite useful: ε 2 ε ε = 1 + γ E + O(ε2 ), Γ = −γ Γ (z + 1) = z Γ (z), Γ 1 − 2 2 2 ε

E

+ O(ε),

(32.8)

where γ E = 0.5772157 . . . is Euler’s constant. Clearly by choosing the dimensionality of space time D = 4 − ε < 4, a logarithmically divergent integral becomes finite. It develops, however, a singularity as O(1/ε) for ε ≈ 0. Some of the properties satisfied by the gamma matrices γ μ with such a given dimensional spacetime are given in Box 32.1. Box 32.2 provides a panorama of some of dimensionally regularized integrals not only for ultraviolet divergent integrals but for infrared divergent ones as well. Given these useful integrals, they may be directly applied to obtain the explicit expressions of self-energy and vertex parts to leading order in perturbation theory as given below and their evaluation may be carried out in a straightforward manner. We note that in QCD, γ μ in QED, is replaced by γ μ ta , with the electron replaced by the quarks, and the photon by gluons. Thus obtaining expressions of self-energy of quark and vertex parts may be facilitated from those in QED. We work explicitly −1 (Q) = Q 2 δab ημν . In particular, the in the Feynman gauge in which the inverse of a free gluon propagator is given by Dabμν inverse of the propagator of a given quark, in massless QCD, is given by −1 ( p) Ss

= δs γ p + ig (ta ta )s 2



 (dk) γ μ − γ ( p − k) γμ   (2π )4 ( p − k)2 k 2 Reg

(32.9)

which from the properties of the gamma matrices in Box 32.1, and the explicit integrals in Box 32.2, give   ε −1 C F F(μ2D / p 2 , ε) γ p, C F = (N 2 − 1)/2N . = δs 1 + g 2 1 − Ss 2 Box 32.1 Some properties of the gamma matrices in D = 4 − ε dimensions [γ μ , γ ν ] = −2 ημν I. Tr[γ μ γ ν ] = −4 ημν .

ημ μ = (4 − ε).

γ μ γμ = −(4 − ε)I. Tr I = 4. γ μ γ ν γμ = (2 − ε)γ ν .

Tr[γ α γ β γ μ γ ν ] = 4 ηαβ ημν − ηαμ ηβν + ηαν ηβμ . γ σ γ kγ μ γ kγσ = γ σ γ α γ μ γ β γσ kα kβ .

4 Do

not confuse with ε.

(32.10)

32

Renormalization Group of a Non-Abelian Gauge Theory …

213

Box 32.2 A panorama of dimensionally regularized integrals with ε > 0, δ > 0 corresponding, respectively, to ultraviolet and infrared cutoffs in D = 4 − ε, D  = 4 + δ dimensions, and appropriate mass scales μ D , μ D  . In the last four integrals, p12 = 0 = p 22 , and Q 2 = −2 p1 p 2

 μ2 ε/2 Γ 2 (1 − ε )Γ ( ε ) (dk) 1 1  2 2 D .  = i F(μ2D /Q 2 , ε), F(μ2D /Q 2 , ε) ≡ 4 2 2 D/2 (2π ) (Q − k) k Reg (4π ) Q2 Γ (2 − ε)   (dk) Qμ (dk) 1 kμ   F(μ2D /Q 2 , ε).  = 0.  =i 4 2 4 2 2 (2π ) (Q − k) Reg (2π ) (Q − k) k Reg 2    kμ kν (dk) i  − ημν Q 2 + (4 − ε)Q μ Q ν F(μ2D /Q 2 , ε). =  4 2 2 (2π ) (Q − k) k Reg 4(3 − ε)  1 μ2  − δ Γ 2 δ Γ (1 − δ

(dk) 1 i  2 D 2 2 .  = (2π )4 ( p1 − k)2 ( p2 − k)2 k 2 Reg (4π ) D  /2 Q 2 Q 2 Γ (1 + δ)





μ μ  i ( p1 + p2 ) 1 μ2D  − 2δ Γ 2δ Γ 1 − 2δ Γ (1 + 2δ kμ (dk)  = .   (2π )4 ( p1 − k)2 ( p2 − k)2 k 2 Reg Γ (2 + δ) (4π ) D /2 Q 2 Q 2



 1 kμkν i i ημν μ2D 2ε Γ 2ε Γ 2 1 − 2ε (dk)  × +  =  /2 4 2 2 2 D/2 2 D (2π ) ( p1 − k) ( p2 − k) k Reg (4π ) Q 2 Γ (3 − ε) Q2 (4π )



 δ  2 μ2  − δ  Γ 2δ Γ 2 + 2δ δ μ μ μ μ Γ 1+ 2 2 × D2 ( p1 p1ν + p2 p2ν ) . Γ 1− + ( p1 p2ν + p1ν p2 ) Q 2 Γ (3 + δ) Γ (3 + δ)

On the other hand, for the inverse of a gluon propagator, we have (1) (2) (3) −1 D˜ abμν (Q) = Q 2 δab ημν + Πabμν (Q) + Πabμν (Q) + Πabμν (Q),

(32.11)

with the quarks represented by solid lines, the gluons represented by curly spring-like ones while the dashed lines denote the ghosts. The corresponding integrals are given in Box 32.3, together with their analytic expressions. Here it is important to note that the presence of ghosts, i.e., the last diagram ensures the gauge invariance of the propagator, and the familiar (3) (2) (Q) to Πabμν (Q) transversality character in involving the multiplicative factor ημν Q 2 − Q μ Qν arises only if we add Πabμν (see Box 32.3).

214

32

Renormalization Group of a Non-Abelian Gauge Theory …

For a quark-gluon vertex part, we have

μ

(1)μ

(2)μ

Γas ( p1 , p2 ) = [ta ]s γ μ + Λas ( p1 , p2 ) + Λas ( p1 , p2 ),

(32.12)

32

Renormalization Group of a Non-Abelian Gauge Theory …

215

Box 32.3 Renormalization constants   We introduce the renormalization constants Z 2 , Z 3 , Z 1 defined at a high-energy scale μ associated, respectively, with quark and gluon wavefunction renormalizations, and quark-gluon vertex renormalization. With F(μ2D /Q 2 , ε) defined in Box 32.2, and using the fact that (μ2D / p 2 )ε/2 = 1 + (ε/2)ln(μ2D / p 2 ) + · · · , as well as the properties of the Gamma function in (32.8) and the elementary property ln(μ2D / p 2 ) = ln(μ2 / p 2 ) − ln(μ2 /μ2D ), we may write the inverse of a quark propagator in (32.10) for ε ≈ 0, with a1 as some finite constant resulting in the limit of small ε as :  2  μ2  2 μ g2 1  g2 1 −γ E −ln ( p) = δ C = 1+ CF 1+ +a1 γ p, • . Si−1 ij F ln j 2 2 2 Z2 (4π ) p Z2 (4π ) ε 4 π μ2D

  Qμ Qν 1 1 Similarly D˜ abμν (Q) = δab ημν 2 + δab Z 3 1 − A(μ2 /Q 2 ) − 1 ημν − , Q Q2 Q2  g2 2 5  μ2 A(μ2 /Q 2 ) = n f − C A ln + a2 , where a2 is some finite constant (4π )2 3 3 Q2

 2 μ2  1 g2 2 5 . • =1+ n f − CA − γ E − ln 2 Z3 (4π ) 3 3 ε 4π μ2D 

μ2    μ2 g2 1 g2  μ D Finally Γai j = [ ta ]i j γ μ + a + t ln t + C h ,δ , 1+ b b A 3 IR Z1 (4π )2 Q2 (4π )2 Q2 where h IR (μ2D  /Q 2 , δ) is a function of the infrared regularization parameters, and independent of the ultraviolet ones. Its explicit expression will not be needed in the sequel, 2 μ2  1 g2 . • =1+ [ CF + CA ] − γ E − ln Z1 (4π )2 ε 4π μ2D

with the corresponding integrals and the analytic expressions given in Box 32.3. Given the explicit expressions of a quark and a gluon propagator, as well as of a quark-gluon vertex parts in Eqs. (32.6)– (32.12) and Box 32.3, the renormalization constants are determined in Box 32.4, evaluated at a large scale parameter μ. The renormalized quark-gluon coupling parameter may be then defined as: gs (μ2 ) =

√ √ ( Z 2 )2 Z 3 go , Z1

α s (μ2 ) = go2 s (μ2 )/4π, α o s = go2 /4π,

(32.13)

 μ2  2n f  2 α s (μ2 )  11 N − − γ E − ln . αs (μ ) = α os 1 + 4π 3 3 ε 4π μ2D

(32.14)

2

It is convenient to set β0 =

11N − 2n f , 12π

 β0 =

33 − 2n f

12π

 for SU(3) ,

(32.15)

216

32

Renormalization Group of a Non-Abelian Gauge Theory …

to rewrite (32.14) as 2 ε

 − γ E + ln(4π μ2D ) −

1 1 + ln(μ2 ). = − β 0 α os β 0 α s (μ2 )

(32.16)

Since the left-hand of the above equation is independent of μ2 , we may infer that for all μ2 such that n f corresponds to the quark flavors with masses squared  μ2 , the combination on the right-hand of the equation is also independent of μ2 as well. That is, it is invariant. Thus taking the derivative of the r.h.s of the above equation w.r.t. μ2 and setting it equal to zero, we have the remarkable equation μ2

d α s (μ2 ) = −β 0 α 2s (μ2 ), dμ2

(32.17)

where, in general, (33 − 2n f ) > 0, and hence β 0 is strictly positive. The minus sign on the r.h.s of the above equation is the origin of asymptotic freedom5 showing that the effective coupling α s (μ2 ), denoting the strength of strong interaction, decreases with large μ2 , making it quite useful in carrying out perturbation analysis at high energies in QCD. This is unlike the situation in QED as discussed below Eq. (31.1) in Chap. 31.6 It is interesting to note that due the μ2 independence of the combination on the r.h.s. of (32.16), then given a typical highenergy squared value Q 2 , associated with a given physical process, this gives rise, in particular, the (invariant) renormalization group equation, as well as the solution for the effective αs (Q 2 ) coupling: −

1 1 + ln(Q 2 ) = − + ln(μ2 ). β 0 α s (Q 2 ) β 0 α s (μ2 ) α s (μ2 ) . α s (Q 2 ) = 1 + β 0 α s (μ2 ) ln(Q 2 /μ2 )

(32.18) (32.19)

It is customary to introduce a mass scale Λ through the equation7 αs (μ2 ) ln

μ2 Λ2

=

1 , β0

(32.20)

which simplifies the expression for α s (Q 2 ) in (32.19) giving α s (Q 2 ) =

1 , β0 ln(Q 2 /Λ2 )

(32.21)

which vanishes for Q 2 → ∞. The part of the interaction which is responsible for asymptotic freedom, by giving rise to a positive β0 , is its non-abelian character inherited in the gluon self-interaction which is absent for the photon in QED. For n f = 6, the gluon self-interaction contribution to β0 is twice as big as the remainder part in the interaction in magnitude making the overall sign of β0 positive. We have seen that asymptotic freedom shows that the strong interaction gets weaker at high energies which is in conformity with the celebrated deep-inelastic electron-nucleon scattering experiments.8 Moreover, as mentioned earlier, it allows one to carry out perturbation analysis of the strong interaction at high energies. Elastic electron-proton scattering, via the exchange of a virtual photon γ ∗ , is depicted by the diagram on the left-hand below, and may be described in terms of an electric and magnetic form factors G E (Q 2 ), G M (Q 2 ), as functions of the momentum transfer squared Q 2 carried by the photon, which vanish quite rapidly for large Q 2 . As Q 2 is increased further one reaches

5 Asymptotic

freedom was discovered in 1973 by David Gross, Frank Wilczek and Hugh Politzer showing renormalization theory in action. See their Nobel lectures: Gross [8], Wilczek [14], Politzer [12], and references in these papers. The pioneering but a rather preliminary related work was carried out earlier by Vanyashin and Terentyev [13]. 6 For higher order terms of the beta function, i.e., of the expression on the right-hand side of (32.17), see, e.g., Patrignani et al. [11]. 7 The phenomenological parameter Λ is taken from experiments. Typical recommendations for the value of Λ are: 205–221 MeV, 286–306 MeV, 329–349 MeV, for n f = 5, 4, 3, respectively. (See, e.g., Beringer et al. [1]). 8 See, e.g., the review by Friedman and Kendall [7]. For early experiments, see, Bloom et al. [4]; Breidenbach et al. [5].

32 Renormalization Group of a Non-Abelian Gauge Theory: …

217

the so-called resonance region9 beyond which one moves to the deep inelastic region, with the scattering as shown on the right-hand diagram below, where “anything” refers to any particles that may be created in the process:

In the deep-inelastic region, the reaction changes “character” and the scattering process may be described in terms of two dimensionless functions F1 (Q 2 , x), F2 (Q 2 , x), referred to as structure functions, where x=

PQ Q2 , ν = − , 2M p ν Mp

Q = k − k,

(32.22)

which instead of vanishing for large Q 2 , as the form factors, they become, to a first order approximation, independent of Q 2 . That is they become just functions of x < 1. This property is referred to as Bjørken scaling.10 This non-vanishing character of the structure functions seems to indicate the presence of almost point-like, i.e., structureless, particles within the proton, referred to as Feynman partons or just partons11 which almost do not interact with each other, consistent with asymptotic freedom, and the virtual photon interacts with each of its charged constituents independently and the scattering process may be then calculated from the incoherent sums of elastic scattering with the charged constituents. All the constituents of the proton, quarks and gluons, are termed partons. Although in deep-inelastic scattering one is involved with high energies the latter are practically finite. QCD corrections may be then implemented where partons may spit into partons,12 giving rise to radiative corrections, and Bjørken scaling becomes broken by calculable logarithmic corrections in Q 2 . Quantum field theory provides a picture of the proton not just consisting of three quarks but consisting of a multitude of partons as well. One of the interesting conclusions that is reached by such an elaborate analysis is that gluons carry about half of the momentum of the proton.13

References 1. Beringer, J., et al. (2012). Particle data group. Physical Review, D86, 010001. 2. Bjørken, J. D. (1969). Asymptotic sum rules at infinite momentum. Physical Review, 179, 1547–1553. 3. Bjørken, J. D., & Pachos, E. A. (1969). Inelastic electron-proton scattering and y-proton scattering and the structure of the nucleon. Physical Review, 185, 1975–1982. 4. Bloom, E. D., et al. (1969). High-energy inelastic e − p scattering at 6o and 10o . Physical Review Letters, 23, 930–935. 5. Breidenbach, M., et al. (1969). Observed behavior of highly inelastic electron-proton scattering. Physical Review Letters, 23, 935–939. 6. Feynman, R. P. (1969). Very high-energy collisions of hadrons. Physical Review Letters, 23, 1415–1417. 7. Friedman, J. I., & Kendall, H. W. (1972). Deep-inelastic electron scattering. Annual Review of Nuclear Science, 22, 203–254. 8. Gross, D. (2005). The discovery of asymptotic freedom and the emergence of QCD. Reviews of Modern Physics, 77, 837–849. 9. Manoukian, E. B. (1983). Renormalization. New York: Academic Press. 10. Manoukian, E. B. (2016). Quantum Field Theory I: Foundations and abelian and non- abelian gauge theories. Switzerland: Springer. 11. Patrignani, C., et al. (2016). Particle data group. Chinese Physics C, 40, 100001 (2016) and 2017 update. 12. Politzer, D. (2005). The dilemma of attribution. Reviews of Modern Physics, 77, 851–856. 13. Vanyashin, V. S., & Terentyev, M. V. (1965). The vacuum polarization of a charged vector field. Soviet Physics JETP, 21 375–380. [Original Russian Version: Zhurnal Experimental’noi i Teoreticheskoi Fiziki, 48, 565–573 (1965).] 14. Wilczek, F. (2005). Asymptotic freedom: From paradox to paradigm. Reviews of Modern Physics, 77, 857–870.

typical resonance is + of mass approximately 1.232 GeV. [2]; Bjørken Pachos [3]. 11 Feynman [6]. 12 For explicit pedagogical treatment of partons splitting see: Manoukian [10], pp. 437–444. 13 For a fairly detailed treatment of deep-inelastic electron-proton scattering see: Manoukian [10], pp. 429–456, which should be beneficial to the serious reader. 9A

10 Bjørken

33

Birth of the Electroweak Theory Prerequisites Chaps. 17, 18 and 26

The early work on the weak interaction theory was carried out by Enrico Fermi1 in an attempt to describe β decay: n → p + e− + ν˜ e , incorporating Wolfgang Pauli’s postulated particle in (1930), which Pauli2 called neutron, while Fermi referred to it later as the neutrino. Now this particle is referred to as the anti-neutrino due to the fact that different lepton numbers are assigned to the neutrino and anti-neutrino, consistent with other reactions, and due the opposite helicities carried by them. Pauli actually considered the mass of the neutrino to be very small but non-zero.3 Fermi described weak interactions, by taking the interaction Hamiltonian density to involve the product of four fermions at a point having the structure H I ∼ GF [(ψ 1 Oi ψ2 )(ψ 3 Oi ψ4 ) + h.c.] where the Oi are, in general, expressed in terms of gamma matrices, and GF is the Fermi coupling consistently given by GF = 1.166 × 10−5 /GeV2 . The latter constant has the dimensionality of [mass]−2 . Thus from what we have seen on renormalization theory in Chap. 30, his underlying theory is non-renormalizable. As a matter of fact, consider some low orders radiative contributions to the scattering of four fermions:

By assigning a dimension E −1 to every fermion propagator, and a dimension E 4 to every loop integration, we see that even when there are no derivative couplings, the degree of divergences of the diagrams, including radiative corrections, increase as 2 , 4 , ... with the orders of the interaction, as functions of an ultraviolet cut-off , with no bound. The theory becomes out of control to give it any sense at all. To improve this divergence problem, an attempt was then made to describe weak interactions by introducing, in the process, intermediate massive vector bosons as mediators of interactions with two fermions and one vector boson meeting at a point, similar to what one encounters in QED with a (massless) vector boson being the photon, instead of four fermions meeting at a point of the original Fermi theory. For example, the beta decay process n → p + e− + ν˜ e in the old Fermi theory becomes replaced as follows: p n

GF

p

(x) e−



n

(a)

g

(b)

(x)

(x ) W− g

e−

e e

where g is a dimensionless constant, and W − is a vector boson of negative charge which mediates an interaction between spacetime points x and x  , and is necessarily quite massive to be consistent with the Fermi description having a zero range of the interaction: δ (4) (x − x  ) with the four fermion fields meeting at a point in spacetime. As we have seen in Chap. 16, 2 . Upon comparison of both the propagator of a massive vector boson, for large mass MW , behaves like ημν δ (4) (x − x  )/MW descriptions above we may infer that 1 Fermi

[2]. e.g., Pauli [10], for the historical account. 3 In this respect, see, e.g., Perrin [11], and Fermi [2]. 2 See,

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_33

219

220

33

GF ≈

g2 . 2 MW

Birth of the Electroweak Theory

(33.1)

In the momentum description, the massive vector boson propagator behaves like k μ k ν /k 2 M 2 for k 2 → ∞ which is unlike that of its massless counter part which instead vanishes rapidly. This makes the degree of divergences in such theories, involving non-vanishing longitudinal parts, in general, increase with the orders of perturbation theory without bound. Consider for example radiative corrections to four fermions scattering, where the wavy line represents a massive vector boson:

Again by power counting by assigning, in the process, a dimension E 0 = 1, for each wavy line representing a massive vector boson, a dimension E −1 to every fermion propagator, and a dimension E 4 to every loop integration, we see that even when there are no derivative couplings, the degree of divergences of the diagrams, including radiative corrections, again increase, as in the Fermi theory, without bound in perturbation theory.4 Such a theory thus becomes again uncontrollable with no much hope to give it any sense and carry out practical computations as one does in QED through the renormalization programme. A different approach had to be taken to deal with the weak interactions as discussed next. The emergence of a gauge invariant theory, such as QED or its generalization, where the vector bosons of the theory are, a priori, massless so that problems arising from longitudinal components of massive ones are avoided, is to be expected. At earlier stages, Julian Schwinger was the first to come with the idea that vacuum expectation values of scalar fields could provide a way of breaking symmetries and generate masses for fermions5 and advanced as well the idea of the unification of weak and electromagnetic and weak interactions6 involving vector bosons.7 That such a unification may lead to a renormalizable theory was expressed by Glashow in his Ph.D. thesis in 1958 as a graduate student of Schwinger. In 1961, Glashow8 proposed an SU(2) × U(1) gauge symmetry for such a model. To deal with the problem of the masses of the vector bosons, Weinberg9 and independently Salam10 proposed a renormalizable theory in which the vector bosons are taken to be massless in the action integral, and acquire masses, by spontaneous symmetry breaking, via the so-called Higgs mechanism, where the Higgs boson field has a non-vanishing vacuum expectation value and gives rise to masses for particles. The associated photon field, however, remains massless, as it should, and the SU(2) × U(1) gauge symmetry is spontaneously broken to the U(1) gauge symmetry of phase transformations of electrodynamics. This theory is popularly known as the Weinberg–Salam– Glashow electroweak theory. By spontaneous symmetry breaking one means that although the underlying equations of the theory are symmetrical, they admit a solution which is not symmetrical. As we will see below and in the next chapter the development of, a priori, such symmetrical equations for the theory in question is not difficult. With initially massless vector bosons the theory behaves well in the light of renormalization theory. Masses of the vector bosons are then generated by the Higgs mechanism. Needless to say rather large masses of the vector bosons are necessary for consistency with the short range nature of the weak interactions dictated, a priori, by the Fermi description. The QED interaction as well as the interaction in the modified Fermi theory are mediated by vector bosons, with respective couplings ∼ e and ∼ g. In a unified description of the electromagnetism and the weak interaction, one expects that these two couplings to be comparable, i.e., 4 The

longitudinal part will not contribute if the vector boson is coupled to a conserved current. In QED, even if a longitudinal component arises, because of a specific choice of a gauge, this problem is eliminated in computing gauge invariant expressions because the photon is coupled to a conserved current. 5 See Martin and Glashow [9], p. 16. See also Johnson [6], p. 96. 6 Schwinger [14]. 7 The legendary Victor Weisskopf in: [17], p. 17, states that Schwinger was the first who suggested that the weak interactions should be interpreted as transmitted by boson fields and that his original idea initiated an impressive development that culminated in the unification of electromagnetic interactions. 8 Glashow [4]. 9 Weinberg [16]. 10 Salam [13].

33

Birth of the Electroweak Theory

221

g 2 ≈ e2 = 4π α,

where α ≈

1 , GF ≈ 1.166 × 10−5 /(GeV)2 . 137

(33.2)

From (33.1), (33.2), we may then estimate the mass of the W boson, by inserting in the process the speed of light c, to be  MW ≈

4π α ≈ 90 GeV/c2 , GF

(33.3)

which is in good agreement with the observed masses. Inserting the fundamental constants  and c, we may estimate the range of the weak interaction, from dimensional analysis corresponding to the exchange of a W boson, to be RW ≈

c ≈ 2.2 × 10−16 cm. MW c2

(33.4)

A key ingredient in the formulation of the weak interaction was that of parity violation, that is violation of symmetry breaking under space reflection. It was discovered in 1950 s,11 and it was observed that neutrinos are left-handed,12 that is its spin projected along its momentum points in the opposite direction to the momentum (hence the anti-neutrinos were right-handed). The leptonic current corresponding to the outgoing electron and the anti-neutrino in β decay, mentioned above, was taken as J μ = e γ μ (1 − γ 5 )ν instead of e γ μ ν. Left and right handedness and the role of parity are described in Box 16.6 of Chap. 16 as well as the need of the presence of the (1 − γ 5 ) matrix associated with projecting out “left-handedness”. To explain some processes which were observed not to happen, at some confidence level, such as μ− → e− + γ , and describe other processes, which happened, such as μ− → e− + ν + ν˜ , it was inferred that there are different 13 neutrinos associated with e− , μ− , and different quantum labelings (lepton numbers) were assigned to the pairs (νe , e− ), (νμ , μ− ), with electronic lepton-number : L e = +1, and muonic lepton-number : L μ = +1, as well as corresponding values of “−1” for their respective anti-particles. (See also Chap. 18 on particles). Separate conservation laws of electronic-type leptons and muonic-type leptons were then assumed. This, for example, explained why processes such as ν˜ μ + p → e+ + n were not observed, while processes such as ν˜ μ + p → μ+ + n did happen. Landau,14 Lee and Yang,15 Salam,16 and Wu et al.17 assumed that the neutrino is massless. By the late fifties, Feynman and Gell-Mann,18 and notably by Sudarshan and Marshak,19 suggested that all the particles massive and massless are to be taken left-handed in weak interactions theory generating what is called the V − A theory involving vector minus axial vector (charge changing) currents. The effective Fermi interaction Lagrangian density, based on four-fermions at a point, having the structure of a V − A theory, incorporating parity violation, turns out to provide a good description for μ decay (μ− → e− + ν˜ e + νμ ), and may be written as GF LF = √ Jα† J α , 2

  J α = e γ ρ (1 − γ 5 )νe + μ γ ρ (1 − γ 5 )νμ ,

(33.5)

from which the μ decay process may extracted, and the Fermi constant may be determined experimentally. After the discovery of parity violation, as discussed above, the charge exchanging interaction between leptons and hadrons, prior to the birth of the electroweak theory, was taken to have the structure   1 ρ √ GF e γρ (1 − γ 5 )νe + μ γρ (1 − γ 5 )νμ J h + h.c., 2

11 For

the experiment, see Wu et al. [18]. See also Chap. 18 for parity violation in the weak interactions. corresponding experiment was carried by Goldhaber et al. [5]. 13 The observation that there are different types of neutrinos was made by Danby et al. [1]. 14 Landau [8]. 15 Lee and Yang [7]. 16 Salam [12]. 17 Wu et al. [18]. 18 Feynman and Gell-Mann [3]. 19 Sudarshan and Marshak [15]. 12 The

(33.6)

222

33

Birth of the Electroweak Theory

involving left-handed leptons (and right-handed for their anti-particles), where J ρh is a hadronic current, with an inherit universality of the processes involving the pairs (νe , e− ) and (νμ , μ− ) with equal strength, as shown by the common coupling constant GF . Upon including the pair (ντ , τ − ), using the anti-commutativity of γ 5 with the gamma matrices, and using the notation 

ν

1 = −e e





νμ , 2 = μ− L





ντ , 3 = τ− L

 ,

(33.7)

L

where L stands for left-handed, and the fact that neutrinos carry no electromagnetic charge, we may write the charge leptonic-current as20 Jρ =



i γ ρ T − i ,

T ± = T 1 ± i T 2, T i =

i

1 i σ , 2

(33.8)

with the σ i denoting the Pauli matrices,21 not involving the neutrinos as is easily verified. Invoking different lepton-number conservation, and due to the very structure of the charge current in (33.8), these left handed lepton pairs are considered, in a simplest possible way, as doublets of an SU(2) group, referred to as weak iso-doublets, with the latter group often denoted by SU(2)L to remind us that these lepton doublets are left-handed. In setting up the original electroweak theory, the neutrinos were assumed to be massless, as we do here,22 and, as seen in Box 16.6 in Chap. 16,23 parity violation allows only one helicity for them and the underlying theory would not involve right-handed neutrinos. Accordingly, the right-handed leptons eR− , μ− R, and τR− , respectively, would not have right-handed neutrinos to be paired with and they are considered as singlets under the group. For convenience, we denote these charged right-handed leptons by ri , i = 1, 2, 3. The SU(2) transformation rules of the (left-handed) iso-doublet i , and of the singlets ri are then simply given by (see Eqs. (17.9), (17.15) and Box 17.2)

i → exp[ i ϕ ·σ /2 ] i ,

ri → ri .

(33.9)

The weak isospin quantum number of the doublets is given by IL = 1/2, with the third components of weak isospin for the upper and lower entries given by I3 = ±1/2, respectively. On the other hand, IR = 0. In terms of the electromagnetic charge Q and the third components of the weak isospin I3 , we have the following equation for the left-handed (L) and right-handed (R) leptons: 1 L : Q = I3 − , 2

R : Q = I3 − 1 (I3 = 0).

(33.10)

This set of equations certainly calls for a new quantum number, which we denote by Y , and rewrite the above equations simply as Q = I3 +

1 Y, 2

L : Y = −1,

R : Y = −2.

(33.11)

This new quantum number is called weak hypercharge. In turn, this suggests to introduce the new symmetry group U(1)Y to account for this quantum number. The combined symmetry group generated thus far is then SU(2)L × U(1)Y , with Y corresponding to the weak hypercharge. Since the weak hypercharge of the right-handed leptons ri is twice as that of the left-handed ones i , we have the following transformation rules corresponding to the U(1)Y group as phase transformations r → e−i β r,

→ e−i β/2 .

this chapter, we avoid using the standard notation τ = σ /2 to avoid confusion with the notation τ used for the tau particle. Chap. 17, we have seen that the Pauli matrices (divided by two) are generators of SU(2) transformations. 22 The problem of the neutrino masses, together with the fact they are small, is addressed in the next chapter. 23 Parity exchanges left-handedness ⇔ and right-handedness as seen at the end of Box 16.6. 20 In 21 In

(33.12)

33

Birth of the Electroweak Theory

223

Promoting the transformations in (33.9), and (33.12) to local ones necessitates the introduction of four gauge vector fields: ρ ρ ρ three vector bosons W1 , W2 , W3 for SU(2)L group, and one vector boson B ρ for U(1)Y , group to restore the symmetry in a lagrangian density involving the above mentioned leptons. From what we have learnt from QCD, except now we are working with an SU(2) group and we are dealing with three vector bosons instead of eight, as well as of the observation that electromagnetic charge in QED is now replaced by weak hypercharge, one may readily write down the Lagrangian density for the leptons and the vector bosons system. This will be done in Chap. 35 on the intricacies of the electroweak theory. In summary, we intend to write down first a Lagrangian density, involving the leptons, and the four massless vector bosons, which is invariant under the symmetry group: SU(2)L × U(1)Y . We will see that this necessitates to take the initial masses of the charged leptons equal to zero as well. By the Higgs mechanism, three massive vector bosons: Wμ+ , Wμ− , Z μ0 and one massless vector boson Aμ , identified with the photon, arise in the theory, the charged leptons develop masses and a massive Higgs boson excitation emerges as well. The initial symmetry becomes broken to that of the QED symmetry: U (1)e . The neutrinos remain massless. A mechanism, however, will be discussed by which masses may be generated for the neutrinos, and finally the incorporation of quarks in the theory will be described. The vacuum expectation value of the Higgs field sets the energy scale of the electroweak symmetry breaking. Although weak isospin is not conserved, at the present stage of the Universe, it is expected that it was at its earlier stages when it was at a much higher temperatures (high energies), and particles were massless, moving with the speed of light, unable to form atoms and molecules. This will bring us to so-called grand unified theories (GUTs) which will be introduced in Chap. 37. This will be also discussed in the light of inflation in cosmology, and the latter will, in turn, be considered in details in Chaps. 88–90.

References 1. Danby, G., et al. (1962). Observation of high-energy neutrino reactions and the existence of two kinds of neutrinos. Physical Review Letters, 9, 36–44. 2. Fermi, E. (1934). Tentativo di una teoria dei raggi β. Nuovo Cimento, 11, 1–19. 3. Feynman, R. P., & Gell-Mann, M. (1958). Theory of Fermi interaction. Physical Review, 109, 193–198. 4. Glashow, S. L. (1961). Partial symmetries of weak interactions. Nuclear Physics, 22, 579–588. 5. Goldhaber, M., et al. (1958). Helicity of the neutrinos. Physical Review, 109, 1015–1017. 6. Johnson, K. (1996). Julian Schwinger - Personal recollections. In Y. Jack Ng (Ed.), Julian Scwhinger - The physicist, the teacher, and the man (p. 96). Singapore: World Scientific. 7. Lee, T. D., & Yang, C. N. (1956). Question of parity conservation in weak interactions. Physical Review, 104, 254–258. 8. Landau, L. D. (1957). On the conservation laws for weak interactions. Nuclear Physics, 3, 127–131. 9. Martin, P. C., & Glashow, S. L. (2008). Julian Schwinger 1918–1994: A biographical memoir. National Academy of Sciences. Washington, DC, Copyright 2008 (p. 16). 10. Pauli, W. (1957). On the earlier and more recent history of the neutrino. In K. Winter (Ed.), Neutrino physics. Cambridge: Cambridge University Press. 11. Perrin, F. (1933). Possibilité d’emission de particules neutres de mass nulle dans les radioactivités β. Comptes Rendus Academie des Sciences Paris, 197, 1625. 12. Salam, A. (1957). Parity conservation and a two-component theory of the neutrino. Nuovo Cimento, 5, 299–301. 13. Salam, A. (1968). Weak and electromagnetic interactions. In N. Svartholm (Ed.), Elementary Particle Theory, Proceedings of the 8th Nobel Symposium, Almqvist and Wiksell, Stockholm. 14. Schwinger, J. (1957). A theory of fundamental interactions. Annals of Physics (NY), 2, 407–434. 15. Sudarshan, E. C. G., & Marshak, R. (1958). Chirality invariance and the universal Fermi interaction. Physical Review, 109, 1860–1862. 16. Weinberg, S. (1967). A model of leptons. Physical Review Letters, 19, 1264–1266. 17. Weisskopf, V. F., & (1980). Growing up with field theory, and recent trends in particle physics. The 1979 Bernard GregoryLectures at CERN, 29 pages. Geneva: CERN. 18. Wu, C. S., et al. (1957). Experimental tests of parity conservation in beta decay. Physical Review, 105, 1413–1415.

The Higgs Field and the God Particle Prerequisite Chaps. 15, 33

34

Have you ever wondered why QED does not require a Higgs field to generate a mass for the electron and one introduces a mass scale (unrenormalized) mass, at the outset, in the Lagrangian density for which the theory eventually leads to a consistent description of electron-photon interactions in conformity with experiments?. Why do we seem to need a Higgs field in the electroweak theory? Why does the Higgs field generate a massive Higgs boson excitation? or equivalently why does the Higgs boson acquire a mass as well as other particles and is not massless in the Lagrangian density? All of these questions have a common answer which we will investigate. Have you ever thought why one must start with a Lagrangian density for the electroweak theory in which all the particles are massless?. From the last chapter, perhaps you can guess why the vector bosons have to be a priori massless. But why do the fermions of spin 1/2 have to be taken a priori massless as well? Finally, why we do not worry much if quarks are not observed in experiments, as such, but the observation of the Higgs boson signal in experiments1 was and is imperative?. We will address these questions in the next chapter, after Eq. (35.24), when we see the Higgs field in action. Regarding the Higgs mechanism, at its earlier developments, the legendary Victor Weisskopf, who had been the Director of CERN in 1960s, did not seem to be impressed by this particular way of generating masses. In his CERN publications,2 he once remarked that “this is an awkward way to explain masses and that he believes that Nature should be more inventive, but experiments may prove him wrong”. But the Higgs mechanism as we know it now works rather well and the electroweak theory is in good agreement with experiments but open, of course, to modifications like almost any other theory. A basic mechanism for the production of the Higgs boson at the Large Hadron Collider (LHC) at CERN has been of gluon-gluon fusion (gluon+gluon → H 0 ) as arising from the internal structures of the protons in which the gluons couple to the Higgs boson via heavy-quark loops such as3

p

n

gl

q

uo

uo

q gl

n

H0 q

p

The Higgs boson decays, for example, into 2 photons (H 0 → γ γ ), as well as by other processes such as into two W bosons or two Z bosons. The W, Z particles, decay almost immediately, and the decay into two photons becomes important for the study involving the Higgs boson. Since photons are massless, the Higgs boson cannot directly interact with photons but it may proceed by the emission of a virtual fermion-antifermion pair first as shown here:

1 For

the experiments concerning the observation of the Higgs boson signal, see, Aad et al. [1]; and Chatrchyan et al. [2]. For the underlying theory see the Nobel Lectures of Englert [4] and Higgs [6], and references therein. 2 Weisskopf [8] on page 7, 11th line from below. In “Growing up with field theory, and recent trends in particle physics”. The 1979 Bernard Gregory Lectures at CERN, 29 pages. CERN: Geneva. 3 For the underlying theory of the production, see Georgi et al. [5]. © Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_34

225

226

34

γ

f H0

The Higgs Field and the God Particle

f f

γ

number of events

The search of the Higgs boson is based on the head-on collision of high energy protons. Protons may be generated by ripping hydrogen atoms apart by the application of a strong electric field making the protons and electrons move in opposite directions. The LHC has presently the capability of operating at energies of 13 TeV. Each proton may be then accelerated to 6.5 TeV in opposite directions which then collide head-on. Energy momentum conservation of a Higgs boson of energymomentum p decaying into two photons of energy-momenta k1 and k2 is given by p = k1 + k2 , with k1 + k2 = 0 in the rest frame of the particle, andfrom your knowledge of spacial relativity (see Chap. 13), the mass m H 0 of the Higgs boson is then 2 0 simply given by m H = − p = −(k1 + k2 )2 , where p 2 = ημν p μ p ν = −m 2H 0 . A theoretical description for the detection of the Higgs boson and predicting its mass and its spin may be given as follows. Through proton-proton collisions as just described, you may consider all those processes giving rise to two photons: p p → 2γ + X , where X stands for any other particles created. You may then count the number of events which give rise to two photons with invariant mass m γ γ having values within selected energy bands centered around various energies, e.g., around 100 GeV,…,125 GeV,…,150 GeV,…. Fairly detailed data taken corresponding to various energy bands for values less than and greater than the one centered around the energy of 125 GeV, and a smooth extrapolation between the number of events, covering the latter range of energies, would give rise to the expected distribution as if there was no Higgs boson decaying into two photons, referred to as the background. The presence of a bump, referred to as the signal, at the energy band centered around 125 GeV, indicates the presence of the Higgs as a peak at a mass of about 125 GeV: signal

background smooth extrapolation 100

125 GeV

150

mγγ

Due to the probabilistic nature of our quantum world, the number of events giving rise to two photons, within an energy band, will change from one set of measurements to another. Accordingly the number of events are to be replaced by the probability of having an event where two photons are produced with a given energy. Experiments of this has sort have been carried.4 Presently the available energy is 13 TeV to be compared to the rearlier one of only 7 TeV, showing indeed such a peak at about 125 GeV, for the decay of the Higgs boson into γ γ and into other pairs of particles as well such as of Z Z bosons.

4 Chatrchyan

et al. [2, 3].

34

The Higgs Field and the God Particle

227

Box 34.1 Decay into two photons and characteristics of spin 0 and spin 1 Bosons. Here p = ( p1 , p2 , p3 ) Consider the state of two photons | p i |− p j of momenta p, −p with i and j in | . i, j specify the components of the their polarization states, pi | ± pi = 0 (∗ ). In general a 2 photon state  d3 p with total angular momentum zero may be written as |2γ  = Ai j (p) | p i |− p j , (2π )3 2|p| ∗ i j and because of condition ( ), Ai j (p) cannot be proportional to p , p , hence quite generally Ai j (p) = A(|p|) δi j + B(|p|) εi j k p k ,

Ai j (−p) = A(|p|) δi j − B(|p|) εi j k p k . Under parity  d3 p transformation: P | p i |− p j = |− pi | p  j , P|2γ  = Ai j (−p)| p i |− p j . (2π )3 2|p| • Therefore for a scalar particle decaying into two photons Ai j (−p) = Ai j (p), i.e.,

Ai j (p) = A(|p|) δi j , and the polarizations of the two photons are parallel, P|2γ  = +|2γ . • Therefore for a pseudo-scalar particle decaying into two photons Ai j (−p) = −Ai j (p), i.e., Ai j (p) = B(p) εi jk p k , and the polarizations of the two photons are perpendicular, P|2γ  = −|2γ . Attempt to write a 2 photon state with total spin one of bosons:  d3 p d3 p (p) | p  |− p = A Aik j (p) | p k |− p j |2γ i = i jk j k (2π )3 2|p| (2π )3 2|p|   d3 p d3 p = Aik j (−p) |− p k | p  j = Aik j (−p) | p  j |− pk , 3 (2π ) 2|p| (2π )3 2|p| i.e., Ai j k (p) = Ai k j (−p) (∗∗ ). Moreover pi ei (±p) = 0, Ai jk (p) cannot be proportional to 

p j , p k , i.e., Ai j k (p) = A(|p|) εi jk + B(|p|) εj k pi p  + C(|p|) pi δ j k , Ai j k (−p) = A(|p|) εi j k + B(|p|) εj k pi p  − C(|p|) pi δ j k , (εi j k = −εik j )   i  = − A(|p|) εik j + B(|p|) εk j p p + C(|p|) pi δ j k = −Ai k j (p). i.e. Ai j k (−p) = −Ai k j (p) (∗∗∗ ). From (∗∗ ) and (∗∗∗ ), we may infer that Ai j k (p) ≡ 0. •Therefore a spin one particle cannot decay into two photons.

Irrespective of the fact that a Higgs field may have a non-vanishing expectation value due to its scalar nature, what one can say about the spin of a particle decaying into two photons ?. It cannot be of spin one, simply because a spin one particle cannot decay into two photons. Angular momentum conservation alone eliminates a spin 1/2. Polarizations of the two photons would indicate whether the spin 0 particle is a scalar or a pseudo-scalar. We recall that polarization is defined by the oscillating electric field perpendicular to the momentum of light. For a pseudo-scalar particle, the planes of polarizations of the two photons turn up to be perpendicular to each other, while for a scalar one they are parallel to each other. In particular, we learn that the elimination of the Higgs boson as being of spin 1 is immediate. These detailed aspects of a decaying particle into two photons are worked out in Box 34.1. The Higgs boson is also popularly known as the “God Particle”.5 It appears as a momentary apparition leaving behind, sometimes, two photons of light. The Higgs field, denoted say by χ (x), spreads through space and has an oscillator character, about its vacuum expectation, with the Higgs boson as a Higgs field excitation as shown in Box 34.2. Due to the very nature of its scalar nature, it may develop a non-vanishing vacuum expectation value: vac|χ (0)|vac ≡ χc (see Chap. 15). With P denoting the energy-momentum operator corresponding to the particle spectrum of the underlying theoretical description, translation invariance of the vacuum

5 This

name was given by Lederman and Teresi [7].

228

34

The Higgs Field and the God Particle

Box 34.2 Higgs field vibration about its vacuum expectation value, and the Higgs Boson excitation Using Eqs. (15.1)–(15.6) of Chap. 15, we may write in a more convenient form, for the problem  at hand χ(x) = e−ix Pχ(0) eix P , and eix P = [(d p)/(2π )4 ] eix p ( p), where  ∞ ˜ p), θ( p 0 ) = 1 for p 0 ≥ 0, = 0 for p 0 < 0. dμ2 (2π ) δ( p 2 +μ2 ) θ( p 0 ) (

= (2π )4 δ (4) ( p)|vacvac| + 0  ∞ ˜ p)χ(0)|vacvac| Hence, χ(x) = |vacχc vac| + [(d p)/(2π )4 ] e−ix p dμ2 (2π ) δ( p 2 +μ2 ) θ( p 0 ) (   0∞ ˜ p) + [(d p)/(2π )4 ] e+ix p dμ2 (2π ) δ( p 2 +μ2 ) θ( p 0 )|vacvac|χ(0) ( 0    (d p) (d p ) ix( p − p) ∞ 2 ∞ 2 + e dμ dμ (2π )2 δ( p 2 +μ2 ) δ( p 2 +μ 2 ) θ( p 0 ) θ( p 0 ) ( p) χ(0) ( p ). (2π )4 (2π )4 0 0 This shows the oscillatory character of the Higgs field about its vacuum expectation value: χc = vac|χ(0)|vac. The translational invariance of the vacuum is necessary so that the mass of a particle, such as of the electron, is the same everywhere. We may readily isolate the contribution of the Higgs boson creation out of the vacuum by the Higgs field or its annihilation into the vacuum by considering the following contribution to ( p):  (2π ) δ( p 2 + m 2H ) θ( p 0 ) |p, νp, ν|, and using the identity: δ( p 2 + m 2H ) ν  = (1/2E)[δ( p 0 − E)+δ( p 0 + E)], E = + p2 +m 2H , to obtain from the expression of χ(x) above,     d3 p |p, ν e−ix p p, ν|χ(0)|vac vac| χ(x) = |vacχc vac| + 3 0 (2π ) 2 p ν     d3 p + |vac e+ix p vac|χ(0)|p, ν p, ν| + · · · , 3 0 (2π ) 2 p ν where the second term on the r.h.s of the equation denotes the creation of a Higgs boson out of the vacuum by the Higgs field giving rise to the state |p, ν, and the third term represents the annihilation of a Higgs boson leading to the vacuum state |vac. The remaining terms, denoted by the dots, necessarily represent the creation or annihilation of multiparticles combinations. Finally, the first term corresponds to the situation in which the vacuum state is retained.

state as described by eix P |vac = |vac is necessary in order to generate a constant vacuum expectation so that the mass it gives to a particle, such as of the electron, be the same everywhere. The stronger a particle interacts with Higgs field the more massive a particle gets. On the other hand, the photon does not interact directly with the Higgs field and it remains massless moving with the constant speed of light in conformity with relativity that the speed of light in vacuum is a constant. As one moves back in time to much earlier ones, at which the Universe was at much higher temperatures, it is expected that conditions were not favorable for a Higgs field to have a non-vanishing vacuum expectation value and existing particles where massless moving with the speed of light unable to combine and form atoms and molecules.

References 1. Aad, G., et al. (2012). Observation of a new particle in the search for the standard model Higgs boson with the ATLAS detector at LHC. Physics Letters B, 716, 1–29. 2. Chatrchyan, S., et al. (2012). Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC. Physics Letters B, 716, 30–61. √ 3. Chatrchyan, S., et al. (2012). Search for the standard model of the Higgs boson decaying into two photons in pp collisions at s = 7 TeV. Physics Letters B, 710, 403–425. 4. Englert, F. (2014). The BEH mechanism and its scalar boson. Reviews of Modern Physics, 86, 843–850. 5. Georgi, H. M., et al. (1978). Higgs bosons from two-gluon annihilation in proton-proton collisions. Physical Review Letters, 40, 692–694. 6. Higgs, P. W. (2014). Evading the Goldstone theorem. Reviews of Modern Physics, 86, 851–853. 7. Lederman, L., & Teresi, D. (2006). The God particle: If the Universe is the answer, what is the question?. New York: Mariner Books. 8. Weisskopf, V. F. (1979). Personal impressions of recent results in particle physics. CERN Ref. Th 2732, on page 7, 11th line from below. In Growing up with field theory, and recent trends in particle physics. The 1979 Bernard Gregory Lectures at CERN, 29 pages. CERN: Geneva.

The Elecroweak Model and the Significant Role of the Higgs Field

35

Prerequisites Chaps. 17, 23, 33 and 34

In this chapter we carry out the details of the electroweak theory which is based on the SU(2)L × U(1)Y symmetry, involving the leptons and the vector bosons, elaborated upon already in the previous chapter. This follows by a brief discussion on how the quarks are included in the theoretical description. We also elaborate on the masses of neutrinos and neutrino oscillations which account for their masses. To write down the Lagrangian density of the electroweak theory and eventually describe spontaneous symmetry breaking, we recall from Chap. 33, that to start with, we have three SU(2)L left-handed doublets  i =

νi ei

 , ν1 = νe , e1 = e; ν2 = νμ , e2 = μ; ν3 = ντ , e3 = τ.

(35.1)

L

of weak isospin I = 1/2, and three singlets eR , μR , τR . This group requires 22 − 1 = 3 generators (see Chap. 17), which are given by (σ1 /2, σ2 /2, σ3 /2), where the σi are the Pauli matrices, and three real (non-abelian) gauge vector fields W1μ , W2μ , W3μ . In terms of the electromagnetic charge Q and the third component I3 , we have the following additive expressions for the left-handed (L) and right-handed (R) leptons: Q = I3 +

1 Y, 2

L : Y = −1,

R : Y = −2,

(35.2)

calling for a new quantum number Y referred to as weak hypercharge as given in Eq. (33.1). This necessitates the introduction of an additional (abelian)gauge vector field Bμ associated with a group U(1)Y . Thus generating the symmetry group SU(2)L × U(1)Y . The properties of the leptons are summarized in Table 35.1 By introducing two coupling constants g and g  , corresponding to the two groups, we may immediately write down the SU(2)L × U(1)Y Lagrangian density of massless leptons and massless vector bosons system  1 1 1  i γ ρ Dρ i + r i γ ρ D ρ ri + h.c. , L1 = − G aμν G aμν − Fμν F μν − 4 4 2 i g ∂ρ σa ∂ρ + Bρ − g Wa ρ , D ρ = + g  Bρ , (σa Waμ ≡ σ · Wμ ), i 2 2 i = ∂μ Bν − ∂ν Bμ , G cμν = ∂μ Wcν − ∂ν Wcμ + g εa bc Waμ Wbν ,

(35.3)

Dρ =

(35.4)

Fμν

(35.5)

where a, b, c = 1, 2, 3, and εa bc , totally antisymmetric with ε123 = +1, are the SU(2) structure constants. (See Chaps. 17, 23 and 33). We will see below that in order to have a complete symmetric theory all the leptons have to be taken to be massless as well. The SU(2)L × U(1)Y transformation rules of the components of the theory which leave the Lagrangian density L1 in (35.3) invariant are spelled out in Box 35.1.1 We consider the physical interpretation of the above Lagrangian density. In particular, we will learn that neither W3μ , nor Bμ get definite masses by the Higgs mechanism, and we have to consider linear combinations of the two real vector fields. We define 1 This

incudes the transformation rule for the scalar doublet.

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_35

229

230

35

The Elecroweak Model and the Significant Role of the Higgs Field

Table 35.1 Electroweak quantum numbers of leptons. The anti-particles have the same I but opposite Q, I3 values (for Q = 0, I3 = 0), and Y = 2(Q − I3 ) Particle I I3 Q Y (νi , ei )L eiR

(+1/2, −1/2) 0

1/2 0

(0, −1) –1

–1 –2

  1  1  Wμ = √ W1μ + iW2μ ≡ W −, Wμ† = √ W1μ − iW2μ ≡ W +, tan θW = g  /g 2 2 Aμ = W3μ sin θW + Bμ cos θW , Z μ = −W3μ cos θW + Bμ sin θW ,

(35.6) (35.7)

where θW is referred to as the Weinberg angle or the weak angle. We also use the identity (a = 1, 2, 3) σa Waμ





1 σ1 − iσ2 1 σ1 + iσ2 Wμ + √ Wμ† + σ3 W3μ , =√ 2 2 2 2

(35.8)

to rewrite the purely interaction part in (35.3) between the leptons and the vector bosons as: L1

 − g sin θW ei γ ρei Aρ = i

  − g sin θW ei γ ρ αe −βe γ 5 )ei + αν ν i γ ρ 1 − γ 5 )νi Z ρ

g ρ ρ † + √ eiL γ νiL Wρ + νiL γ eiL Wρ , 2

(35.9)

where we recognize the interactions on the first line are just QED interactions, and hence for the electron charge we have e = −g sin θW . In detail, we have 1 (3 tan θW − cot θW ), 4 1 1 βe = −αν , αν = (tan θW + cot θW ) = . 4 4 sin θW cos θW − e = g sin θW = g  cos θW , αe =

(35.10) (35.11)

It is interesting to note that once the angle θW is determined, then all the parameters above may be expressed in terms of e and θW . On the second line in (35.9), Z ρ is coupled to both left-handed and right handed (charged and uncharged ) leptons through a neutral current, and is not a pure V − A type of a current. Box 35.1 SU(2)L × U(1)Y transformation rules i → exp[−i β(x)/2 ] exp[ i ϕ(x) · σ /2 ] i , ri → exp[−i β(x)] ri , i = 1, 2, 3.

1 1 i σ · Wμ → exp[ iϕ(x) · σ /2 ] σ · Wμ + ∂μ exp[−iϕ(x) · σ /2 ]. 2 2 g

1 1 σ · Gμν → exp[ iϕ(x) · σ /2 ] σ · Gμν exp[−iϕ(x) · σ /2 ]. 2 2  +  + ∂μ β(x) φ φ . Bμ → Bμ + → exp[ i β(x)/2 ] exp[ i ϕ(x) · σ /2 ] . φ0 φ0 g

We note for a spinor field ψ, the combination ψψ, may be expressed as ψψ = ψL ψR + ψR ψL ,

 1 ψR/L = I ± γ 5 ψ , 2

(35.12)

35

The Elecroweak Model and the Significant Role of the Higgs Field

231

and we note that (i ri + r i i ), is not invariant under the transformation rules in Box  Hence no leptonic mass may be  35.1. φ+

included initially the Lagrangian density. We may however introduce a doublet Φ = φ 0 with transformation rule as given in Box 35.1, of weak hypercharge Y = 1, consisting of two scalar fields φ + , φ 0 and their adjoints, and introduce instead of a mass term a Yukawa type interaction with each term in (35.12) corresponding to the leptons which restores the SU(2)L × U(1)Y symmetry:

LΦ = −

  νi   φ +  λi e + h.c. . iR ei L φ 0

(35.13)

i

Such a term gives rise to an interaction of scalar fields (dotted lines) with spin 1/2fields (solid lines) such as shown, for example, on the left-hand box diagram which by power counting (E −1 )4 × E 4 = 1 is logarithmically divergent, and a contact term Φ 4 is to be added, for consistency to counteract such a potentially divergent term, as shown on the right-hand side diagram:

Accordingly, renormalizability implies to introduce, in general, a potential2 for the self-interaction of Φ as follows:

V ( ΦΦ ) = −

m2 † 1 Φ Φ + λ (Φ † Φ)2 , 2 4

m 2 > 0, λ > 0,

(35.14)

where a positive λ is chosen so that the energy of the system doesn’t become arbitrarily large and negative, and a negative sign multiplying the constant m 2 in (35.14) is chosen, which is opposite to the sign of a mass term in a corresponding Lagrangian density, so that a stationary point for the potential is achieved giving rise to a non-vanishing expectation value of the Higgs field. Such a potential is just what is needed for generating masses to particles via spontaneous symmetry breaking. We may now subtract V ( ΦΦ ) from the Lagrangian density in (35.3). In the no-radiative approximation (referred to as a tree approximation), the stationary point of the potential is given by v2 vac|Φ |vacvac|Φ|vac ≡ , 2 †

v and √ = 2

m2 . λ

(35.15)

From the SU(2)L × U(1)Y transformation rule of the doublet Φ given at the end of Box 35.1, the kinetic term −|(∂μ /i)Φ|2 of the doublet, in a Lagrangian density, becomes replaced by the invariant expression:  ∂  2 g σa   μ Lvec.bosons,fl = − − g Waμ − Bμ Φ  , i 2 2

(35.16)

consistent with the transformation rules in Box 35.1. Now according to the analysis in Box 35.2, we may always carry out (0 χ ) , for which χ is real, referred to an SU(2)L × U(1)Y : exp[iβ/2]exp[iϕ · σ /2], gauge transformation (φ + φ 0 ) → √ √ as a unitary gauge. According to (35.15), we may then write χ (x) = (v + ρ(x))/ 2, where vac|χ (x)|vac = v/ 2, and vac|ρ(x)|vac = 0. The masses acquired by the vector bosons are then readily obtained from the expression

2A

cubic term clearly violates the symmetry in consideration (see Box 35.1).

232

35

The Elecroweak Model and the Significant Role of the Higgs Field

 ∂  0 2 g σa  μ √  − gWaμ − Bμ − v/ 2 mass i 2 2   2     1  g Wμ† v 1 0  = − √ +  0 2 2 2 −(gW3μ −g Bμ )v   v2  g2 † ρ 1 2 2 Wρ W + (g sec θW )Z ρ Z ρ . =− 2 2 4

(35.17)

The acquired masses of the vector bosons are then given by (g  /g = tan θW ) MW =

v |g|, 2

MZ =

v |g sec θW |, 2

Mγ = 0,

(35.18)

where we note that Wρ is a complex field, while Z ρ is a real one, which explains the presence of the overall 1/2 factors in the above equations. As the photon field Aρ does not appear on the extreme right-hand side of (35.17), the photon remains massless. On the other hand, for the Higgs boson H 0 excitation we have   −V (Φ Φ) †

 2  λ λ 1 2 2 2 2  = − (ρ +v) ρ + 2ρv−v  = − λv ρ , ⇒ M H 0 = v , mass mass 16 4 2

(35.19)

where λ is undetermined.3 Moreover from (35.13) the masses acquired by the charged leptons are given by v m i = λi √ . 2

(35.20)

We have started with two complex scalar fields φ + , φ 0 and their adjoints, that is equivalently with four real scalar fields, and four massless vector bosons. We were able to gauge away three fields and generated, Box 35.2 Achieving a unitary gauge Let ϕ = |ϕ|(n 1 , n 2 , n 3 ), n 21 + n 22 + n 23 = 1. Then exp[ iβ/2 ] exp i ϕ · σ /2   cos(|ϕ|/2) + i n 3 sin(|ϕ|/2) (i n 1 + n 2 ) sin(|ϕ|/2) . = exp[ iβ/2] (in 1 − n 2 ) sin(|ϕ|/2) cos(|ϕ|/2) − in 3 sin(|ϕ|/2) φ +  Consider the expression: exp[ iβ/2 ] exp i ϕ · σ /2 : φ0 • For φ + = 0, choose ϕ = 0, and β such that exp[ iβ/2 ]φ 0 is real. • For φ 0 = 0, choose ϕ = π(0, −1, 0), and β such that exp[ iβ/2 ]φ + is real. • For φ + = 0. Let φ + = |φ + | exp [i δ+ ], φ 0 = |φ 0 | exp [i δ0 ]. Choose

ϕ = |ϕ|(n 1 , n 2 , 0), cot(|ϕ|/2) = |φ 0 /φ + |, −(i n 1 + n 2 ) = expi[δ+ − δ0 ], and β/2 = −δ0 . This analysis implies that we can always find locally a ϕ and a β such that :  +   φ 0 , and χ is real. = exp[− iβ/2 ] exp − i ϕ · σ /2 0 χ φ

in turn, three massive vector bosons and one massless vector boson. That is, three degrees of freedom associated with the three fields gauged away were lost and three degrees of freedom were gained to create the additional three degrees of freedom associated with the longitudinal components of the three massive vector bosons. The three bosonic fields gauged away are referred to as Goldstone bosonic fields. The SU(2)L × U(1)Y gauge symmetry is spontaneously broken to the U(1)em gauge symmetry of phase transformations of electrodynamics. √ The value of the v in the vacuum expectation value of the Higgs boson field v/ 2 may be readily estimated as follows. αβ 2 at low By using the fact the propagator of the now massive vector boson W propagator + (q 2 ) approaches: ηαβ /MW one is to rely on the validity of perturbation, and restricts λ  (0, 1), then formally for λ 1/2, for example, (35.19)/(35.22) give the value M H 123 GeV/c2 comparable to the experimentally quoted value of 125 GeV/c2 . The coupling is, however, constrained by renormalization restrictions.

3 If

35

The Elecroweak Model and the Significant Role of the Higgs Field

233

2 energies |q 2 | = u, c, t. q< = d, s, b. The antiquarks have the same I , but opposite Q, I3 (for I3 = 0), and Y = 2(Q − I3 ). Each of the quarks comes in three different colors Particle I I3 Q Y (q> L , qR , q γ ρ q> − q< γ ρ q< . 3 q 3 q >

(35.32)


0, i.e., it is strictly positive, since vac|Q a† Q a |vac > 0, at least for an a. The fact that in Eq. (39.14) that [ Q a , P 2 ] = 0, P 2 = −Mass2 implies that particles and their superpartners have the same masses in a SUSY theory. On the other, Eq. (39.13) implies that [ Q a , Jμν ] = 0, i.e, different spins arise in a given supermultiplet as is investigated next. Supermultiplets A supermultiplet may be identified by a given spin index, say, j. Now we want to find all the particles of a given fixed mass m, that are grouped together in such a supermultiplet and find out their spins, for a given j. To the above end, we will see that the Super-Poincaré algebra in Eqs. (39.12)–(39.17) contains all the information we need to investigate the nature of a supermultiplet identified with a given spin j: We first consider a supermultiplet of particles with a given non-zero mass m. Consider a particle at rest: P μ = (m, 0). It will be shown that simultaneous eigenstates |m, j, σ  √ of P 2 , J2 , J 3 may be chosen such that Aa |m, j, σ  = 0, for a = 1, 2, where Aa = Q a / m. On the other hand, we may use the identity (σ = (σ1 , σ2 , σ3 )) 1 JAa† = A†b (Jδba + σ ba ) in Box 39.2, and the facts that (A†1 )2 = 0 = (A†2 )2 , 2   1 1 [J, A†1 A†2 ] = A†1 A†2 σ 11 + σ 22 = A†1 A†2 Tr[σ ] = 0. That is JA†1 A†2 = A†1 A†2 J. 2 2

(39.22) (39.23)

Hence |m, j, σ , A†1 A†2 |m, j, σ , A†1 |m, j, σ , A†2 |m, j, σ , are mutually ortho-normalized states due to the facts that Aa |m, j, σ  = 0, and as shown in Box 39.2, {Aa , A†b } = δab , a, b = 1, 2, and (A1 )2 = 0, (A2 )2 = 0, where Aa refers to the operator or its adjoint. Due to the Grassmann property of the spinor generators just given (Aa )2 = 0, and the anti-commutation relations {Aa , A†b } = δab , a, b = 1, 2, no other states may be constructed by the applications of these generators and their products on the already obtained states above as they would either be zero or proportional to the preexisting ones. According to the last equation in (39.23), the state |m, j, σ  = A†1 A†2 |m, j, σ  has the same spin as the state |m, j, σ . On the other hand, according to the first equation in (39.22), involving the operators A†1 , A†2 , we note that J + σ /2 represents the addition of two angular momenta one of which corresponds to spin 1/2. Hence we may conclude that a supermultiplet of spin index j and fixed non-zero mass m is generated as follows: • Starting with a (2 j + 1)-component column matrix {|m, j, σ } of spin j > 0, with a given fixed non-sero mass m, we may generate another (2 j + 1)-component column matrix of spin j, as well as a 2( j + 1/2) + 1 - component column matrix of spin j + 1/2, and a 2( j − 1/2) + 1 - component column matrix of spin j − 1/2. That is, the following spin content emerges: ( j, j, j +1/2, j −1/2) for j > 0. In total, we have 2(2 j +1)+(2 j +2)+2 j = 4(2 j +1) spin states. A spin j will be fermionic or bosonic type depending whether j is a half-odd integer or an integer. In either case, the total number of components of 2 See Eqs. (4.9), (4.10) of Chap. 4 for the infinitesimal change of a field given by δχ

= χ − χ = [χ, G]/i, χ = (1 − i G) χ (1 + i G), U = 1 + i G.

264

39

The Very Nature of a SUSY Theory: Super-Poincaré Algebra …

spin states of fermionic and bosonic types are equal and each is given by [ 2(2 j + 1) + (2 j + 2) + 2 j ]/2 = 2(2 j + 1). For j = 0, starting with a one component spin 0-(scalar) state, we generate another one component spin 0-state, as well as a 2-component column matrix of spin 1/2. The following spin content then emerges as: (0, 0, 1/2) with the total number of bosonic and fermionic spin states being each equal to [ 0 + 0 + 2 + 2 ]/2 = 2. • In Box 39.3, it is shown that if λ denotes the largest helicity in a supermultiplet of mass zero, then the only other state in the supermultiplet has helicity λ − 1/2. On the other hand, CPT invariance implies the existence of its anti-supermultiplet consisting of states with helicities −λ and −λ + 1/2 of opposite helicities. For example the superpartner of the photon, called the photino, is a massless fermion, of helicities ± 1/2, and that of the graviton, the superpartner, called the gravitino is a fermion, a Rarita–Schwinger massless particle, of helicities ± 3/2. The photon and the graviton being so-called gauge fields, their superpartners are also each referred to as gauginos. In general supermultiplets with λ = 1/2, 1 are referred to as chiral and vector supermultiplets, respectively. Box 39.2 Intricacies of a supermultiplet for particles with a common non-zero fixed mass m 1 1 The Poincaré algebra in Eqs. (39.12)–(39.17) imply that : [Q, J i ] = Σ i Q, [J i , Q † ] = Q † Σ i , 2 2 i 1 i σ 0 J i = εi j k J j k , Σ i = εi j k [γ j , γ k ] = (from Eq. (39.13)), 0 σi 2 2  [P 2 , J i ] = 0, (from Eq. (39.16)), [J 3 , J2 ] = 0 (from Eq. (39.17)). ∗ 1 1 {Aa , A†b } = δab , Aa = √ Q a , Aa† = √ Q a† , (γ 0 γ 0 = −I ) (from Eq. (39.18)) according m m  ∗ we may identify a particle state by |m, j, σ , σ = − j, − j + 1, · · · j − 1, j. to the equations in  i Using the fact that Σ ab = (σ i )ab , for a, b = 1, 2, we may use the first two equation on the   1 1 first line to infer that: JAa† = A†b Jδba + σ ba , JAa = Ab Jδba − σ ba , f or a, b = 1, 2, 2 2 where recall that σ 3 = diag [1, −1]. One can always define a normalized state |ϕ such that Aa |ϕ = 0, a = 1, 2. For example, if A1 |ϕ = 0, then you may consider the normalized state A1 |ϕ which is annihilated by A1 , since (A1 )2 = 0 and so on. Accordingly, given Aa |ϕ = 0, consider the following. For any given unit vector n, and any angle ϑ: (∂/∂ϑ) exp[i ϑn · J ] Aa exp[−i ϑn · J ] = −(i/2)n · σ exp[i ϑn · J ] Aa exp[−i ϑn · J ], ⇒  Ab . That is Aa exp [−i ϑn · J ]|ϕ = 0, as well. exp[i ϑn · J ] Aa exp[−i ϑn · J ] = exp[−i ϑn · σ /2] ab

Aa | j, σ  j, σ | exp[−iϑn · J] |ϕ, which We may carry out the expansion 0 = Aa exp [−iϑn · J ] |ϕ = j,σ

is true for all ϑ, n. That is, the latter is true for all arbitrary coefficients  j, σ | exp [−iϑn · J] |ϕ as we may vary over the infinite possible values that may be taken by (n, ϑ). Thus we may infer that one may set-up normalized spin states | j, σ  ≡ |m, j, σ  such that Aa |m, j, σ  = 0, for a = 1, 2.

Box 39.3 Intricacies of a supermultiplet for particles with a common zero mass For massless supermultiplets, consider a particle with energy-momentum P μ = (E, 0, 0, E), E > 0. Working √ in the chiral representation of the Dirac matrices, the first equation in (39.18) implies that for Ba = Q a / 2E, √ Ba† = Q a† / 2E : {B1 , B1† } = 0, {B2 , B2† } = 0, {B1 , B2† } = 0, {B2 , B1† } = 0, (B1 )2 = 0, (B1† )2 = 0, We also   have J 3 B1† = B1† J 3 + 1/2), J 3 B1 = B1 J 3 + 1/2). Hence if λ denotes the largest helicity in a supermultiplet, then B1† |E, λ = 0. And another state is B1 |E, λ = |E, λ − 1/2. Since (B1 )2 = 0, this means that B1 |E, λ = 0.

39

The Very Nature of a SUSY Theory: Super-Poincaré Algebra …

265

These multiplets are represented by superfields which are functions of x and θ . The fields are components of these superfields. The latter are introduced in the next chapter.3

Reference 1. Manoukian, E. B. (2016). Quantum field theory II: Introduction to quantum gravity, supersymmetry and string theory. Switzerland: Springer.

3 For

additional related details see Manoukian [1], pp. 118–126.

40

Superfields Prerequisite Chaps. 38, 39

In order to develop SUSY field theoretical interactions, it is important first to introduce superfields. The latter represent the supermultiplets considered in the last chapter. In Eq. (38.27) of Chap. 38, we have already introduced a scalar superfield by making an expansion in products of the components θa up to the product of four components θ1 · · · θ4 due to the fact that a product involving the product of two or more of any given component is zero, e.g., (θ1 )2 = 0 due its Grassmann character. We will first elaborate further on the nature on the components of the Majorana spinor θ and their products. The Majorana character of θ leads to the following: ⎛ ⎞ θ1     2 ⎟  ⎜ 0 θ −i σ 0 −i  2⎟ ∗ ∗ , σ2= , ⇒ θ =⎜ θ = C θ , C chiral = ⎝θ3 ⎠ , θ1 = θ4 , θ2 = −θ3 . 0 iσ2 i 0 θ4

(40.1)

Moreover, from Eqs. (38.4) and (38.5) of Chap. 38, in particular, θγ μ θ = 0, θ [γ μ , γ ν ]θ = 0, θ θ = 0, θ γ 5 θ = 0, θ γ 5 γ μ θ = 0, θ γ 5 γ μ γ ν θ = − ημν θ γ 5 θ,

(40.2)

where we have included the last identity which is easily established from the elementary properties of the gamma matrices and the Majorana character of θ . Thus if, say, we want to expand products of (anti-commuting) components of a Majorana spinor, we may do this solely in terms of θ θ, θ γ 5 θ, θ γ 5 γ μ θ as the first two other expressions given in the above last equation vanish. Here we recall that the matrices I, γ 5 , γ μ , γ 5 γ μ , [ γ μ , γ ν ], give rise to 16 independent 4 × 4 matrices. For example, for θ , with anti-commuting components, we may write θ b θa = c1 δab θ θ + c 2 γ

5 ab

θ γ 5 θ + c3 (γ 5 γμ )ab θγ 5 γ μ θ.

(40.3)

The coefficients c1 , c2 , c3 are easily obtained by multiplying, in turn, by δab , by γ 5ba , and finally by (γ 5 γ σ )ba , to obtain c1 = c 2 = c3 = 1/4, where recall, in particular, that {γ 5 , γ μ } = 0, {γ μ , γ ν } = −2 ημν , (γ 5 )2 = I . For further applications, we use the notation B for the matrices I, γ 5 , γ 5 γ μ , and rewrite (40.3) in the more elegant form, after multiplying it by − Cbc , using in the process (40.3), and doing an obvious relabeling of the components of θ , as 1 1 Bbc θ Bθ = (B C )ab θ Bθ, θa θb = − Cca 4 4 B B

(40.4)

where note from the properties in Eq. (38.3) of Chap. 38 that C −1B C = B  , C  = C −1 = − C , © Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_40

B C = −(B C ) .

(40.5) 267

268

40

Superfields

Continuing in this manner, we may conveniently generate the dictionary Box 40.1 useful in dealing with superfields. Box 40.1 Dictionary of basic properties involving the spinor θ and the charge conjugate matrix C 1 1 5 θγ 5 θ + (γ 5 γμ )ab θγ 5 γ μ θ ], θ4 θ3 θ2 θ1 = (θγ 5 θ)2 . [ δab θθ + γ ab 4 8

1 5 5 5 5 5 5 θa θb θc = θγ θ (C γ )ab θc − (C γ )ac θb + (C γ )bc θa − Cab (γ θ)c + Cac (γ θ)b − Cbc (γ 5 θ)a . 4

1 (θγ 5 θ)2 (C γ 5 )ab (C γ 5 )cd − (C γ 5 )ac (C γ 5 )bd + (C γ 5 )bc (C γ 5 )ad − Cab Ccd + Cac Cbd − Cbc Cad . θa θb θc θd = 16 θθ θc = − θγ 5 θ (γ 5 θ)c , θθ θ c = − θγ 5 θ (θγ 5 )c , θγ 5γ μ θ θc = − θγ 5 θ (γ μ θ)c , θγ 5γ μ θ θ c = θγ 5 θ (θγ μ )c , 1 1 1 θθ θa θb = − Cab (θγ 5 θ)2 , θγ 5 θ θa θb = (γ 5 C )ab (θγ 5 θ)2 , θγ 5 γ σ θ θa θb = (γ 5 γ σ C )ab (θγ 5 θ)2 . 4 4 4 (θθ) (θγ 5 γ μ θ) = 0, (θγ 5 γ μ θ) (θγ 5 θ) = 0, (θγ 5 γ μ θ)(θγ 5 γ σ θ) = ημσ (θγ 5 θ)2 . (θθ) (θγ 5 θ) = 0, θ b θa =

(θθ)2 = − (θγ 5 θ)2 ,

θξ = ξ θ,

θ[γ μ , γ ν ]θ = 0,

θγ 5 γ μ ξ = ξ γ 5 γ μ θ,

θγ 5 [γ μ , γ ν ]θ = 0,

θγ μ ξ = − ξ γ μ θ, θγ 5 ξ = ξ γ 5 θ, 1 θγ 5 γ μ γ ν ξ = − ημν θγ 5 ξ − ξ γ 5 [γ μ , γ ν ] θ, 2

Scalar Superfield From Eq. (40.2), we note that the only non-vanishing bilinear (sesquilinear) forms θ · · · θ are: θ θ , θ γ 5 θ , θ γ 5 γ μ θ , where note that from the last line in Box 40.1, θ γ 5 γ μ γ ν θ = − ημν θ γ 5 θ . The scalar superfield is defined by i 1 1 θγ 5 θ B(x) − θ θ G(x) − θ γ 5 γ μ θ Vμ (x) 4 4 4     i 1 i 1 + √ θ γ 5 θ θ χ (x) + √ γ ∂ ψ(x) − (θγ 5 θ )2 D(x) + A(x) . 16 2 2 2 2

Φ(x, θ ) = A(x) − i θ γ 5ψ(x) +

(40.6)

Here A(x), B(x), G(x), D(x) are Lorentz scalars, V μ (x) is a Lorentz vector, and ψ(x), χ (x) are spinors,  = ∂μ ∂ μ . The numerical coefficients in this expansion, as well as the coefficients of θ γ 5 θ θ and (θ γ 5 θ )2 , expressed in the above forms, are written for convenience. One may, of course, absorb the numerical coefficients in the fields and introduce other symbols for the coefficients just mentioned at the cost of complicating the algebra and the interpretation that follows from the analysis. In particular, it is important to emphasize that the θ independent part A(x) of the above expression is a Lorentz scalar. The SUSY transformations between spinor and bosonic fields may be readily obtained from considering the infinitesimal change of the scalar superfield in Eq. (40.6) under the SUSY transformation in Eq. (38.8) of Chap. 38 for bμ = 0: x μ = x μ +

i

γ μ θ, θ a = θa + a , 2

(40.7)

By definition, a scalar superfield is defined by: Φ  (x  , θ  ) = Φ(x, θ ). That is,  i ∂  θ, θ − ) = Φ(x, θ ) + − γ μ θ ∂μ − Φ(x, θ ), 2 ∂θ  ∂ i + γ μ θ ∂μ Φ(x, θ ), for infinitesimal . δΦ(x, θ ) = Φ(x, θ ) − Φ  (x, θ ) = 2 ∂θ Φ  (x, θ ) = Φ(x −

i

γ 2

μ

(40.8) (40.9)

We note that the above operation is almost the same as D in Eq. (38.22) of Chap. 38, with the sign in the middle of D changed to a plus sign. We have already learnt to take derivatives with respect to θ in Eqs. (38.16)–(38.18) in Chap. 38. In particular

40

Superfields

∂ ∂θ a

269

(θ γ 5 )b = (γ 5 )ab , ∂ ∂θ a

∂ ∂θ a

θ γ 5 θ = 2 (γ 5 θ )a ,

∂ ∂θ a

θ θ = 2 θa ,

θ γ 5 θ θ b = 2 (γ 5 θ )a θ b + θ γ 5 θ δab ,

∂ ∂θ a

∂ ∂θ a

θ γ 5 γ μ θ = 2 (γ 5 γ μ θ )a ,

(θ γ 5 θ )2 = 4 (θ γ 5 θ )(γ 5 θ )a .

(40.10)

Accordingly from Eqs. (40.6), (40.9) and (40.10), the following SUSY transformations rules emerge between the fields: 1 1 δψ = − (B + iγ 5 G) − γ μ (γ 5 ∂μ A + iVμ ) , 2 2 √ √ 1 i δ B = √ (χ + i 2 γ μ ∂μ ψ), δG = √ γ 5 (χ + i 2 γ μ ∂μ ψ), 2 2   √ i 1 1 σ μ 1 δVμ = √ (γμ χ − i 2 ∂μ ψ), δχ = − √ [γ , γ ]∂σ Vμ − iγ 5 D , δD = √ γ 5 γ μ ∂μ χ . 2 2 2 2 δ A = − i γ 5 ψ,

(40.11) (40.12) (40.13)

From the expression of the scalar superfield Φ(x, θ ), the second identity in Box 40.1 (1/8) (θ γ 5 θ )2 = θ4 θ3 θ2 θ1 , and the identity of δD in Eq. (40.13), we learn the following: 

    1 1 1 (dx)(dθ )Φ(x, θ ) = − (dx) D(x) + A(x) = − (dx) D(x), 2 2 2    1 1 δ (dx)(dθ )Φ(x, θ ) = − (dx)∂μ ( γ 5 γ μ χ (x)) = 0, where (dx) δD(x) = − √ 2 2 2   (dθ )θ4 θ3 θ2 θ1 = 1, dθa θb = δab . (dx) = dx 0 dx 1 dx 2 dx 3 , (dθ ) = dθ1 dθ2 dθ3 dθ4 ,

(40.14) (40.15) (40.16)

The first equality in (40.14) arises as a consequence of the θ -integrals in (40.16), and the second equality in (40.14) arises due to the fact that A = ∂μ (∂ μ A) is a total derivative. That is for the integral of the scalar superfield only the D term contributes to it. The last equality in (40.15) is zero due to the fact that ∂μ ( γ 5 γ μ χ (x)) is a total derivative. This establishes the SUSY  invariance of the integral (dx)(dθ ) Φ(x, θ ) of a scalar superfield. Chiral Superfield The scalar superfield is a reducible superfield in the sense one may select subset of its fields which transform among themselves. To this end, one may define a left-chiral and right-chiral fields by     1−γ 5 1+γ 5 ψ, ψ R ≡ ψ, ψ = ψ L + ψ R, ψ L ≡ 2 2 ⎛ ⎞ ⎛ ⎞ 0 θ1   ⎟ ⎜ ⎜ ⎟ θ I 0 0 ⎟ , θR = ⎜ 2⎟ γ5= , θL = ⎜ ⎠ ⎝ ⎝ 0⎠ 0 −I θ3 θ4 0

(40.17)

which have, respectively, only two lower and only two upper non-vanishing components, respectively, and note that γ 5 ψL/R = ∓ψL/R defining left and right chiralities. A subset of the fields mentioned above may be obtained by setting, in the process,

ψ R = 0, χ = 0, D = 0, i Vμ = ∂μ A, which from (40.6), (40.11)–(40.13) lead to the left-chiral superfield:

− i G = B ≡ −i F ,

(40.18)

270

40

1 i θθL F (x) + θ γ 5 γ μ θ ∂μ A(x) 2 4 1 1 5 5 2 − θγ θ θ γ ∂ψ L (x) − (θ γ θ )  A(x). 4 32 δψ L = i F L − γ ∂ R A, δF = − γ ∂ ψ L , δA = i ψ L ,

Superfields

ΦL (x, θ ) = A(x) + i θψ L (x) −

(40.19) (40.20)

where R/L are defined as the ψ field in (40.17), and note that θ θL = 2 θ3 θ4 .  We note that the last term in (40.19) is a total derivative and hence cannot contribute to (dx)ΦL (x,  θ ). On the other hand, the change δF is a total derivative, and one expects that an integral of ΦL (x, θ ) will simply lead to (dx) F . This is indeed the case. To this end, we define, in the process, δ (2) (θR ), to obtain   (dθ ) δ (2) (θR ) δ (2) (θL ) = 1, (dθ ) δ (2) (θR ) θ θR = 0, δ (2) (θR ) = θ2 θ1 , δ (2) (θL ) = θ4 θ3 ,    (2) (2) 5 μ (dθ ) δ (θR ) θ γ γ θ = 0, (dθ ) δ (2) (θR ) θ γ 5 θ θ a = 0, (dθ ) δ (θR ) θ θL = −2,     (dθ ) δ (2) (θR ) ΦL (x, θ ) = (dx) F (x), δ (dθ ) δ (2) (θR ) ΦL (x, θ ) = (dx) δF (x) = 0,

(40.21) (40.22) (40.23)

 and the last integral follows from the expression of F in (40.20) as a result that (dx) ∂μ ( γ μ ψL ) = 0. A simplified expression may obtained for ΦL (x, θ ) and is given in Box 40.2. Box 40.2 Simplified expression for ΦL (x, θ) By defining, in the process, the variable xˆ μ , and using the Grassmann property of θ, θ, we have 1 ∂μ i i (θγ 5 θ)2 = 1 + θγ 5 γ μ θ ∂μ − xˆ μ = x μ + θγ 5 γ μ θ, and exp − θγ 5 γ μ θ , 4 4 i 4 32 where from Box 40.1, we have used the identity (θγ 5 γ μ θ)(θγ 5 γ σ θ) = ημσ (θγ 5 θ)2 . Also the Box gives θγ 5 γ μ θ θ c = θγ 5 θ (θγ μ )c ,

θγ 5 γ μ θ θθL = (1/2) θ γ 5 γ μ θ θθ − (1/2) θγ 5 γ μ θ θγ 5 θ = 0.

Accordingly, using the fact that ∂μ /i generates a translational, we have 1  ∂μ  1 A(x) + i θψL (x) − θθL F (x) ≡ ΦL (x, θ). A(x) ˆ + i θψL (x) ˆ − (1/2) θ θL F (x) ˆ = exp − θγ 5 γ μ θ 4 i 2 1 i That is, ΦL (x, θ) = A(x) ˆ + i θψL (x) ˆ − θθL F (x), ˆ xˆ μ = x μ + θγ 5 γ μ θ. 2 4

A right-chiral superfield may be defined by taking the Hermitian conjugate of a left-chiral one. (Scalar) Vector Superfields: The (scalar) vector field is defined by imposing a reality condition on the general scalar superfield Φ(x, θ ) in Eq. (40.6): Φ(x, θ ) = Φ † (x, θ ). It may be conveniently written as 1 1 i θγ 5 θ B(x) − θ θ G(x) − θγ 5γ μ θ Vμ (x) 4 4 4 i i 1 1 ˜ + √ θ γ 5 θ θ [ χ (x) + √ γ ∂ ψ(x) ]− (θ γ 5 θ )2 [ D(x) +  S(x) ]. 16 2 2 2 2

˜ V (x, θ ) = S(x) − i θ γ 5 ψ(x) +

(40.24)

Although this is a scalar superfield, it is, rather, unfortunately referred to as a (scalar) vector superfield because it involves the Lorentz vector V μ as a component field. The pure vector superfield will be considered at the end of the chapter. We consider, in turn, the abelian and non-abelian versions of the (scalar) vector superfield. (Scalar) Vector Superfield: Abelian Case ˜ χ are Majorana spinors. S, B, G, D in (40.24) are new real Lorentz scalars, while V μ is a real Lorentz vector, and ψ, In analogy to ordinary gauge transformations, we define a supergauge transformation of, say, a chiral superfield in the following manner

40

Superfields

271

Φ L (x, θ ) → e i e Λ(x,θ) Φ L (x, θ ), Φ



L (x, θ ) e

−2 eV(x,θ)

such that

Φ L (x, θ ) → Φ



(40.25)

L (x, θ ) e

−2 eV (x,θ)

Φ L (x, θ ),

(40.26)

where the factor of 2 in the exponential is introduced for convenience, Λ(x, θ ) is a classical left-chiral superfield, and e−2 eV (x,θ) is referred to as the gauge connection, e is a parameter, and the structure of Λ(x, θ ) will be spelled out below. Equation (40.26) is a statement of gauge invariance in constructing gauge invariant theories. Equation (40.26), then requires that V , in turn, to have the following supergauge transformation V (x, θ ) → V (x, θ ) −

 i † Λ (x, θ ) − Λ(x, θ ) = V  (x, θ ), 2

with

 i  † Λ (x, θ ) − Λ(x, θ ) 2

real.

(40.27)

The exponential term exp[−2eV ] in (40.26) is problematic for the simple reason that due to the presence of the θ independent term S(x) in V given in (40.24) will generate an infinite non-ending series in powers of the parameter e. A specific supergauge may, however may be chosen, referred to as the Wess–Zumino gauge [7], which remedies this situation. ˜ B, G) This done by choosing Λ(x, θ ) in the gauge connection exp[−2 eV (x, θ )] to gauge away the field components (S, ψ, in V . To achieve, this gauge, we choose i i i Λ(x, θ ) = a(x) − √ θ ξ L (x) − θ θL b(x) + θγ 5 γ μ θ ∂μ a(x) 2 4 2 1 1 (θγ 5 θ )2  a(x), + √ θγ 5 θ θ γ ∂ξ L (x) − 32 4 2

(40.28)

and consistently check how the θ -independent part S(x), in particular, in the expression for V in Eq. (40.24) is gauged away. From (40.25)–(40.27), we obtain the supergauge transformations of the field components of V : i √ γ 5 ξ, 2 2 = Vμ + ∂μ Re a,

S  = S − Im a,

ψ˜  = ψ˜ +

G  = G − Re b,

Vμ 

B  = B − Im b, χ  = χ,

D  = D.

(40.29) (40.30)

˜ B, G) of V may be gauged away, by choosing Im a, ξ, b, appropriately, to cancel Hence the field components (S, ψ, the just mentioned field components. This supergauge is referred to as the Wess–Zumino gauge.1 In this gauge, V and Λ reduce to the following forms i 1 1 (θ γ 5 θ )2 D(x), V (x, θ ) = − θ γ 5 γ μ θ Vμ (x) + √ θ γ 5 θ θχ (x) − 4 16 2 2 1 i Λ(x, θ ) = a(x) + θ γ 5 γ μ θ ∂μ a(x) − (θ γ 5 θ )2  a(x), a(x) ≡ Re a(x). 4 32

(40.31) (40.32)

It is remarkable that V has no θ -independent part and the gauge connection exp[eV (x, θ )] must reduce to a finite terms, i.e., to a polynomial, upon expansion in powers of the parameter e, as a consequence of the Grassmann properties of θ, θ . We note that χ and D are gauge invariant, while the vector field components Vμ have the conventional gauge transformations of an abelian gauge theory V



= V −

1 θ γ 5 γ μ θ ∂μ a, 4

Vμ (x) → Vμ (x) + ∂μ a(x).

(40.33)

(Scalar) Vector Superfield: Non-Abelian Case In the non-abelian gauge, one is involved with the Hermitian generators: tC satisfying the commutations relations [t A , t B ] = i f ABC tC , where f ABC are the structure constants. One often uses capital indices for the generators in order not to confuse them with other indices occurring in the theory. One may define a (scalar) vector superfield for the non-ablian case directing from that in (40.24) with now tC VC = V , where now all the fields carry an additional index C. Similarly, one defines tC ΦCL = Φ, 1 Wess

and Zumino [7].

272

40

Superfields

tC ΛC = Λ. Under a gauge transformation (40.27) is replaced by exp(−2 g V ) → exp(i g Λ† ) exp(−2 g V ) exp(−i g Λ) ≡ exp(−2 g V  ),

(40.34)

noticing the complicating nature of the matrix form in the exponentials. To deal with this expression, we may use the Baker–Campbell–Hausdorff formula2   1 1 [ A, [ A, B + C ] ] + O , eC e A e B = exp (A + (B + C) + [ A, B − C ] + 2 12

(40.35)

where O involves three or more factors of the matrix A in the subsequent commutators. We set A = −2gV ,

B = −i g Λ,

C = i g Λ† .

(40.36)

The basic idea of attack in handling the non-abelian case in the light of the complicated exponential structures in (40.34), (40.35) is to show that in a Wess–Zumino gauge, V involves at least two factors of θ ’s. By doing this, the factor O in (40.35), due to the expression of A in (40.36), will involve at least six factors of θ ’s and hence it would be identically equal to zero on account of the Grassmann property of the θ ’s. We note that     i (40.37) A + (B + C) = − 2 g tα V α − (Λα† − Λα ) , 2 and the Wess–Zumino supergauge now consists, from (40.35), (40.37), (40.24), and finally, (40.29), and the first equation in (40.30), in choosing Im a E to cancel S E , Im b E to cancel B E ,

i − √ γ 5 ξ E to cancel ψ˜ E , 2 2 Re b E to cancel G E .

(40.38) (40.39)

The (scalar-) vector superfield V , then reduces explicitly to the structure given in (40.31) now with V = t E V E involving at least two powers of θ ’s. Also (B + C) = i g(Λ† − Λ) from (40.36) and involves at least two powers of the θ ’s. Hence also [A, [A, B + C ] ] also vanishes. Accordingly, (40.34), (40.35) give  

i 1 V  = V − (Λ† − Λ) − i g V , (Λ† + Λ) , 2 2 1 † 1 i † 1 5 2 (Λ + Λ) = a − (θ γ θ )  a, (Λ − Λ) = θ γ 5 γ μ θ ∂μ a. 2 32 2 4

(40.40) (40.41)

where the second term on the right-hand side of (40.40) corresponds to the commutator (1/2)[A, B − C ], and we used the expression in (40.32) now with Λ = tC ΛC , to write the corresponding ones in (40.41). That is, V

 C

1  θ γ 5 γ μ θ ∂μ aC + g f D EC V D a E , V Cμ = VC μ + ∂μ aC + g f D EC VD μ a E , 4 = χC + g f D EC χ D a E , D C = DC + g f D EC D D a E , a E (x) = Re a E (x),

= VC − χ C

(40.42) (40.43)

with the infinitesimal transformations of the vector VC μ and the spinors χC , DC , as given in (40.43), as expected (see also Eqs. (17.24)–(17.26) in Chap. 17). We also introduce two more superfields the Spinor Superfield and the Pure Vector Superfield, without going much in their underlying analyses, however, as they are quite involved and beyond the scope of the present book.3 Spinor Superfield In the Wess–Zumino gauge with Λ given in (40.32), the Spinor Superfield is defined by 2 For

a proof of the Baker–Campbell–Hausdorff formula and its generalizations, see, Manoukian [2], Appendix I, pp. 975, 976. the underlying analyses, see p.138, pp. 148, 149 for the spinor superfield, and pp.135–138, pp.146–148 for the Pure Vector Superfield of my book Manoukian [3].

3 For

40

Superfields

  √ 1 χ Aa + √ (γ μ γ ν θ L )a G Aμν − i(γ ∇χ ) Aa θ θ L − i 2 D A (θ L )a , 4 2   = ∂μ V Aν − ∂ν V Aμ + g f AB C VBμ VCν , (γ ∇χ ) Aa = γ μ (δ AC ∂μ + g fAB C VBμ )χC ,

W Aa = exp G Aμν

273

i

θγ 5 γ μ θ ∂μ

a

(40.44) (40.45)

where a is a spinor index, and, as carried out in Box 40.1, we have factored out the translational operator which simply changes the variable x in the fields above to xˆ which is also defined in this Box. The presence of a spinor χ Aa as a θ independent part of W Aa should be noted. For the abelian counterpart simply set f AB C = 0, and delete the corresponding indices. Pure Vector Superfield With the success of abelian and non-abelian gauge theories involving vector bosons gauge fields, the first thing that comes into one’s mind is promoting a Lorentz vector field into a (pure) vector superfield. Surprisingly, the explicit expression of the (pure) vector superfield has been explicitly derived and spelled out recently4 and are defined as follows: Pure Abelian Vector Superfield The pure abelian vector superfield, in the Wess–Zumino supergauge, is given by   i i i V μ = V μ + √ θ γ μ χ + θ γ 5 γν θ Aνμ +θ γ 5 θ θ B μ + (θ γ 5 θ )2 ∂ν Aνμ − ∂ ν V μ , 4 8 2

(40.46)

and with F μν = ∂ μ V ν − ∂ ν V μ , Aνμ =

 i μ ν 1 αβνμ 1 1  1 ∂ V + ε Fαβ + ηνμ D, B μ = √ ημν − γ 5 γ μ γ ν ∂ν χ , 4 8 4 2 2 2

(40.47)

and note that the θ independent part of Vμ is a Lorentz vector. Also note the order of the indices in ∂ μ V ν in the first term on the right-hand of the above equation. Pure Non-Abelian Vector Superfield The pure vector superfield Vμ , in the Wess–Zumino supergauge, is again given by the expression in (40.46) but now Vμ = t E V Eμ .   i i i V μ = V μ + √ θ γ μ χ + θ γ 5 γν θ Aνμ +θ γ 5 θ θ B μ + (θ γ 5 θ )2 ∂ν Aνμ − ∂ ν V μ , 4 8 2 i λ ρ i 1 1 ∂ V (x) − G λρ (x) + ε λρσ μ G σ μ (x) + ηλρ D(x), 4 4 8 4 G σ μ (x) = ∂σ Vμ (x) − ∂μ Vσ (x) − i g [Vσ (x), Vμ (x)],

Aλρ (x) =

B ρ (x) = λρ

μ

1+γ 5 1 1 ig [ Vσ (x), χ (x) ]. √ (ηρσ − γ 5 γ ρ γ σ ) ∂σ χ (x) + √ γ ρ γ σ 2 2 2 2 2 2

(40.48)

(40.49) (40.50) (40.51)

μ

Aλρ = t E A E , B a = t E B E a , where a is a spinor index, [ Vσ , χ ] = i t E f EC D VCσ χ D ,

(40.52)

as usual. Note the order of the indices ∂ λ V ρ , now, in the first term on the right-hand side of (40.49). For additional general references on superfields see also Manoukian [3], Binetruy [1], Weinberg [6]. Only the first textbook [3], in the latter textbooks, includes the pure vector superfield.

References 1. Binetruy, P. (2006). Supersymmetry, experiment, and cosmology. Oxford: Oxford University Press. 2. Manoukian, E. B. (2006). Quantum theory: A wide spectrum. Dordrecht: Springer. 4 See

Manoukian [4], as well as Manoukian [3], pp.135–138, pp.146–148. See also the pioneering work of Salam and Strathdee [5].

274 3. 4. 5. 6. 7.

40

Superfields

Manoukian, E. B. (2016). Quantum field theory II: Introductions to quantum gravity, supersymmetry and string theory. Switzerland: Springer. Manoukian, E. B. (2012). The explicit pure vector superfield in gauge theories. Journal of Modern Physics, 3, 682–685. Salam, A., & Strathdee, J. (1974). Supersymmatry and non-abelian gauges. Physical Letters, B51, 353–355. Weinberg, S. (2000). The quantum theory of fields. III: Supersymmetry. Cambridge: Cambridge University Press. Wess, J., & Zumino, B. (1974). Supergauge transformations in four dimensions. Nuclear Physics B, 70, 39–50.

SUSY Field Theories: Maxwellian, Yang–Mills, Spin0-Spin1/2 Interactions

41

Prerequisite Chap. 40

Now that we have introduced a large class of superfields in the last chapter, we are ready to construct SUSY field theories. We all want to see how the very first gauge theory, that is of Maxwell’s, is modified in a SUSY setting. After investigating the SUSY Maxwell-Field Theory, we also consider the SUSY Yang and Mills-Field Theory, as well as a very first SUSY theory studied involving spin 0-spin1/2 particles, referred to as a Wess–Zumino Theory [4]. SUSY Maxwell-Field Theory In the Wess–Zumino gauge, we have seen that the spinor, associated with an abelian gauge field V μ is from Eqs. (40.44), (40.45) of Chap. 40, with f ABC set equal to zero, given to be Wa = exp

  i √ 1 θ γ 5 γ μ θ ∂μ χ a + √ (γ μ γ ν θL )a Fμν −i(γ ∂χ )a θθ L −i 2 D(θ L )a , 4 2

Fμν = ∂μ Vν −∂ν Vμ .

(41.1) (41.2)

Using the fact that C −1 = C  , and θ = −θ  C −1 = −(C θ ) , θ a = −(C θ )a , we now consider the bilinear form (W W ) = −(C W )W , and introduce the integral  I =

(dθ ) δ (2) (θR )(C W

 a

Wa ,

(41.3)

and only a term, given in Eq. (40.22), involving θ3 θ4 = (1/2) θ θL , with 

(dθ ) δ (2) (θR ) θ θL = − 2,

  can contribute from the expression of (C W a Wa , on account that only (dθ )θ4 θ3 θ2 θ1 = 0, which is equal to 1. According to Eq. (40.19) of Chap. 40, this term, involving the factor θ θL , is referred to as the F term. Clearly, the exponential term in Eq. (41.1) cannot contribute as it would generate, upon expansion, in addition to terms involving three θ ’s, a term involving two θ ’s proportional to i (θ γ 5 γ μ θ )∂μ χa , which according to the second integral in Eq. (40.22) of Chap. 40 4  (dθ )δ (2) (θR ) θγ 5 γ μ θ = 0.

(41.4)

In Box 41.1, it is shown that (C W )a Wa

F

= 2 i χ γ μ ∂μ χ + 2 D 2 − F μν Fμν ) θ θL ,

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_41

(41.5)

275

276

41

SUSY Field Theories: Maxwellian,Yang–Mills, Spin0-Spin1/2 Interactions

Box 41.1 Derivation of Eq. (41.5)

= −χ a (C W )a Wa F

  1 We note that : Cab χ b + √ (γ μ γ ν θ L )b Fμν − i(γ ∂χ)b θθ L − i(θ L )b D 2 √  1−γ 5 1  ν μ1−γ 5 − √ θγ γ Fμν − i∂μ (χ γ μ )a θθ L + i 2 D θ . Accordingly, a a 2 2 2 



  

 (x, θ) = 2 i χγ μ ∂μ χ θθ L + 2D 2 θθ L − (1/2) θ γ ν γ μ γ α γ β 1 − γ 5 )/2 θ Fμν Fαβ . ∗

On the other hand, by using, in the process, the first identity in Box 40.1 in Chapter 40   

  θγ ν γ μ γ α γ β 1 − γ 5 )/2 θ = (1/4) Tr γ ν γ μ γ α γ β (1 − γ 5 ) [ θθ − θγ 5 θ ]/2   = ηνμ ηαβ − ηνα ημβ + ηνβ ημα + i ε νμαβ θθ L , using the traces Tr[γ ν γ μ γ α γ β ] = 4 (ηνμ ηαβ − ηνα ημβ + ηνβ ημα ), Tr[γ ν γ μ γ α γ β γ 5 ] = −4iενμαβ . Now we show that ενμαβ Fμν Fαβ is a total derivative: ε νμαβ Fμν Fαβ = −4 εμναβ (∂μ Vν ∂α Vβ ) = −4 εμναβ [∂μ (Vν ∂α Vβ ) − Vν (∂μ ∂α )Vβ ]. The second term within the square brackets gives zero since it is symmetric in (μα) while εμναβ is anti-symmetric in interchanging these two indices. The first term is just a total derivative. All told, the equation labeled

 by ∗ above, leads to Eq.(41.5) in the text.

up to a total derivative. Therefore from (41.3), (41.5) and the first equation in (40.22) in Chap. 40, the modified Lagrangian density of the Maxwell field in the Wess–Zumino supergauge, is obtained by dividing the above expression, in the process, after carrying out the integration in (41.3), by − 8 1 γ∂ 1 χ (x) + D 2 (x), L (x) = − F μν (x) Fμν (x) − χ (x) 4 2i 2

(41.6)

to be integrated over Minkowski spacetime, giving rise to a modification of the Maxwell Lagrangian density −(1/4)F μν Fμν , where the particle associated with the field χ is referred to as the photino. The presence of the 1/2 factor in the second term in the Lagrangian density should not be surprising, as for a Majorana spinor, this term may be rewritten as  −1 −1 −1 χa γ ∂ bc χc ], where we have used the property Cab = − Cba , and the spinor field χ appears twice in [ −(1/2i)Cba the same way that the kinetic term of the Lagrangian density of a real scalar field is [ −(1/2)∂μ φ∂ μ φ ] involving a factor of 1/2. The field D(x) is referred to as an auxiliary field, which will not contribute in the Euler-Lagrange equation. Its presence is to ensure the SUSY algebra closes relating the different fields. SUSY Yang and Mills-Field Theory: Directly from Eqs. (40.44), (40.45), the spinor superfield in the Wess–Zumino gauge is given by   √ 1 θ γ 5 γ μ θ ∂μ χ Aa +√ (γ μ γ ν θL )a G Aμν −i(γ ∇χ ) Aa θ θL−i 2 D A (θL )a , 4 2 

= ∂μ V Aν −∂ν V Aμ +g f AB C VBμ VCν , (γ ∇χ ) Aa = γ μ (δ AC ∂μ +g f AB C VBμ )χC a .

W Aa = exp G Aμν

i

(41.7) (41.8)

The analysis carried out for the Maxwell-field case can be repeated with F μν now replaced by G E μν , and (γ ∂χ )a replaced by (γ Δχ ) Aa , except that we also have to show that εμναβ G Aμν G Aαβ is also a total derivative. The proof of this is spelled out in Box 41.2. The Langrangian density of the SUSY Yang–Mills field in the Wess–Zumino supergauge, then becomes

41

SUSY Field Theories: Maxwellian,Yang–Mills, Spin0-Spin1/2 Interactions

277

Box 41.2 εμναβ G Aμν G Aαβ is a total derivative



 g g We may write εμναβG Aμν G Aαβ = 4ε μναβ ∂μ V Aν + f ABC VBμ VCν ∂α V Aβ + f AD E VDα VEβ . 2 2 Using the facts that εμναβ f ABC = ε μβαν f C B A = ε μνβα f AC B . The earlier equation may be rewritten as

  g

 g2 4 εμναβ ∂μ V Aν ∂α V Aβ − V Aν (∂μ ∂α )V Aβ + f AB C ∂μ V Aν VBα VCβ + f AB C f AD E VBμ VCν VDα VEβ . 3 4 The first term above is a total derivative. The second term is zero because ∂μ ∂a is symmetric in μ and α while εμναβ is anti-symmetric. Thus only the last term multiplying g 2 is to be considered. Using the facts that −2 ε

μναβ

Tr (t A t B ) = δ AB /2, [Vμ , Vν ] = i t A f ABC VBμ VCν , some algebra shows that it may be rewritten as   Tr Vμ Vν Vα Vβ . The latter is obviously zero as under the change of indices (μ, ν, α, β) → (ν, α, β, μ),

the trace is symmetric while ε μναβ is anti-symmetric.

 1 μν 1 1 L = − G E G Eμν − χ E γ ν δ EC ∂ν + g f E AC V Aν χC + D E D E , 4 2i 2

(41.9)

to be integrated over Minkowski spacetime, with sums over repeated indices understood. Spin0 - Spin1/2 SUSY Interacting Theory Given the left-chiral superfield in Eq. (40.19) of Chap. 40, 1 i θ θL F (x) + θ γ 5 γ μ θ ∂μ A(x) 2 4 1 1 − θ γ 5 θ θ γ ∂ψ L (x) − (θ γ 5 θ )2  A(x), 4 32

ΦL (x, θ ) = A(x) + i θ ψ L (x) −

(41.10)

and its adjoint 1 i θ θ R F † (x) − θ γ 5γ μ θ ∂μ A† (x) 2 4 1 1 − θ γ 5θ θ γ ∂ψ R (x) − (θ γ 5θ )2  A† (x), 4 32

ΦL† (x, θ ) = A† (x) − i θ ψ R (x) −

(41.11)

we may readily derive a SUSY theory of non-interacting spin 0, spin 1/2 particles by considering first the easily verified expression   1 γ ∂ψ L ψ γ ∂ − ψR , ΦL† (x, θ )Φ L (x, θ ) = − (θγ 5 θ )2 − ∂μ A† ∂ μ A + F †F − ψ D 8 2i 2i

(41.12)

up to total derivatives within the square brackets. The last two terms may be combined on account  that ψL + ψR = ψ. Using the fact that (1/8)(θ γ 5 θ )2 = θ4 θ3 θ2 θ1 , as given on the first line of Box 40.1 in Chap. 40, and (dθ )θ4 θ3 θ2 θ1 = 1, we may write the Lagrangian density in question for the spin 0/spin 1/2 non-interacting theory simply from (40.12) and the fact that ψ = ψR + ψL , with an overall correct sign, as L0 = −∂μ A† ∂ μ A + F † F − ψ

γ∂ ψ, 2i

(41.13)

which is to be integrated over Minkowski spacetime. The fields F , F † as before are referred to as auxiliary fields. The Lagrangian density in (41.13) is referred to as a Wess–Zumino Lagrangian density. To introduce interactions, we may readily consider the products

278

41

SUSY Field Theories: Maxwellian,Yang–Mills, Spin0-Spin1/2 Interactions



1 Φ L (x, θ )Φ L (x, θ ) = − AF + ψψ L θ θ L , F 2  3

Φ L (x, θ )Φ L (x, θ )Φ L (x, θ ) = − A AF + Aψψ L θ θ L , F 2

(41.14) (41.15)

and we may, conveniently define the combination to generate an interaction W [Φ] =

2 m Φ L (x, θ )Φ L (x, θ ) + λ Φ L (x, θ )Φ L (x, θ )Φ L (x, θ ), 2 3

(41.16)

which is appropriately referred to as a superpotential, where m and λ are real, and m > 0. We may, in turn, consider the combination to the action integral 



(dx)(dθ ) δ (2) (θ R )W [Φ] + h.c generating together L0 in (41.13), the Lagrangian density,   1 L = L0 − m (−AF + ψψ L ) + 2 λ (−A AF + Aψψ L ) + h.c , 2

(41.17)

where F , F † are auxiliary fields. Their field equations via the Euler Lagrangian equations allow us to eliminate in favor of the other fields: −F = m A† + 2λA† A† , −F † = m A + 2λA A. Upon substituting these expressions in (41.17), the following Lagrangian density emerges 



m + 4λA m + 4λA† γ∂ 2 ψψ L − ψψ R . (41.18) L = −∂μ A ∂ A − ψ ψ −|m A + 2λA A| − 2i 2 2 √ It is customary to use to write A = (ϕ1 + i ϕ2 )/ 2, and simply rewrite the above lagrangian density in terms of the real fields ϕ1 , ϕ2 : † μ

m 1 γ∂ m2 2 1 ψ − ψψ − (ϕ1 + ϕ 22 ) L = − ∂μ ϕ1 ∂ μ ϕ1 − ∂μ ϕ 2 ∂ μ ϕ 2 − ψ 2 2 2 i 2 2 √ √ √ − λ2 (ϕ12 + ϕ 22 )2 − 2 λ mϕ1 (ϕ12 + ϕ 22 ) − 2 λ ϕ1 ψψ − i 2 λ ϕ 2 ψ γ 5ψ,

(41.19)

referred to as a Wess–Zumino Lagrangian density with interaction [4]. For additional details on such SUSY theories, see also Benetruy [1], Manoukian [2], and Weinberg [3].

References 1. 2. 3. 4.

Binetruy, P. (2006). Supersymmetry, experiment, and cosmology. Oxford: Oxford University Press. Manoukian, E. B. (2016). Quantum field theory II: Introductions to quantum gravity, supersymmetry and string theory. Switzerland: Springer. Weinberg, S. (2000). The quantum theory of fields. III: Supersymmetry. Cambridge: Cambridge University Press. Wess, J., & Zumino, B. (1974). Supergauge transformations in four dimensions. Nuclear Physics B, 70, 39–50.

SUSY and the Standard Model: Couplings Unification

42

Prerequisite Chaps. 37, 40

In order to generate the superpartners of the particles of the standard particles, we have to promote the fields corresponding to the latter particles to superfields. In particular, a left-chiral superfield associated with the electron may be defined from Eq. (40.19) in Chap. 40 to be, 1 i θ θ L F (x) + θ γ 5 γ μ θ ∂μ A(x) 2 4 1 1 − θγ 5θ θ γ ∂ψ (e) (θ γ 5 θ )2  A(x), L (x) − 4 32

E(x, θ ) = A(x) + i θ ψ (e) L (x) −

(42.1)

whose ψ (e) L component is left-handed. Moreover, A is the field associated with the selectron, the superpartner of the electron, of spin 0, and F is an auxiliary field which may be eliminated in favor of other fields from its Euler-Lagrange equation in an interacting theory involving the superfield E. The superfields associated with fermions involved in the standard model may be now expressed as 

Ni Ei



≡ L i , E¯ i ;



Ui Di



, U¯ i , D¯ i ,

(42.2)

where the subscript i specifies the various generations, with the Ni denoting the left-chiral superfields associated with the left handed neutrino fields, the Ui denoting the left-chiral superfields associated with the left-handed u, c, t quarks, and the Di denoting the left-chiral superfields associated with the left-handed d, s, b quarks, respectively, for i = 1, 2, 3. E¯ i ; U¯ i , D¯ i are associated with the left-handed charged antileptons, and the left-handed antiquarks, respectively. The superpartner of a lepton is referred to as a slepton, and of the quark as a squark. The quark superfields form SU(3) triplets of colors, while the leptons are SU(3) singlets. Quantum numbers of the particles just mentioned together with the Higgs fields introduced below are summarized in Table 42.1. We note that since for each i, Y (E i ) + Y ( E¯ i ) = +1, Y (Di ) + Y ( D¯ i ) = +1, while Y (Ui ) + Y (U¯ i ) = −1, only one SU(2) doublet of left-chiral Higgs superfield will not be enough to generate masses for all the quarks. The need then arises to introduce two SU(2) doublets of left-chiral Higgs superfields :  H1 =

H 10 H 1−



 ,

H2=

H 2+ H 20

 ,

(42.3)

respectively, of weak hypercharges Y = ∓1. Accordingly, to generate masses for the charged leptons and the quarks, we may introduce Yukawa interaction types of the fermions with the Higgs bosons which will involve linear combinations of F parts of the following anti-symmetric terms: 1 √ (E i H10 − N i H1− ) E¯ j , 2

1 √ (D i H10 − Ui H1− ) D¯ j , 2

1 √ (D i H 2+ − Ui H20 ) U¯ j , 2

(42.4)

with coefficients depending on i and j, where we note that the anti-symmetric combinations of the two doublets, in each term, give rise to SU(2) singlets. In particular, the two Higgs fields now give masses to u-type and d-type quarks. © Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_42

279

280

42

Table 42.1 Quantum numbers of the particles in Eq. (42.2), Y = 2(Q − T3 ) E¯ i U¯ i Ni Ei Ui Di Qi T3 Yi

0 1/2 −1

−1 −1/2 −1

+1 0 2

−1/3 −1/2 1/3

2/3 1/2 1/3

−2/3 0 −4/3

SUSY and the Standard Model: Couplings Unification

D¯ i

H10

H1−

H2+

H20

1/3 0 2/3

0 1/2 −1

−1 −1/2 −1

1 1/2 +1

0 −1/2 +1

One may similarly define supersymmetric extension of gauge fields promoting the ones in the standard model. The theory under consideration with just two Higgs doublets, with minimal number of couplings and minimal number of fields, emerging from the standard model, is referred to as the Minimal Supersymmetrc Standard Model (MSSM). The beta functions of MSSM are given in Table 42.2 together with the corresponding ones for the SM in Chap. 37, and in the same notation, for comparison,1 where d 1 g#2 μ2 2 , α = (42.5) = β , for the three α’s: αs , α, α  , # # dμ α# (μ2 ) 4π with B.C.:   αs (μ2 )

  = α(μ2 )

5  2  α (μ ) 2 2 , (42.6) μ2 =M 2 μ2 =M 2 μ =M 3 at a unifying energy scale M to be determined below. The number of complex Higgs bosons doublets are taken to be n H = 1 for the SM, and n H = 2 for the MSSM.2 Moreover, n g = 3 for the number of generations. The equations in (42.5) may be readily integrated from μ2 = M Z2 to μ2 = M 2 , where M Z denotes the mass of the vector boson Z and M corresponds to the unifying energy, by using, in the process, the beta functions in Table 42.1 for the MSSM, (with n g = 3, n H = 2): =

  1 1 M2 = 1/α3 , = + βs ln αs (M 2 ) αs (M Z 2 ) MZ 2   M2 1 1 = + β ln = 1/α2 , α(M 2 ) α(M Z 2 ) MZ 2   3 3 3  M2 = 1/α1 . = + β ln 5 α  (M 2 ) 5 α(M Z 2 ) 5 MZ 2

(42.7) (42.8) (42.9)

The coupling parameters α  and α are defined in terms of the electromagnetic one αem and the weak angle θW by 1 cos2 θW = , α α em

sin2 θW 1 = . α α em

(42.10)

Experimentally [2], αs (M Z2 ) = 0.1184 ± 0.0007, 1/αem (M Z2 ) = 127.916 ± 0.015. 3  The following parameters are estimated in Box 42.1 indicated by • in it are: sin2 θW  M 2  0.231, M  2.2 × 1016 GeV, Z

1/αs (M 2 ) ≈ 24.3, 1/α em (M 2 ) ≈ 65. As a measure of the accuracy of the approach of the coupling parameters to eventual unification in (42.6), one may introduce the following critical parameter of Peskin [8], as we have also done in Chap. 37 on GUT: Δ=

1 For

α −1 (M 2Z ) − αs−1 (M 2Z ) , (3/5)α −1 (M 2Z ) − α −1 (M 2Z )

(42.11)

the SM, see Chap. 37. For details concerning the beta functions of the MSSM theory, see, e.g.., Eisborn and Jones [4], Jones [5, 6]. The computations of the beta functions parallel very closely to those computed in Chap. 37 but are much more involved to be given here. See also Rosiek [9], Tanabashi [10]. 2 The small contribution of the Higgs boson with n = 1 has been included in the SM. H 3 We have also computed 1/α em  128 from QED by taking into account all the charged leptons and all those quarks having masses less than that of M Z as dictated by the decoupling theorem of QFT—See Chap. 28.

42

SUSY and the Standard Model: Couplings Unification

281

Table 42.2 Table depicting the expressions for 12π β, with the beta functions, as introduced in (42.5), for the SM and MSSM as functions of the number of generations n g , and the number of complex Higgs doublets n H 12πβ#

SM

MSSM

12πβs 12π β 12πβ 

(33 − 4n g ) (22 − 4n g − 21 n H ) 1 − ( 20 3 ng + 2 n H )

(27 − 6n g ) (18 − 6n g − 23 n H ) − (10 n g + 23 n H )

which experimentally takes the value 0.74. On the other hand, from Eqs. (42.7)–(42.9) and the beta values for MSSM in Table 42.2, we obtain for the theoretical expression for (42.11) Δtheor =

β − βs −3 − 9 = = 0.714, (3/5)β  − β −(3/5) × 33 + 3

(42.12)

which compares well with the experimental value. This is unlike the value of 0.5 obtained for the non-supersymmetric version in Chap. 37. In effect, the following behavior of the couplings is then obtained from (42.7)–(42.9):

60

3/(5g 2 )

40

1/g2

20

1/g2s 5

10

15

log10 ( /GeV)

The SUSY version improves much the non-supersymmetric one. Box 42.1 Estimates from the renormalization group equations of MSSM

The following two equations immediately follow from Eq.(42.8)and Table42.2     1 1 1 d 1 1 3 12 2 d μ2 2 − = , μ − = . dμ αs (μ2 ) α(μ2 ) π dμ2 αs (μ2 ) 5 α  (μ2 ) 5π Upon subtracting the second equation above from 12/5 times the first one and using the unifying boundary condition in Eq.(42.6), as well as the definitions: 1/α  = cos2 θW /α em ,

1/α = sin2 θW /α em ,

gives rise to the following expression for sin θW at the energy scale M Z specified by the mass of the  1 7 α em (M 2Z ) ∗   Z boson : sin2 θW  2 = + . The experimental results below (42.10) then give MZ 5 15 αs (M 2Z )   • sin2 θW  2  0.231 which compares well with experiment. On the other hand, Eqs. (42.7), (42.8), the MZ  2     M 8 α em (M 2Z ) π second equation in (42.10), together with the one above ∗ give ln 1 − . = 3 αs (M 2Z ) M 2Z 5 αem (M 2Z ) 2

The experimental values for α em (M 2Z ), αs (M 2Z ) below Eq. (42.10) give • M  2.2 × 1016 GeV. Accordingly, from Eqs.(42.7), (42.8) • 1/αs (M 2 ) ≈ 24.3, 1/α em (M 2 ) ≈ 65.

282

42

SUSY and the Standard Model: Couplings Unification

One of the consequences of grand unification is that the proton becomes unstable and may decay. With the lifetime of a proton ∝ M 4 , as discussed in Chap. 37, one obtains the very welcome additional power of ∼10 4 for the lifetime to that of the non-supersymmetric theory.4 The fact that the unification energy is large M ∼ 2 × 1016 GeV, gives one hope that a unification scheme may be set up which also unifies the gravitational one at energies M ∼ 1019 GeV and even possibly at a lower energy, since quantum gravity is expected to play an important role in particle physics at the Planck energy: c5 /G N  1.221×1019 GeV, where we have conveniently inserted  and c in the latter expression, and GN is the Newtonian gravitational constant. The Planck energy involves gravitation (GN ), relativity (c), and quantum physics (). A fundamental energy scale arises in the SM from the vacuum expectation value of the Higgs field 246 GeV,5 which sets up the scale for the masses of the particles in the theory, including for the mass of its boson excitation—the Higgs boson. This energy scale is much smaller in comparison to the Planck energy scale ∼ 1019 GeV, or even at a lower energy than the Planck one, at which gravitation may be significant. It is even much smaller than a grand unified energy scale, say ∼ 1016 GeV. The question then arises as to what amounts for the enormous energy scale difference between a grand unified energy scale and the energy scale characteristic of the SM ?. This is known as the hierarchy problem. What kind of new physics arises in this huge range of energy? As a scalar particle, the self-mass squared δ M H2 of the Higgs boson, in the non-supersymmetric SM model, is quadratically divergent, as is directly inferred, by simple power counting, from such diagrams as shown here, as also discussed in Chap. 37, where a solid line denotes a fermion of spin 1/2, while a dashed line denotes a spin 0 boson. This happens because for each diagram, one in integrating in four dimensions in energy-momentum space, involving a power ∼E 4 , and a spin 1/2 propagator vanishes like 1/E, while a spin-zero propagator vanishes like ∝ 1/E 2 , as energy-momentum E → ∞6 :

where the first diagram involves two fermion spin 1/2 propagators, and the second one involves only one spin-0 propagator. With an ultraviolet cut-off taken of the order, say, Λ ∼ 1016 − 1019 GeV, then each diagram diverges with the enormous energy squared Λ2 . Due to the opposite signs of the statistics of the fermion of spin 1/2 and its superpartner of spin 0, however, this quadratic divergence cancels out7 between two such diagrams in a supersymmetric setting. This unnatural cancelation of enormously large numbers has been termed a facet of the hierarchy problem. On the other hand, the SM, as it involves no superpartners, this necessitates that the so-called bare mass of a Higgs boson to be enormous to cancel out such a quadratic divergence arising from radiative corrections in order to give a finite mass for the Higgs boson. We have seen in Chap. 35, that at the tree level approximation, that is, without radiative corrections, the electroweak theory provides very good agreement with experiments for the masses of the gauge bosons. A supersymmetric removal of an unwanted quadratic divergence is a positive contribution to the hierarchy problem. Supersymmetry is of significance in dealing with the hierarchy problem, as in supersymmetric field theories cancelations of such large quadratic corrections, a priori, generally, occur between loops involving particles and loops involving their supersymmetric counterparts in a supersymmetric version of a non-supersymmetric field theory, similar to the ones between the two diagrams mentioned above. Moreover, this cancelation is, possibly, up to divergences of logarithmic type which are, in general, tolerable, thus protecting a scalar particle from acquiring a large bare mass.

experimental large lower bound (See Abe et al. [1]) for the lifetime (>5.9 × 1033 years) of the proton always justifies, however, the need of further investigations of unifications schemes. 5 See Chap. 37. 6 See Chap. 16. 7 For an explicit pedagogical detailed treatment of cancelation of such a quadratic divergence between a spin 1/2 and its superpartner of spin 0, in the so-called Wess-Zumino model, developed in Chap. 41, see Manoukian [7], pp. 153–157. 4 Recent

42

SUSY and the Standard Model: Couplings Unification

283

Although the SUSY version of the standard model is a much improvement over the non-supersymmetric one, it awaits the discovery of superpartners predicted by SUSY which are expected to be much heavier than the particles with which they are associated. Much theoretical work is also necessary, in particular, to include gravitation in its formalism at a quantum level. Supergravity, as the SUSY version of general relativity, will be studied in details in Chap. 70. For additional details on incorporation of SUSY in the Standard Model, see also Tanabashi et al. [10], Benetruy [3], and Weinberg [11].

References 1. Abe, K., et al. (2014). Search for proton decay via p → ν K + using 260 kiloton. Year data of Super-Kamiokande. Physical Review, D90, 072005. 2. Beringer, J., et al. (2012). Particle data group. Physical Review, D 86, 010001. 3. Binetruy, P. (2006). Supersymmetry, experiment, and cosmology. Oxford: Oxford University Press. 4. Eisborn, M. B., & Jones, D. R. T. (1982). The weak mixing angle and unification mass in supersymmetric SU(5). Nuclear Physics B, 196, 475–438. 5. Jones, D. R. T. (1974). Two-loop diagrams in Yang-Mills theory. Nuclear Physics B, 75, 531–538. 6. Jones, D. R. T. (1975). Asymptotic behavior of supersymmetric Yang-Mills theories in the two-loop approximation. Nuclear Physics B, 87, 127–132. 7. Manoukian, E. B. (2016). Quantum field theory II: Introductions to quantum gravity, supersymmetry and string theory. Switzerland: Springer. 8. Peskin, M. (1997). Beyond the standard model. In N. Ellis & M. N. (Eds.). 1996 European school of high-energy physics. Genève: CERN-97-03. 9. Rosiek, J. (1990). Complete set of Feynman rules for the minimal supersymmetric extension of the standard model. Physical Review D, 41, 3464–3501. 10. Tanabashi, M., et al. (2018). Particle data group. Physical Review D, 98, 010001. 11. Weinberg, S. (2000). The quantum theory of fields. III: Supersymmetry. Cambridge: Cambridge University Press.

String Theory

43

A string, whether open or closed, as it moves in spacetime, it sweeps out a two-dimensional surface called a worldsheet. String Theory is a quantum field theory which operates on this two-dimensional worldsheet with remarkable consequences in spacetime itself, albeit in higher dimensions. A string is supposedly very small in extension and may “appear” almost point-like if it is indeed very small, say, of the order of Planck length as no present experiments can probe distances of such an order. We will learn in Chap. 44, in particular, that string theory (involving closed strings) predicts the existence of a spin two massless particle which is identified with the graviton.1 Thus if string theory has to do with quantum gravity, it must involve the three fundamental constants: Newton’s gravitational constant GN , the quantum unit of action  , and the speed of light c. The unit of length emerging from these fundamental constants is the Planck length P = GN /c3 ∼ 10−33 cm just mentioned. Due to the assumed non-zero extensions of strings, it is hoped that they provide, naturally, an ultraviolet cut-off Λ ∼ (P )−1 and give rise to an ultraviolet finite theory. This is unlike conventional quantum field theory interactions where all the quantum fields are multiplied locally at the same spacetime points, like multiplying distributions, e.g.., two delta distributions, at the same point, and are, in this sense, quite troublesome. A remarkable property of string theory, as we will see, is that particles that are needed to describe the dynamics of elementary particles arise naturally in the mass spectra of oscillating strings such as the graviton, and are not, a priori, assumed to exist or put in by hand in the underlying theories. Particles are identified as vibrational modes of strings, and a single vibrating string may describe several particles depending on its vibrational modes. With this remarkable property and the emergence of the graviton from the theory gives a hope that it may provide a unified description of all the fundamental interactions in Nature and, in particular, give rise to a consistent theory of quantum gravity. Massless particles encountered in string theory are really the physically relevant ones here because of the large unit of mass (P )−1 ∼ 1019 GeV in attributing masses to the spectrum of massive particles.2 Strings describing bosonic particles only are referred to as bosonic strings, while the ones describing bosonic as well as fermionic particles are referred to as superstrings. The dimensionality of the spacetime in which the strings live are predicted by the underlying theory as well and are necessarily of higher dimensions than four. We will learn that internal consistency of the theory at the quantum level, dictates a dimensionality of 26 for the bosonic strings (Chap. 44) and a spacetime dimensionality of 10 for the superstrings (Chap. 46). We will see, in particular, how the knowledge of the number of degrees of freedom associated with massless fields lead self consistently to the determination of the dimensionality of the underlying spacetime of a string theory.3 The extra dimensions are expected to curl up into a space that is too small to be detectable with present available energies. For example the surface of a hollow extended cylinder with circular base is two dimensional, with one dimension along the cylinder, and another one encountered as one moves on its circumference. If the radius of the base of the cylinder is relatively small, the cylinder will appear as one dimensional when viewed from a large distance (low energies). Accordingly, the extra dimensions in string theory are expected to be small and methods, referred to as compactifications have been developed to deal with them thus ensuring that the “observable” dimensionality of spacetime is four. Compactification will be studied in Chap. 45. There exists

1 For

the underlying pioneering work, see Yoneya [22], Scherk and Schwarz [17]. systematic analysis of all the massless field excitations encountered in both bosonic and superstrings are investigated in Manoukian [8–10], in their respective higher dimensional spacetimes, and include the determinations of the degrees of freedom associated with them. 3 Note that in four dimensional spacetime the number of degrees of freedom (spin states) of non-scalar massless fields is always two.This is not true in higher dimensional spacetime.For example, the degrees of freedom associated with a massless vector particle is 8 in 10 dimensions, while for the graviton is 35, as shown in Chap. 46. 2A

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_43

285

286

43

String Theory

several superstring theories, and a theory, referred to as M-Theory,4 based on non-perturbative methods, is envisaged to unify the existing superstrings theories into one single theory, instead of several ones, and be related to them by various limiting and/or transformation rules, referred to as dualities. We will not, however, go into M-Theory in the present introductory treatment of the subject. String theory was accidentally discovered5 through work carried out by Veneziano in 1968 when he attempted to write down consistent explicit expressions of meson-meson scattering amplitudes in strong interactions physics prior to the discovery of QCD.6 One of the remarkable properties of the Veneziano scattering amplitude is, unlike a scattering Feynman amplitude where a single particle exchange is described by a Feynman propagator, an infinite number of particles may be exchanged in the corresponding propagator. We will explicitly derive the scattering Veneziano amplitude in Chap. 47 from string theory thus witnessing this remarkable property. As early as 1971, fermions were included7 in various investigations which eventually led to the notion of superstrings. Other extended objects are also encountered in string theory called branes which, in general, are of higher spatial dimensions than one, with the string defined as a one dimensional brane. For example, we will learn that an open string, satisfying a particular boundary condition, referred to as a Dirichlet boundary condition, specifies a hypersurface, referred to as a D brane, on which the end points of the open string reside. On the other hand, the graviton corresponds to a vibrational mode of closed strings, as we will see, and since the latter, having no ends, may not be restricted to a brane and moves away from it. This might explain the weakness of the gravitational field, if our universe is a 3 dimensional brane embedded in a higher dimensional spacetime. As we will see in Chap. 45, a massless particle may acquire mass, as a Higgs-like mechanism in string theory if, for example, the end points of the open string are attached to two different branes, instead of a single brane. In Chap. 48, we will learn that string theory re-invents Yang-Mills Field Theory, and in Chap. 69, we will learn that it re-invents Einstein’s General Relativity as well, after the reader has acquired a good understanding of Einstein’s theory in the earlier chapters to Chapter 69. Bosonic strings are introduced in Chap. 44, while Chap. 45 deals with compactifications, D branes as well as a “Higgs-like mechanism” for a massless particle acquiring a mass in string theory as mentioned above. Chapter 46 deals with superstrings, while Chap. 47 introduces the basics on how vertices, interaction and scattering amplitudes may be described in string theory. Finally, Chap. 48 shows how string theory re-invents the Yang-Mills field theory. My experience with graduate students as well as with established physicists with different backgrounds, but with almost no training in field theory, is that they have much difficulty learning string theory even by reading several treatments on strings. The main reason for this, is many authors writing about string theory forget that they are dealing with a subject of physics and not merely of mathematics. I have much confidence that a reader of the next 5 chapters on strings will have a pretty good preparation to go into more advanced studies. Your best bet after these introductory 5 chapters is to consult my book: Manoukian [11].8 As a preparation of introducing bosonic and superstrings, it is instructive to first construct the action integrals of a relativistic particle as well as of a relativistic superparticle from which the actions of the strings may be generated as generalizations of the actions of these particles: relativistic particle → bosonic string,

relativistic superparticle → superstring.

A relativistic particle, as it moves in spacetime, traces a curve referred to as its worldline. In Minkowski spacetime, generalized, for convenience for later studies, to D dimensions, with one time variable x 0 , and D − 1 space variables x 1 , ..., x D−1 , as set up in a given coordinate system, the interval ds may be expressed as  ds = c dt

 ˙2 1 ˙2 1 mX − m c2 dt, 1− 2X  − c mc 2

(43.1)

where the X μ , in capital letters, with μ = 0, 1, ..., D − 1, X = (X 1 , ..., X D−1 ), are the coordinate labels of points on the string in the “laboratory” system one has set up, and we have inserted the speed of light constant c for further analysis. 4 See,

e.g.., Townsend [19], Witten [21], Duff [5]. Nambu [12], Nielsen [14], Susskind [18]. 6 See Veneziano [20], see also Lovelace and Squires [7], Di Vecchia [4]. For an elementary treatment of the Veneziano construction, see Manoukian [11], pp. 188–189. 7 Neveu and Schwarz [13], Raymond [16]. 8 For general more advanced studies of string theory the following references: Bailin and Love [1], Becker et al. [2], Dine [3], Polchinski [15] are recommended. 5 See

43

String Theory

287

˙ = dX/dt, and as usual X 0 = c t. The Minkowski metric is defined by [ημν ] = diag[−1, 1, ..., 1]. On the extreme Here X right-hand side of Eq. (43.1), we have carried out an expansion for low speeds, and we recognize the term within its square brackets as the non-relativistic kinetic energy minus a constant potential energy. Accordingly, one may define the action of a relativistic particle simply by     dX μ dX ν , (43.2) W rel. part. = −mc ds = −mc dτ −ημν dτ dτ where in the last integral we have parameterized the integral by the proper time τ . In Box 43.1, it is shown that the action W in (43.2) may be equivalently rewritten as W =

1 2κ



 dτ X˙ μ X˙ μ − m 2 c2 κ 2 ,

dX μ . X˙ μ = dτ

(43.3)

For a massless particle m = 0, 1/κ may be chosen as any convenient mass scale parameter. On the other hand, for m = 0, set κ = 1/m. To deal with the relativistic superparticle, we note that the integral in Eq. (43.2) is a one dimensional integral of one variable τ . That is, to describe a superparticle and develop an action with worldline supersymmetry, we note that in one dimension, one may introduce one theta, and define a general superfield i Φ μ (τ, θ ) = X μ (τ ) + √ θ ψ μ (τ ), 2

(θ )2 = 0,

(43.4)

where ψ 0 (τ ), ..., ψ D−1 (τ ) denote D fermion fields. The i factor in the equation is to ensure the reality of Φ μ since complex conjugation reverses the order of anti-commuting factors in a product, and {ψ μ , θ } = 0. Box 43.1 Derivation of the action W in (43.3) for a relativistic particle

  d dX ν dX ν −1/2 d μ − = 0. ∗ X dτ dτ dτ dτ This equation may be more conveniently rewritten as a simultaneous set of two equations by introducing, in the  1/2 ∗∗  , to process, an additional variable, as a function of τ , as follows. Set m c a(τ ) = − (dX ν /dτ )(dX ν /dτ )

∗∗∗  1 d μ d . Therefore, instead of working with the action in (43.2), involving the square obtain X (τ ) = 0 dτ a(τ ) dτ

 1 1 dX μ dX μ root expression, one may consider anew the following expression, W = dτ − m 2 c2 a(τ ) . By 2 a(τ ) dτ dτ   varying with respect to X μ gives the expression in ∗∗∗ . On the other varying with respect to a(τ ) gives the



  1 1 expression in ∗∗ . The latter action integral may be rewritten as W = dX μ dX μ − m 2 c2 a(τ ) dτ 2 a(τ ) dτ dτ  which is invariant under re-parametrizations τ → τ , with the function a(τ ) transforming as a  (τ  ) = a(τ )  . dτ Indeed under an infinitesimal variation δτ = τ − τ  = λ(τ ), dτ  /dτ = 1 − dλ/dτ , the following variations emerge dX μ d δ X μ (τ ) = X μ (τ ) − X μ (τ ) = − X˙ μ (τ ) λ(τ ), X˙ μ = , δa(τ ) = a(τ ) − a  (τ ) = − (a(τ )λ(τ )) and δW = 0, dτ dτ up to a total derivative. Accordingly, one may choose a gauge such that a(τ ) is some arbitrary constant, say, κ,   1 dτ X˙ μ X˙ μ − m 2 c2 κ 2 , having the dimension of (mass)−1 , to rewrite the action in this gauge as W = 2κ X˙ μ = dX μ /dτ. For a massless particle m = 0, 1/κ may be chosen as any convenient mass scale parameter. On By varying the action W in Eq. (43.2) with respect to X μ gives

the other hand, for m = 0, set κ = 1/m.

We may, in turn, define a one dimensional supersymmetry transformation τ = τ+

i

θ, 2

θ  = θ + .

(43.5)

288

43

String Theory

Accordingly, with Φ μ (τ  , θ  ) = Φ μ (τ, θ ), and hence Φ μ (τ, θ ) = Φ μ (τ − (i/2) θ, θ − ), i.e, μ

μ





δΦ (τ, θ ) = Φ (τ, θ ) − Φ (τ, θ ) =

i ∂ ∂ + θ ∂θ 2 ∂τ



Φ μ (τ, θ ),

(43.6)

i 1 dX μ . (43.7) δψ μ = √ X˙ μ , X˙ μ = ⇒ δ X μ = √ ψ μ, dτ 2 2   Using, by now, the well known integrals dθ = 0, dθ θ = 1, we may define the action of a relativistic superparticle, consistent with the transformation rules in Eq. (43.7), by  1 W rel. superpart. = − 2κ



 dθ dτ θ

1 X˙ μ X˙ μ + ψ μ ψ˙ μ i

 = −

1 2κ



 dτ

 1 X˙ μ X˙ μ + ψ μ ψ˙ μ , i

(43.8)

and if τ is taken a dimensionless parameter, κ, an arbitrary constant, would have the dimensions of [length]2 in units for which  and c are set equal to one. It is rather surprising to me to hear those who criticize string theory by stating, in one way or the other, that string theory is “physically irrelevant”. It is not out of the question that certain critics may not be very well informed. In this respect, I would like to re-iterate, what has already been mentioned above in this chapter, about some remarkable properties that string theory possesses : (1) Particles that are needed to describe the dynamics of elementary particles emerge naturally from the theory and are not, a priori, assumed to exist or put in by hand in the underlying theory. This includes the evasive graviton as we will witness, e.g., in Chap. 44. (2) General Relativity (GR) may be recovered from string theory (Chap. 69). The success of GR as our present theory in describing gravitational phenomena including in cosmology has been well recorded and will be witnessed in several chapters in this book. (3) Yang-Mills Field Theory may be recovered from string theory (Chap. 48). Needless to say the success of the description of the dynamics of elementary particles in Nature, based on Yang-Mills Field Theory (quantum field theory in terms of gauge fields) has been well recorded and is well witnessed in several chapters in this book. We recall what the legendary Feynman [6] p. 1, after all, said in his lectures on the quantum field theory description of fundamental processes in Nature that his lectures cover all of physics. I may add to these aspects of the theory by saying that string theory may, hopefully, solve the ultraviolet divergence problem in string theory due to the finite extension of a string. I may add that it may, hopefully, lead to provide a consistent quantum theory of gravitation. I may add that it may, hopefully, provide a unified description of the fundamental forces in Nature which includes gravitation. But the above three aspects of string theory alone justify learning the subject of strings, carrying research in it thus leading hopefully to further developments in the field by putting, in the process, emphasis on physical considerations related to the description of Nature.

References 1. 2. 3. 4.

5. 6. 7. 8. 9. 10. 11. 12. 13.

Bailin, D., & Love, A. (1994). Supersymmetric gauge field theory and string theory. Bristol: Institute of Physics Publishing. Becker, K., Becker, M., & Schwarz, J. H. (2006). String theory and M-theory: A modern introduction. Cambridge: Cambridge University Press. Dine, M. (2007). Supersymmetry and string theory: Beyond the standard model. Cambridge: Cambridge University Press. Di Vecchia, P. (2008). The birth of string theory, in M. Gasperini & J. Maharana (Eds.), String theory and fundamental interactions: Gabriele Veneziano and Theoretical Physics: Historical and Contemporary Perspectives. Lecture Notes in Physics (Vol. 737, pp. 59-118). Berlin: Springer. Duff, M. J. (1996). M-Theory (The theory formerly known as superstrings). International Journal of Modern Physics A, 11, 5623–5642. Feynman, R. P. (1982). The theory of fundamental processes. Menlo Park, California: The Benjamin/Cummings Publishing Co., 6th Printing. Lovelace. C, & Squires, E. (1970). Veneziano theory. Proceedings of the Royal Society of London A318, 321–353. Manoukian, E. B. (2012). All the fundamental massless bosonic fields in bosonic string theory. Fortschritte der Physik, 60, 329–336. Manoukian, E. B. (2012). All the fundamental bosonic massless fields in superstring theory. Fortschritte der Physik, 60, 337–344. Manoukian, E. B. (2012). All the fundamental massless fermion fields in supersring theory: A rigorous analysis.Journal of Modern Physics, 3, 1027–1030. Manoukian, E. B. (2016). Quantum field theory II: Introductions to quantum gravity, supersymmetry and string theory. Switzerland: Springer. Nambu, Y. (1969). In Proceedings of International Conference on Symmetries and Quark Models (p. 269). New York: Wayne State University, Gordon and Breach. Neveu, A., & Schwarz, J. H. (1971). Factorizable dual model of pions. Nuclear Physics B, 31, 86–112.

43 14. 15. 16. 17. 18. 19. 20.

String Theory

289

Nielson, H. (1970). International Conference on High Energy Physics, Kiev Conference, Kiev. Polchinski, J. (2005). Superstring theory (Vols. I and II). Cambridge: Cambridge University Press. Raymond, P. (1971). Dual theory for free fermions. Physical Review D, 3, 2415–2418. Scherk, J., & Schwarz, J. H. (1974). Dual models for non-hadrons. Nuclear Physics B, 81, 118–144. Suskind, L. (1970). Dual symmetric theory of hadrons. I. Nuovo Cimento A, 69, 457–496. Townsend, P. K. (1995). The eleven-dimensional supermembrane revisited. Physics Letters, B350, 184–187. Veneziano, G. (1968). Construction of a crossing-symmetric Regge-behaved amplitude for linearly rising trajectories. Nuovo Cimento, 57A, 190–197. 21. Witten, E. (1995). String theory dynamics in various dimensions. Nuclear Physics B, 443, 85–126. 22. Yoneya, T. (1974). Connection of dual models to electrodynamics and gravidynamics. Progress of Theoretical Physics, 51, 1907–1920.

Bosonic Strings Prerequisite Chaps. 16, 43

44

The points on the worldsheet of a string are parametrized by coordinates ξ 0 = τ , ξ 1 = σ , which correspond, respectively, to timelike and spacelike directions. That is, ξ 1 = σ is the spacial coordinate along the string, i.e., it provides a partition of the string for somebody “sitting” on it, while ξ 0 = τ , describes its propagation in time:

In a coordinate system set up (“laboratory system”), a point on the string is labeled by X μ (τ, σ ), μ = 0, 1, ..., D− 1 in a D-dimensional Minkowski spacetime. If the worldsheet of string itself were Minkowskian, i.e., flat, the Lagrangian density L , multiplied by dτ dσ , involving the bosonic real D fields: X 0 , X 1 , . . . , X D−1 , would by simply given by a sum over all these fields (see Eq. (21.22) in Chap. 22)   1  dτ dσ L Mink. ∝ dτ dσ − ηαβ ∂α X μ ∂β X μ , α, β taking each the values τ, σ, 2

(44.1)

with a summation over the Minkowski index μ corresponding to the “laboratory” frame, and the 1/2 factor is due the reality of the D fields. More generally, a curved worldsheet with an endowed metric h αβ simply generalizes the above expression as follows1  ηαβ → h αβ , dτ dσ → −det [h αβ ] dτ dσ, and gives rise to an action  √ 1 −h dτ dσ h αβ ∂α X μ ∂β X μ , h ≡ det [h αβ ], W =− 2κ

(44.2) (44.3)

where κ is a constant that will be elaborated upon below. In Boxes 44.1 and 44.2, it is shown that one can always parametrize a worldsheet such that the action takes the simple Minkowskian one, with the re-paramatrization referred to as light-cone gauge given in Box 44.2. Without loss of generality, we will still use the standard notation for the resulting parameters by τ and σ and consider them as dimensionless.

1 We

will have amply opportunity to deal with curved structures in general relativity in subsequent chapters. © Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_44

291

292

44

Bosonic Strings

Box 44.1 Diagonalization of a metric in 2 dimensions

Consider a coordinate transformation: ξ = (ξ 0 , ξ 1 ) → ρ = (ρ 0 , ρ 1 ), in 2 dimensions. The underlying metric ∂ξ α ∂ξ β   transform as follows: h αβ (ξ ) → hˆ α β (ρ) or hˆ α  β  (ρ) = h αβ (ξ ) α  β  . The latter may be rewritten as ∂ρ ∂ρ ⎛ ⎞ ⎞ ⎛ 0   ∂ξ 0 ∂ξ 0   ∂ξ ∂ξ 1 hˆ 00 hˆ 01 ∂ρ 0 ∂ρ 0 ⎠ h 00 h 01 ⎝ ∂ρ 0 ∂ρ 1 ⎠ = ⎝ ∂ξ . This leads us to consider the equation 0 ∂ξ 1 ∂ξ 1 ∂ξ 1 h 10 h 11 hˆ 10 hˆ 11 

hˆ 00 hˆ 10

hˆ 01 hˆ 11



 =

∂ρ 1 ∂ρ 1

a b c d



h 00 h 01 h 10 h 11



∂ρ 0 ∂ρ 1

a c bd



, with the first matrix on the right-hand side being

the transpose of the third one. One may explicitly write : hˆ 01 = h 00 ac + h 01 (bc + ad) + h 11 b d, hˆ 00 = h 00 a 2 + 2h 01 ab + h 11 b2 , −h 201

For h 00 = 0, and hence h = giving [ hˆ αβ ] = diag[−1, 1], For h 00 For h 00

hˆ 11 = h 00 c2 + 2h 01 cd + h 11 d 2 ,

= 0, choose d = b = 0, a = −(1 + b h 11 )/2b h 01 , c = (1 − b2 h 11 )/2b h 01 2

where h = det[h αβ ] = h 00 h 11 − h 201 . √ √ > 0, choose a = h 01 / −h, b = −h 00 / −h, c = ±1, d = 0. This gives [ hˆ αβ ] = (h 00 ) diag[−1, 1]. √ √ < 0, choose c = h 01 / −h, d = −h 00 / −h, a = ±1, b = 0. This gives [ hˆ αβ ] = (−h 00 ) diag[−1, 1].

That is, we may write [ hˆ αβ ] = eφ diag[−1, 1], with eφ defining a positive function.

Box 44.2 The light-cone gauge

We can relate the diagonal metric in Box 44.1 [ hˆ αβ ] = diag [ − eφ , eφ ] to the Minkowski one, via a transformation rule as follows: eφ ηαβ = ηα  β  eφ =

∂ζ 0 ∂ζ 0 ∂ζ 1 ∂ζ 1 − , 0 0 ∂ρ ∂ρ ∂ρ 0 ∂ρ 0

0 =





∂ζ α ∂ζ β . Multiplying the latter by minus ones, gives ∂ρ α ∂ρ β

∂ζ 0 ∂ζ 0 ∂ζ 1 ∂ζ 1 − , 0 1 ∂ρ ∂ρ ∂ρ 0 ∂ρ 1

− eφ =

∂ζ 0 ∂ζ 0 ∂ζ 1 ∂ζ 1 − . 1 1 ∂ρ ∂ρ ∂ρ 1 ∂ρ 1

∂ζ 0 ∂ζ 1 ∂ζ 1 ∂ζ 0 = , = . Linear combinations of these two equations lead to 0 1 0 ∂ρ ∂ρ ∂ρ ∂ρ 1     ∂ ∂ ∂ ∂ ζ 0 + ζ1 ζ 0 − ζ1 + − + − ζ ζ − = 0, + = 0, where ζ = = , ζ . √ √ ∂ρ 0 ∂ρ 1 ∂ρ 0 ∂ρ 1 2 2 Clearly

Since [ (∂/∂ρ 0 ) ± (∂/∂ρ 1 ) ]ρ ∓ = 0, we may infer that ζ ± is each a linear combination of functions of √ ρ ± . Since in turn ζ 0 = (ζ + + ζ − )/ 2, we note that ζ 0 is a linear combination of functions of ρ +

     ∂ 2 ∂ 2 0 ∂ ∂ 0 and ρ − . That is, ζ 0 satisfies the free field equation: + ζ = 0 or − ζ = 0. ∂ρ ∂ρ − ∂ρ 1 ∂ρ 0 The light-cone gauge consists of choosing ζ 0 as a simple linear combination ζ 0 = c1 + c2 X + since X + , in particular, satisfies also the free field equation mentioned above. Here c1 , c2 are constants.

44

Bosonic Strings

293

That is, the action of the bosonic string will be taken in this gauge as W = −

1 2π  2



(dζ ) ηαβ (∂α X ) · (∂β X ),

(dξ ) ≡ dξ 0 dζ 1 ≡ dτ dσ, [ηαβ ] = diag[−1, 1],

(44.4)

which is the action of free fields X μ : [∂σ2 − ∂τ2 ]X μ = 0, where the dot (·) denotes a scalar product in Minkowski spacetime A · B = Aμ Bμ , and we have used the fact since the X μ have the dimension of length, and τ, σ are taken as dimensionless, as well as , c are set equal to one, κ in Eq. (44.3) must have the dimension of (length)2 . We use the standard notation for it: κ = π  2 , with  having the dimension of length. Another standard notation used often is to set  2 = 2 α,

√ with 1/ α  providing a mass scale.

(44.5)

√ We will see below in Eq.(44.27), how 1/ α  sets the mass scale of particles generated.2 Here we pose for a moment to note that in Eq. (44.3) there is no kinetic term for the metric h μν and that the so called worldsheet energy-momentum tensor, being the source of its field equation, should vanish. This would be the same if the Faraday tensor F μν is “formally” set equal to zero in the Maxwell Lagrangian density, or equivalently in Maxwell’s equation −∂μ F μν = J ν leading to a vanishing electromagnetic current as the source in the latter field equation.3 In Box 44.3, we provide a method for computing the energy-momentum tensor. It is convenient to work with light cone variables defined by X+ =

X 0 + X D−1 X 0 − X D−1 , X− = , X i , i = 1, . . . D − 2, √ √ 2 2 X · X = X · X − X 0 X 0 = X i X i − X + X − − X − X +.

(44.6) (44.7)

In terms of the ζ = (τ, σ ) variables in the action (44.4), we have seen in Box 44.2, that in the light-cone gauge, ζ 0 may be chosen as a a linear combination ζ 0 = c1 + c2 X + . This, in turn allows us to write X + in terms of two constants x + , p + , whose interpretations will emerge later, as follows X + = x + +  2 p + τ.

 ∂τ X + =  2 p + ,

 ∂σ X + = 0 .

(44.8)

Since the action in Eq. (44.4) gives free field equations we have, in particular X i = 0, i = 1, . . . , D − 2. We also note from the equation of X + above, ∂σ X + = 0 as indicated above between parentheses. From the constraints derived in Box 44.3, we obtain the following field equations for bosonic strings X + = x + +  2 p + τ,  1  2 p + ∂τ X − = (∂σ X i )2 + (∂τ X i )2 , 2

 2 p + ∂σ X − = ∂τ X i ∂σ X i , X i = 0, i = 1, . . . , D − 2,

(44.9)  ≡ ∂σ2 − ∂τ2 ,

(44.10)

where the constraints on the derivatives of X − , consisting of the second equation in (44.9), and the first one in (44.10) derived in Box 44.3, follow, in particular, directly from the vanishing of the worldsheet energy-momentum tensor now in flat space. To solve these equations, we have to set up boundary conditions (B.C) for the strings. As mentioned earlier, the coordinates (τ, σ ) are chosen dimensionless, with τi ≤ τ ≤ τ f , and 0 ≤ σ ≤ π , for both open and closed strings with an obvious periodicity condition for closed strings over the coordinate σ .

an almost “point-like” string, an intrinsic length  of the order of the Planck length (∼ 10−33 cm) sets up particularly a very large value for √ the mass scale 1/ α  . which would be of the order of the Planck mass ∼ 1019 GeV/c2 . 3 Another simple example is the following. Consider the Lagrangian density involving three scalar fields φ , φ , φ in 4 dimensional Minkowskian 1 2 3      2 μ spacetme given by L = (dx) i=1,2 − (1/2)∂ φi ∂μ φi + φ3 f 1 (φ1 , φ2 ) + φ3 f 2 (φ1 , φ3 ) , where the f 1 (φ1 , φ2 ), f 2 (φ1 , φ2 ) terms give rise to interaction terms. We note that the Lagrangian density does not include a “kinetic term” −(1/2)∂ μ φ3 ∂μ φ3 for the φ3 field, and the Euler-Lagrange equation for the latter filed gives 0 = f 1 (φ1 , φ2 ) + 2 φ3 f 2 (φ1 , φ2 ) which is an equation satisfied by the φ3 field. 2 For

294

44

Bosonic Strings

Box 44.3 Vanishing of the worldsheet energy-momentum tensor and merging constraints

Consider the response of the action in (44.11) under an infinitesimal transformation of the fields : μ

δ X =  α (ζ ) ∂α X μ , δψ μ =  α (ζ ) ∂α ψ μ (ζ ), leading, by using the free field equations of the fields that  1 1 (dζ ) ∂ α  β (ζ ) Tαβ (ζ ), where Tαβ = ∂α X · ∂β X − ηαβ ∂ γ X · ∂γ X up to total derivatives δW = − 2π 2 2 is the energy-momentum tensor. The vanishing of Tαβ allows us to consider the constraint T00 = 0. That is, 0 = [∂0 X · ∂0 X + ∂1 X · ∂1 X ]/2 = −∂1 X + ∂1 X − − ∂0 X + ∂0 X − + (1/2)[(∂1 X i )2 + (∂0 X i )2 ] by using (44.7). Finally ∂1 X + = 0, ∂0 X + =  2 p + leading to the first equation in (44.10). Similarly, 0 = T01 = ∂0 X · ∂1 X = ∂0 X i ∂1 X i − ∂0 X + ∂1 X − − 0, giving the second equation in (44.9).

The response of the action W in Eq. (44.4) for a bosonic string to the variation of the fields δ X μ is given by δW = −

1 π 2



(dξ ) ηαβ (∂α δ X ) · ∂β X =

   1  (dξ )(δ X · X ) − (dξ ) ∂ α (δ X · ∂α X ) . 2 π

(44.11)

The first integral in the second equality gives the free field equations X μ = 0. On the other hand, the second integral within the square brackets becomes      , (44.12) − δ X · ∂σ X  − dτ δ X · ∂σ X  σ =π

σ =0

for δ X (τ f , σ ) = 0 = δ X (τi , σ ). For closed string, we impose the periodicity condition X μ (τ, σ ) = X μ (τ, σ + π ),

for all τ : τi ≤ τ ≤ τ f ,

(44.13)

which implies the equality of the two terms in Eq. (44.12) leading to δW = 0. For open strings the vanishing of integral in (44.12) leads us to consider the following two immediate boundary conditions for the vanishing of (44.12):

Neumann B.C. :

  Choose ∂σ X μ (τ, σ )σ =π = 0 = ∂σ X μ (τ, σ )σ =0 ,

(44.14)

with the ends of the string not taken to be fixed. In the other extreme case, we have the    μ Dirichlet B.C. : Choose X μ σ =π = X , X μ σ =0 = X μ , and δ X μ (τ, σ )σ =0,π = 0,

(44.15)

where X μ , X μ are constants and the position of the end points do not change with τ . That is, the end points of the string are fixed. This also implies that the ends of the string carry no momentum. One may also consider mixed B.C. between the above two just given with interesting consequences which will be discussed in the next chapter. For example, consider an open string with Neumann B.C., i.e. satisfying the set of equations (44.9), (44.10) and (44.14). We may carry out a Fourier transform of the fields X i as follows: X i (τ, σ ) = x

i

+ 2 p i τ + i 

 α i (n) e− i n τ cos nσ, n n=0

(44.16)

where note that ∂σ cos nσ |σ =π,0 = 0. The canonical momentum conjugate to X μ is given by Pμ =

∂L 1 ˙ = X μ, μ ˙ π 2 ∂X

L = −

1 ηαβ (∂α X ) · (∂β X ), 2π  2

∂ Xμ . X˙ μ = ∂τ

(44.17)

44

Bosonic Strings

295

That is,

P i (τ, σ ) =



1 i 1  i p + α (n) e− i n τ cos nσ, π π  n=0

π

dσ P i = p i ,

(44.18)

0

and p i denotes the i th component of the momentum of the string in the specified directions, and x i +  2 p i τ in (44.16) denotes the i th component of its center of mass position. We note that P i (τ, σ ) above may be conveniently rewritten as ∞ 1  i α (n) e− i n τ cos nσ, P (τ, σ ) = π  n=−∞

 p i ≡ α i (0).

i

(44.19)

In Box 44.4, the solution of X − (τ, σ ) is worked out from the second equation in (44.9) and is shown, in particular, that 2  2 p + p − = (α i (0))2 +



α i (−n) α i (m) =  2 pi pi +

n=0



α i (−n) α i (n),

(44.20)

n=0

where in writing the last equality, we have used the last equation in (44.19). As in Eq. (44.7), we note that p μ pμ ≡ p 2 = −2 p + p − + pi pi which from Eq. (44.20) gives the simple key equation − p2 =

1  i α (−n) α i (n)  2 n=0

(44.21)

which will be now investigated, and a sum over i = 1, ..., D − 2 is understood in the above equation. The quantum aspect of string theory emerges upon introducing the equal τ commutation relations of the fields X i (τ, σ ) in (44.16) and their canonical momenta P i (τ, σ ) in (44.19): [ X i (τ, σ ), P j (τ, σ  ) ] = i δ i j δ(σ − σ  ), 

[ X (τ, σ ), X (τ, σ ) ] = 0, i

j

⇒ [ α (n), α (m) ] = n δ(m, −n) δ , i

j

(44.22) 

[ P (τ, σ ), P (τ, σ ) ] = 0, i

ij

j

α (−n) = (α (n)) , i

i



(44.23) (44.24)

where the last equality in (44.24) follows from the Hermiticity of the quantum fields X i (τ, σ ). We now have a mass shell condition p 2 = −M 2 given from (44.21) by

M2 = =

∞  1  i 1  i i i i i α α (−n) α (n) = (−n) α (n) + α (n) α (−n)  2 n=0  2 n=1 ∞ ∞ D−2   1  i † i α (n) α (n) + n , α  n=1 2 n=1

 2 1 , = 2 α

(44.25)

where we have used the commutation relations and the√adjoint character of α i (−n) in (44.24), the fact that i = 1, . . . , D − 2, as well as the notation in (44.5) for the mass scale 1/ α  . ∞ In making a transition from a classical description to a quantum one, the constant n=1 n in (44.25) is

296

44

Bosonic Strings

Box 44.4 Solving for X −

  From the first equation in (44.10) and (44.16), we have 2  2 p + ∂τ X − = (∂σ X i )2 + (∂τ X i )2  = 2 e− i (n+m)τ α i (n) α i (m) cos(n + m) σ n,m

= 

2



α i (−m) α i (m) +  2



e− i N τ

N = 0

m

∂τ X − = (1/ 2 p + )

+ ∞ 



 α i (N − m) α i (m) cos N σ. Or

m

α i (−m) α i (m) + (1/ 2 p + )



e− i N τ

N = 0

m=−∞

integrating the latter gives : X − (τ, σ ) = (x − +  2 p − τ ) + where p − = (1/ 2 2 p + )

+ ∞ 

α i (−m) α i (m), ⇒

∞  + 

 α i (N − m) α i (m) cos N σ. Upon

m=−∞

i 2 p+

  e− i N τ   α i (N − m) α i (m) cos N σ, N m

N  =0

2  2 p+ p− =

m=−∞

+ ∞ 

α i (−m) α i (m).

m=−∞

interpreted as the unique analytical continuation of the Riemann zeta function4 defined by ζ (s) =

∞ 

m −s ,

for Re s > 1,

(44.26)

m=1

to s = −1 having the value ζ (−1) = − 1/12. Accordingly, the mass shell condition takes the form M2

1 =  α



∞ 

m=1

 D − 2 α i (m)† α i (m) − , 24

(44.27)

with (α  )−1/2 clearly providing a fundamental mass scale as mentioned earlier. We may define ground-states, annihilated by α i (m), and labeled by the momenta, say, | 0 ; p  : α i (m) | 0 ; p  = 0,

p  = ( p + , p 1 , ..., p D−2 ), i = 1, 2, ..., D − 2,

m = 1, 2, ... .

(44.28)

The particle of lowest mass corresponds to M 2 | 0 ; p  = −

1 α



 D−2 , 24

(44.29)

which forq D > 2 corresponds to a particle of negative mass squared—a tachyon—not a very welcome particle.5 The first excited state corresponds to   D−2 1 2 i †  α i (1)† | 0 ; p  . (44.30) M α (1) | 0 ; p =  1 − α 24 Now a vector particle in 4 dimensions which has only two degrees of freedom is a massless particle. On the other hand, we will show in Box 44.5 that in D dimensions, a vector particle, with D − 2 degrees of

4 For

details and interesting applications of the Riemann zeta function, see, e.g.., Elizalde et al. [1], Elizalde [2]. The analytical continuation to s = −1 is all what is needed here. We will not, however, go into the mathematical intricacies of this analytical continuation procedure here. 5 We will see that in supersymmetric strings no such a tachyon arises when one invokes supersymmetry in the theory.

44

Bosonic Strings

297

Box 44.5 Vector Field in D dimensions of D − 2 degrees of freedom is massless

Consider the Lagrangian density L = − (1/4)F μν Fμν , F μν = ∂ μ Aν − ∂ ν Aμ . The Lagrangian density is invariant under the gauge transformation Aμ → Aμ + ∂ μ Λ. In the Coulomb gauge : ∂ i A i = 0, where a sum over i = 1, 2, ..., D − 1 is understood. We add an external source contribution to the Lagrangian density 1 L  = − F μν Fμν + J μ Aμ . Due to the Coulomb gauge constraint, not all the components of Aμ may be varied 4    independently. We may, however, express Aμ as Aμ = −A 0 ημ 0 + ημ i δ i j − ∂ i ∂ j /∇ 2 A j and vary A 0 , A j    to obtain from L  : −  A i = δ i j − ∂ i ∂ j /∇ 2 J j , − ∇ 2 A0 = J 0 in D dimensions. That is only Ai propagates as no time derivative in the equation of A0 arises. On the other hand, Ai has D − 1 components, and, the Coulomb gauge as a single constraint imposed on the Ai makes only D − 2 of the components of the latter independent. Thus we learn that degrees of freedom D − 2 of a vector field in D dimensions is of a massless vector field. As a matter of fact the degrees of freedom of a massive field in D dimensions is is D − 1, as we will see in the next Box (Box 44.6).

freedom is a massless particle—the photon. Accordingly by setting the right-hand side of (44.30) equal to zero, we obtain that the critical spacetime dimension of bosonic strings is given by D = 26.6 On the other hand, a vector field in D dimensions of D − 1 degrees of freedom is massive as shown in Box 44.6. The expression for the mass squared M 2 in (44.27) may be now rewritten as M

2

1 =  α



∞ 

 α (m) α (m) − 1 . i



i

(44.31)

m=1

√ √ Higher excitations are similarly considered, as well as closed bosonic strings. We recall from (44.5) 1/ α  = 2/, and if an intrinsic length (∼ 10−33 cm) this sets up particularly a very large value for √  is taken of the order of the Planck length 19  the mass scale 1/ α of the order of the Planck mass ∼ 10 GeV/c2 . Accordingly massless excitations are really relevant here. What is interesting is that string theory gives rise to an elegant description of imparting a mass to a massless excitation by the consideration of some additional structures in the theory referred to as D p branes. This is introduced in the next chapter. For the description of all the massless fields in bosonic strings see Manoukian [3, 4]. See also Manoukian [4] for detailed study of open and closed strings satisfying various boundary conditions discussed earlier. We now consider closed bosonic strings, where the B.C are given in (44.13): X μ (τ, σ ) = X μ (τ, σ +π ). Upon introducing a right-mover and a left-mover fields7 : X iR (τ − σ ) = x

i R

+  α i (0)(τ − σ ) +

i  α i (n) −2 i n(τ −σ ) , e 2 n=0 n

(44.32)

X iL (τ + σ ) = x

i L

+  α¯ i (0)(τ + σ ) +

i  α¯ i (n) −2 i n(τ +σ ) e , 2 n=0 n

(44.33)

where X i (τ, σ ) = X iR (τ − σ ) + X iL (τ + σ ), we have

6 The

critical dimension D of the string theory may be also obtained from Lorentz invariant arguments. as τ → τ + δτ, δτ > 0, σ → σ + δσ , with δσ = δτ > 0 for invariance of the expression for X iR (τ − σ ) justifying the nomenclature for it as a right-mover, and a similar analysis is given for X iL (τ + σ ).

7 Note

298

44

Bosonic Strings

Box 44.6 Vector field in D dimensions of D − 1 degrees of freedom is massive

By adding a mass term to Maxwell’s equation, we have : −∂μ F μν + m 2 V ν = 0, where F μν = ∂ μ V ν − ∂ ν V μ and is anti-symmetric in μ, ν ⇒ ∂ν ∂μ F μν = 0, m 2 ∂ν V ν = 0.That is, for m = 0, we have the single constraint ∂ν V ν = 0, ν, ν = 0, 1, . . . , D − 1. But V ν has D components, satisfying one constraint. Hence, the number

of independent components (degrees of freedom) of V ν is D − 1.

    X i (τ,σ ) = x i +  α i (0) + α¯ i (0) τ +  α¯ i (0) − α i (0) σ i   α i (n) −2 i n(τ −σ ) α¯ i (n) −2 i n(τ +σ )  e e , + + 2 n=0 n n where x i = x

i R

(44.34)

+ x iL . The periodicity condition in (44.13) is satisfied only if α i (0) = α¯ i (0),

(44.35)

in order to make the third term in (44.34) vanish. The canonical momentum conjugate to X i from (44.17) then becomes P i (τ, σ ) =

∞  1   i α (n) e−2 i n(τ −σ ) + α¯ i (n) e−2 i n(τ +σ ) , π n=−∞

 p i = α i (0) + α¯ i (0), ⇒



π

pi =

dσ P i ,

(44.36)

0

 i p = α i (0) = α¯ i (0). 2

(44.37)

The expression for X i in (44.34) then may be rewritten as X (τ, σ ) = x i

i

 i  α i (n) −2 i n(τ −σ ) α¯ i (n) −2 i n(τ +σ ) + p τ + + , e e 2 n=0 n n 2

i

which obviously satisfy the field equations X i (τ, σ ) = 0, i = 1, 2, . . . , D − 2,

Also

∂τ X i = + 



[ α i (n) e−2 i n(τ −σ ) + α¯ i (n) e−2 i n(τ +σ ) ],

(44.38) (44.39)

(44.40)

n



∂σ X i = −  

[ α i (n) e−2 i n(τ −σ ) − α¯ i (n) e−2 i n(τ +σ ) ],

(44.41)

n π

dσ ∂τ X i ∂σ X i = 0,

(44.42)

0

where the latter equation follows from the constraint in (44.9) and the B.C. in (44.13). From the constraint in Eq. (44.10), and Eqs. (44.40), (44.41) and an analysis as carried out in Box 44.4, gives X − = x − +  2 p − + f − (τ − σ ) + f + (τ + σ ),

f ∓ (τ ∓ σ ) some functions of τ ∓ σ,

(44.43)

44

Bosonic Strings

299

p− =

1 22 p

  2  i i i i i i α α (0) + α ¯ (0) + 2 (−n)α (n) + α ¯ (−n) α ¯ (n) , + 

(44.44)

n=0

α i (−n)α i (n) − α¯ i (−n)α¯ i (n)] = 0,

(44.45)

n=0

where the last equation follows from (44.42). From (44.37) and (44.44) and the basic property in (44.7) now applied to p μ pμ = p 2 , gives the key equation − p2 = 2 p+ p− − p i p i =

2  i [ α (−n) α i (n) + α¯ i (−n) α¯ i (n) ]. 2 n=0

(44.46)

To make a transition to a quantum description, the fields X i in (44.38) and their canonical conjugate momenta in (44.36) are to satisfy the commutation relations [ X i (τ, σ ), P j (τ, σ  ) ] = i δ i j δ(σ − σ  ), 

[ X (τ, σ ), X (τ, σ ) ] = 0, i

j

(44.47) 

[ P (τ, σ ), P (τ, σ ) ] = 0. i

j

(44.48)

The Hermiticity of the quantum fields X i (τ, σ ) also requires that α i (−n) = (α i (n))† ,

α¯ i (−n) = (α¯ i (n))† .

(44.49)

The ensuing commutation rules of α i (−n), α¯ i (n) are then given by [ α i (n), α j (m) ] = n δ(m, −n) δ i j ,

[ α¯ i (n), α¯ j (m) ] = n δ(m, −n) δ i j ,

[ α (m), α¯ (n) ] = 0. i

j

(44.50) (44.51)

The mass shell condition (44.46), together with (44.45), then takes the form 2  i [ α (−m) α i (m) + α¯ i (−m) α¯ i (m) ],  2 m=0   α i (−m) α i (m) − α¯ i (−m) α¯ i (m) = 0.

M2 =

m=0

(44.52) (44.53)

m=0

More conveniently, we may write

M2 =

 ∞  2  D−2 i † i i † i α , (m) α (m) + α ¯ (m) α ¯ (m) − α  m=1 12 ∞ 

α (m) α (m) = i



i

m=1

∞ 

α¯ i (m)† α¯ i (m),

(44.54) (44.55)

m=1

The latter equation signifies that there is an equal amount of excitations of the right and left movers when applied to particle states in the mass spectrum. Again a tachyon with negative mass squared appears. The following first excited states are of great importance as they involve the evasive graviton:



α (1) α¯ (1) | 0; p , i



j



∞  m=1

α (m) α (m) = i



i

∞  m=1

α¯ i (m)† α¯ i (m) → 1,

(44.56)

300

44

Bosonic Strings

where | 0; p  denotes the ground state. Of particular interest is the term  1 i k j s 2 δ δ + δi s δ j k − δ i j δ k s α k (1)† α¯ s (1)† contributing to α i (1)† α¯ j (1)† , 2 D−2   1 2 δi k δ j s + δi s δ j k − M2 δ i j δ k s α k (1)† α¯ s (1)† | 0; p  2 D−2  D − 2 1 i k j s 2 4 δ δ + δi s δ j k − δ i j δ k s α k (1)† α¯ s (1)† | 0; p  , =  1− α 24 2 D−2

(44.57)

(44.58)

where M 2 is given in (44.54), and where a sum over k, s = 1, . . . , D−2 is understood. This defines a field described by a second rank symmetric tensor whose trace is equal to zero with i, j = 1, . . . , D−2, and hence has (D−1)(D−2)/2−1 = D(D−3)/2 degrees of freedom, where the minus one term corresponds to the constraint that the trace of the tensor field is zero in D dimensions of space and from the analysis in Box 44.7, it is a massless field. This gives again D = 26 by setting (1 − (D − 2)/24) equal to zero for a massless field.

Box 44.7 A symmetric second rank tensor of D(D − 3)/2 degrees of freedom in D dimensions is massless

Consider the Lagrangian density of a symmetric tensor massless field h μν given by 1 σ 1 L = − ∂ h μν ∂σ h μν + ∂μ h μν ∂σ h σ ν − ∂σ h σ μ ∂μ h + ∂ μ h ∂μ h + T μν h μν , h = h μ μ , where T μν is a classical 2 2 source. For a conserved source ∂μ T μν = 0, the corresponding action is invariant under the gauge transformation h μν → h μν + ∂ μ Λν + ∂ ν Λμ , up to a total derivative. In the Coulomb-like gauge ∂ i h iν = 0, ν = 0, 1, ..., D − 1, i = 1, 2, ..., D − 1. It is not difficult to verify that the components h i i (sum over i), h 00 , h 0i satisfy field equations in which no time derivatives act on them. That is they do not propagate. On the other hand the field equation of the symmetric component h i j has (D − 1)D/2 components and its field equation involves time derivatives acting on it. But its trace h i i , together with its constraints ∂i h i j = 0, j = 1, 2, . . . , D − 1 make up 1 + D − 1 = D constraints. That is, the number of degrees of freedom in question is (D − 1)D/2 − D = D(D − 3)/2.

In reference to Eqs. (44.32) and (44.33) of the right and left movers we can rewrite (44.54), (44.55) with D = 26 as M 2 = MR2 + ML2 , α 2 MR = 2

∞  n=1

α i (n)† α i (n) − 1,

MR2 = ML2 , α 2 ML = 2

∞ 

α¯ i (n)† α¯ i (n) − 1.

(44.59) (44.60)

n=1

The operator equality MR2 = ML2 means that for a particle state |ϕ : [ MR2 − ML2 ]|ϕ = 0. The other excitations are similarly analyzed and classified. For the investigation of all the massless field excitations of bosonic strings see Manoukian [3, 4]. In the next chapter we consider the concept of compactification and we encounter an additional structure that arises in string theory referred to as D branes, as well we learn how a massless field may acquire mass in a “Higgs-like” mechanism in string theory involving D branes.

References

301

References 1. 2. 3. 4.

Elizalde, E. et al. (1994). Zeta regularization techniques with applications. Singapore: World Scientific. Elizalde, E. (1995). Ten physical applications of spectral zeta functions. Berlin: Springer. Manoukian, E. B. (2012). All the fundamental massless bosonic fields in bosonic string theory. Fortschritte der Physik, 60, 329–336. Manoukian, E. B. (2016). Quantum field theory II: Introductions to quantum gravity, supersymmetry and string theory. Switzerland: Springer.

Compactification, D Branes and Mass Generation

45

Prerequisite Chaps. 16, 44

We have seen in the previous chapter that bosonic strings require that the underlying dimension of spacetime is 26. So the problem arises as to how one may compactify some of the spatial components of a string, as mentioned in Chap. 43, in reconciliation with the three dimensional character of space we experience in our world. This chapter is involved with the compactification of one of the extra spatial component, say, X 25 of a bosonic string into a circle of, say, radius R as an illustration. In this chapter, we also consider some more general structures that may arise in the theory, in addition to the simple string of one dimensional extension, referred to as D branes as well as D p branes whose significance will arise as they are encountered. We will also see how one may generate mass for a massless vector particle and, in turn, look for the extra degree of freedom which is necessarily required to describe the resulting massive vector particle. Consider the open string satisfying the Neumann B.C. given in Eq. (44.14) of Chap. 44 with the expression for X i explicitly given in Eq. (44.16) of that chapter with i = 25, in particular, given by  ∂σ X 25 (τ, σ )σ =0,π = 0,

X 25 (τ, σ ) = x 25 +  2 p 25 τ + i 

 α 25 (n) e− i n τ cos nσ. n n=0

(45.1)

One may rewrite X 25 (τ, σ ) as a sum of two terms X R25 (τ − σ ) and X L25 (τ + σ ), referred to, respectively, as right-mover and left-mover: 25 X 25 (τ, σ ) = X 25 R (τ − σ ) + X L (τ + σ ),

(45.2)

1 i   α 25 (n) − i n(τ −σ ) , X R25 (τ − σ ) = x R +  2 p 25 (τ − σ ) + e 2 2 n=0 n

(45.3)

1 i   α 25 (n) − i n(τ +σ ) e , X L25 (τ + σ ) = x L +  2 p 25 (τ + σ ) + 2 2 n=0 n

(45.4)

using the fact that cos nσ = [einσ + e− inσ ]/2. One may, in turn, define the dual string by taking, instead, the combination

25  X (τ, σ ) = X 25 x 25 +  2 p 25 σ +  L (τ + σ ) − X R (τ − σ ) = 

 α 25 (n) e− i n τ sin nσ, n n=0

(45.5)

using, in the process, the fact that sin nσ = i[e− inσ − einσ ]/2, with  X (τ, σ ), referred to as the T-dual of X 25 (τ, σ ), which now satisfies instead a Dirichlet condition:  ∂τ  X (τ, σ )σ =0,π = 0, independently of τ,

(45.6)

unlike X 25 which satisfies a Neumann condition ∂σ X 25 (τ, σ )|σ =0,π (see Eqs. (44.14), (44.15)). That is, the ends of the dual string carry no momentum in the 25th direction. Also since no linear term in τ occurs in (45.5), there is no momentum in the 25th direction, but only an oscillatory behavior occurs due to the presence of the expression exp[−inτ ] in it. The ends of the string are restricted to move in a space of 24 spatial dimensions, referred to as D brane, described by the hyperspace  X = x: ˜ © Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_45

303

304

45

Compactification, D Branes and Mass Generation

e

an

D

br

A Fourier decomposition of a wavefunction in quantum mechanics involves the Fourier factor exp[ i p x ], and single valuedness of a wavefunction would impose the constraint that if x is increased by 2 π R, then p = k/R, where k is an integer. Accordingly, for the B.C of the variable  X (τ, σ ) in a circular direction, we may write  X (τ, 0) = x, ˜

kπ ˜  X (τ, π ) = x˜ +  2 ≡ x˜ + 2 π k R, R

2 R˜ = . 2R

(45.7)

Hence for  2 of the order the Planck length squared, and R arbitrary large, one is able to compactify the coordinate in question ˜ where note that the string wraps the dual circle k times. on a circle of arbitrary small radius R, We may generalize the concept of the D brane in the following manner: Consider a bosonic open string, with field variables X μ : μ = 0, 1, ..., D − 1, with X 0 , X 1 , ..., X p satisfying Neumann boundary conditions, i.e., ∂ X μ (τ, σ )  = 0,  σ =0,π ∂σ

μ = 0, 1, ..., p,

(45.8)

I = p + 1, ..., D − 1.

(45.9)

and X p+1 , ..., X D−1 satisfying Dirichlet conditions X I (τ, 0) = X I (τ, π ) = x¯ I ,

Here x¯ I , I = p + 1, ..., D − 1, specify a location of a D p-brane where p is the number of spatial √ dimensions. In particular, p+1 D−1 ± 0 1 , ..., X are coordinates normal to the brane. We set X = (X ± X )/ 2. we note that X From Eq. (44.16) in Chap. 44, corresponding to a Neumann B.C., X i (τ, σ ) = x i +  2 p i τ + i 

 α i (n) n=0

n

e−inτ cos nσ, i = 2, ..., p,

(45.10)

and for the field variables satisfying Dirichlet conditions, satisfying (44.9), X I (τ, σ ) = x¯ I + 

 α I (n) n=0

n

e−inτ sin nσ,

I = p + 1, ..., D − 1,

(45.11)

with no momentum: ∂ X I (τ, 0)/∂τ = 0, ∂ X I (τ, π )/∂τ = 0 of the end points perpendicular to the brane. From Eq. (44.31) in Chap. 44, we may infer that the mass squared pertinent to the brane, 2 p + p − − p i p i with a sum over i from 2 to p, is given by α M 2 =

∞    α i (n)† α i (n) + α I (n)† α I (n) − 1,

(45.12)

n=1

and with a sum over I = ( p + 1), ..., (D − 1), as well, understood. The ground-state may be labeled as shown in |0; p + , p i  with no momentum components perpendicular to the brane. In particular, in terms of a Fourier transform there will be dependence on x − , x i . The massless vector particles are given by α i (1)† |0, p + , p i  with the index i belonging to Lorentz indices pertinent to the brane only, with the number of states being

45

Compactification, D Branes and Mass Generation

305

equal to p − 1 = ( p + 1) − 2, i.e., the massless vector particles “live” in the brane.1 We also have D − 1 − p massless scalar particles, associated with the states α I (1)† |0, p + , p i , so-called Goldstone bosons, arising from the symmetry breaking of translational invariance by the presence of the brane. Such states correspond to displacements of the brane in the above mentioned normal directions. Clearly, for p = D − 1, there are no massless scalar particles and translational invariance holds true in spacetime. The “D” in D branes, stands for Dirichlet, and the word “brane” is derived from the word membrane. If p is the spatial dimension of a D brane, the latter, as we have seen above, is referred to as a D p -brane. The volume of a brane is termed as its worldvolume. The space outside the brane is called “the bulk”. If p = D − 1, then the brane is referred to as space filling. Time flows not only in the brane but in the bulk as well. It is interesting that the existence of branes in string theory would lead one to imagine that the universe in which we live is a D 3-brane with the bulk being associated with the extra dimensions. We have seen that the graviton is in the mass spectrum of closed strings and hence, in particular, have no end points and are thus not restricted to branes, and may move around all over spacetime. One may then infer that gravitons can escape from branes into the bulk providing, perhaps, an explanation of the feebleness of the strength of gravitation in comparison to the other interactions. Interesting situations arise when there are more than one brane. In particular, we learn that a massless vector particle acquires mass by connecting the end of an open string to two spatially separated D branes. More Than One Brane: Mass Generation: Consider two parallel D p-branes, with a string satisfying a Dirichlet condition in (45.11) with one end restricted to one brane and the other to the second brane2 :

1 2

In the direction perpendicular to the branes, we may write X I (τ, σ ) = x¯ 1I + (x¯ 2I − x¯ 1I ) ∂σ X I (τ, σ ) = =

such that

 α I (n) σ + e−inτ sin nσ, π n n=0

(45.13)

 (x¯ 2I − x¯ 1I ) + α I (n) e−inτ cos nσ, π n=0  (x¯ 2I − x¯ 1I ) + α I (n) e−inτ cos nσ, π n=0

 (x¯ I − x¯ I )   2 1 = e0 cos(0) +  α I (n) e−inτ cos nσ, π n=0

(45.14)

X I (τ, π ) − X I (τ, 0) = x¯ 2I − x¯ 1I ,

(45.15)

I = p + 1, ..., D − 1.

Due to the e−inτ terms in ∂ X I (τ, σ )/∂τ , the string has an oscillatory character between the two D branes. If we compare the  expression for ∂ X I (τ, σ )/∂σ in (45.14) with ∂ X i (τ, σ )/∂σ = −i n=0 α i (n)e−inτ sin nσ as obtained, e.g.., from (45.10), we note that the expression for M 2 in (45.12) picks up necessarily additional terms due to the non-zero (x¯ 2I − x¯ 1I )/ π part in (45.14), thus modifying it to 1 See 2A

Box 44.5 for degrees of freedom (independent states) of a massless vector particle. state of a string defined by attaching labels (1,2) to its ends is referred to as a Chan-Paton-state: Paton and Chan [2].

306

45

α M 2 =

For

 D−1

I = p+1



Compactification, D Branes and Mass Generation

D−1 I ∞      x¯ 2 − x¯ 1I 2 α j (n)† α j (n) + α I (n)† α I (n) . −1 + √ 2π α  n=1 I = p+1

(45.16)

2 x¯ 2I − x¯ 1I > 4 π 2 α  , the state

α j (1)† |0; p + , p i ; (1, 2) corresponds to a massive vector particle but has only ( p − 1) of the p, polarization states needed to describe a massive particle since j = 2, . . . , p, where we have denoted the ground state by |0; p + , p i ; (1, 2). We have already seen in Box 44.6 of Chap. 44, that in a spacetime of dimension p = ( p − 1) + 1 corresponding to a D p-brane, a massive vector particle has p degrees of freedom. The additional degree of freedom needed to define a massive vector particle is obtained from the scalar excitations in the following manner. One may introduce the unit vector with components n I along x¯ 2I − x¯ 1I , and define the state D −1 x¯ 2I − x¯ 1I (x¯ I − x¯ 1I ) 1  I I † , n α (0) |0; p + , p i ; (1, 2), n I =  , α I (0) = 2 D −1 N I = p +1 π J J 2 ( x ¯ − x ¯ ) 2 1 J = p+1

(45.17)

where N is a normalization factor, and we emphasize that the index I , as far as the branes are concerned, just labels the scalar particles and does not correspond to a Lorentz index for them. The state α j (1)† |0; p + , p i ; (1, 2) together the one in (45.17) now give the correct number of polarization states of a massive vector particle given by ( p − 1) + 1 = p, in reference to the branes. This is interesting. The open string with its both ends on the same brane gives rise to a massless vector particle. On the other hand, in the case, when it is stretc.hed instead between the two branes gives rise to a massive vector particle. Here what we have is what one may call a D brane realization of a “Higgs-like mechanism” in string theory. I

 D −1  I 2 I I 2 2  3 Clearly, for ID−1 = p+1 x¯ 2 − x¯ 1 = 4 π α or I = p +1 ( x¯ 2 − x¯ 1 ) → 0, one would be dealing with the massless cases.

References 1. Manoukian, E. B. (2016). Quantum field theory II: Introductions to quantum gravity, supersymmetry and string theory. Switzerland: Springer. 2. Paton, J. E., & Chan, H. (1969). Generalized Veneziano model with isospin. Nuclear Physics B, 10, 516–520.

3 For

additional details, see also Manoukian [1].

46

Superstrings Prerequisite Chaps. 16, 43, 44

In order to develop the theory of superstrings we have to consider the Dirac equation in two dimensions pertaining to the worldsheet of a superstring. Everything you need to know, and more, about the Dirac equation in 2 dimensions in Minkowski spacetime is spelled out in Box 46.1. In particular for a massless field it is given by      ψ1 0 −i 0 i 0 1 ρ = , ρ = , ρ ∂α ψ(ζ ) = 0, ψ = ψ2 i 0 i 0   10 ρ0ρ1 = ρ5 = , {ρ α , ρ β } = −2 ηαβ . 0−1 

α

(46.1) (46.2)

We may define a general superfield involving the bosonic fields X μ (ζ ) and Dirac fields ψ μ (ζ ), with μ = 0, 1, . . . , D − 1, as well as a supersymetry θ = (θ1 , θ2 ) pertaining to the two dimensional worldsheet of a superstring, as follows: 1 1 Φ μ (ζ, θ ) = X μ (ζ ) + √ θ ψ μ (ζ ) + θ θ B μ (ζ ), 4 2

μ = 0, 1, ..., D − 1, ζ = (τ, σ )

(46.3)

in view of developing a supersymmetric action for a string, involving the bosonic fields and the spinors ψ μ . Here B μ is called an auxiliary field. The supersymmetry transformations are given by ζ



α

= ζα +

i α

ρ θ, 2

θ  = θ + ,



θ = θ + . 

(46.4) 

Under such a transformation, we have that for each component Φ μ (ζ, θ ): Φ μ (ζ  , θ  ) = Φ μ (ζ, θ ). That is, Φ μ (ζ α , θ ) = μ α Φ (ζ − (i/2) ρ α θ, θ − ). The variation of the superfield is defined by  i α δΦ (ζ, θ ) = Φ (ζ, θ ) − Φ (ζ , θ ) = + (ρ θ ) ∂α Φ μ (ζ, θ ). 2 ∂θ μ

μ



μ

α





(46.5)

From Eq. (46.3), the variation on the left-hand side of the above equation is 1 1 δ Φ μ (ζ, θ ) = δ X μ (ζ ) + √ θ δψ μ (ζ ) + θ θ δ B μ (ζ ). 4 2

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_46

(46.6)

307

308

46

Superstrings

Box 46.1 Dirac equation in two dimensions  In two dimensional spacetime, the mass shell condition reads [ p 0 − ( p 1 )2 + m 2 ]χ = 0, which upon  multiplying by [ p 0 + ( p 1 )2 + m 2 ] gives [( p 1 )2 − ( p 0 − m)( p 0 + m)]χ = 0. Simply set ( p 0 − m)χ = p 1 ϕ, hence ( p 0 + m)ϕ = p 1 χ thus introducing the object ϕ. These two equations may be rewritten as p 1 ϕ − ( p 0 − m)χ = 0, − p 1 χ + ( p 0 + m)ϕ = 0, and may be combined in the elegant form        α  χ 1 0 01 γ (∂α /i) + m Ψ = 0, Ψ = , γ0 = , γ1 = , ϕ 0−1 10   01 , {γ α , γ β } = − 2 ηαβ , [ηαβ ] = diag[−1, 1]. An equivalent description may γ 0γ 1 = γ 5 = 10   √ 1 1 , be introduced by making a unitary transformation:G γ α G † = ρ α , G Ψ = ψ with G = (1/ 2)          i −i ∂α 0 −i 0 i 1 0 ψ1 , ρ1 = , ρ0 ρ1 = ρ5 = , ♠ ρα , ρ0 = + m ψ = 0, ψ = i 0 i 0 0−1 ψ2 i {ρ α , ρ β } = −2 ηαβ ♠ . Since ρ 0 /i, ρ 1 /i are real, we may infer that the Majorana condition of a spinor simply reads ψ ∗1 = ψ 1 , ψ ∗2 = ψ 2 , signifying the reality of the spinor ψ. Since ρ 5 is diagonal and, as indicated above, the matrices ρ 0 /i, ρ 1 /i are real, we have a Majorana-chiral, also called Majorana-Weyl, representation. A complete set of 2 by 2 matrices is I, ρ 0 , ρ 1 , ρ 5 , where I is the identity matrix. The following basic properties should be noted: (ρ 0 )† = ρ 0 , (ρ 1 )† = −ρ 1 , {ρ α , ρ 5 } = 0, ρ α ρα = −2, (ρ α )† ρ 0 = ρ 0 ρ α , ρ α ρ β ρα = 0,

ρα , ρ β , ρ γ = 4 (ρ β δα γ − ρ γ δα β ). For two anti-commuting Majorana spinors ψ, ω (ψ = ψ † ρ 0 ), we have ψ ω = ω ψ, ψ ρ α ω = −ωρ α ψ, (ψ ρ α ω)† = (ω ρ α ψ), ψ ρ α ρ β ω = ω ρ β ρ α ψ, ψ ρ α ρ 5 ω = −ω ρ α ρ 5 ψ, ψ ρ 5 ω = −ω ρ 5 ψ. Finally for a Majorana spinor θ with Grassmann, i.e, anti-commuting, components, we   θ1 , {θa , θb } = 0, θρ α θ = 0, θρ 5 θ = 0, θρ α ρ 5 θ = 0, θθ = 2iθ 2 θ1 , θa θ b = − (1/2) θθ δab , have θ = θ2   ∂/∂θ c θθ = 2 θc , θ 1 = iθ 2 , θ 2 = −i θ1 .

On the other hand from the properties of the θ spinor in the last two lines in Box 46.1, and the properties of the product of two Majorana spinors depicted in the fourth line from below in the Box, the right hand of Eq. (46.5) reads 1 i i 1 √ ψ μ − θ ρ α ∂α X μ − √ θ θ ρ α ∂α ψ μ + θ B μ , 2 2 2 4 2

(46.7)

which upon comparing this with the expression in Eq. (46.6) the following basic supersymmetry transnformations of the fields emerge 1 1 i δ X μ = + √ ψ μ , δψ μ = + √ (B μ − i ρ α ∂α X μ ) , δ B μ = − √ ∂α ( ρ α ψ μ ). 2 2 2

(46.8)

It is readily verified that the supersymmetric generalizations of the action of a bosonic string in Eq. (44.4) of Chap. 44, consistent with the above supersymmetry transformations is given by

∂α μ ψμ − B μ Bμ , (dζ ) ∂ α X μ ∂α X μ + ψ ρ α i     †  μ thus achieving worldsheet supersymmetry. Note that ψ μ ρ α ∂α /i ψμ = − ∂α /i ψ ρ α ψμ =   μ ψ ρ α ∂α /i ψμ . The field equations following from the action (46.9) are W = −

1 2π 2

 X μ = 0,

ρ α ∂α ψ μ = 0,

B μ = 0.

(46.9)

(46.10)

46

Superstrings

309

In the sequel, we may set the auxiliary field B μ equal to zero since it doesn’t, a priori, set a constraint on the other fields. The action in (46.9), then simply becomes W = −

1 2π 2



∂α μ ψμ . (dζ ) ∂ α X μ ∂α X μ + ψ ρ α i

(46.11)

and the supersymmetry transformation rules in (46.8) then also reduce to 1 i i δ X μ = √ ψ μ , δψ μ = − √ ρ β ∂β X μ , δψ μ = √ ρ β ∂β X μ . 2 2 2

(46.12)

We introduce the following light-cone variables 1 ∂± (τ ± σ ) = 1, ∂± (τ ∓ σ ) = 0, (∂τ ± ∂σ ), 2 1 1 X ± = √ [X 0 ± X D−1 ], ψ ± = √ [ψ 0 ± ψ D−1 ], 2 2 X μAX μ = −X +AX − − X −AX + + X i AX i , ψ μ Bψμ = −ψ +Bψ − − ψ −Bψ + + ψ iB X i , ∂± =

(46.13) (46.14) (46.15)

i = 1, . . . , D − 1, and for the Dirac fields the last expression in (46.15) is defined as in a matrix multiplication for a given matrix B, and where A is arbitrary. A field equation like X (τ, σ ) = 0,  ≡ (∂σ )2 − (∂τ )2 implies as a wave equation that X (τ, σ ) may be written as X (τ, σ ) = X R (τ − σ ) + X L (τ + σ ),

(46.16)

where X R/L (τ ∓ σ ) describes a wave propagating to the right/left and are referred to as a right-mover and left-mover, respectively. That X R (τ − σ ), for example, describes a wave propagating to the right is easily seen if τ becomes more positive, σ should also increase become more positive (i.e., wave moves to the right) so that the amplitude X (τ − σ ) attains the same value for the new τ and σ as for the initial ones. On the other hand, the Dirac field satisfies ρ α ∂α ψ = 0, and with  ψ=

ψR ψL



     0 −i(∂0 − ∂1 ) ψR ψR , ρ ∂α ψ = (ρ ∂0 + ρ ∂1 ) = = 0, ψL 0 ψL i(∂0 + ∂1 ) α

0

⇒ ∂− ψL = 0,

1

∂+ ψR = 0,

ψL/R (τ ± σ ) as functional dependence,

(46.17) (46.18)

where ρ 0 , ρ 1 are given in Eq. (46.1). Superstring field equations then follow from Boxes 46.2 and 46.3 to be given by ∂ α ψ + = 0,

μ

μ

∂+ ψR = 0, ∂− ψL = 0, i i 2 ∂+ X L+ ∂+ X L− − ∂+ X Li ∂+ X Li − ψLi ∂+ ψLi + ψL+ ∂+ ψL− = 0. 2 2 We also have X + = x + +  2 p + τ, X i = 0, i = 1, 2, . . . , D − 1.

(46.19) (46.20) (46.21)

310

46

Superstrings

Box 46.2 Some equations relevant to superstrings in the light-cone gauge

From Eqs. (46.14), (46.15), the integrand of the action (46.11) may be explicitly rewritten as : ρ α ∂α − ρ α ∂α + ρ α ∂α i I (ξ ) = −2 ∂ α X + ∂α X − − ψ + ψ −ψ − ψ + ∂ α X i ∂α X i + ψ i ψ . i i i The supersymmetry transformation rules in (46.12) may be also expressed as : 1 i i δ X + = + √ ψ +, δψ + = − √ ρ β ∂β X + , δψ + = √ ρ β ∂β X + . The response 2 2 2 of the integrand I (ζ ) to the variations of the following fields X + , ψ + , ψ + , emerges as ρ α ∂α − ρ α ∂α δ+ I = −2 (∂ α δ X + ) ∂α X − − (δψ + ) ψ −ψ− (δψ + ), which may be rewritten as i i √ 1 1 δ+ I = − 2 (∂ α ψ + ) ∂α X − − √ (ρ β ∂β X + )ρ α ∂α ψ − + √ ψ − ρ α ∂α (ρ β ∂β X + ) . In the 2 2 light-cone gauge, we recall that X + = x + +  2 p + τ and hence using the fact that ∂β X + =δ

0

β

 2 p + , we may infer that now the light-cone gauge for the problem at hand may be defined

by simultaneously setting : • ∂ α ψ

+

= 0 which would make δ+ I be reduced to a total derivative.

We note that the Dirac field equations ρ α ∂α ψ μ = 0 allow one to introduce left-movers and μ

μ

right-movers ψL , ψR respectively, and write ψ = (ψR ψL ) , and by using the Dirac field equations we obtain •

μ ∂+ ψR

= 0,

μ

∂− ψL = 0. From Eq. (46.13), we may infer that they have the following

functional dependence: ψR (τ − σ ), ψL (τ + σ ) and are thus justified to be referred to as right-movers and left-movers in wave propagations description.

Box 46.3 Vanishing of the worldsheet energy-momentum tensor and an emerging constraint

Consider the response of the action in (46.11) under an infinitesimal transformation of the fields : δ X μ = α (ζ ) ∂α X μ , δψ μ = α (ζ ) ∂α ψ μ (ζ ), leading, by using the free field equations of the fields that 1 (dζ ) ∂ α β (ζ ) Tαβ (ζ ), where up to total derivatives δW = − 2π 2 1 1 Tαβ = ∂α X · ∂β X − ηαβ ∂ γ X · ∂γ X + ψ μ ρα ∂β ψμ , is the worldsheet energy-momentum tensor 2 2i It is easy to see that Tα,β is symmetric. As a matter of fact since ρ 0 ρ0 = −I, we have for 1 1 1 T01 = (ψ ρ 0 ρ0 ∂1 ψ) : T01 = − ψ ∂1 ψ = − (ψR ∂1 ψR + ψL ∂1 ψL ). Also ∂± ψR/L = 0, i.e., 2i 2i 2i 1 ∂0 ψR/L = ∓ ∂1 ψR/L . Hence T01 = (ψR ∂0 ψR − ψL ∂0 ψL ) = ψρ 0 ρ1 ∂0 ψ = T10 , where ψ = (ψR ψL ) , 2i   1 0 . The vanishing of Tαβ allows us to consider the combination T00 + T01 = 0. and ρ 0 ρ 1 = 0 −1 From the explicit expression of Tαβ , it readily follows that the combination just mentioned leads i i to the constraint • 2 ∂+ X L+ ∂+ X L− − ∂+ X Li ∂+ X Li − ψLi ∂+ ψLi + ψL+ ∂+ ψL+ = 0. 2 2

where the constraint in (46.20), relating derivatives of the components of the fields, in particular, is derived in Box 46.3, while the constraint X + = x + +  2 p + τ is given in (44.8). The constraint in (46.20) emerges, in particular, from the vanishing of the worldsheet energy-momentum tensor. As mentioned in Chap. 44, this point is not sufficiently emphasized and the reason why it vanishes was also discussed there. In reference to Eqs. (46.19) - (46.21), we consider the following simple B.C.:

46

Superstrings

311

 ∂σ X μ (τ, σ )σ =0, π = 0, chosen in Eq. (44.14) in Chap. 44, referred to as Neumann (N) B.C,

μ ψL (τ, 0)

=

μ ψR (τ, 0),

μ ψL (τ, π )

=

μ −ψR (τ, π ),

referred to as Neumann-Schwarz (NS) B.C.

(46.22) (46.23)

The Fermi character implying that only bilinear forms of the spinors have a physical meaning, a B.C where its changes in sign at the other end of the string is permissible. The importance of the NS B.C is that we know from the first equation in (46.19) that ψ + is independent of τ and σ . The NS B.C, because of the change in sign in (46.23) at σ = π , then implies that ψ + = 0 simplifying, accordingly, the analysis and the last term on the right-hand side of (46.20) does not contribute. In particular from this constraint, and the expression for X + in (46.21), that is X L+ = xL+  2 p + (τ + σ )/2, ∂+ X L+ = 2 p + /2, the constraint in (46.20) simply becomes i i  2 p + ∂+ X − L = ∂+ X L ∂+ X L +

i i ψ ∂+ ψ iL . 2 L

(46.24)

The NS B.C for the Fermi fields implies that we may carry out a Fourier transform as   i ψ iL/R (τ, σ ) = √ d (r ) e−i r (τ ± σ ) , 2 r

1 3 r = ± , ± , ... (NS), 2 2

(46.25)

which obviously satisfies the B.C in question since exp[−i r π ] = −exp[+i r π ]. On the other hand in the N B.C for the bosonic fields we have from Eq. (44.16) in Chap. 44, X iL/R =

1 i 1 i   α i (n) −i n (τ ± σ ) x +  2 p i (τ ± σ ) + e , 2 2 2 n =0 n

p i = α i (0),

(46.26)

where the last equality is given in Eq. (44.19) of Chap. 44. Also note that [ ein(σ ) + e−in(σ ) ]/2 = cos nσ. We note from (46.13) and (46.25) that   i  −i r  (τ ± σ )   d (r ) e , ∂+ ψLi = −i √ r d i (r ) e−ir (τ +σ ) , ψ iL = √ 2 r 2 r i i 2  i ψ L ∂+ ψLi = d (−r ) r d i (r ) + F1 (τ + σ ), 2 4 r

(46.27) (46.28)

r = ±1/2, ±3/2, . . ., and where we have use different indices r  and r in (46.27) for clarity. Here two situations arises in the product in (46.28): either r  + r = 0 or r  + r = 0. In the first case, one clearly obtains the first expression on the right-hand side of (46.28) which is independent of (τ + σ ), while in the second case, clearly an expression arises depending on (τ + σ ) which we have simply denoted by F1 (τ + σ ). Similarly from (46.13) and (46.26) ∂+ X Li = ∂+ X Li ∂+ X Li =

∞   i α (n) e−in(τ +σ ) , wher e α i (0) = pi , 2 n=−∞

(46.29)

∞ 2  i α (−n)α i (n) + F2 (τ + σ ), 4 n=−∞

(46.30)

where F2 (τ + σ ) is a function of (τ + σ ) as indicated. From (46.24), (46.28) and (46.30), upon integration over ζ+ , we obtain X L− (τ + σ ) = x − +

∞   1  i i i i α (−n)α (n) + d (−r ) r d (r ) (τ + σ ) + F3 (τ + σ ) 4 p + n=−∞ r

1 2 −  p (τ + σ ) + F3 (τ − σ ), 2 

where F3 (τ + σ ) = (1/ 2 p + ) dζ+ F1 (τ + σ ) + F2 (τ + σ ) . That is, ≡ x− +

(46.31) (46.32)

312

46 ∞   1 i i i i α (−n)α (n) + d (−r ) r d (r ) , or  2 n=−∞ r   1  i 2 p+ p− − p i p i = 2 α (−n)α i (n) + d i (−r ) r d i (r ) , since α i (0) =  p i .  n =0 r

2 p+ p− =

Superstrings

(46.33) (46.34)

We note that the part of the Lagrangian density in (46.11) depending on the fermion field may be rewritten as

i ψL · ∂− ψL + ψR · ∂+ ψR , 2 π ∂LF i i ⇒ canonical conjugate momenta of ψL/R : = ψi , i 2 π  2 L/R ∂τ ψL/R LF =

(46.35) (46.36)

i with ψL/R given in (46.20), and where we have used the fact that ∂− τ = (1/2)∂− [(τ + σ ) + (τ − σ )] = (0 + 1)/2 (see Eq. (46.13), as well as the chain rule ∂− = [(∂− τ )/∂τ + (∂− σ )/∂σ ], and so on). Hence the quantum aspect of the theory emerges upon introducing, in particular, the anti-commutation relations

{ψ iL (τ, σ ), ψ L (τ, σ  )} = (2 π  2 ) δ i j δ(σ − σ  ), j

{ψ {ψ  †

j i  R (τ, σ ), ψ R (τ, σ )} j i  L (τ, σ ), ψ R (τ, σ )} ij  i

(46.37)



= (2 π  ) δ δ(σ − σ ), 2

= 0.

ij

(46.38)



(46.39)



 †

{d (r ), d (r ) } = δ δ(r, r ), {d (r ), d (r )} = 0, {d (r ) , d (r ) } = 0, i

j

j

i



j

(46.40)

in addition to the commutations relations of the bosonic field contributions in Eq.(44.24) of Chap. 44, [ α i (n), α j (m) ] = n δ(m, −n) δ i j ,

α i (−n) = (α i (n))† .

(46.41)

From Eqs. (46.34), (46.40) and (46.41), we obtain by summing over i = 1, 2, . . . , D − 2 with M 2 = − p 2 = −2 p + p − + p pi ,  2 = 2α  given in Eq. (44.5) of Chap. 44, i

M2 =

1 α

 ∞

+

m =1 ∞  r=

α i (m)† α i (m) +

∞ D−2  m 2 m =1

 ∞ D−2  d (r ) r d (r ) − r 2 1 1 i



i

r=

2

(NS),

(46.42)

2

where r = 1/2, 3/2, . . .  As for the bosonic string, in making a transition from a classical description to a quantum one, the sum ∞ m = 1 m is interpreted as the analytic continuation of the Riemann zeta function ζ (s) in Eq. (44.26) of Chap. 44, to s = −1 giving the value ζ (−1) = −1/12. On the other hand, the zero point energy ∞ 

r ≡

r = 1/2, 3/2,...

∞ 

(m + 1/2),

(46.43)

m =0

is interpreted as the analytic continuation of the so-called Hurwitz function 1 ζ (s, a) =

∞ 

(m + a)−s , Re s > 1,

(46.44)

m =0

to s = −1, for a = 1/2 : some useful references on the Hurwitz function see Adesi [1], Cohen [2], Elizalde [3]. The analytical continuation to s = −1, for a = 1/2, is all what is needed here. 1 For

46

Superstrings

313

 1 1  1 , ζ (−1, a)a=1/2 = − a 2 − a + =  2 6 a = 1/2 24 which is all that is needed about the Hurwitz function here. Accordingly, (46.42) becomes M2 =

1 α

 ∞

∞ 

α i (m)† α i (m) +

m =1

d i (r )† r d i (r ) −

r = 1/2

 D−2 , (NS). 16

(46.45)

(46.46)

Let |NS |0, p  denote the corresponding ground-state. We first examine the first lowest excited state for r = 1/2. The application of the operator M 2 to this state gives 1 M d (1/2) |N S |0, p =  α 2

i







1 D−2 − 2 16



d i(1/2) |NS |0, p  ,

(46.47)

2 which corresponds to a vector particle with D − 2 degrees  of freedom and hence is massless - the photon. This gives the critical dimension of spacetime from 1/2 − (D − 2)/16 = 0 to be 10, with the number of degrees of freedom of the vector particle being equal to 8. Again we learn that the number of degrees of freedom and the masslessness of the particle (described by a vector field) determine the dimensionality of the underlying spacetime of the theory. There seems to be also a tachyonic state with negative mass squared −1/(2 α ) as seen by the application of M 2 in (46.46) with D = 10

|N S |0, p  , ⇒ M 2 |N S |0, p  = −

1 |N S |0, p  , 2 α

(46.48)

which, however, by invoking the necessary supersymmetry restriction that the number of fermion states is equal to the of bosonic states,3 this state is automatically eliminated. This method of invoking supersymmetry for consistency is referred to as the GSO Projection Method.4 Here we only give the basic idea of this projection method on how it eliminates the tachyon. It is achieved by applying the operator defined below, for example, to the ground state in (46.48) 1 [1 − (−1) F ], 2

F=

∞ 

d i (r )† d i (r ),

F |N S |0, p  = 0,

r =1/2

1 [1 − (−1) F ] |N S |0, p  = 0, 2

(46.49)

where F is the number of fermion excitations operator, and the tachyonic state is automatically eliminated. After the GSO projection is applied, the first excited state d i(1/2)† |N S |0, p  in (46.47), it becomes the ground state and supersymmetry is consistently achieved. In particular, we note that 1 [1 − (−1) F ] d i(1/2)† |N S |0, p  = d i(1/2)† |N S |0, p  , 2 1 M 2 [1 − (−1) F ] d i(1/2)† |N S |0, p  = 0. 2

2 See

(46.50) (46.51)

Box 44.5 of Chap. 44. Chap. 39, below Eq. (39.23). 4 The GSO method was proposed by Gliozzi, Scherk and Olive [4, 5]. A pedagogical treatment of this method is given for the interested reader in Manoukian [8], p.p. 252, 253 as it is applied in determining the number of states (fermionic equal to bosonic) for each generated mass in the theory. 3 See

314

46

Superstrings

For a detailed study of various boundary conditions for open and closed superstrings, which include, in general, more constraints, and the investigation of all the massless bosonic and Fermionic fields in the mass spectra of superstrings, see, respectively Manoukian [6, 7] as well as Manoukian [8].

References 1. Adesi, V. B., & Zerbini, S. (2002). Analytic continuation of the Hurwitz zeta function with physical application. Journal of Mathematical Physics, 43, 3759–3765. 2. Cohen, H. (2007). Number theory. Vol. II: Analytic and modern tools. New York: Springer. 3. Elizalde, E., et al. (1994). Zeta regularization techniques with applications. Singapore: World Scientific. 4. Gliozzi, F., Scherk, J., & Olive, D. (1976). Supergravity and the spinor dual model. Physics Letters B, 65, 282–286. 5. Gliozzi, F., Scherk, J., & Olive, D. (1977). Supersymmetry, supergravity theories and the dual spinoral model. Nuclear Physics B, 122, 253–290. 6. Manoukian, E. B. (2012). All the fundamental bosonic massless fields in superstring theory. Fortschritte der Physik, 60, 337–344. 7. Manoukian, E. B. (2012). All the fundamental massless fermion fields in supersring theory: A rigorous analysis. Journal of Modern Physics, 3, 1027–1030. 8. Manoukian, E. B. (2016). Quantum field theory II: Introductions to quantum gravity, supersymmetry and string theory. Switzerland: Springer.

Vertices, Interactions and Scattering Prerequisite Chaps. 16, 43, 44

47

In this chapter, we describe a method on how vertices, interactions and scattering amplitude may be developed in string theory. In scattering theory, external lines will be defined representing single particle excitations, but the internal structure, describing the dynamics, include all string modes and is thus much more involved than in conventional field theory where only a finite number of types of particles are exchanged in a scattering process in the latter describing the dynamics. The worldsheets of two open string may emerge together forming a single worldsheet and may then split again. The intermediate wordlsheet may, in turn, split into more than one worldsheet, which may then combine and then split again and so on:

and similarly, for closed springs:

One may also generate, for example, three-point vertices which will be important in subsequent chapters:

Scattering theory of string theory is not fully developed to the extent of scattering theory in conventional field theory but some interesting results emerge. We present a simple intuitive description of scattering theory and of the construction of interacting vertices. In particular, we apply the formalism to evaluate a simple scattering process, interpret its significance and make contact with earlier seemingly unrelated work of Veneziano in 1968. We evaluate, as well a vertex for the interaction of massless vector bosons, as well as of a vertex of the interaction of massless gravitons. These latter expressions will allow to show how string theory re-invents the Yang-Mills field theory in the next chapter (Chapter 38), as well Einstein’s General Relativity (GR) in Chapter 69, after the reader becomes familiar with the foundations of GR in chapters preceding Chapter 69.

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_47

315

316

47

Vertices, Interactions and Scattering

Open Strings: Let us now consider bosonic open strings. For any given particle state of, say, type Λ, of momentum k μ , in the open string spectrum, one associates a vertex operator VΛ (k, τ ), using a rather standard notation. This vertex describes the emission of such a particle from the σ = 0 end of the open string at a proper time τ . The nature of such vertices will be discussed below. As a very simplistic view of a scattering process, consider a particle of type, say, Λ1 , of momentum k1 , in the spectrum of the string, being attached to the string at the σ = 0 end of the string in the far past. As time develops, one may, conveniently, introduce a string diagram, not to be confused with its worldsheet, as the semi-infinite rectangular strip, with σ taken along the vertical axis, while τ taken along the horizontal one, with τ1 ≤ τ 2 ≤ · · · ≤ τn , and τ 1 → −∞ to τ n → ∞. At times τ 2 , τ 3 , · · · , τ n−1 , particles may be emitted (absorbed), at the σ = 0 boundary of the string, such that with each particle, a vertex operator is associated, as mentioned above. Finally a particle of momentum kn emerges in the distant future: String Diagram (

1 k1 )

1

×

×

2

3

×

(

n−1

n kn )

n

The transition amplitude A , for example, involving 2 vertices, may be written in terms of the propagator Δ, as introduced in Box 47.1, connecting the two vertices in question as1 A (Λ1 k1 , Λ2 k2 , Λ3 k3 , Λ4 k4 ) = (g)2  Λ 4 k 4 out | VΛ 3 (k 3 ) Δ VΛ 2 (k 2 ) |Λ1 k1 in ,

(47.1)

where g denotes an open string coupling constant, and from Box 47.1, the propagator Δ is given by Δ =

p2

1/α  . + M 2 − i

(47.2)

The remarkable thing about this propagator, unlike a Feynman propagator which is restricted to the exchange a single particle, here M 2 is an operator, as given in (44.31), signalling the exchange of an infinite number of particles in the mass spectrum. A parametric representation of the propagator is given by  Δ = i

0 −∞



dτ2 e i τ2 α ( p

2

+M 2 − i)

 = i

0

−∞

dτ2 e i τ2 (H − i) ,

(47.3)

where H = α  ( p 2 + M 2 ) is the Hamiltonian operator derived in Box 47.1. The latter operator annihilates a particle state |φ corresponding to a particle (i.e., on the mass shell) in the spectrum of a string: ( p 2 + M 2 )|φ = 0. It should not be taken to denote the energy of a system. In particular H |Λ1 k1 in = 0. Substituting this expression in (47.1), and suppressing the  dependence for simplicity of the notation, it takes the form  A (Λ1 k1 , Λ2 k2 , Λ3 k3 , Λ4 k4 ) = i (g)2  Λ 4 k 4 out |

0

−∞

dτ2 VΛ 3 (k 3 , τ3 = 0)VΛ 2 (k 2 , τ2 ) |Λ1 k1 in ,

where VΛ 2 (k 2 , τ2 ) = exp( iτ2 H ) VΛ2 (k2 ) exp( −iτ2 H ).

1 As

a standard notation, the variables Λi label the external particles in this expression.

(47.4)

47

Vertices, Interactions and Scattering

317

Box 47.1 Translations on the worldsheet of bosonic strings

We consider translations operators on the worldsheet of bosonic strings, and, in turn, obtain an expression for the propagator generalizing the Feynman propagator in conventional field theory. First consider an open string. The canonical conjugate momentum densities to the fields X μ are given in Eq. (44.17) of Chapter 44,  π dσ ( X˙ · P − L ) to be P μ = X˙ μ /π 2 , from which the Hamiltonian is given by H = 0  π  dσ ( X˙ 2 + X  2 ) = (1/2) α i (−n)α i (n). With τ taken to be dimensionless, H is the = (1/2π 2 ) 0

n

Hamiltonian times a unit of time, so that H , and H τ are dimensionless. Using light-cone variables in (44.6), (44.7) of Chapter 44, as well as from (44.8) ∂σ X + = 0, give ∂α X D −1 ∂ α X D −1 − ∂α X 0 ∂ α X 0 = − X˙ − X˙ + − X˙ + X˙ − .

Also note from (44.19) in Chapter 44 α i (0) = p i , α ± (0) = p ± , and from the constraint in (44.10) of Chapter 44 ∞    α i (−n) α i (n) = 2 α i (n)† α i (n) − 1 to obtain H = α  ( p 2 + M 2 ). involving ∂τ X − and (25) that n=0

n=1

Regarding this operator, we note from Eq. (21), that for a given particle state |ϕ, in the spectrum of the string, i.e., in particular, on the mass shell ( p 2 = −M 2 ) : ( p 2 + M 2 )|ϕ = 0. If one compares this with the ordinary Klein-Gordon equation ( p 2 + m 2 )|ϕ = 0 which accounts for the exchange of only one particle of mass m via the corresponding propagator ( p 2 + m 2 − i )−1 , we see that in the string theory case, we have the equivalence of the exchange of an infinite number particles, in the the spectrum of the string, via a corresponding propagator 1/α  (operator) given by • Δ = 2 . Also note that a translation of an operator O (0) through p + M 2 − i ∗  . We may the parameter τ may be defined by • O (τ ) = exp[ i τ α  ( p 2 + M 2 )] O (0) exp[−i τ α  ( p 2 + M 2 )] also shift τ → τ ∓ σ, and write O (τ ∓ σ ) = exp[i (τ ∓ σ )α  ( p 2 + M 2 )] O (0) exp[−i τ ∓ σ ) α  ( p 2 + M 2 )]. In particular O (∓ σ ) = exp[∓ i σ α  ( p 2 + M 2 )] O (0) exp[± i σ α  ( p 2 + M 2 )]. Consider now an operator O (∓ σ ) such that for any constant β : exp [ i β p 2 ] O (∓ σ ) exp [ −i β p 2 ] = O (∓ σ ). For such an operator, we may write O (∓σ ) = exp[(∓ σ ) i α  M 2 ] O (0) exp[(± σ ) i α  M 2 ]. For a closed string, we have seen in Eqs. (44.59), (44.60) of

Chapter 44, in reference to the right and left movers in Eqs. (44.32) and (44.33), that the mass squared operator is ∞   given by M 2 = MR2 + ML2 with NS B.C, where (α  /2)M 2R = α i (n)† α i (n) − 1 , ∞ n=1   (α  /2)M 2L = α i (n)† α i (n) − 1 . Hence for operators OR/L (τ ∓ σ ) which are functions of (α i† , α i ), (α i† , α i ) n=1

2 2 respectively, we have OR/L (∓ σ ) = exp[∓ σ i MR/L ] OR/L (0) exp[± σ i MR/L ]. This also gives the useful expression     ∗∗  • OR (− σ )OL (+ σ ) = exp − σ i (MR2 − ML2 ) OR (0) OL (0) exp + σ i (MR2 − ML2 ) , where for a

particle state |ϕ :

[M 2R − M 2L ]|ϕ = 0.

Interaction Vertex of Massless Vector Bosons: Before applying the expression in Eq. (47.4) in computing the scattering amplitude of 4-particle scattering, let us consider first in determining the so-called three-point function representing the interacting vertex of massless vector bosons. To this end, we may directly infer from the expression in (47.4) the following expression for a three-massless vector bosons interaction vertex (47.5) A(Λ1 k1 , Λ 2 k2 , Λ3 k3 ) = g  Λ3 k3 out | VΛ 2 (k2 , τ2 = 0) | Λ1 k1 in . In response to the translation of a point of a string : X μ → X μ + a μ , the wave-function of a particle of momentum k, develops the factor ei ka . Accordingly, we define a vector operator by the expression VΛ (k, τ ) = WΛ (k, τ ) ei k X (τ ) ,

WΛ (k, τ ) = eμ λ

dX μ (τ ) , eλ · k = 0, k 2 = 0, dτ

(47.6)

318

47

Vertices, Interactions and Scattering

where dX μ /dτ is a current per unit of charge, and eλ is a polarization vector associated with the massless vector particle. The factor e i k X (τ ) is assumed to be defined as normal-ordered. That is, it is defined with creation operators set to the left of annihilation operators. To this end, let us spell out the normal-ordered expression for e i k X (τ ) , denoted by the standard notation : ei k X (τ ) :. To simplify the notation, we set = 1 in the subsequent analysis. We may use Eq. (44.16) of Chapter 44 with N B.C. for σ = 0 in a covariant notation X μ (τ ) = x

μ

+ pμτ + i

= x μ + p μτ + i

 α μ (n) e− i n τ , n n=0

∞  μ  α (n) n=1

n

e−i nτ −

α μ (n)† i nτ  e . n

(47.7)

We may thus define : exp [ i k X (τ )] : = exp[ k · B(τ )] exp[ i (k · x + k · p τ )] exp[ k · A (τ )], B μ (τ ) =

∞ 

α μ (n)† i nτ e , n n=1

A μ (τ ) = −

∞ 

α μ (n) −i nτ . e n n=1

(47.8) (47.9)

In particular, since in (47.5) τ2 = 0, we have   : exp i k X (τ ) : 

τ =0

= exp

∞ ∞   α μ (n)†

α μ (n)

exp[ i k · x ] exp − , n n n=1 n=1

∞  dX μ (τ )  = pμ + [ α μ (n) + α μ (n)† ].  τ =0 dτ n=1

(47.10) (47.11)

On the other hand, the state of a massless vector particle of momentum, say, k1 and polarization specified by the vector = (0, e1λ ), may be defined by

μ e1 λ

| Λ1 , k1 in  = e1λ · α(1)† | 0; k1  = e1 · α(1)† |0; k1 , e1 · k1 = 0, k1 2 = 0, Λ1 = λ,

(47.12)

suppressing the parameter λ, for the simplicity of the notation, and we have labeled the ground-state by the momentum k1 in a covariant description. Clearly, only the terms n = 1 in (47.10), (47.11) may contribute to (47.5). In order to evaluate the latter from the expressions in (47.10), (47.11) with VΛ 2 (k2 , τ2 = 0) = e2 · [ p +

∞ 

[ α(n) + α(n)† ]

n=1

∞ ∞   k2 · α(n)†

k2 · α(n)

× exp exp[ i k2 · x] exp − n n n=1 n=1



→ e2 · [ p + α + α † ] exp k2 · α † exp[ i k2 · x] exp − k2 · α

→ e2 · [ p + α + α † ] [1 + k2 · a † ] exp[ i k2 · x] [1 − k2 · a],

(47.13)

where α ≡ α(1). Now we work with the expression in the last line of (47.13). We note that exp[ i k2 · x] is a translation μ operator of momentum, p μ is a momentum operator, i.e., p μ |0, k1  = k1 |0, k1 , [a μ , a ν† ] = ημν in a covariant notation, e j · k j = 0, for each j = 1, 2, 3, k 2j = 0. Also an annihilation operator α annihilates the states |0; k1/3 . In the subsequent analysis, we direct the momenta of the three particles such that k1 + k2 + k3 = 0. An elementary applications of these rules give A (Λ1 k1 , Λ2 k2 , Λ 3 k 3 ) = g e2 ·k1 e1 ·e3 + e3 ·k2 e2 ·e1 + e1 ·k3 e3 ·e2 + k1·e2 k2 ·e3 k3 ·e1 (2π ) D δ D (k1 + k2 + k3 ),

(47.14)

47

Vertices, Interactions and Scattering

319

where we have extracted a delta momentum conserving factor by setting 0; −k3 |0; k1 +k2  = (2π ) D δ D (k1 + k2 + k3 ). Since the last term within the brackets in (47.14) involves three powers of momenta, we may infer from dimensional analysis alone that its coefficient is suppressed by a factor 2 relative to the other terms and hence is of higher order. The leading contribution to (47.14) may be then conveniently written as e1μ e2ν e3 Aμν (k1 , k 2 , k3 ),



μ

Aμν (k1 , k2 , k3 ) = ημ k1ν + ημν k2 + ην k3 .

(47.15)

with k1 +k2 +k3 = 0 up to the g(2π ) D δ D (k1 +k2 +k3 ) factor. It is easily verified, by the application of momentum conservation and the orthogonality relations e j k j = 0, for each j = 1, 2, 3, that this amplitude is anti-symmetric in the exchange of any pairs (ei , k i ) ↔ (e j , k j ). Hence symmetrization over the exchanges of such pairs, which is necessary for bosons, give zero unless one considers a linear combination of such amplitudes involving massless vector bosons, with corresponding anti-symmetric coefficients thus leading naturally to a non-abelian gauge theory with anti-symmetric structure constants. Thus this three point-function contributes only in the non-ablian gauge theory case. The completion of the description of the vertex of the interacting vector bosons may be achieved by attaching an additional label (called a Chan-Paton factor) to the polarization vectors e a , a = 1, ..., N , introduce totally anti-symmetric constants f abc and consider appropriately such a linear combination (47.16) A (k1 e1 , k2 e2 , k3 e3 ) = e1μa11 e2μa22 e3μa 33 V μa11 μa22 μa33 (k1 , k2 , k3 ), with an emerging three-vector-boson vertex V μ 1 μ 2 μ 3 (k1 , k2 , k3 ) = (2π ) D δ D (k1 + k2 + k3 )  μ 1aμ1 2a2 a3  (k1 − k2 )μ 3 + ημ 3 μ 1 (k3 − k1 )μ 2 + ημ 2 μ 3 (k2 − k3 )μ 1 , × i g f a1 a2 a3 η

(47.17)

where the overall phase factor “ i ” is introduced for convenience to ensure reality by Fourier transform to x-space, and the conservation law k1 + k2 + k3 = 0 has been used, in the process, as well as e j · k j = 0 in writing the expression in (47.17). The above three-point function A (k1 e1 , k2 e2 , k3 e3 ) for bosons, with necessarily totally anti-symmetric expansion coefficients f abc , satisfies the Bose character of the vector particles as it should. Are we on the verge in re-discovering the Yang-Mills field theory from string theory, where the f abc are so-called structure constants ?. This is the subject of the next chapter. A Scattering Process For historical reasons, the amplitude (47.4) is often applied to the scattering of four tachyons as the resulting amplitude, in particular, of scattering of four scalar particles coincides with an expression conjectured by Veneziano in the sixties2 which was discussed in Chapter 43, and has some desirable properties which will be mentioned again here. As before we set = 1 to simplify the notation. We make a change of variable τ → z: eτ = z in the subsequent analysis. The vertex operator for a tachyon state, of momentum k, has the simplest possible structure given by VΛ (k, z) = : exp i k X (z) :

(47.18)

i.e., WΛ (k, z)1 is set equal to one in (47.6) due to the scalar nature of the paticles. We note from (44.29) of Chapter 44, with D = 26, = 1, i.e., α  = 1/2 in (44.5), that for the tachyon, M 2 = −2, for α  = 1/2, i.e., k 2 = 2. We carry out a transformation of variable (z)i → z, which corresponds to so-called a Wick rotation in complex time plane. In reference to (47.8), with eτ = z, the following properties easily follow for a tachyon state, 

exp − k2μ

∞  α μ (n) n=1

n

 z −n |0; k1  = |0; k1 ,

  k2   exp[ i k2 x − i k2 p lnz ]|0; k1  = exp k1 k2 lnz exp 2 lnz |0; k1 + k2 , 2 where in writing the latter equation, we have used the elementary relation e A+B = e A e B e[ B,A ]/2 ,

2 Veneziano

[3], see also Lovelace and Squires [1].

(47.19)

(47.20)

320

47

Vertices, Interactions and Scattering

for [ B, A ] a c-number, the covariant generalization [ x μ , p ν ] = i ημν , that p |0; k1  = k1 |0; k1 , and finally that x is the generator of momentum translation. We recall that k22 = 2. From (47.8), (47.9), (47.19), (47.20), the scattering amplitude in (47.4), with z = eτ , of four tachyons becomes 

1

dz k1 k2 1−i (z) z z 0 ∞ ∞    

k3 α(n)

k2 α(m)† m   exp z  0; k1 + k2 . × 0; k3 + k4  exp − n m n=1 m =1 A (k1 , k2 , k3 , k4 ) = (g)2

(47.21)

Using the identity

= −k2 k3

∞ ∞    k3 α(n)  k2 α(m)† m   0; k3 + k4  − z  0; k1 + k2 n m n=1 m =1

∞  zn 0; k3 + k4 |0; k1 + k2  = k2 k3 ln(1 − z) 0; k3 + k4 |0; k1 + k2 , n n=1

(47.22)

after some labor, the following expression emerges  A (k1 , k2 , k3 , k4 ) = (g)2

1

dz (z)k1 k2 (1 − z)k2 k3 0; k3 + k4 |0; k1 + k2 ,

(47.23)

0

where 0; k3 + k4 |0; k1 + k2  expresses the conservation of momentum. Upon using Mandelstam variables and the identities following them (ki2 = 2, i = 1, 2, 3, 4) s t s = −(k1 + k2 )2 , t = −(k2 + k3 )2 , k1 k2 = − − 2 k2 k3 = − − 2, 2 2

(47.24)

as well as the definition of the beta function 

1

B(a, b) = 0

dz z a−1 (1 − z)b−1 =

Γ (a) Γ (b) , Γ (a + b)

(47.25)

the amplitude, up to a conserving delta function and a phase factor, becomes t s A = (g)2 B(− − 1, − − 1), 2 2

(47.26)

which is the classic Veneziano amplitude. In addition to the tower of string states contributing to this expression, leading to poles in the gamma functions in the s and t- channels with an infinite number of particles exchanged, the pole, in the schannel, for example, arising at s = −2, is the expected exchange of a tachyon in field theory. This amplitude was written down by Veneziano before the birth of string theory!. Closed Strings: The worldsheet of a closed string, does not have specific boundaries such as σ = 0, as in the open string case, and particle emission (absorption) is to be considered to occur from any σ value, as shown by the crosses in the string diagram of a string diagram of a closed string

and an averaging is to be performed over σ of a vertex operator insertion. Here a string diagram is defined as the surface of the infinite cylinder, generated by the closed string, set up as shown straight, in a circular

47

Vertices, Interactions and Scattering

321

shape , with τ labeled along the horizontal direction shown,from τ → −∞ to τ → ∞. Consider, for example, a vertex insertion of the graviton which as we have seen is in the spectrum of a closed string in Chap. 44. The state of graviton of momentum k, and described by a polarization tensor  μν , (see Eqs. (44.32), (44.33), (44.56), (44.58) in Chap. 44) μν α μ (1)α¯ ν (1)|0; k,  μ0 = 0,  μ μ = 0, kμ  μν = 0.

(47.27)

where |0; k, is the ground-state introduced in Eq. (44.56) of Chapter 44 with associated momentum k. A graviton vertex operator may be then defined by VΛ ∝ μν ∂ ξ X μ ∂ ξ X ν e i k X , with ξ = (τ, σ ), and hence simply, and more conveniently, as     μ VΛ (τ, σ ) = μν ∂+ X μ ∂− X ν e i k · X = μν ∂+ X L ∂− X νR e i k · X ,

(47.28)

∂± = (∂τ ± ∂σ )/2. We note that for τ = 0, we may write k · X (0, σ ) = k · x + an expression consisting of operators of creation and annihilation of particles and is independent of x, which we may denote by k ·  X (0, σ ). Also note that the coefficient of exp [ i k · X (0, σ ) ] in VΛ (0, σ ) in (47.28) is independent of x. Accordingly by using the fact that the graviton is massless, i.e., k 2 = 0, we have for any constant β the following useful equations : exp [ i βp 2 ] exp [ i k · x ] exp [ −i βp 2 ] = exp [ i k · x ],

(47.29)

exp [ i βp ] VΛ (0, σ ) exp [ −i β p ] = VΛ (0, σ ).

(47.30)

2

2

Using the expression of the normal ordered expression in (47.8), (47.9), adapted to the present situation, the graviton vertex operator may be defined by VΛ (k, τ, σ ) = μν ×

1 2 1 2

pμ +

 n=0

pμ +



 α μ (n) e−2i n (τ −σ ) : e i k · X R (τ −σ ) :  α¯ μ (n) e−2i n(τ +σ ) : e i k · X L (τ +σ ) : .

(47.31)

n=0

Upon rewriting the latter as V = μν V μν , we see that V μν , is written as the product of a right- and left-vertex. Therefore up to polarization tensors, we may define a vertex operator in the closed string as a product of a right- and a left vertex operators expressed as     k k , τ − σ VLΛ ,τ + σ , (47.32) VΛ (k, τ − σ, τ + σ ) = VRΛ 2 2 and perform an integration over σ as an averaging over all σ , considering all of its possible values, and obtain a vertex operator which is independent of σ : 1 VΛ (k, τ ) = π





π

dσ VRΛ 0

k ,τ − σ 2



 VLΛ

 k ,τ + σ . 2

(47.33)

From (47.4), we may directly write for a 4-particle scattering amplitude  A (Λ1 k1 , Λ2 k2 , Λ3 k3 , Λ4 k4 ) = i(gC )2

0

dτ 2 Λ 4 k n out| VΛ3(k3 , τ3 = 0)VΛ2(k2 , τ 2 )|Λ1 k1 in.

(47.34)

−∞

Of particular interest is three-graviton vertex A (Λ1 k1 , Λ2 k2 , Λ3 k3 ) = gC  Λ3 k3 | VΛ 2 k2 | Λ1 k1 ,      k2 k2 1 π , 0 − σ VLΛ 2 ,0 + σ . VΛ 2 k2 = dσ VRΛ 2 π 0 2 2 In Box 47.1, Eq.

∗∗

at the end of the Box, it is shown that

(47.35) (47.36)

322

47

Vertices, Interactions and Scattering



   k2 k2 , 0 − σ VLΛ 2 ,0 + σ 2 2         k k2 2 = exp − σ i (MR2 − ML2 ) VRΛ 2 , 0 VLΛ 2 , 0 exp + σ i (MR2 − ML2 ) , 2 2 VRΛ 2

M 2 = MR2 + ML2 ,

(MR2 − ML2 )| Λ1/3 k1/3  = 0.

(47.37) (47.38)

Accordingly, the three-graviton vertex is given from (47.35) - (47.38) by  A(Λ1 k1 , Λ2 k2 , Λ3 k3 ) = gC Λ3 k3 |VΛ2 R

   k2 k2 , 0 VΛ 2 L , 0 |Λ1 k1 . 2 2

(47.39)

Upon comparison of this expression with the three-point function of three massless vector particles through (47.5), (47.6), (47.15), the expression of three-point function for three gravitons emerges as the product of two such vertices of 3-massless vector bosons of momenta k1 /2, k2 /2, k3 /2 multiplied by graviton polarization tensors, i.e., k 2 k 3  μ 2 ν 2 2  k 1 k 2 k 3  A , , , , 2 2 2 2 2 2  Aμν (k1 , k2 , k3 ) = ημ k1ν + ημν k2 + ην k3μ ,

gC μ1 1 μ2 ν21 ν2 31 2 Aμ1 ν1 1

k

1

,

(47.40) (47.41)

with the direction of momenta finally chosen so that momentum conservation reads k1 + k2 + k3 = 0, where we have retained only the leading terms in 2 implying that A μν (k1 , k2 , k3 ) must be linear in the momenta (see (47.15) and just below it). We may rewrite (47.40) in a more symmetrical manner. We Make use of momentum conservation : k1 + k2 + k3 = 0, and μ the properties ki μν = 0, kiν μν = 0, for i = 1, 2, 3. The expression for the three-point function in (47.40) equally holds true for A μν (k1 , k2 , k3 ) → −A μν (k3 , k1 , k2 ). Accordingly, we may rewrite the three-point function for three gravitons, with the polarization tensors labeling the gravitons, as A (ε 1 k1 , ε 2 k2 , ε 3 k3 ) = μ1 1 μ 2 ν2 1 ν 2 31 2 V μ 1 μ 2 ,ν 1 ν 2 , 1 2 (k1 , k2 , k3 ),

(47.42)

giving rise to a three-graviton vertex g V μ 1 μ 2 ,ν 1 ν 2 ,  1  2 (k1 , k2 , k3 ) = (2π ) D δ (D) (k1 + k2 + k3 ) C 8 μ 1ν 1  1 μ 2ν 2  2 (k1 , k2 , k3 ) A (k1 , k2 , k3 ) × Sym A

+ Aμ 1ν 1  1 (k3 , k1 , k2 ) Aμ 2 ν 2  2 (k3 , k1 , k2 ) ,

(47.43)

where Aμν  (k1 , k2 , k3 ) is defined in (47.41), and “ Sym” stands for the symmetrization operation to be applied to the expression following it over μ 1 ↔ μ 2 , ν 1 ↔ ν 2 ,  1 ↔  2 , as a consequence of the symmetric nature of the polarization tensors in their two indices. For completeness, we have also introduced a momentum conserving delta function factor in (47.43). The above expression of the three-graviton vertex will turn up to be important when dealing with the connection between string theory and Einstein’s theory of gravitation later in Chap. 69 after having introduced GR. For more details on the content of this chapter including general expressions of multi-particle scatterings, as well as applications to superstring theory see Manoukian [2].

References 1. Lovelace, C., & Squires, E., (1970). Veneziano theory. Proceedings of the Royal Society of London A, 318, 321–353. 2. Manoukian, E. B. (2016). Quantum field theory II: Introductions to quantum gravity, supersymmetry and string theory. Switzerland: Springer. 3. Veneziano, G. (1968). Construction of a crossing-symmetric Regge-behaved amplitude for linearly rising trajectories. Nuovo Cimento A, 57, 190–197.

String Theory Re-Invents the Yang-Mills Field Theory

48

Prerequisite Chaps. 16, 44, 47

It is remarkable that we have asked string theory to provide us with a three-massless-vector interaction vertex. We have found, by invoking the spin statistics theorem that it is possible if the corresponding field carries an extra quantum number (“charge”) as additional label. This is quite interesting as we know, for example, that photons do not carry a charge and they hardly interact1 and move as blind entities and essentially do not see each other. We have seen in Chap. 44, that string theory gives rise naturally to vector massless particles associated with modes of vibrations of some of the strings. On the other hand, it also gives rise to a mathematical expression in Eq. (47.17) of Chap. 47 for the three-massless-vector interaction vertex as just mentioned. Given these two facts, would these two facts obtained from string theory lead to the Yang-Mills field theory?. The purpose of this chapter is involved with this problem. To the above end, we start from a first order formulation of free massless vector fields, described each in terms of two sets of fields {Aaμ }, and {Faμν }, with action and corresponding field equations (see also Chap. 16) 

W =

 1  1 (dx) − Faμν (∂ μ Aaν − ∂ ν Aaμ ) + Faμν Faμν , 2 4 μ Faμν = ∂μ Aaν − ∂ν Aaμ . ∂ Faμν = 0,

(48.1) (48.2)

The string theory three-vector-particle vertex as obtained in (47.17) of Chap. 47 reads V μa11 μa22 μa33 (k1 , k2 , k3 ) = (2π ) D δ ( D ) (k1 + k2 + k3 )   × i g f a1 a2 a3 ημ 1 μ 2 (k1 − k2 )μ 3 + ημ 3 μ 1 (k3 − k1 )μ 2 + ημ 2 μ 3 (k2 − k3 )μ 1 ,

(48.3)

with f a1 a2 a3 necessarily anti-symmetric in its indices. As we will see below, given the fields Aaμ and Fbμν , an interaction term involving the product of these fields which leads to the vertex part in (48.3) is simply given by LI (x) = −

1 g f abc Faμν (x) Aμb (x) Aνc (x). 2

(48.4)

The latter may be equivalently rewritten as μ

LI (x) = − g f abc ∂μ A aν (x) Ab (x) Aνc (x),

(48.5)

involving only one derivative in conformity with the expression in (48.3), in x-space. μa11 μa22 μa33 (k1 , k2 , k3 ) corresponding to this interaction is defined by The vertex part V μa 1 μa 2 μa3 (k1 , k2 , k3 ) = (2π )(3D) V 1 2 3

δ

δ

δ



δ A a3 μ 3 (k3 ) δ A a2 μ 2 (k2 ) δ A a1 μ 1 (k1 )

(dx) LI (x).

(48.6)

We introducing the Fourier transform of the fields

1 Any

interaction between photons arise as quantum corrections of higher order and are significantly small.

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_48

323

324

48



String Theory Re-Invents the Yang-Mills Field Theory

(d p) i p x e A aμ ( p), (2π ) D

(48.7)

δ A a μ ( p) = δ a b δ μ ν δ ( D ) ( p − k). δ A bν (k)

(48.8)

A aμ (x) = with functional derivatives of the fields defined by

μa11 μa22 μa33 (k1 , k2 , k3 ) in (48.6), is then readily carried out. Clearly, it will contain six terms. Up to the The expression for V D (D) (2π ) δ (k1 + k2 + k3 ) factor obtained by integrating over x, the first term in (48.6) is given by − i g f abc [ k1μ δ μ 1ν δa a1 ] [η μμ 2 δ ba 2 ] [η νμ 3 δ ca 3 ]. (∗) A straightforward application of the functional derivatives in (48.6), by using, in the process, the anti-symmetry of f abc , then gives precisely 

μa 1 μa 2 μa3 (k1 , k2 , k3 ) = (2π ) D δ D (k1 + k2 + k3 ) V 1 2 3

 × i g f a1 a2 a3 ημ1 μ2 (k1 − k2 )μ 3 + ημ 3 μ 1 (k3 − k1 )μ 2 + ημ 2 μ 3 (k2 − k3 )μ1 .

(48.9)

μ

For example, note that the term just given in (*) corresponds to the term ημ3 μ1 (−k1 2 ) inside the square brackets in (48.9). μa11 μa22 μa33 (k1 , k2 , k3 ) coincides with the one in (48.3) The vertex part V The above analysis, in turn, suggests to add the integral of the interaction term in (48.4) to the action in (48.1) as an iteration. Here we note that the fields Faμν and Aaν are now subjected to the variation of the new action and hence are automatically modified. All told, the action becomes:    1 1 (48.10) W = (dx) − Faμν (∂ μ Aaν − ∂ ν Aaμ + g f abc Aμb Aνc ) + Faμν Faμν . 2 4 The field equations that now follow are ∂ μ Fcμν = g Faμν A μb f abc ,

Faμν = ∂μ A aν − ∂ ν A aμ + g f abc A bμ A cν ,

(48.11)

recognizing the Yang-Mills field theory equations,2 where we have used, in the process, the anti-symmetry property of f abc . The analysis was based on the expression of Lagrangian density of free massless vector fields and an interaction term which gives rise to the three-vector-particle vertex as obtained from string theory, followed by an iteration. Obviously no non-abelian gauge invariance argument was used to obtain the expression for the effective action in (48.10).3 A similar analysis will be applied using the three-graviton vertex obtained from string theory in Eq. (47.43) of Chap. 47 to show how Einstein’s General Relativity may be obtained from string theory in Chap. 69 after the reader has acquired a good understanding of Einstein’s theory in the following chapters to the present one.

Reference 1. Manoukian, E. B. (2016). Quantum field theory II: Introductions to quantum gravity, supersymmetry and string theory. Switzerland: Springer.

2 In 3A

earlier chapters (see, e.g.., Chap. 17 ) we have conveniently denoted Faμν by G aμν . general reference to this chapter is Manoukian [1].

Gravity and Spacetime

49

Unlike all the interactions we have described in earlier chapters, the gravitational one is a universal interaction experienced by all particles massive or massless due to their energy content. Although the gravitational coupling constant is much weaker than the other couplings such as of the electromagnetic, the Fermi and the QCD ones, it plays a prominent role in describing nature at large distances in our solar system, in describing the fate of the Universe, and not to mention of its unique role it plays right here on earth in our everyday lives. Moreover, it is expected that the strengths of other interactions will be comparable to the strength of gravitation at very high energies of the order 1019 GeV or perhaps even lower, at which quantum gravity will be significant in a unified quantum field theory description of all the interactions in Nature including gravitation. The present chapter and a few of the following ones, are involved with Einstein’s General Relativity (GR) theory1 as a theory of gravitation which goes far beyond Newton’s theory of gravitation. It has been successfully applied to actual experiments, many of which will be considered later, for which Newton’s theory is not adequate. GR has been quite rich in its predictions such as of black holes, white holes, wormholes, gravitational waves, even that spacetime itself rotates around a rotating massive body. The latter, referred to as frame dragging effect has been fully verified in a recent experiment in 2011 involving our (rotating) Mother Earth.2 Moreover, GR plays a unique role in describing the evolution of the Universe as we will see in later chapters. The underlying theory of GR as well as of many of its predictions, including the ones just mentioned, will be developed in details in subsequent chapters. In particular applications will be given to gravitational waves (Chap. 52), perihelion precession of a planetary orbit (Chap. 56), deflection of light by the Sun (Chap. 57), gravitational lensing (Chap. 57), time delay in the round-trip travel time for radar signals reflecting off other planets (Chap. 58), many aspects of black hole physics including non-rotating black holes (Chaps. 60–66), wormholes (Chaps. 62, 63), rotating black holes (Chaps. 64–66), frame-dragging (Chap. 67), curvature (geodesic effect) of spacetime due to Earth, thermodynamics of black holes (Chap. 68), development of the theory of supergravity (Chap. 70). Quantum aspects, and cosmological studies will follow later. In particular, in Chap. 69, we will learn the very interesting aspect on how GR may be recovered from string theory. As a guiding principle and a rule of thumb, GR theory treats Newton’s theory as a limiting case of a very weak gravitational interaction involving slowly moving bodies. In the GR theory, one assumes the validity of the Weak Principle of Equivalence (WPE) which states that the inertial mass m I of a body and its gravitational mass m G are equal: m I = m G . The inertial mass m I is a measure of the resistance of a body to a change of its state of motion. The larger it is, the slighter is the change of its acceleration a. For any given force F acting on the body, it is defined through the equation F = m I a. On the other hand, the gravitational force on a body is proportional to its gravitational mass. We may express this in terms of Newton’s law of gravitation as follows: if m G and, say, M denote the gravitational masses of two bodies then the force of attraction between the two bodies is proportional to the product m G M, and, in particular, the gravitational force acting on the particle of gravitational mass m G is given by m I a = −(GMm G /r 2 )r/r , with m I = m G . The latter implies that a = − (GM/r 2 )r/r . This interesting property then, in turn, says that different objects fall at the same rate in a pure gravitational field, i.e., they all have the same acceleration. The weak principle of equivalence is attributed to Galileo who supposedly claimed its validity for different particles thrown from the tower of Pisa. Astronaut David Scott verified Galileo’s claim that acceleration is the same for a hammer and a feather which were released simultaneously from a given height on the Moon (which is essentially in a vacuum), during the Apollo 15 mission in 1971. The early classical test of the weak equivalence principle goes to the Austro-Hungarian physicist Roland   [3, 4] translation of which may be found in the books: Lorentz et al. [8, 9] . 2 Everitt et al. [7]. 1 Einstein

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_49

325

326

49

Gravity and Spacetime

von Eötvös in about 1890.3 For relevant modern tests see, e.g., the collection of papers on WEP edited by C. C. Speake and C. M. Will.4 A central role that played in the development of GR by Einstein, the creator of the theory himself, was the so-called Einstein’s Equivalence Principle (EEP) or just referred to as the Principle of Equivalence (PE). The principle states that: “At any point in spacetime one may set up locally a Lorentz frame such all the (non-gravitational) laws of physics take on their special relativistic forms at the point in question”. For example, in generalizing Maxwell equations in the presence of gravitation, at any given point in spacetime one may set up a Lorentz frame such that at the point in question, Maxwell’s equations are given by their special relativistic forms we have encountered earlier in Chap. 14.5 In order to develop the GR theory, we also need the Principle of General Covariance. This means that physical laws must take the same mathematical form in all coordinate systems. This means that they are to be expressed in terms of vectors and tensors. Accordingly, physical laws expressed in different coordinate systems must take the same form, with just its variables, components, ... simply relabeled. This, in turn, requires that when one introduces the derivative of a vector, one must make sure that it defines a tensor, i.e., it transforms as a tensor under coordinate transformations.6 Before going further into the meaning of the equivalence principle and in the development of the GR theory, consider a simple example of an elevator whose cable has been cut and is falling freely in the gravitational field of the Earth, with the latter assumed to be non-rotating, as shown here:

For a frame set up and fixed in the falling elevator shown on the left-hand side, a test particle placed at the origin of this coordinate system will remain at rest and appears as weightless, while for an observer outside the elevator, the particle is accelerating in the gravitational field. On the other hand, for a frame set and fixed in the falling elevator on the right hand, a test particle placed at the origin of this coordinate system will eventually move from it approaching the center line due to the attraction directed toward the center of the Earth. Thus in the latter case the test particle may be taken to be approximately at rest only in a very small region of space and for a very short time, depending on the accuracy one is seeking. The general, basic rule is then, that at every point in spacetime, way before a particle falls to the surface of the Earth, a coordinate system may be set up in which locally, and only locally, i.e., only at the point in question, a particle is considered at rest with the gravitational force wiped out. In a limiting sense, according to the EP, at every point in spacetime we may set up a local Lorentz frame (LF), such that at a point in question, (non-gravitational) laws of physics are given by their special relativistic flat space forms. The role of gravitation is then taken into account by the comparison of such LFs by the way they are infinitesimally related to one another. To account for gravity, as just described, one, clearly, needs a structure to tell us how such coordinates, associated with the LFs, may be arranged relative to each other, which are set up at infinitesimally close points. By doing this, one is able to enmesh non-gravitational laws with gravity by the EP. The structure in question is referred to as the connection. The connection allows one to compare the flat coordinate systems with infinitesimally close origins lying, for example, on given curves parametrized by a given variable λ as formally seen here on some curved surfaces7 :

3 Eötvös,

R. V. [6]. For an English version see: Eötvös, L. [5].

4 [11]. 5 Some

pertinent tests of the equivalence principle are given in these useful references: Misner et al. [10], Sect. 38.6; Turyshev [12]; Will [13]; Di Casola et al. [2]; Arai et al. [1]. 6 Under a coordinate transformation {x μ } → {x μ }, a tensor T μ1 ...μn transforms as T ν1 ...νn (x  ) = ∂ x ν1 . . . ∂ x νn T μ1 ...μn (x) by the chain rule. In ∂ x μ1 ∂ x μn particular, if all the components of a tensor vanish in one coordinate system, then they also vanish in every coordinate system. 7 It is important to realize that it is not necessary to consider such curved structures to be embedded in a higher dimensional space. It just helps one to visualize the situations.

49

Gravity and Spacetime

327

LF

x1 (λ )

e1

λ

λ

eμ = {eα μ , α = 0, 1, 2, 3}

e2 x2 (λ )

Fig. 49.1 For a given curve traced by a series of events in spacetime, parametrized by a variable, say, λ, a coordinate description is provided by setting a LF at a given fixed point λ corresponding to a given event

Because of the very appealing universal character of the interaction of gravitation with all particles, one may consider the metric, in general, as a property of spacetime itself. At any point x μ , corresponding to a coordinate labeling of a given event in spacetime, we may set up a LF. Now referring to Fig. 49.1, if the spacetime is flat, then the coordinate axes x μ (λ) may be taken to coincide with the axes of the LF thus erected. That is, the coordinate axes of the LF will just read: x α = δ α μ x μ , α = 0, 1, 2, 3. Since we will be working in curved spacetime as well as in flat Minkowski space time, associated with LFs, it is important to distinguish Lorentz coordinates labels from those of curved spacetime ones. We, conveniently, use the first few letters α, β, γ , ... of the Greek alphabet for the former coordinates, and we use letters from about the middle to the end of the Greek alphabet for the latter such as: μ, ν, ξ, ..., ρ, σ . Accordingly, in general, we need local structures eα μ (x), at a given point x, as a generalization of δ α μ corresponding to flat spacetime, to refer components V μ (x) of a vector, at the point x, to its components V α (x) in the LF, given by the transformation rule α = 0, 1, 2, 3. (49.1) V α (x) = V μ (x)eα μ (x), The fields eα μ (x), are referred to as tetrad or vierbein fields . The index μ specifies the different vector fields in the curvilinear coordinate system, while α species the α th component of any of these vectors in the Local Lorentz coordinate system set up at x. A vector field V(λ), with components V α (x), α = 0, 1, 2, 3, in local Lorentz coordinate system, set up at a point λ on a curve parametrized by λ with coordinate labels x μ (λ) in curved spacetime, may be then expressed as follows: V(λ) = V μ (x)eμ (x), eμ (x) = {eα μ (x)},

V α (x) = V μ (x)eα μ (x).

(49.2)

and define the scalar product of V with itself as V(λ) · V(λ) = V μ (x)V ν (x) eμ (x) · eν (x) = V μ (x)V ν (x)ηαβ eα μ (x)eβ ν (x),

(49.3)

where ηαβ is the Minkowski metric. This allows one to introduce a metric gνσ (x) to lower one of the indices, say, of V ν , and, in turn, define the scalar product of two vectors V1 , V2 , at spacetime point x, as follows: μ

eμ (x) · eν (x) = gμν (x), V ν (x)gνσ (x) = Vσ (x), V1 · V2 = V1 (x) V2μ (x).

(49.4)

The symmetry in the product eμ (x) · eν (x), implies the symmetry in the two indices of the metric gμν (x) = gνμ (x). In a local Lorentz coordinate system set up at x, the above considerations allow one to define the scalar product of two vectors as in special relativity from the following chain of equalities eμ (x) · eν (x) = eα μ (x)eβ ν (x) ηαβ , V1 · V2 =

V1μ (x)V2ν (x) eα μ (x)eβ ν (x) ηαβ

=

β V1α (x)V2 (x) ηαβ ,

(49.5) (49.6)

328

49

eθ = r(cos θ cos φ , cos θ sin φ , − sin θ ),

Gravity and Spacetime

cu

rv e

eφ = r(− sin θ sin φ , sin θ cos φ , 0),

λ

deθ /dλ = cot θ eφ dφ /dλ ,

e1φ e1θ

deφ /dλ = cot θ eφ dθ /dλ − sin θ cos θ eθ dφ /dλ ,

λ + dλ

with deθ /dλ , deφ /dλ denoting components

e2φ e2θ

restricted to the surface of the sphere.

Fig. 49.2 The derivatives of the basis vectors eθ and eφ constrained to the surface of the sphere. Details of these are given in Box 49.1 at the end of the chapter

as expected. In order define the concept of a derivative, in general, it is most instructive to consider first the geometry of the surface of a sphere, say, of radius r as a constructive illustration. To this end at any point on the sphere, we may introduce two basis vectors eθ , eφ as shown in Fig.49.2, with motion restricted to the surface of the sphere. For the underlying algebra see Box 49.1 at the end of the chapter. In reference to the surface of the sphere above, a vector V defined at a point λ, with coordinate labels θ (λ), φ(λ), together with its derivative, may be written as V(λ) = V θ (λ) eθ (λ) + V φ (λ) eφ (λ), θ

(49.7)

φ

dV dV deθ deφ dV = eθ + V θ + eφ + V φ dλ dλ dλ dλ dλ    θ  φ dθ dV dV φ dφ θ dφ − sin θ cos θ V eθ + + cot θ V + cot θ V φ eφ . = dλ dλ dλ dλ dλ

(49.8)

That is, we learn that the derivatives of the components of the vector V are not simply given by dV θ /dλ, dV φ /dλ in the basis vectors eθ , eφ , but involve additional terms due to the curved nature of the surface of a sphere. The basic idea in defining derivatives of a vector, in general, is now easy. We note that eμ (x(λ + dλ)) − eμ (x(λ)) must vanish for dx μ (λ) → 0. Moreover, it may be expanded in terms of basis vectors eσ (x(λ)). That is, the derivative of eμ (x(λ)) with respect to λ may be written as deμ (x(λ)) dx ν (λ) = Γνμ σ (x(λ))eσ (x(λ)) . dλ dλ

(49.9)

where the totality of the expansion coefficients {Γνμ σ (x)} is called the connection whose interpretation will be spelled out below. The above equation may be equally rewritten as 

∂ν eμ (x) − Γνμ σ eσ (x)

 dx ν (λ) dλ

= 0.

(49.10)

From Eq. (49.9), and the fact V(λ) = V μ (x)eμ , we may infer that the derivative of a vector field is given by dV(λ) = dλ



 dV σ (x) dx ν (λ) DV σ (x) + Γνμ σ (x)V μ eσ (x) ≡ eσ (x), dλ dλ dλ

(49.11)

where we have used the notation DV σ /dλ, in the latter equation, in order not to confuse it with dV σ /dλ. A concept which is important in understanding the meaning of the derivative of a vector in curved spacetime is the parallel transfer of a vector V along a curve. For a curve specified, say, by a parameter λ, with coordinate variables x μ (λ), it may simply defined, as λ varies to λ + δλ, by

49

Gravity and Spacetime

329

V(λ) = V(λ + δλ) ⇒ V μ (λ)eμ (λ) = V μ (λ + δλ)eμ (λ + δλ),

(49.12)

and to first order in δλ this gives V μ (λ)eμ (λ) = V μ (λ)eμ (λ) + ⇒

dV μ deμ (λ) eμ (λ) + V μ (λ) , dλ dλ

(49.13)

dV ν dx σ + Γμσ ν V μ = 0, ∂σ V ν (x) + Γμσ ν (x)V μ (x) = 0, dλ dλ

(49.14)

where we have used Eq. (49.9). That is, a vector V ν is parallel transferred along a curve, if its so-called covariant derivative, defined by ∂σ V ν (x) + Γμσ ν (x)V μ (x), and considered below further, with all components  are zero. We also learn that in a ν local Lorentz frame set at x, a parallel transferred vector has constant components: ∂σ V Loc.LF = 0. μ We note that the components V (x + dx) = V μ (x) + δ V μ (x) of a parallel transferred vector from x to x + dx, may be defined through the chain of equalities: ∂ν V μ dx ν = V μ (x + dx) − V μ (x),



∂ν V μ + Γνσ μ V σ dx ν = V μ (x + dx) − V μ (x) − Γνσ μ (x) V σ (x)dx ν , V

μ

μ V (x



ν

dx =

+ dx) =

μ V (x + dx) − V (x + dx), V μ (x) − Γνσ μ (x) V σ (x)dx ν . μ

where V

μ



= ∂ν V

(49.15) (49.16) μ

μ

σ

+ Γνσ V ,

(49.17) (49.18)

We must now consistently check that the derivative of a vector just given in (49.11), and spelled out below, defines a second rank tensor, whose property will be inferred and spelled out. We note that dx ν (λ) DV σ (x) dV σ (x) = + Γνμ σ (x)V μ , dλ dλ dλ ∂ν V σ (x) + Γνμ σ (x)V μ (x) = V μ ;ν (x) ≡ ∇ν V σ (x),

(49.19) (49.20)

where V μ ;ν (x) also denoted by ∇ν V σ (x), define its covariant derivative, and are standard notations for it. One may define the displacement vector ds, and the definition of a line element as follows: ds = dx μ eμ , ⇒ ds · ds = gμν dx μ dx ν ,

(49.21)

where we have used the relation eμ · eν = gμν in Eq. (49.4). Using the invariance of the scalar product of a vector V and ds, under coordinate transformations, V · ds = V ν (x  ) dxν = V μ (x) dxμ , ∂xμ ∂xμ dxν dx ν = dxμ dx μ = dxμ ν dx ν ⇒ dxν = dxμ ν ∂x ∂x

(49.22) (49.23)

gives the following transformation of a vector component under coordinate transformations: V μ (x) =

∂ x μ ν  ∂ x ν   V (x ), and similarly Vμ (x) = V (x ), ν ∂x ∂xμ ν

(49.24)

with similar transformations of tensors obtained simply from the transformations rules of the product of vector components. That is, V μ ;ν (x) defines a (second rank) tensor if, it satisfies the transformation rule V μ ;ν (x) = (∂ x μ /∂ x λ )(∂ x σ /∂ x ν ) V λ ;σ (x  ), under a coordinate transformation, say, as x → x  . In particular, these transformations rules imply that if a vector or a tensor is zero in one coordinate systems that it is zero in every coordinate system. It is readily verified that in order that V μ ;ν transform as a the components of a second rank tensor under a coordinate transformation, say, x → x  , the components of the connection, in turn, transform as  ξ = Γρν

∂ x ξ ∂ x μ ∂ x τ ∂ x ξ ∂ 2 x σ σ Γ + . μτ ∂ x σ ∂ x ρ ∂ x ν ∂ x σ ∂ x ρ ∂ x ν

(49.25)

330

49

Gravity and Spacetime

The latter, in turn, does not transform as the components of a tensor due to the presence of the second term on the right-hand side of the above equation. On the other hand the combinations



 ξ ∂ x ξ ∂ x μ ∂ x τ Γμτ σ − Γτ μ σ , Γρν − Γνρ ξ = σ ρ ν ∂x ∂x ∂x

(49.26)

transform as tensor components. The importance of this equation is that according to the EP, at any point in spacetime, a LF may be chosen so that at the point in question the laws of speciality relativity. This means that at such a point covariant derivatives become just partial derivatives and the connection vanishes. Since the above combination is a tensor and vanishes at such a point in the LF, it vanishes in every coordinate system. That is Γρ ν ξ = Γνρ ξ ,

(49.27)

is symmetric in its lower two indices as a consequence of the EP. It is readily checked the partial derivative ∂ϕ(x)/∂ x μ of a scalar ϕ(x) defines the components of a vector. On the other hand since V μ Vμ is a scalar and hence ∂ν (V μ Vμ ) define vector components, this implies that Vμ;ν = ∂ν Vμ (x) − Γνμ σ (x)Vσ (x) ≡ ∇ν Vμ (x),

(49.28)

defining the expression for the covariant derivative of Vμ . The components V μ and Vμ are often referred to as contravariant and covariant components of a vector, especially in the old literature. By considering tensor components as the product of vector components, as far as their covariant derivatives are concerned, the explicit expressions of the covariant derivatives of the followings tensors become obvious8 : ∇σ T μν = ∂σ T μν + Γσρ μ T ρν + Γσρ ν T μρ , ∇σ Tμν = ∂σ T ∇σ T

μ

ν ν

= ∂σ T

μν μ

∇σ Tμ = ∂σ Tμ

ν ν

ρ

ρ

− Γσ μ Tρν − Γσ ν Tμρ , μ

+ Γσρ T ν

ρ

ν ρ

ρ

− Γσ ν T ρ

μ

ρ, ν

+ Γσρ Tμ − Γσ μ Tρ .

(49.29) (49.30) (49.31) (49.32)

Finally, gμν V μ ;σ (x) = V ν;σ (x), as lowering the index μ of the components of a tensor, implies that   gμν;σ = 0, ⇒ ∂σ gμν 

LF

= 0.

(49.33)

The latter means that at any point x, we may set up a LF such that the point in question we encounter the welcome property of the vanishing of the derivative of the metric: ∂σ gμν = 0, as in flat spacetime. This, however, does not imply the vanishing of the higher derivatives of the metric at x. Using the fact that the covariant derivative of the metric vanishes, we may, in particular, write the equation gμσ ;ν + gσ ν;μ + gνμ;σ = 0.

(49.34)

Using the definition of the covariant derivative of a second rank tensor in (49.30), the above equation readily gives rise to an explicit expression for the connection Γμν ρ =

1 ρσ ∂μ gνσ + ∂ν gμσ − ∂σ gμν . g 2

(49.35)

It is interesting to note that with partial derivatives replaced by covariant derivatives, Maxwell’s equations given in Eq. (14.7) of Chap. 14 now in curved spacetime simply take the following form F μν ;μ (x) = − Fμν;σ (x) + Fσ μ;ν (x) + Fνσ ;μ (x) = 0. 8 Note

that covariant differentiations are non-commutative.

1 ν J (x), c

(49.36) (49.37)

49

Gravity and Spacetime

331

These equations will hold true in different coordinate systems and its variables will be simply relabeled.

Box 49.1 Geometry on the Surface of a Sphere

er = r (sin θ cos φ, sin θ sin φ, cos θ), eθ = r (cos θ cos φ, cos θ sin φ, − sin θ), eφ = r (− sin θ sin φ, sin θ cos φ, 0). eμ · eν = gμν , eθ · eθ = r 2 , eφ · eφ = r 2 sin2 θ, eθ · eφ = 0.   [ gμν ] = diag r 2 , r 2 sin2 θ , [ g μν ] = diag [(1/r 2 ), (1/r 2 sin2 θ)]. ds 2 = r 2 (dθ)2 + r 2 sin2 θ(dφ)2 . deθ = r (− sin θ cos φ, − sin θ sin φ, − cos θ)dθ + r (− cos θ sin φ, cos θ cos φ, 0)dφ,

= −er dθ + cot θ eφ dφ, ⇒ deθ = cot θ eφ dφ restricted to the surface . deφ = r (− cos θ sin φ, cos θ cos φ, 0)dθ + r (− sin θ cos φ, − sin θ sin φ, 0)dφ = cot θeφ dθ − sin2 θer dφ − sin θ cos θeθ dφ, ⇒ deφ = cot θ eφ dθ − sin θ cos θ eθ dφ, where we have used the fact that r sin θ(cos φ, sin φ, 0) = sin2 θer + sin θ cos θeθ .

deμ = Γμν σ eσ dx ν , Γθ φ φ = Γφθ φ = cot θ, Γφφ θ = − sin θ cos θ, and all other components of the connection are equal to zero, [dx θ ≡ dθ, dx φ ≡ dφ.]

References 1. Arai, S., Nitta, D., & Tashiro, H. (2016). Test of the Einstein Equivalence Principle with spectral distortions in the Cosmic Microwave Background. Physical Review, D, 94, 124048 (pp. 6). 2. Di Casola, E., Liberati, S., & Sonego, S., (2015). Nonequivalence of equivalence principles. American Journal of Physics, 83, 39–49. 3. Einstein, A. (1915). Die feldgleichungen der gravitation. Preussischen Akademie der Wissenschaften zu Berlin, Sitzungsberichte, (pp. 844–847). 4. Einstein, A. (1916). Die grundlage der allgemeinen relativitätstheorie. Annalen der Physik, 49, 769–822. 5. Eötvös, L. (2008). On the gravitation produced by the Earth on different substances. The Abraham Zelmanov Journal, 1, 6–9. 6. Eötvös, R. V. (1890). Uber die anziehung der erde auf verschiedene Substanzen. Mathematische und Naturwissenschaftliche Berichte aus Ungarn, 8, 65–68. 7. Everitt, C. W. F., et al. (2011). Gravity probe B: Final results of a space experiment to test general relativity. Physical Review Letters, 106, 221101. 8. Lorentz, H. A., Einstein, A., Minkowski, H., Weyl, H., & Sommerfeld, A. (1952). The principle of relativity. New York: Dover Publication. 9. Lorentz, H. A., Einstein, A., Minkowski, H., Weyl, H., & Sommerfeld, A. (1953). The meaning of relativity (5th ed.). Princeton: Princeton University Press. 10. Misner, C. W., Thorne, K. S., & Wheeler, J. A. (1971). Gravitation. New York: W. H. Freeman and Company. 11. Speake, C. C., & Will, C. M. (Eds.). (2012). Tests of the weak principle of equivalence. Classical and Quantum Gravity, 29, 180301–184013. 12. Turyshev, S. G. (2008). Experimental tests of general relativity: Recent progress and future directions. Annual Review of Nuclear Science, 58, 207–248. 13. Will, C. M. (2014). The confrontation between general relativity and experiment. Living Reviews in Relativity, 17, 4–117.

Spacetime Curvature Prerequisite Chap. 49

50

We recall from Eq. (49.14) in the previous chapter that a vector field with components V μ (x) is said to be parallel transferred along a curve, parametrized by a variable, say, λ with coordinate label x μ (λ), if its covariant derivative is zero along the curve, that is dV μ (x) dx ν + Γσ ν μ (x)V σ (x) = 0. (50.1) dλ dλ Of particular importance is the parallel transfer of the tangent vector to the curve, with components dx μ /dλ, given in Eq. (49.21) parallel transferred to itself. It satisfies the equation σ ν d2 x μ μ dx dx = 0. + Γ σ ν dλ2 dλ dλ

(50.2)

This equation is referred to as the geodesic equation. In particular, in a LF set up at point x, the connection is zero, which gives at the point in question in the LF, d2 x μ /dλ2 = 0. From the chain rule with d/dλ = (dx ν /dλ)(∂/∂ x ν ) this gives dx ν ∂ dλ ∂ x ν



dx μ dλ

 =

d2 x μ = 0, dλ2

(50.3)

which means that dx μ (λ)/dλ satisfies the equation of a straight line. The geodesic equation is the corresponding equation of a straight line in curved spacetime. In a geometrical sense gravity is accounted for by the curvature of spacetime, and the departure of the latter from the flat Minkowski spacetime of special relativity. With gravity one associates a curvature to describe the underlying geometry of spacetime, and in this sense gravity and curvature become simply interchangeable words for the same thing. To study the curvature of spacetime we consider two approaches in its development to investigate the local nature of the geometry of spacetime. One is to investigate the deviation between two infinitesimally close geodesics. As a second approach, we investigate how the component of a vector may change if it is parallel transferred around a close path back to its initial position. Consider first two test1 particles in a gravitational field in Newton’s theory of gravitation at positions r1 (t), r2 (t), at a given time t as shown on the left-hand side of Fig. 50.1. The potential energy in a is given by ϕ(r ) = −GM/r , where G is Newton’s gravitational constant, and M is the mass of the gravitating body (non-rotating Earth). The force on a particle, of unit mass, at position r is given by −(GM/r 2 )(r/r ) = −∇ϕ(r ). Accordingly, the equations of motion of the two text particles are given by: d2 r1 d2 r2 = − ∇ ϕ(r ), = − ∇ 2 ϕ(r2 ), 1 1 dt 2 dt 2 1 δr δr with r = (r1 + r2 ), δr = r2 − r1 , ⇒ r1 = r − , r2 = r + . 2 2 2

(50.4)

1 By

test particles, in physics, it is meant that the interaction between the two particles is neglected in comparison to their interactions with the massive body of mass M. © Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_50

333

334

50

1

2

2

1

t r2 (t)

r1 (t)

Spacetime Curvature

xμ (λ )+δ xμ (λ )

λ x μ (λ )

Fig. 50.1 Two test particles in a gravitational field in Newton’s theory of gravitation shown on the left-hand side of the figure, while two infinitesimally close geodesics, in curved spacetime, are depicted on its right-hand side

For the relative position vector between the two test particle δr(t) = r2 (t) − r1 (t), with infinitesimal δx i , the relative acceleration of the two test particles, derived in Box 50.1, is given by   ∂ ∂ d2 δx i = − ϕ(r ) δx j , dt 2 ∂xi ∂x j

(r = (x 1 , x 2 , x 3 )).

(50.5)

On the other hand, the the two geodesic equations shown on the right-hand of Fig. 50.1 are given by: d(x ν + δx ν ) d(x σ + δx σ ) d2 (x μ + δx μ ) = 0, + Γνσ μ (x + δx) 2 dλ dλ dλ d2 x μ dx ν dx σ μ = 0. + Γ (x) νσ dλ2 dλ dλ

(50.6) (50.7)

Upon defining the displacement vector between the two geodesics at point λ: P(λ) = δx μ eμ (x(λ)), then referring to Box 50.2, and in comparison to Eq. (50.5), we have D 2 δx μ d2 P(λ) = eμ (x(λ)), with components dλ2 dλ2 2 D δx μ dx ν dx ρ σ = R μ νρσ (x) δx , where 2 dλ dλ dλ R μ νρσ = ∂ρ Γνσ μ − ∂σ Γνρ μ + Γρτ μ Γνσ τ − Γσ τ μ Γνρ τ ,

(50.8) (50.9) (50.10)

and the latter is referred to as the Riemann Curvature Tensor. If it is not zero it provides evidence of the presence of gravitation as a manifestation of the spacetime curvature, by the observed deviation of two neighboring geodesics. Equation (50.9) is referred to as the equation of the geodesic deviation. At a point x, one may set up a LF, such that the connection is zero but not is derivative and the Riemann tensor will not vanish.2 Box 50.1 Derivation of Eq. (50.5)

∂ ∂ x 1i

 

  i i xi xi = − r 31 , ∂ i r12 = − r 32 , x2i − x1i = δx i , x1i = x i − δx2 , x2i = x i + δx2 , ∂ x 1 2 2         x 2i x 1i 1 1 1 1 δx i 1 1 i + − − +  − = x x i + r13 δx i , 2 r3 r3 r3 r3 r3 r3 r3 r3

1 r1

2



2 As

1

1 r13 x 2i r23



2

≡ x 1i r13

1 |r−δr/2|3



1

 i

1 r3

+

 − 3 x r 5r·δr +

2

1

3 r·δr , 2r 5

1 r23

δx i r3

=



∂ ∂xi

2



1

r·δr  r13 − 32r 5 ,    xj ∂ ∂ 1 j =− δx δx j . 3 j i r ∂x ∂x r 1 |r+δr/2|3

a matter of fact, as stated in the previous chapter, a tensor which vanishes in one coordinate system vanishes in every coordinate system.

50

Spacetime Curvature

335

Box 50.2 Geodesic deviation and the Riemann curvature tensor

Upon carrying a Taylor expansion about the point x of Eq. (50.6), and subtracting from it Eq. (50.7), we obtain d2 (δx μ ) dλ2

+ ∂ρ Γνσ μ (x) δx ρ

dx ν dx σ dλ dλ

ν

+ 2 Γνσ μ (x) dx dλ

dx σ dλ

= 0.

∗ 

P(λ) = δx μ eμ (x(λ)) denotes the displacement vector between the two neighboring geodesics. The relative acceleration between two particles following these geodesics is then given by d2 P/dλ2 . Using  ∗  ν and ∂ν eμ (x) − Γνμ σ (x)eσ (x) dx Eq. dλ = 0, in Eq. (49.10) of Chap. 49, the following equation emerges

d2 P(λ) dλ2

ν

= R μ νρσ (x) dx dλ

dx ρ dλ

δx σ eμ (x(λ)),

where R μ νρσ = ∂ρ Γνσ μ − ∂σ Γνρ μ + Γρτ μ Γνσ τ − Γσ τ μ Γνρ τ .

That is, it cannot be eliminated locally. What this means, of course, in the light of the EP is that only non-gravitational laws take on their special relativistic forms locally in a LF. Elementary properties of the Riemann curvature tensor are given in Box 50.3. The so-called Bianchi identity in it, is easily verified in a local LF and the result is then generalized in curved spacetime by promoting the partial derivatives to covariant ones. On the other hand the contraction R μ νμσ = Rνσ is referred to as the Ricci tensor, and is explicitly given by Rνσ = ∂ρ Γνσ ρ − ∂σ Γνμ μ + Γμτ μ Γνσ τ − Γσ τ μ Γνμ τ , The scalar curvature R is defined by

Rνσ = Rσ ν .

R = g νσ Rνσ ,

(50.11)

(50.12)

and referring to Box 50.3, the last identity in it, follows from contractions of the Bianchi identity, defined in Box 50.3, and by using the definitions of the Ricci tensor and the scalar curvature.3 The combination 

R μν −

1 μν  g R = G μν , 2

G μν = G νμ ,

∇μ G μν = 0,

is referred to as the Einstein Tensor (components). Box 50.3 Properties of the Riemann curvature tensor and of its contractions

R μ νρσ = −R μ νσρ , gμκ



νσρ

= Rμνσρ ,

R μ νρσ + R μ ρσ ν + R μ σ νρ = 0,

Rμνρσ = −Rνμρσ = −Rμνσρ , Rμνρσ = Rρσ μν

R μ νρσ ;λ + R μ νσ λ;ρ + R μ νλρ;σ = 0, Rνσ = R μ νμσ , R μν ;μ −

3 For

1 2

g μν R;μ

(referred to as the Bianchi Identity)

Rνσ = Rσ ν , R = R μ μ ,   ≡ ∇μ R μν − 21 g μν R = 0.

the Riemann curvature tensor and its contractions for the geometry of a sphere, see Box 50.4.

(50.13)

336

50

Spacetime Curvature

Box 50.4 Riemann curvature and its contractions on the surface of a sphere

eθ = r (cos θ cos φ, cos θ sin φ, − sin θ), eφ = r (− sin θ sin φ, sin θ cos φ, 0). deθ = cot θeφ dφ, deφ = cot θ eφ dθ − sin θ cos θ eθ dφ, (constrained to the surface). Also deμ = Γμν σ eσ dx ν , Γθ φ φ = Γφθ φ = cot θ, Γφφ θ = − sin θ cos θ, and all the other components equal to zero. See Box 49.1 of Chap. 49 for the above derivations.     R θ νρσ = sin2 θ δ φ ν δ θ ρ δ φ σ − δ φ ρ δ θ σ , R φ νρσ = −δ θ ν δ θ ρ δ φ σ − δ φ ρ δ θ σ .   g φν R θ νρσ = R θ φ ρ σ = r12 δ θ ρ δ φ σ − δ φ ρ δ θ σ = − R φ θ ρ σ . R = R μν μν = R θ φ θ φ + R φ θ φ θ =

2 . r2

We may also consider the parallel transfer of a vector 4 with components V μ around a closed path and investigate the fate of its components upon their return to the initial point as shown in Fig. 50.2. Referring to the first three equations in the figure we have V μ (3) = V μ (2) − Γσρ μ (2)V σ (2) dηρ  = V μ (1) − Γσρ μ (1)V σ (1)dξ ρ )    − Γσρ μ (1) + ∂ν Γσρ μ dξ ν V σ (1) − Γτ χ σ (1)V τ (1)dξ χ dηρ .

(50.14)

Upon expanding the above expression, taking into account the infinitesimal nature of the displacements dξ σ and dησ , leads to the last expression for V μ (3) in Fig. 50.2 at the point 3. For the trip from 3 → 4 → 1, we simply have to interchange the μ (1) and multiply the whole expression by a minus sign. This leads to variables ξ ↔ η, replace V μ (1) by Vfinal   μ (1) − V μ (1) = ∂ν Γσρ μ − ∂ρ Γσ ν μ V σ (1) dξ ρ dην Vfinal   μ + Γτ ν Γσρ τ − Γτρ μ Γσ ν τ V σ (1) dξ ρ dην = R μ σ νρ V σ (1) dξ ρ dην ,

(50.15)

where only the components V σ (1) appear on the right-hand side to the accuracy needed for infinitesimal dξ, dη. The nonvanishing of the Riemann curvature tensor implies that the parallel transfer of a vector, around a closed path may give rise to a change of the component of a vector upon its return to initial point (Fig. 50.2). Before closing this chapter we note that upon a coordinate transformation, now as x → x , the tetrad ea μ in its μ index transforms as a covariant vector (see Eq. (49.24 with x ↔ x ) in the last chapter), i.e.,  ∂ x μ det[e α ρ (x )]. det[eα μ (x)] = det (50.16) ∂xν

On the other hand, dx μ = (∂ x μ /∂ x ν ) dx ν , and dx 0 dx 1 dx 2 dx 3 = det (∂ x μ /∂ x ν ) dx 0 dx 1 dx 2 dx 3 expressed in terms of the familiar Jacobian matrix (∂ x μ /∂ x ν ) . Also e α μ (x ) =

∂xν α e ν (x), ∂ x μ

eα μ (x)ηαβ eβ ν (x) = gμν (x), − (det[eα μ (x)])2 = det[gμν (x)] ≡ g(x) < 0, since det[ηαβ ] = −1, i.e., (dx) = dx 0 dx 1 dx 2 dx 3 ,



−g(x) =

the combination

4 That



(50.17)



−g (x ) det ∂ x μ /∂ x ν . Hence for the volume element :



−g(x)(dx) is an invariant under coordinate transformations.

is from Eq. (49.14), satisfying ∂λ V μ (x) = − Γτ λ μ (x)V τ (x).

(50.18)

50

Spacetime Curvature

337

V μ (2) = V μ (1) − Γσ ρ μ (1)V σ (1)dξ ρ ,

Γσ ρ μ (2) = Γσ ρ μ (1) + ∂ν Γσ ρ μ dξ ν , V μ (3) = V μ (2) − Γσ ρ μ (2)V σ (2) dη ρ .

4

dη μ

− dη μ 1

3

− dξ μ

2 dξ μ

V μ (3) = V μ (1) − Γσ ρ μ (1)V σ (1) dξ ρ − Γσ ρ μ (1)V σ (1) dη ρ + Γτ χ μ (1)Γσ ρ τ V σ (1) dξ ρ dη χ − ∂ν Γτ χ μ V τ (1) dξ ν dη χ .

Fig. 50.2 Parallel transfer of a vector around a closed path. The last expression for V μ (3) denotes the expression of its V μ component at the point 3, expressed in terms of V μ (1) and the displacements dξ μ and dημ

A similar analysis, on the other hand, shows that the combination (dx) εμνσρ Tμνσρ (x), is an invariant under coordinate transformations,

(50.19)

√ without the −g(x) factor, where Tμνσρ (x) is a tensor field, and εμνσρ is totally anti-symmetric with ε0123 = +1. In particular √ we note that [1/ −g(x) ] εμνσ λ transforms as a tensor.

51

Field Equations of GR Prerequisite Chaps. 49, 50

A simple way to make a transition from Newton’s theory of gravitation1 to Einstein’s field equations of GR is to compare the equations of geodesic deviation in curved spacetime with the one in Newtonian approximation in Eqs. (50.9) and (50.5), respectively, in the last chapter. In particular, we have 0 0 D 2 δx i i j dx dx + · · · , R00 = R 0 000 + R i 0i 0 = − R i00i = R δx 00 j dτ 2 dτ dτ   0 0 ∂ ∂ d2 δx i 1 j dx dx , = − ϕ(r ) δx dτ 2 c2 ∂ x i ∂ x j dτ dτ

(51.1) (51.2)

using the properties R 0 000 = 0, R i 0i 0 = − R i00i as follow from the properties of the Rienmann curvature tensor in Box 50.3. The differentiations are carried out in terms of the proper time. Hence we may infer that R00  ∇ 2 φ/c2 = 4π Gρ/c2 = (4π G/c4 )T00 , where we have used in the latter the expression derived in Boxes 51.1, 51.2 for Newton’s theory of gravitation. On grounds of general covariance, it is tempting to generalize this to the tensor equation: Rμν = (4π Gρ/c4 )Tμν , where Tμν is the energy momentum tensor. However, the special relativistic conservation law ∂ μ Tμν = 0 becomes replaced by the covariant derivative Tμν ;μ = 0, and this is inconsistent with the left-hand side since Rμν ;μ = 0. The equation, however, may be simply replaced by 1 8π G Tμν , (51.3) Rμν − gμν R = 2 c4 where now both sides have zero covariant derivative with respect to μ (and ν) (see Box 50.3). It remains to check that this equation consistently reduces to the Newtonian approximation: R00 = 4 π Gρ/c2 (= 4 π G/c4 )T00 . Equation (51.3) may be rewritten as  8π G  1 (51.4) Rμν = 4 Tμν − gμν T , c 2  8π G  1 4π G T00 , (51.5) R00  4 T00 + [−T00 + Tii ]  c 2 c4 where we have used the fact that in Newtonian physics Ti i ∝ T00 v 2 /c2  T00 and is small. The field equations of GR may be thus summarized by the equation Gμν =

1 Newton’s

8π G Tμν , c4

Gμν = Rμν −

1 gμν R, 2

(51.6)

theory of gravitation is summarized in Boxes 51.1, 51.2, and Eq. (51.2).

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_51

339

340

51

Field Equations of GR

Box 51.1 Newton’s theory of gravitation

Mass density of a system of particles of masses m 1 , m 2 , ..., located at points n  m i δ (3) (r − Ri ). Force acting on a test particle of unit R1 , R2 , ..., Rn : ρ(r) = i=1

mass at a point r : F = −

n  i=1

φ(r) = −

n  i=1

G mi = −G |r − Ri |



 G mi G mi r ≡ ∇r = −∇ r φ(r), 2 |r − Ri | r |r − Ri | n

i=1

ρ(R) d3 R , using property of the Dirac delta. |r − R|

The latter expression is now valid for a continuum mass distribution as well.  In Box 51.2, it is established that ∇ 2r 1/|r − R| = −4π δ (3) (r − R). This leads to the basic equation :

∇ 2r φ(r) = 4π G ρ(r) of Newton’s theory of gravitation.

 Box 51.2 Proof of the equation ∇ 2r 1/|r − R| = −4π δ (3) (r − R)  1 exp[ i k · (r − R)] d3 k . Upon applying (2π )3 k2  1 d3 k exp[ i k · (r − R)] = − δ (3) (r − R). to it gives ∇ 2r I (r) = − (2π )3

Consider the integral I (r) = ∇ 2r

On the other hand d3 k = 2π sin θ dθ k 2 dk, k · (r − R) = k|r − R| cos θ,  1  ∞ 2π d(cos θ) dk exp[ i k|r − R| cos θ] I (r) = (2π )3 −1 0  ∞  ∞ exp [ i k|r − R| ] − exp [− i k|r − R| ] sin z 1 1 1 dk dz = = 4π 2 0 i |r − R| k 2π 2 |r − R| 0 z by changing variable of integration k to z = k|r − R|. Using the integral  ∞ 1 sin z π 1 dz = , gives I = . Hence from the second line z 2 4π |r − R| 0 the equation in question follows.

which are the celebrated Einstein’s Field Equations, where we recall from Eq. (50.13) that the Gμν denote components of the Einstein tensor in a given coordinate system. An action integral may be introduced for the above equation as follows: c3 A = 16π G



−g(x) (dx)R(x) + Amatter ,

(dx) ≡ dx 0 dx 1 dx 2 dx 3 ,

(51.7)

where c3 /(16 π G) is a normalization factor so that the units on the right-hand side has the units of action (energy-second), as the integral itself has the dimensions of [length]2 , with R(x) involving two derivatives. As we have seen in Eq. (50.18), in √ the last chapter, g(x)(dx) is the invariant volume element, R(x) is a scalar, and, as a result, the action is invariant under coordinate transformations. The action integral consisting of the first term on the right-hand side of Eq. (51.7) is referred to as the Einstein-Hilbert action. The tensor

51

Field Equations of GR

341 √ Box 51.3 Variation of the action of GR: (dx)δ[ −g g μν Rμν ]

The variation of the determinant of a matrix A is given by: δdet[A] = det [A] Tr [A−1 δ A].∗ Hence δ g ≡ δdet [gμν ] = g g μν δ gμν = −ggμν δg μν ,  using the fact that δ g μν gνσ ) = δ(δ μ σ ) = 0 in the latter equality. √ √ √ √ Moreover δ −g = −(1/2)δg/ −g. Therefore δ −g = −(1/2) −g gμν δ g μν .  √ √ −g g μν δ Rμν ∗∗ = −g g μν ∂ρ δ μν ρ + σρ σ δ μν ρ − ρμ σ δ σ ν ρ − ρν σ δ σ μ ρ √ √ √  − −gg μν ∂ν δ μρ ρ −∂ν −g g μν δ μρ ρ = − −g ∂ρ g μν + ρσ μ g σ ν + ρσ ν g μσ δ μν ρ √ √ √ ∂ρ −g g μν δ μν ρ − ∂ν −g g μν δ μρ ρ . The first term is − −g g μν ; ρ δ μν ρ

which is zero since g μν; ρ = 0, also the last two terms are total derivatives. √ √ Accordingly −g g μν δ Rμν = 0. All told, we have δ[ −g g μν Rμν ]

√ 1 = −g Rμν − gμν R δg μν , leading to Einstein’s equation away from matter. 2 ∗ A proof of this identity may be found on p. 79 of my book Manoukian [5]. √ √ ∗∗ In writing this equality we have used, in the process, that −gg σ ν σ ν μ = −∂ν −g g μν √ √ √ −g g μν ∂ρ μν ρ = ∂ρ −gg μν δ μν ρ − ∂ρ −g g μν δ μν ρ , as well as the facts that : √ √ σ ∂ρ −g = −g σρ .

Tμν comes from the action part Amatter , associated with matter, as we will see below. In Box 51.3 it is shown that 1 c3 δ √ μν 16π G −g(x) δg (x)



−g(x) (dx)R(x) =

c3 Gμν (x). 16π G

(51.8)

In order to satisfy Eq. (51.6), the energy momentum tensor Tμν is defined by: −√

2c δ Amatter = Tμν (x). μν δg (x) −g(x)

(51.9)

In particular for the action of a real scalar field ϕ(x) in curved spacetime Amatter = −

1 2



  −g(x) (dx) g μν ∂μ ϕ(x)∂ν ϕ(x) + m 2 ϕ(x)2 ,

(51.10)



 1 Tμν (x) = c ∂μ ϕ(x)∂ν ϕ(x) − gμν ∂ ρ ϕ(x)∂ρ ϕ(x) + m 2 ϕ 2 (x) , (51.11) 2 √ √ √ √ using the identities (1/ −g) δ( −g g μν ) = −(1/2)gσρ δg σρ g μν + δg μν , (1/ −g) δ −g = −(1/2)gμν δg μν (see Box 51.3). The fact that the covariant derivative of the metric is zero: gμν ;μ = 0 (see Eq. 49.33)), one may also simply introduce a parameter Λ and generalize the GR field equations in (51.6) simply as follows 8π G Tμν , or equivalently as, c4   c4 8π G Tμν − gμν Λ , = c4 8π G

Gμν + gμν Λ = Gμν

(51.12)

where −gμν (c4 /8 π G)Λ is interpreted as a contribution of the vacuum. Such a term, where the parameter Λ is referred to as a cosmological constant, is very much relevant in modern cosmology to account for the fact that the Universe is expanding as will be discussed in much details when considering such aspects in cosmology in later chapters. Early in his

342

51

Field Equations of GR

career, Einstein introduced2 a cosmological constant to describe a so-called “static universe” which he later dropped. The cosmological constant has been having a dominant role in describing the evolution of the Universe in the past 3.5 billion years or so as will be seen later in Chap. 78. Finally note that the metric may be written as (51.13) gμν = ημν + h μν , describing its deviation from the Minkowski one. This allows one to carry out perturbative approximations to GR by treating h μν as a small perturbation. The linearization of the GR field equations for such a small perturbation, as a first order approximation in h μν , is developed in Box 51.4. The corresponding analysis will turn up to be useful in some of the following chapters. Box 51.4 Linearization of the field equations of GR in Eq. (51.6)

gμν = ημν + h μν , working to first order in the tensor h μν in this box, we 1 have: g σ μ = ησ μ − h σ μ , g σ μ gμν = δ σ ν , μν ρ = ∂μ h ν ρ + ∂ν h μ ρ − ∂ ρ h μν . 2 1 1 ∂ρ ∂μ h ν ρ + ∂ρ ∂ν h μ ρ − h μν − ησρ ∂μ ∂ν h σρ , Rμν = ∂ρ μν ρ − ∂ν μρ ρ = 2 2  σρ 1 σ ρ μ μν where  = ησρ ∂ ∂ . R = R μ = ∂σ ∂ρ − ησρ  h . G = R μν − ημν R. 2   1 − h μν + ∂ρ ∂ μ h νρ + ∂ρ ∂ ν h μρ − ησρ ∂ μ ∂ ν h σρ − ημν ∂σ ∂ρ − ησρ  h σρ G μν = 2   1 1 = − h μν + ∂ρ ∂ μ h νρ + ∂ρ ∂ ν h μρ − ημν ∂σ ∂ρ h σρ , h μν = h μν − ημν ησρ h σρ . 2 2 G μν is invariant under the transformation h μν → h μν + ∂ μ ξ ν + ∂ ν ξ μ − ημν ∂ξ, for arbitrary ξ. Hence we may choose − ξ ν = ∂μ h μν to obtain ∂μ h μν → 0, 1 8π G μν 16 π G μν G μν = − h μν = T , ⇒ −h μν = T , ∂μ h μν = 0, ∂μ T μν = 0. 2 c4 c4

References 1. Einstein, A. (1917). Komologische betrachtungen zur allgemeinen relativitätstheorie (pp. 142–152). Sitzungsberichte: Preussischen Akademie der Wissenschaften zu Berlin. 2. Gamow, G. (1970). My world line. New York: Viking Press. 3. Lorentz, H. A., Einstein, A., Minkowski, H., Weyl, H., & Sommerfeld, A. (1952). The principle of relativity. New York: Dover Publication. 4. Lorentz, H. A., Einstein, A., Minkowski, H., Weyl, H., & Sommerfeld, A. (1953). The meaning of relativity (5th ed.). Princeton: Princeton University Press. 5. Manoukian, E. B. (2016). Quantum field theory II: Introduction to quantum gravity, supersymmetry and string theory. Switzerland: Springer.

2 Einstein

[1]. A translation of which may be found in the books: Lorentz et al. [3, 4]. See also Gamow [2].

Transfer of Gravitational Energy, Polarization Aspects and Gravitational Waves

52

Prerequisite Chaps. 16, 51

Gravitational waves were predicted by Einstein himself, on the basis of his GR theory, as early as 1916.1 Accelerating massive bodies2 or the collisions of two massive bodies, such as of two black holes, distort the fabric of spacetime causing ripples in it. Gravitational waves are these spacetime spreading ripples that transport energy as gravitational radiation, and, unlike in Newton’s theory of gravitation, they travel at a finite speed. This explains why it was not possible to predict their existence based on Newton’s theory. Gravitational waves where only recently detected3 about 100 years after Einstein’s prediction. The gravitational waves detected where due to the merger of two black holes 1.3 billion years ago, or at a distance that light would take 3.1 billion years to reach the Earth, via the Laser Interferometer Gravitational Wave Observatory (LIGO). Credit should be also given to Joseph Weber for his earlier pioneering attempts over the years4 on the detection of gravitational waves. It is interesting to point out that in the 1970s there was also indirect evidence of the existence gravitational waves through the discovery5 of the Hulse-Taylor Pulsar (PSR B193+16). The existence of gravitational waves in the realm of black hole physics was already emphasized by Stephen Hawking in the 1970s.6 The particle associated with gravity is the elusive graviton, which is massless as the photon. The vector character of the photon implies the existence of two perpendicular polarizations states which, in turn, are perpendicular to the direction of its propagation. On the other hand, the tensor character of gravitation implies the existence of two polarization states, with one described in a plane, perpendicular to the direction of propagation, and another state also in such a plane with the axes rotated by 45o relative to the axes of oscillations in the plane of the other state as we will see below.7 Propagation of gravitational energy and the extraction of polarization states of gravitation are best described by considerations of the propagator associated with the graviton thus avoiding the use of plane waves altogether. The power of gravitational radiation is also readily derived from these considerations. We consider only a weak gravitational field. In the linearized theory of gravitation with the metric written as gμν = ημν + h μν , with h μν small, we have seen in Box 51.4 of the previous chapter that h μν satisfies the equation 16π G  μν 1 μν  T − η T c4  2  μσ νρ η η + ηνσ ημρ − ημν ησρ 16π G Tσρ (x). = c4 2

− h μν (x) =

(52.1) (52.2)

For a detection source, with associated, energy-momentum tensor T˜μν , causally arranged to be in the future of the emission  source, associated with the energy-momentum tensor Tσρ , the expression i (dx)T˜μν (x)h μν (x) given by the integral8 16π G c4



  μσ νρ η η + ηνσ ημρ − ημν ησρ d3 k ∗ ˜ [i Tσρ (k)], k 0 = |k|, [i Tμν (k)] (2π )3 2k 0 2

(52.3)

1 Einstein

[4, 5]. is similar to electromagnetic radiation generated by accelerating charged bodies. 3 Abbott et al. [1, 2]; Castelvecchi and Witze [3]. 4 Weber [11, 12]. 5 Hulse and Taylor [7]. 6 Hawking [6]. 7 See also Manoukian [9]. 8 For the simplicity of the notation we put the fundamental constants  = 1 and c = 1 in this chapter. 2 This

© Springer Nature Switzerland AG 2020 E. B. Manoukian, 100 Years of Fundamental Theoretical Physics in the Palm of Your Hand, https://doi.org/10.1007/978-3-030-51081-7_52

343

344

52

Transfer of Gravitational Energy, Polarization Aspects and Gravitational Waves

∗ describes the transmission of energy from the emission source Tσρ (k) to the detection source T˜μν (k). The multiplicative ∗ ˜ i-factor in [i Tμν (k)] was introduced for convenience. We re-iterate part of the analysis involved with the massless spin 2 field in Chap. 16. To extract the physical meaning of this equation, we consider the following. For a given momentum k, we define the vectors9 :

k = (k 0 , k), k = (k 0 , −k),

(k + k ) = 2(k 0 , 0),

(k − k ) = 2(0, k),

(k + k ) = −4 (k ) , (k − k ) = 4 k . 2

satisfying the equations:

0 2

2

2

(52.4) (52.5)

In terms of these vectors, the Minkowski metric may be written as ημν =

(k + k )μ (k + k )ν (k + k )2

+

(k − k )μ (k − k )ν (k − k )2

 ki k j  + δμ i δν j δi j − 2 , k

(52.6)

as is explicitly verified through these equations:

For

⎧ (μ = 0, ν = 0) : η00 = 4(k 0 k 0 )/(− 4(k 0 )2 ) + 0 + 0 = −1, ⎪ ⎪ ⎪ ⎪ ⎨(μ = 0, ν = i) : η0i = 0 + 0 + 0 = 0, ⎪ (μ = i, ν = 0) ⎪ ⎪ ⎪ ⎩(μ = i, ν = j)

: ηi 0 = 0 + 0 + 0 = 0,

  : ηi j = 0 + (4 k i k j )/(4 k2 ) + δ i j − (k i k j )/(k2 ) = δ i j .

Finally, the mass shell constraint k 0 = |k| implies that k 2 = 0, k 2 = 0, which lead, from (52.5), to the equality (k + k )μ (k + k )ν (k + k )2

+

(k − k )μ (k − k )ν (k − k )2

=

kμk ν + k μkν kk

.

(52.7)

Accordingly, the following decomposition of the Minkowski metric arises:10 ημν =

kμk ν + k μkν kk

 ki k j  + δμ i δν j δi j − 2 . k

(52.8)

∗ ∗ Due to the conservations laws: k σ Tσρ = 0, k ρ Tσρ = 0, k μ T˜μν = 0, k ν T˜μν = 0, the first term in the decomposition of the metric above in Eq. (52.8) will not contribute to Eq. (52.3). Accordingly, by setting



δi j −

ki k j  ≡ π i j (k), k2

(52.9)

we may rewrite Eq. (52.3) as   i jm  π π + πi mπ j  − πi j πm 16π G d3 k ∗ ˜ [i T m (k)], k 0 = |k|, [i Ti j (k)] c4 (2π )3 2k 0 2

(52.10)

Since k i π i j (k) = 0, k j π i j (k) = 0, then for any vector k we can introduce two perpendicular unit vectors n1 , n2 , which are perpendicular to each other and perpendicular to k as shown in Fig. 52.1. By expressing π i j (k) in terms of the components of the unit vectors n1 and n2 , as given in Fig. 52.1, some elementary algebra allows us to write the following basic equality

9 For

the convenience of the reader we re-iterate the following derivation of polarization vectors for the photon below Eq. (16.32). ingenious decomposition goes back to the legendary Julian Schwinger [10].

10 This

52

Transfer of Gravitational Energy, Polarization Aspects and Gravitational Waves

345

n1 · n2 = 0, n1 · k = 0, n2 · k = 0

k

n1 = (n1 1 , n2 1 , n3 1 ), n2 = (n1 2 , n2 2 , n3 2 ) i j

n2

n1

π i j ≡ δ i j − kkk2

= ∑λ =1,2 niλ nλj

with ki π i j = 0, k j π i j = 0

Fig. 52.1 The figure shows how to decomposes π i j (k) defined in Eq. (52.10) in terms of the components of the unit vectors n1 , n2

 1  i jm π π + πi mπ j  − πi j πm 2   1 i j = n 1 n 1 − ni 2 n j 2 n 1 nm 1 − n 2 nm 2 2

i j   + n 1 n 2 + ni 2 n j 1 n 1 nm 2 + n 2 nm 1 ,

(52.11)

e.g.. with k = |k|(0, 0, 1). This, in turn, allows us to introduce polarization tensors defined by   1 1 ij ij ε 2 = √ ni 1 n j 2 + ni 2 n j 1 , ε 1 = √ ni 1 n j 1 − ni 2 n j 2 , 2 2  ij  1  i jm i j,m im j i j m π π +π π −π π = ≡ ελ ελm . π 2 λ=1,2

(52.12) (52.13)

We may rewrite Eq. (52.10), in general, as   λ



  ∗ i j  8π G d3 k  m  8π G d3 k i T˜i j ελ ε i Tm , c4 (2π )3 2k 0 c4 (2π )3 2k 0 λ       amplitude of detection

(52.14)

amplitude of emission

expressed as the product of amplitudes of emission and detection of a quantum, a graviton, of momentum k, and polarization specified by the parameter λ. In particular, by setting up a coordinate system such that the graviton is propagating along the x 3 = z-axis, i.e., of momentum k along the positive z-axis, the unit vectors n1 , n2 may be chosen along the x 1 , x 2 axes, respectively. In this coordinate system n i 1 = δ i 1 , n i 2 = δ i 2 . That is, the only non-vanishing polarization components are, from (52.12,) then given by 1 1 ε111 = √ , ε122 = − √ , 2 2 1 1 ε212 = √ , ε221 = √ . 2 2

(52.15) (52.16)

The polarizations ε111 and ε122 deform a ring of test particles, respectively, in x 1 and x 2 directions, while ε212 and ε221 and ε221 deform a ring of test particles at 45o angles to the x 1 and x 2 directions. To describe the latter polarization-set, we may thus consider ε212 , ε221 in a coordinate system rotated clockwise by an angle 45o of the coordinate system in which ε111 , ε122 are depicted (see Fig. 52.2). Remembering √ that a second rank tensor transforms as the product of two vectors, we have in two dimensions, with cos θ = sin θ = 1/ 2:

346

52 x2

Transfer of Gravitational Energy, Polarization Aspects and Gravitational Waves negative surface

positive surface

negative surface

ε122

(a)

45o

ε222

(b) ε111

x2

positive surface

x1

ε211

negative surface positive surface

x1 positive surface

negative surface

Fig. 52.2 Part b is obtained by a 45o c.w. rotation of the coordinate system x 1 − x 2 in a. Note that ε111 in (a) is positive, then at a negative surface the direction of the corresponding arrow is in the negative x 1 -direction, while at the positive surface is in the direction of positive x 1 -direction, and so on for the direction of the other arrows. Another cycle may be described by rotating the axes in this figure 90o c.c.w. The momentum of the graviton is in the x 3 -direction out of the page

(a)

(b)

Fig. 52.3 The figure depicts the deformation of a ring of test particles, at different cycles, in response to the polarization states, respectively, to the ones in Fig. 52.2



V 1 = V 1 cos θ − V 2 sin θ V 2 = V 1 sin θ + V 2 cos θ

,

(52.17)

for vector components, and hence ⎧ ⎪ ε211 ⎪ ⎪ ⎪ ⎨ε22 2 ⎪ε212 ⎪ ⎪ ⎪ ⎩ε21 2

√ = ε211 cos2 θ + ε222 sin2 θ − ε212 sin θ cos θ − ε221 sin θ cos θ = −1/ 2, √ = ε211 sin2 θ + ε222 cos2 θ + ε212 sin θ cos θ + ε221 sin θ cos θ = +1/ 2, = ε211 sin θ cos θ − ε222 sin θ cos θ + ε212 cos2 θ − ε221 sin2 θ = 0, = ε211 sin θ cos θ − ε222 sin θ cos θ − ε212 sin2 θ + ε221 cos2 θ = 0.

(52.18)

Therefore the two polarization states of the graviton are given by   1 ε111 = √ = − ε122 , 2

  1 ε211 = − √ = − ε222 , 2

(52.19)

and are depicted in Fig. 52.2. The deformation of a ring of test particles in such a gravitational field is formally depicted in Fig. 52.3. To derive the expression of gravitational radiation, we note from the expression of the amplitude of emission Eq. (52.14), that the probability of the emission of a graviton or N gravitons of any momenta and any polarizations are, respectively, given by Probemission [1] ∝ Probemission [N ] ∝

8π G c4



(χ ) N , N!

d3 k  μν |ε Tμν |2 ≡ χ , k 0 = |k|, (2π )3 2k 0 λ λ

(52.20) (52.21)

52

Transfer of Gravitational Energy, Polarization Aspects and Gravitational Waves

347

where, using the indistinguishability of gravitons due to their bosonic character, we have divided the above expression by N !. We need a normalization factor such that when sum we over N , from 0 to ∞ of the corresponding probabilities, we get one. Clearly, by the definition of an exponential, the properly normalized probability in question is then given by Probemission [N ] =

(χ ) N exp[− χ ], N!

N = 0, 1, 2, . . . ,

(52.22)

∞  (χ ) N  exp[− χ ] = exp[ χ ] exp[− χ ] = 1, N! N =0

< N > = =χ

∞ 

N Probemission [N ] =

N =0 ∞ 

N −1

(χ ) exp[− χ ] = χ (N − 1)! N =1

∞  N =1 ∞ 

N

(52.23)

(χ ) N exp[− χ ] N!

(χ ) N exp[− χ ] = χ . N! N =0

(52.24)

This is called a Poisson distribution,11 with the average number of gravitons emitted given by < N >=

8π G c4



d3 k  μν |ε Tμν |2 ≡ χ . (2π )3 2k 0 λ λ

IF N (E) denotes the average number of gravitons emitted with energy in the interval (E, E + dE), with N >, and E = k 0 = |k|, k = k 0 n, then 

 |k| N (E) = 4π G dΩ T ∗ i j (k) π i j,m (k) T m (k), π i j,m given in (52.13) , 3 (2π )  ∞   G ∞ dk 0 0 2 (k ) E N (E) dE = dΩ T ∗i j (k 0 , k 0 n)π i j,m (n)T m (k 0 , k 0 n), π 0 2π 0   0 0 0 0 ∗i j 0 0 i j,m m 0 0 dΩ T (k , k n)π (n)T (k , k n) = dx  0 eik x dx 0 e−ik x T ∗i j (x  )T m (x)  0  0 × dΩ e−ik n·x π i j,m (n) eik n·x .

(52.25) ∞ 0

N (E) dE =